2008 IEEE International Conference on Data Mining Workshops

Items from 1 to 15 out of 15 results

chapter

Hunting for Coherent Co-clusters in High Dimensional and Noisy Datasets

M. Deodhar, J. Ghosh, G. Gupta, Hyuk Cho, more

2008 IEEE International Conference on Data Mining Workshops > 654 - 663

2008 IEEE International Conference on Data Mining Workshops

Clustering problems often involve datasets where only a part of the data is relevant to the problem, e.g., in microarray data analysis only a subset of the genes show cohesive expressions within a subset of the conditions/features. The existence of a large number of non-informative data points and features makes it challenging to hunt for coherent and meaningful clusters from such datasets. Additionally,...

chapter

Efficient Distance Computation Using SQL Queries and UDFs

S.K. Pitchaimalai, C. Ordonez, C. Garcia-Alvarado

2008 IEEE International Conference on Data Mining Workshops > 533 - 542

2008 IEEE International Conference on Data Mining Workshops

Distance computation is one of the most computationally intensive operations employed by many data mining algorithms. Performing such matrix computations within a DBMS creates many optimization challenges. We propose techniques to efficiently compute Euclidean distance using SQL queries and user-defined functions (UDFs). We concentrate on efficient Euclidean distance computation for the well-known...

chapter

Multiple-Instance Regression with Structured Data

K.L. Wagstaff, T. Lane, A. Roper

2008 IEEE International Conference on Data Mining Workshops > 291 - 300

2008 IEEE International Conference on Data Mining Workshops

We present a multiple-instance regression algorithm that models internal bag structure to identify the items most relevant to the bag labels. Multiple-instance regression (MIR) operates on a set of bags with real-valued labels, each containing a set of unlabeled items, in which the relevance of each item to its bag label is unknown. The goal is to predict the labels of new bags from their contents...

chapter

Extension of Partitional Clustering Methods for Handling Mixed Data

Y. Naija, S. Chakhar, K. Blibech, R. Robbana

2008 IEEE International Conference on Data Mining Workshops > 257 - 266

2008 IEEE International Conference on Data Mining Workshops

Clustering is an active research topic in data mining and different methods have been proposed in the literature. Most of these methods are based on the use of a distance measure defined either on numerical attributes or on categorical attributes. However, in fields such as road traffic and medicine, datasets are composed of numerical and categorical attributes. Recently, there have been several proposals...

chapter

Clustering Events on Streams Using Complex Context Information

YongChul Kwon, Wing Yee Lee, M. Balazinska, Guiping Xu

2008 IEEE International Conference on Data Mining Workshops > 238 - 247

2008 IEEE International Conference on Data Mining Workshops

Monitoring applications play an increasingly important role in many domains. They detect events in monitored systems and take actions such as invoke a program or notify an administrator. Often administrators must then manually investigate events to figure out the source of a problem. Stream processing engines (SPEs) are general purpose data management systems for monitoring applications. They provide...

chapter

Distributed Data Mining Models as Services on the Grid

E. Cesario, D. Talia

2008 IEEE International Conference on Data Mining Workshops > 486 - 495

2008 IEEE International Conference on Data Mining Workshops

This paper describes how distributed data mining models, such as collective learning, ensemble learning, and meta-learning models, can be implemented as WSRF mining services by exploiting the Grid infrastructure. Our goal is to design a general distributed architectural model that can be exploited for different distributed mining algorithms deployed as Grid services for the analysis of dispersed data...

chapter

Harmonic Blind Sound Source Isolation Enhanced by Spectrum Clustering

Xin Zhang, Wenxin Jiang, Z.W. Ras

2008 IEEE International Conference on Data Mining Workshops > 310 - 319

2008 IEEE International Conference on Data Mining Workshops

Automatic indexing of music by instruments and their types is a challenging problem, especially when multiple instruments are playing at the same time. We have built a database containing more than one million of music instrument sounds, each described by a large number o features including standard MPEG7 audio descriptors, features for speech recognition, and many new audio features developed by...

chapter

Automatic Multimodal Aggregation of News from Television and the Web

A. Messina

2008 IEEE International Conference on Data Mining Workshops > 979 - 982

2008 IEEE International Conference on Data Mining Workshops

This demonstration concerns a system designed and implemented to automatically build multimodal aggregations of informative news items coming from the two domains of digital television and the Web. Though in recent times several technological solutions have addressed the problem of clustering online articles, little is available which is capable of integrating these two sources of information. The...

chapter

A New Graph-Based Algorithm for Clustering Documents

A.P. Suarez, J.F.M. Trinidad, J.A.C. Ochoa, J.E.M. Pagola

2008 IEEE International Conference on Data Mining Workshops > 710 - 719

2008 IEEE International Conference on Data Mining Workshops

In this paper a new algorithm, called CStar, for document clustering is presented. This algorithm improves recently developed algorithms like generalized star (GStar) and ACONS algorithms, originally proposed for reducing some drawbacks presented in previous Star-like algorithms.The CStar algorithm uses the condensed star-shaped sub-graph concept defined by ACONS, but defines a new heuristic that...

chapter

Parallel Hierarchical Clustering on Market Basket Data

Baoying Wang, Qin Ding, I. Rahal

2008 IEEE International Conference on Data Mining Workshops > 526 - 532

2008 IEEE International Conference on Data Mining Workshops

Data clustering has been proven to be a promising data mining technique. Recently, there have been many attempts for clustering market-basket data. In this paper, we propose a parallelized hierarchical clustering approach on market-basket data (PH-Clustering), which is implemented using MPI. Based on the analysis of the major clustering steps, we adopt a partial local and partial global approach to...

chapter

Detecting and Tracking Spatio-temporal Clusters with Adaptive History Filtering

J. Rosswog, K. Ghose

2008 IEEE International Conference on Data Mining Workshops > 448 - 457

2008 IEEE International Conference on Data Mining Workshops

This paper addresses the problem of detecting and tracking moving clusters in spatio-temporal data sets. Spatio-temporal data sets contain data elements that move in space over time. Traditional data clustering algorithms work well on static data sets that contain well separated clusters. When traditional techniques are applied to spatio-temporal data they breakdown when the moving data elements intersect...

chapter

A New Method for Multi-view Face Clustering in Video Sequence

Panpan Huang, Yunhong Wang, Ming Shao

2008 IEEE International Conference on Data Mining Workshops > 869 - 873

2008 IEEE International Conference on Data Mining Workshops

In the problem of face clustering with multi-views, the similarity between faces of different persons with similar pose is usually greater than the similarity between multi-view faces of the same person. This may exert a tremendous impact on the clustering result that sent back to the user. To solve this problem, we should do pose clustering first and then within each dasiapose grouppsila, clustering...

chapter

Semi-supervised Collaborative Clustering with Partial Background Knowledge

G. Forestier, C. Wemmert, P. Gancarski

2008 IEEE International Conference on Data Mining Workshops > 211 - 217

2008 IEEE International Conference on Data Mining Workshops

In this paper we present a new algorithm for semisupervised clustering. We assume to have a small set of labeled samples and we use it in a clustering algorithm to discover relevant patterns. We study how our algorithm works against two other semisupervised algorithms when the data are multimodal. Then, we study the case where the user is able to produce few samples for some classes but not for each...

chapter

If Constraint-Based Mining is the Answer: What is the Constraint? (Invited Talk)

J.-F. Boulicaut

2008 IEEE International Conference on Data Mining Workshops > 730

2008 IEEE International Conference on Data Mining Workshops

Constraint-based mining has been proven to be extremely useful. It has been applied not only to many pattern discovery settings (e.g., for sequential pattern mining) but also, recently, on classification and clustering tasks (see, e.g., ). It appears as a key technology for an inductive database perspective on knowledge discovery in databases (KDD), and constraint-based mining is indeed an answer...

chapter

Bounding and Estimating Association Rule Support from Clusters on Binary Data

C. Ordonez, Kai Zhao, Zhibo Chen

2008 IEEE International Conference on Data Mining Workshops > 609 - 618

2008 IEEE International Conference on Data Mining Workshops

The theoretical relationship between association rules and machine learning techniques needs to be studied in more depth. This article studies the use of clustering as a model for association rule mining. The clustering model is exploited to bound and estimate association rule support and confidence. We first study the efficient computation of the clustering model with K-means; we show the sufficient...

Filter options

Keywords:
PATTERN CLUSTERING

Publication date

Set your own date range

Keywords

CLUSTERING ALGORITHMS (11)
DATA MINING (10)
CLASSIFICATION ALGORITHMS (5)
DISTANCE MEASUREMENT (5)
CLUSTERING (4)
PATTERN CLASSIFICATION (4)
ALGORITHM DESIGN AND ANALYSIS (3)
DATA ANALYSIS (3)
DATA MODELS (3)
DATABASES (3)
FEATURE EXTRACTION (3)
BUILDINGS (2)
CLUSTERING METHODS (2)
COMPUTATIONAL MODELING (2)
CONFERENCES (2)
DATA CLUSTERING (2)
ESTIMATION (2)
ITEMSETS (2)
K-MEANS CLUSTERING ALGORITHM (2)
LEARNING (ARTIFICIAL INTELLIGENCE) (2)
MERGING (2)
QUERY PROCESSING (2)
ACCURACY (1)
ACONS ALGORITHM (1)
ADAPTIVE FILTERS (1)
ADAPTIVE HISTORY FILTERING (1)
AGRICULTURE (1)
APPROXIMATION METHODS (1)
ASSOCIATION RULE SUPPORT ESTIMATION (1)
ASSOCIATION RULES (1)
ASTROPHYSICS (1)
AUDIO CODING (1)
AUDIO DATABASES (1)
AUDIO FEATURE (1)
AUTOMATIC INDEXING (1)
AUTOMATIC MUSIC INDEXING (1)
BAG LABELS (1)
BINARY DATA (1)
BIOLOGICALLY MEANINGFUL COCLUSTERS (1)
BISMUTH (1)
BLIND SOUND SOURCE SEPARATION (1)
BLIND SOUND SOURCE SEPARATION ALGORITHM (1)
BLIND SOURCE SEPARATION (1)
BOUND (1)
CATEGORICAL ATTRIBUTE (1)
CATEGORY THEORY (1)
CDM (1)
CHAPTERS (1)
CLASSIFICATION TASK (1)
CLASSIFICATION TECHNIQUE (1)
CLASSIFICATION TREE ANALYSIS (1)
CLUSTER MINING (1)
CLUSTERING ALGORITHM (1)
CLUSTERING DOCUMENTS (1)
CLUSTERING PROBLEMS (1)
CLUSTERING TASK (1)
CLUSTERING TECHNIQUE (1)
CO-CLUSTERING (1)
COCLUSTERING ALGORITHM (1)
COHERENT COCLUSTERS (1)
COHESIVE EXPRESSIONS (1)
COLLABORATION (1)
COLLABORATIVE CLUSTERING (1)
COMPLEX CONTEXT INFORMATION (1)
COMPLEXITY THEORY (1)
COMPUTATIONAL COMPLEXITY (1)
COMPUTER AIDED SOFTWARE ENGINEERING (1)
CONDENSED STAR-SHAPED SUBGRAPH CONCEPT (1)
CONSTRAINT BACK PROPAGATION (1)
CONSTRAINT RELAXATION STRATEGY (1)
CONSTRAINT-BASED DATA MINING QUERY (1)
CONSTRAINTS (1)
CONTEXT DISTANCE MEASURE (1)
CROP YIELD PREDICTION (1)
CSTAR ALGORITHM (1)
DATA CLASSIFICATION (1)
DATA HANDLING (1)
DATA INTEGRITY (1)
DATA MANAGEMENT SYSTEMS (1)
DATA MINING ALGORITHMS (1)
DATA PARTITIONING (1)
DATA STREAM (1)
DATABASE INDEXING (1)
DATASET TRAINING (1)
DBMS (1)
DECLARATIVE SEMANTICS (1)
DEDUCTIVE DATABASES (1)
DENSITY-BASED CLUSTERING METHOD (1)
DIGITAL TELEVISION (1)
DIGITAL TV (1)
DIRECTED GRAPH (1)
DIRECTED GRAPHS (1)
DISTANCE (1)
DISTINCT DISTRIBUTIONS (1)
DISTRIBUTED ALGORITHMS (1)
DISTRIBUTED ARCHITECTURAL MODEL (1)
DISTRIBUTED DATA MINING (1)
DISTRIBUTED DATA MINING MODEL (1)
DISTRIBUTED DATABASES (1)
more

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2008 IEEE International Conference on Data Mining Workshops