2008 IEEE International Conference on Data Mining Workshops

Items from 1 to 17 out of 17 results

chapter

Hunting for Coherent Co-clusters in High Dimensional and Noisy Datasets

M. Deodhar, J. Ghosh, G. Gupta, Hyuk Cho, more

2008 IEEE International Conference on Data Mining Workshops > 654 - 663

2008 IEEE International Conference on Data Mining Workshops

Clustering problems often involve datasets where only a part of the data is relevant to the problem, e.g., in microarray data analysis only a subset of the genes show cohesive expressions within a subset of the conditions/features. The existence of a large number of non-informative data points and features makes it challenging to hunt for coherent and meaningful clusters from such datasets. Additionally,...

chapter

Efficient Distance Computation Using SQL Queries and UDFs

S.K. Pitchaimalai, C. Ordonez, C. Garcia-Alvarado

2008 IEEE International Conference on Data Mining Workshops > 533 - 542

2008 IEEE International Conference on Data Mining Workshops

Distance computation is one of the most computationally intensive operations employed by many data mining algorithms. Performing such matrix computations within a DBMS creates many optimization challenges. We propose techniques to efficiently compute Euclidean distance using SQL queries and user-defined functions (UDFs). We concentrate on efficient Euclidean distance computation for the well-known...

chapter

Food Sales Prediction: "If Only It Knew What We Know"

P. Meulstee, M. Pechenizkiy

2008 IEEE International Conference on Data Mining Workshops > 134 - 143

2008 IEEE International Conference on Data Mining Workshops

Sales prediction is an important problem for different companies involved in manufacturing, logistics, marketing, wholesaling and retailing. Food companies are more concerned with sales prediction of products having a short shelf-life and seasonal changes in demand. The demand may depend on many hidden contexts, not given explicitly in the form of predictive features. Even if some changes are known...

chapter

A Vector-Geometry Based Spatial kNN-Algorithm for Traffic Frequency Predictions

M. May, D. Hecker, C. Korner, S. Scheider, more

2008 IEEE International Conference on Data Mining Workshops > 442 - 447

2008 IEEE International Conference on Data Mining Workshops

We introduce s-kNN, a nearest neighbor based spatial data mining algorithm. It belongs to the class of vector-geometry based algorithms that reason on complex spatial objects instead of point measurements. In contrast to most methods in this class, it does on the fly spatial computations that cannot be replaced by a pre-processing step without sacrificing efficiency. The key is a partial evaluation...

chapter

Extension of Partitional Clustering Methods for Handling Mixed Data

Y. Naija, S. Chakhar, K. Blibech, R. Robbana

2008 IEEE International Conference on Data Mining Workshops > 257 - 266

2008 IEEE International Conference on Data Mining Workshops

Clustering is an active research topic in data mining and different methods have been proposed in the literature. Most of these methods are based on the use of a distance measure defined either on numerical attributes or on categorical attributes. However, in fields such as road traffic and medicine, datasets are composed of numerical and categorical attributes. Recently, there have been several proposals...

chapter

Hierarchical Text Categorization in a Transductive Setting

M. Ceci

2008 IEEE International Conference on Data Mining Workshops > 184 - 191

2008 IEEE International Conference on Data Mining Workshops

Transductive learning is the learning setting that permits to learn from "particular to particular'' and to consider both labelled and unlabelled examples when taking classification decisions. In this paper, we investigate the use of transductive learning in the context of hierarchical text categorization. At this aim, we exploit a modified version of an inductive hierarchical learning framework...

chapter

Clustering Events on Streams Using Complex Context Information

YongChul Kwon, Wing Yee Lee, M. Balazinska, Guiping Xu

2008 IEEE International Conference on Data Mining Workshops > 238 - 247

2008 IEEE International Conference on Data Mining Workshops

Monitoring applications play an increasingly important role in many domains. They detect events in monitored systems and take actions such as invoke a program or notify an administrator. Often administrators must then manually investigate events to figure out the source of a problem. Stream processing engines (SPEs) are general purpose data management systems for monitoring applications. They provide...

chapter

k-Nearest Neighbor Classification on First-Order Logic Descriptions

S. Ferilli, M. Biba, T. Basile, N. Di Mauro, more

2008 IEEE International Conference on Data Mining Workshops > 202 - 210

2008 IEEE International Conference on Data Mining Workshops

Classical attribute-value descriptions induce a multi-dimensional geometric space. One way for computing the distance between descriptions in such a space consists in evaluating an Euclidean distance between tuples of coordinates. This is the ground on which a large part of the Machine Learning literature has built its methods and techniques. However, the complexity of some domains require the use...

chapter

Risk Assessment of Atmospheric Hazard Releases Using K-Means Clustering

G. Cervone, P. Franzese, Y. Ezber, Z. Boybeyi

2008 IEEE International Conference on Data Mining Workshops > 342 - 348

2008 IEEE International Conference on Data Mining Workshops

Unsupervised machine learning algorithms are used to perform statistical analysis of several transport and dispersion model runs which simulate emissions from a fixed source under different atmospheric conditions. A clustering algorithm is used to automatically group the results of the transport and dispersion simulations according to their respective cloud characteristics. Each cluster of clouds...

chapter

Using Betweenness Centrality to Identify Manifold Shortcuts

W.J. Cukierski, D.J. Foran

2008 IEEE International Conference on Data Mining Workshops > 949 - 958

2008 IEEE International Conference on Data Mining Workshops

High-dimensional data presents a significant challenge to a broad spectrum of pattern recognition and machine-learning applications. Dimensionality reduction (DR) methods serve to remove unwanted variance and make such problems tractable. Several nonlinear DR methods, such as the well known ISOMAP algorithm, rely on a neighborhood graph to compute geodesic distances between data points. These graphs...

chapter

A Study on the Reliability of Case-Based Reasoning Systems

Ke Wang, J. Liu, Wei-min Ma

2008 IEEE International Conference on Data Mining Workshops > 60 - 68

2008 IEEE International Conference on Data Mining Workshops

Case-based reasoning (CBR) is a methodology for problem solving, which suggests a solution to a new problem based on the previously-solved problems and their associated solutions. A key issue in this methodology is that can we always trust the solutions suggested by a case-based reasoning system? This paper studies the reliability of CBR systems at an overall level first. Factors affecting the reliability...

chapter

A Radar for the Internet

M. Latapy, C. Magnien, F. Ouedraogo

2008 IEEE International Conference on Data Mining Workshops > 901 - 908

2008 IEEE International Conference on Data Mining Workshops

In contrast with most Internet topology measurement research, our concern here is not to obtain a map as complete and precise as possible of the whole internet. Instead, we claim that each machine's view of this topology, which we call ego-centered view, is an object worth of study in itself. We design and implement an ego-centered measurement tool, and perform radar-like measurements consisting of...

chapter

A New Method for Multi-view Face Clustering in Video Sequence

Panpan Huang, Yunhong Wang, Ming Shao

2008 IEEE International Conference on Data Mining Workshops > 869 - 873

2008 IEEE International Conference on Data Mining Workshops

In the problem of face clustering with multi-views, the similarity between faces of different persons with similar pose is usually greater than the similarity between multi-view faces of the same person. This may exert a tremendous impact on the clustering result that sent back to the user. To solve this problem, we should do pose clustering first and then within each dasiapose grouppsila, clustering...

chapter

Unifying Unknown Nodes in the Internet Graph Using Semisupervised Spectral Clustering

A. Almog, J. Goldberger, Y. Shavitt

2008 IEEE International Conference on Data Mining Workshops > 174 - 183

2008 IEEE International Conference on Data Mining Workshops

Most research on Internet topology is based on active measurement methods. A major difficulty in using these tools is that one comes across many unresponsive routers. Different methods of dealing with these anonymous nodes to preserve the connectivity of the real graph have been suggested. One of the more practical approaches involves using a placeholder for each unknown, resulting in multiple copies...

chapter

Detection and Exploration of Outlier Regions in Sensor Data Streams

C. Franke, M. Gertz

2008 IEEE International Conference on Data Mining Workshops > 375 - 384

2008 IEEE International Conference on Data Mining Workshops

Sensor networks play an important role in applications concerned with environmental monitoring, disaster management, and policy making. Effective and flexible techniques are needed to explore unusual environmental phenomena in sensor readings that are continuously streamed to applications. In this paper, we propose a framework that allows to detect outlier sensors and to efficiently construct outlier...

chapter

Kernels for the Investigation of Localized Spatiotemporal Transitions of Drought with Support Vector Machines

M.W. Collier, A. McGovern

2008 IEEE International Conference on Data Mining Workshops > 359 - 368

2008 IEEE International Conference on Data Mining Workshops

We present and discuss several spatiotemporal kernels designed to mine real-life and simulated data in support of drought prediction. We implement and empirically validate these kernels for support vector machines. Issues related to the nature of geographic data such as autocorrelation and directionality are investigated.

chapter

Detecting Suspicious Behavior in Surveillance Images

D. Barbara, C. Domeniconi, Z. Duric, M. Filippone, more

2008 IEEE International Conference on Data Mining Workshops > 891 - 900

2008 IEEE International Conference on Data Mining Workshops

We introduce a novel technique to detect anomalies in images. The notion of normalcy is given by a baseline of images, under the assumption that the majority of such images is normal. The key of our approach is a featureless probabilistic representation of images, based on the length of the codeword necessary to represent each image. Such codeword's lengths are then used for anomaly detection based...

Filter options

Keywords:
DISTANCE MEASUREMENT

Publication date

Set your own date range

Keywords

CLUSTERING ALGORITHMS (8)
DATA MINING (5)
LEARNING (ARTIFICIAL INTELLIGENCE) (5)
PATTERN CLUSTERING (5)
CLASSIFICATION ALGORITHMS (4)
ESTIMATION (3)
FEATURE EXTRACTION (3)
MONITORING (3)
PATTERN CLASSIFICATION (3)
CATEGORY THEORY (2)
CLASSIFICATION (2)
CORRELATION (2)
DELAY (2)
EUCLIDEAN DISTANCE (2)
GRAPH THEORY (2)
HISTOGRAMS (2)
INTERNET (2)
INTERNET TOPOLOGY (2)
IP NETWORKS (2)
METEOROLOGY (2)
NOISE MEASUREMENT (2)
PROPOSALS (2)
TELECOMMUNICATION NETWORK TOPOLOGY (2)
TOPOLOGY (2)
TRAINING (2)
ACCURACY (1)
ACTIVE MEASUREMENT METHODS (1)
ADAPTATION MODEL (1)
ALGORITHM DESIGN AND ANALYSIS (1)
ANOMALIES (1)
ANOMALOUS SENSOR READINGS (1)
ANOMALY DETECTION (1)
ANONYMOUS NODES (1)
APPROXIMATION METHODS (1)
ATMOSPHERIC HAZARD (1)
ATMOSPHERIC MODELING (1)
BETWEENNESS (1)
BIOLOGICALLY MEANINGFUL COCLUSTERS (1)
BOSPHORUS CHANNEL (1)
BUILDINGS (1)
CARTOGRAPHY (1)
CASE LIBRARY (1)
CASE-BASED REASONING (1)
CASE-BASED REASONING SYSTEM RELIABILITY (1)
CATEGORICAL ATTRIBUTE (1)
CDM (1)
CENTRALITY (1)
CITIES AND TOWNS (1)
CLASSICAL ATTRIBUTE-VALUE DESCRIPTIONS (1)
CLASSIFICATION TECHNIQUE (1)
CLOUDS (1)
CLUSTER MINING (1)
CLUSTERING (1)
CLUSTERING METHODS (1)
CLUSTERING PROBLEMS (1)
CLUSTERING TECHNIQUE (1)
CO-CLUSTERING (1)
COCLUSTERING ALGORITHM (1)
CODEWORD (1)
COGNITION (1)
COHERENT COCLUSTERS (1)
COHESIVE EXPRESSIONS (1)
COMPANIES (1)
COMPATIBILITY (1)
COMPLEX CONTEXT INFORMATION (1)
COMPLEX SPATIAL OBJECTS (1)
COMPLEXITY THEORY (1)
COMPUTATIONAL MODELING (1)
COMPUTER AIDED SOFTWARE ENGINEERING (1)
CONCEPT DRIFT (1)
CONCEPTUAL LEARNING SYSTEMS (1)
CONFERENCES (1)
CONTAMINATION (1)
CONTEXT DISTANCE MEASURE (1)
CONTEXT SENSITIVE ANALYSIS (1)
DATA ANALYSIS (1)
DATA EXPLORATION (1)
DATA HANDLING (1)
DATA MANAGEMENT SYSTEMS (1)
DATA MINING ALGORITHMS (1)
DATA REDUCTION (1)
DATA STREAM (1)
DATA STREAMS (1)
DBMS (1)
DENSITY-BASED CLUSTERING METHOD (1)
DETECTION ALGORITHMS (1)
DIMENSIONALITY REDUCTION (1)
DIMENSIONALITY REDUCTION METHOD (1)
DISASTER MANAGEMENT (1)
DISPERSION (1)
DISPERSION MODEL (1)
DISTANCE (1)
DISTANCE MEASURES (1)
DOCUMENT CLASSIFICATION DECISION (1)
DROUGHT (1)
DROUGHT PREDICTION (1)
DYNAMIC INTEGRATION (1)
DYNAMICS (1)
EARTH (1)
more

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2008 IEEE International Conference on Data Mining Workshops