2008 IEEE International Conference on Data Mining Workshops

Items from 1 to 19 out of 19 results

chapter

Comparing Accuracies of Rule Evaluation Models to Determine Human Criteria on Evaluated Rule Sets

H. Abe, S. Tsumoto

2008 IEEE International Conference on Data Mining Workshops > 1 - 7

2008 IEEE International Conference on Data Mining Workshops

In data mining post-processing, rule selection using objective rule evaluation indices is one of a useful method to find out valuable knowledge from mined patterns. However, the relationship between an index value and experts' criteria has never been clarified. In this study, we have compared the accuracies of classification learning algorithms for datasets with randomized class distributions and...

chapter

Online Reliability Estimates for Individual Predictions in Data Streams

P.P. Rodrigues, J. Gama, Z. Bosnic

2008 IEEE International Conference on Data Mining Workshops > 36 - 45

2008 IEEE International Conference on Data Mining Workshops

Several predictive systems are nowadays vital for operations and decision support. The quality of these systems is most of the time defined by their average accuracy which has low or no information at all about the estimated error of each individual prediction. In many sensitive applications, users should be allowed to associate a measure of reliability to each prediction. In the case of batch systems,...

chapter

A Comparative Study of Data Sampling and Cost Sensitive Learning

C. Seiffert, T.M. Khoshgoftaar, J. Van Hulse, A. Napolitano

2008 IEEE International Conference on Data Mining Workshops > 46 - 52

2008 IEEE International Conference on Data Mining Workshops

Two common challenges data mining and machine learning practitioners face in many application domains are unequal classification costs and class imbalance. Most traditional data mining techniques attempt to maximize overall accuracy rather than minimize cost. When data is imbalanced, such techniques result in models that highly favor the over represented class, the class which typically carries a...

chapter

TransRank: A Novel Algorithm for Transfer of Rank Learning

Depin Chen, Jun Yan, Gang Wang, Yan Xiong, more

2008 IEEE International Conference on Data Mining Workshops > 106 - 115

2008 IEEE International Conference on Data Mining Workshops

Recently, learning to rank technique has attracted much attention. However, the lack of labeled training data seriously limits its application in real-world tasks. In this paper, we propose to break this bottleneck by considering the cross-domain ldquotransfer of rank learningrdquo problem. Simultaneously, we propose a novel algorithm called TransRank, which can effectively utilize the labeled data...

chapter

Food Sales Prediction: "If Only It Knew What We Know"

P. Meulstee, M. Pechenizkiy

2008 IEEE International Conference on Data Mining Workshops > 134 - 143

2008 IEEE International Conference on Data Mining Workshops

Sales prediction is an important problem for different companies involved in manufacturing, logistics, marketing, wholesaling and retailing. Food companies are more concerned with sales prediction of products having a short shelf-life and seasonal changes in demand. The demand may depend on many hidden contexts, not given explicitly in the form of predictive features. Even if some changes are known...

chapter

Hierarchical Text Categorization in a Transductive Setting

M. Ceci

2008 IEEE International Conference on Data Mining Workshops > 184 - 191

2008 IEEE International Conference on Data Mining Workshops

Transductive learning is the learning setting that permits to learn from "particular to particular'' and to consider both labelled and unlabelled examples when taking classification decisions. In this paper, we investigate the use of transductive learning in the context of hierarchical text categorization. At this aim, we exploit a modified version of an inductive hierarchical learning framework...

chapter

k-Nearest Neighbor Classification on First-Order Logic Descriptions

S. Ferilli, M. Biba, T. Basile, N. Di Mauro, more

2008 IEEE International Conference on Data Mining Workshops > 202 - 210

2008 IEEE International Conference on Data Mining Workshops

Classical attribute-value descriptions induce a multi-dimensional geometric space. One way for computing the distance between descriptions in such a space consists in evaluating an Euclidean distance between tuples of coordinates. This is the ground on which a large part of the Machine Learning literature has built its methods and techniques. However, the complexity of some domains require the use...

chapter

Semi-supervised Collaborative Clustering with Partial Background Knowledge

G. Forestier, C. Wemmert, P. Gancarski

2008 IEEE International Conference on Data Mining Workshops > 211 - 217

2008 IEEE International Conference on Data Mining Workshops

In this paper we present a new algorithm for semisupervised clustering. We assume to have a small set of labeled samples and we use it in a clustering algorithm to discover relevant patterns. We study how our algorithm works against two other semisupervised algorithms when the data are multimodal. Then, we study the case where the user is able to produce few samples for some classes but not for each...

chapter

Harmonic Blind Sound Source Isolation Enhanced by Spectrum Clustering

Xin Zhang, Wenxin Jiang, Z.W. Ras

2008 IEEE International Conference on Data Mining Workshops > 310 - 319

2008 IEEE International Conference on Data Mining Workshops

Automatic indexing of music by instruments and their types is a challenging problem, especially when multiple instruments are playing at the same time. We have built a database containing more than one million of music instrument sounds, each described by a large number o features including standard MPEG7 audio descriptors, features for speech recognition, and many new audio features developed by...

chapter

Risk Assessment of Atmospheric Hazard Releases Using K-Means Clustering

G. Cervone, P. Franzese, Y. Ezber, Z. Boybeyi

2008 IEEE International Conference on Data Mining Workshops > 342 - 348

2008 IEEE International Conference on Data Mining Workshops

Unsupervised machine learning algorithms are used to perform statistical analysis of several transport and dispersion model runs which simulate emissions from a fixed source under different atmospheric conditions. A clustering algorithm is used to automatically group the results of the transport and dispersion simulations according to their respective cloud characteristics. Each cluster of clouds...

chapter

Scalable Sparse Bayesian Network Learning for Spatial Applications

T. Liebig, C. Korner, M. May

2008 IEEE International Conference on Data Mining Workshops > 420 - 425

2008 IEEE International Conference on Data Mining Workshops

Traffic routes through a street network contain patterns and are no random walks. Such patterns exist for instance along streets or between neighbouring street segments. The extraction of these patterns is a challenging task due to the enormous size of city street networks, the large number of required training data and the unknown distribution of the latter. We apply Bayesian Networks to model the...

chapter

A Semi-supervised Learning Algorithm for Recognizing Sub-classes

R.R. Vatsavai, S. Shekhar, B. Bhaduri

2008 IEEE International Conference on Data Mining Workshops > 458 - 467

2008 IEEE International Conference on Data Mining Workshops

In many practical situations it is not feasible to collect labeled samples for all available classes in a domain. Especially in supervised classification of remotely sensed images it is impossible to collect ground truth information over large geographic regions for all thematic classes. As a result often analysts collect labels for aggregate classes (e.g., Forest, Agriculture, Urban). In this paper...

chapter

Co-training by Committee: A New Semi-supervised Learning Framework

M. Hady, F. Schwenker

2008 IEEE International Conference on Data Mining Workshops > 563 - 572

2008 IEEE International Conference on Data Mining Workshops

For many data mining applications, it is necessary to develop algorithms that use unlabeled data to improve the accuracy of the supervised learning. Co-Training is a popular semi-supervised learning algorithm. It assumes that each example is represented by two or more redundantly sufficient sets of features (views) and these views are independent given the class. However, these assumptions are not...

chapter

An Adaptive Pre-filtering Technique for Error-Reduction Sampling in Active Learning

M. Davy, S. Luz

2008 IEEE International Conference on Data Mining Workshops > 682 - 691

2008 IEEE International Conference on Data Mining Workshops

Error-reduction sampling (ERS) is a high performing (but computationally expensive) query selection strategy for active learning. Subset optimisation has been proposed to reduce computational expense by applying ERS to only a subset of examples from the pool. This paper compares techniques used to construct the subset, namely random sub-sampling and pre-filtering. We focus on pre-filtering which populates...

chapter

ZCS Revisited: Zeroth-Level Classifier Systems for Data Mining

F.A. Tzima, P.A. Mitkas

2008 IEEE International Conference on Data Mining Workshops > 700 - 709

2008 IEEE International Conference on Data Mining Workshops

Learning classifier systems (LCS) are machine learning systems designed to work for both multi-step and single-step decision tasks. The latter case presents an interesting,though not widely studied, challenge for such algorithms,especially when they are applied to real-world data mining problems. The present investigation departs from the popular approach of applying accuracy-based LCS to data mining...

chapter

The Set Classification Problem and Solution Methods

Xia Ning, G. Karypis

2008 IEEE International Conference on Data Mining Workshops > 720 - 729

2008 IEEE International Conference on Data Mining Workshops

This paper focuses on developing classification algorithms for problems in which there is a need to predict the class based on multiple observations (examples) of the same phenomenon (class). These problems give rise to a new classification problem, referred to as set classification, that requires the prediction of a set of instances given the prior knowledge that all the instances of the set belong...

chapter

Semantic Features for Multi-view Semi-supervised and Active Learning of Text Classification

Shiliang Sun

2008 IEEE International Conference on Data Mining Workshops > 731 - 735

2008 IEEE International Conference on Data Mining Workshops

For multi-view learning, existing methods usually exploit originally provided features for classifier training, which ignore the latent correlation between different views. In this paper, semantic features integrating information from multiple views are extracted for pattern representation. Canonical correlation analysis is used to learn the representation of semantic spaces where semantic features...

chapter

Semantic Concept Learning through Massive Internet Video Mining

Peijiang Yuan, Bo Zhang, Jianmin Li

2008 IEEE International Conference on Data Mining Workshops > 847 - 853

2008 IEEE International Conference on Data Mining Workshops

Semantic concept learning is one of the most challenging problems in video retrieval. The key barrier for semantic concept learning is lack of annotated training data. Internet videos are different from ordinary videos: massive, rich information, customized, non-uniform format, uneven quality, little descriptive text, only a few shots with limited length etc. Therefore, Internet is a potential repository...

chapter

Using Betweenness Centrality to Identify Manifold Shortcuts

W.J. Cukierski, D.J. Foran

2008 IEEE International Conference on Data Mining Workshops > 949 - 958

2008 IEEE International Conference on Data Mining Workshops

High-dimensional data presents a significant challenge to a broad spectrum of pattern recognition and machine-learning applications. Dimensionality reduction (DR) methods serve to remove unwanted variance and make such problems tractable. Several nonlinear DR methods, such as the well known ISOMAP algorithm, rely on a neighborhood graph to compute geodesic distances between data points. These graphs...

Filter options

Keywords:
LEARNING (ARTIFICIAL INTELLIGENCE)

Publication date

Set your own date range

Keywords

CLASSIFICATION ALGORITHMS (11)
DATA MINING (8)
TRAINING (8)
ACCURACY (6)
DISTANCE MEASUREMENT (5)
PATTERN CLASSIFICATION (5)
MACHINE LEARNING (4)
TRAINING DATA (4)
CLASSIFICATION (3)
CLUSTERING ALGORITHMS (3)
HIDDEN MARKOV MODELS (3)
LABELING (3)
LEARNING SYSTEMS (3)
TEXT ANALYSIS (3)
ACTIVE LEARNING (2)
BENCHMARK TESTING (2)
CO-TRAINING (2)
COMPUTATIONAL MODELING (2)
DATA MODELS (2)
DECISION TREES (2)
ENSEMBLE LEARNING (2)
ESTIMATION (2)
FEATURE EXTRACTION (2)
METEOROLOGY (2)
OPTIMIZATION (2)
PATTERN CLUSTERING (2)
PREDICTIVE MODELS (2)
QUERY PROCESSING (2)
SEMI-SUPERVISED LEARNING (2)
SENSITIVITY (2)
SET THEORY (2)
SUPPORT VECTOR MACHINES (2)
TEXT CATEGORIZATION (2)
ADABOOST (1)
ADAPTATION MODEL (1)
ADAPTIVE FILTERS (1)
ADAPTIVE PREFILTERING TECHNIQUE (1)
AGGREGATES (1)
AGRICULTURE (1)
ANALYSIS OF VARIANCE (1)
ANNOTATED TRAINING DATA (1)
APPLICATION DOMAIN (1)
ASIA (1)
ATMOSPHERIC HAZARD (1)
ATMOSPHERIC MODELING (1)
AUDIO CODING (1)
AUDIO DATABASES (1)
AUDIO FEATURE (1)
AUTOMATED SOURCE DISCOVERY (1)
AUTOMATIC INDEXING (1)
AUTOMATIC MUSIC INDEXING (1)
AUTOMATICAL GRAPH MODEL GENERATOR (1)
BAGGING (1)
BALANCED RANDOMIZED CLASS DISTRIBUTION (1)
BATCH SYSTEM (1)
BAYESIAN METHODS (1)
BELIEF NETWORKS (1)
BENCHMARK TEXT CATEGORISATION DATASETS (1)
BETWEENNESS (1)
BIOLOGICAL SYSTEM MODELING (1)
BLIND SOUND SOURCE SEPARATION (1)
BLIND SOUND SOURCE SEPARATION ALGORITHM (1)
BLIND SOURCE SEPARATION (1)
BOSPHORUS CHANNEL (1)
CATEGORY THEORY (1)
CENTRALITY (1)
CITIES AND TOWNS (1)
CITY STREET NETWORKS (1)
CLASS IMBALANCE (1)
CLASSICAL ATTRIBUTE-VALUE DESCRIPTIONS (1)
CLASSIFICATION LEARNING ALGORITHM (1)
CLASSIFICATION TREE ANALYSIS (1)
CLASSIFIER TRAINING (1)
CLOUDS (1)
CLUSTERING (1)
CLUSTERING METHODS (1)
CO-TESTING (1)
COLLABORATION (1)
COLLABORATIVE CLUSTERING (1)
COMPANIES (1)
COMPLEXITY THEORY (1)
CONCEPT DRIFT (1)
CONCEPTNET (1)
CONCEPTUAL LEARNING SYSTEMS (1)
CONTAMINATION (1)
CORRELATION (1)
COST SENSITIVE LEARNING (1)
COTRAINING BY COMMITTEE (1)
CROSS-DOMAIN TRANSFER (1)
DATA CLASSIFICATION (1)
DATA MINING APPLICATIONS (1)
DATA REDUCTION (1)
DATA SAMPLING (1)
DATA STREAM PREDICTION (1)
DATABASE INDEXING (1)
DATABASES (1)
DATASET TRAINING (1)
DECISION SUPPORT (1)
DIMENSIONALITY REDUCTION (1)
more

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2008 IEEE International Conference on Data Mining Workshops