2008 IEEE International Conference on Data Mining Workshops

Items from 1 to 15 out of 15 results

chapter

Comparing Accuracies of Rule Evaluation Models to Determine Human Criteria on Evaluated Rule Sets

H. Abe, S. Tsumoto

2008 IEEE International Conference on Data Mining Workshops > 1 - 7

2008 IEEE International Conference on Data Mining Workshops

In data mining post-processing, rule selection using objective rule evaluation indices is one of a useful method to find out valuable knowledge from mined patterns. However, the relationship between an index value and experts' criteria has never been clarified. In this study, we have compared the accuracies of classification learning algorithms for datasets with randomized class distributions and...

chapter

Online Reliability Estimates for Individual Predictions in Data Streams

P.P. Rodrigues, J. Gama, Z. Bosnic

2008 IEEE International Conference on Data Mining Workshops > 36 - 45

2008 IEEE International Conference on Data Mining Workshops

Several predictive systems are nowadays vital for operations and decision support. The quality of these systems is most of the time defined by their average accuracy which has low or no information at all about the estimated error of each individual prediction. In many sensitive applications, users should be allowed to associate a measure of reliability to each prediction. In the case of batch systems,...

chapter

A Case Study on Classification Reliability

Honghua Dai

2008 IEEE International Conference on Data Mining Workshops > 69 - 73

2008 IEEE International Conference on Data Mining Workshops

The reliability of an induced classifier can be affected by several factors including the data oriented factors and the algorithm oriented factors. In some cases, the reliability could also be affected by knowledge oriented factors. In this paper, we analyze three special cases to examine the reliability of the discovered knowledge. Our case study results show that (1) in the cases of mining from...

chapter

One-Class Classification of Text Streams with Concept Drift

Yang Zhang, Xue Li, M. Orlowska

2008 IEEE International Conference on Data Mining Workshops > 116 - 125

2008 IEEE International Conference on Data Mining Workshops

Research on streaming data classification has been mostly based on the assumption that data can be fully labelled. However, this is impractical. Firstly it is impossible to make a complete labelling before all data has arrived. Secondly it is generally very expensive to obtain fully labelled data by using man power. Thirdly user interests may change with time so the labels issued earlier may be inconsistent...

chapter

Food Sales Prediction: "If Only It Knew What We Know"

P. Meulstee, M. Pechenizkiy

2008 IEEE International Conference on Data Mining Workshops > 134 - 143

2008 IEEE International Conference on Data Mining Workshops

Sales prediction is an important problem for different companies involved in manufacturing, logistics, marketing, wholesaling and retailing. Food companies are more concerned with sales prediction of products having a short shelf-life and seasonal changes in demand. The demand may depend on many hidden contexts, not given explicitly in the form of predictive features. Even if some changes are known...

chapter

Semi-supervised Collaborative Clustering with Partial Background Knowledge

G. Forestier, C. Wemmert, P. Gancarski

2008 IEEE International Conference on Data Mining Workshops > 211 - 217

2008 IEEE International Conference on Data Mining Workshops

In this paper we present a new algorithm for semisupervised clustering. We assume to have a small set of labeled samples and we use it in a clustering algorithm to discover relevant patterns. We study how our algorithm works against two other semisupervised algorithms when the data are multimodal. Then, we study the case where the user is able to produce few samples for some classes but not for each...

chapter

Plant Protein Localization Using Discriminative and Frequent Partition-Based Subsequences

S.V. Jazayeri, O.R. Zaiane

2008 IEEE International Conference on Data Mining Workshops > 228 - 237

2008 IEEE International Conference on Data Mining Workshops

The function of proteins in the living cells varies with respect to their localizations. Extracellular plant proteins are responsible for vital functions such as nutrition acquisition, protection from pathogens, communication with other soil organisms, etc. Hence, characterizing these proteins and distinguishing them from intracellular proteins is of high interest to biologists. Nonetheless, the small...

chapter

Investigation of Various Matrix Factorization Methods for Large Recommender Systems

G. Takacs, I. Pilaszy, B. Nemeth, D. Tikk

2008 IEEE International Conference on Data Mining Workshops > 553 - 562

2008 IEEE International Conference on Data Mining Workshops

Matrix factorization (MF) based approaches have proven to be efficient for rating-based recommendation systems. In this work, we propose several matrix factorization approaches with improved prediction accuracy. We introduce a novel and fast (semi)-positive MF approach that approximates the features by using positive values for either users or items. We describe a momentum-based MF approach. A transductive...

chapter

Rules Extraction from Multiple Decisions Ordered Information Tables

Bin Shen, Min Yao, Zhaohui Wu

2008 IEEE International Conference on Data Mining Workshops > 573 - 582

2008 IEEE International Conference on Data Mining Workshops

Ordered information table is one of the most important research areas of granular computing. In this thesis, we introduce multiple decisions ordered information tables based on the concept of ordered information tables. Multiple decisions ordered information tables are used to describe the actual multiple decision attributes situation of reality. We study the process of rule extraction from multiple...

chapter

An Adaptive Pre-filtering Technique for Error-Reduction Sampling in Active Learning

M. Davy, S. Luz

2008 IEEE International Conference on Data Mining Workshops > 682 - 691

2008 IEEE International Conference on Data Mining Workshops

Error-reduction sampling (ERS) is a high performing (but computationally expensive) query selection strategy for active learning. Subset optimisation has been proposed to reduce computational expense by applying ERS to only a subset of examples from the pool. This paper compares techniques used to construct the subset, namely random sub-sampling and pre-filtering. We focus on pre-filtering which populates...

chapter

ARUBAS: An Association Rule Based Similarity Framework for Associative Classifiers

B. Depaire, K. Vanhoof, G. Wets

2008 IEEE International Conference on Data Mining Workshops > 692 - 699

2008 IEEE International Conference on Data Mining Workshops

This article introduces ARUBAS, a new framework to build associative classifiers. In contrast with many existing associative classifiers, it uses class association rules to transform the feature space and uses instance-based reasoning to classify new instances. The framework allows the researcher to use any association rule mining algorithm to produce the class association rules. Every aspect of the...

chapter

Semantic Features for Multi-view Semi-supervised and Active Learning of Text Classification

Shiliang Sun

2008 IEEE International Conference on Data Mining Workshops > 731 - 735

2008 IEEE International Conference on Data Mining Workshops

For multi-view learning, existing methods usually exploit originally provided features for classifier training, which ignore the latent correlation between different views. In this paper, semantic features integrating information from multiple views are extracted for pattern representation. Canonical correlation analysis is used to learn the representation of semantic spaces where semantic features...

chapter

Using Contextual Information to Decrease the Cost of Incorrect Predictions in On-line Customer Behavior Modeling

M. Gorgoglione, C. Palmisano, S. Lombardi

2008 IEEE International Conference on Data Mining Workshops > 780 - 788

2008 IEEE International Conference on Data Mining Workshops

The performance of user profiling models depends on both the predictive accuracy and the cost of incorrect predictions. In this paper we study whether including contextual information leads to a decrease in the misclassification cost. Several experimental analyses were done by varying the cost ratio, the market granularity and the granularity of context. The experimental results show that context...

chapter

G-REX: A Versatile Framework for Evolutionary Data Mining

R. Konig, U. Johansson, L. Niklasson

2008 IEEE International Conference on Data Mining Workshops > 971 - 974

2008 IEEE International Conference on Data Mining Workshops

This paper presents G-REX, a versatile data mining framework based on genetic programming. What differs G-REX from other GP frameworks is that it doesn't strive to be a general purpose framework. This allows G-REX to include more functionality specific to data mining like preprocessing, evaluation- and optimization methods, but also a multitude of predefined classification and regression models. Examples...

chapter

A Data Stream Mining System

H. Thakkar, B. Mozafari, C. Zaniolo

2008 IEEE International Conference on Data Mining Workshops > 987 - 990

2008 IEEE International Conference on Data Mining Workshops

On-line data stream mining has attracted much research interest, but systems that can be used as a workbench for online mining have not been researched, since they pose many difficult research challenges. The proposed system addresses these challenges by an architecture based on three main technical advances, (i) introduction of new constructs and synoptic data structures whereby complex KDD queries...

Filter options

Keywords:
ACCURACY

Publication date

Set your own date range

Keywords

DATA MINING (10)
CLASSIFICATION ALGORITHMS (9)
LEARNING (ARTIFICIAL INTELLIGENCE) (6)
TRAINING (6)
PATTERN CLASSIFICATION (4)
DATA MODELS (3)
TEXT ANALYSIS (3)
TRAINING DATA (3)
ACTIVE LEARNING (2)
ASSOCIATION RULES (2)
COMPANIES (2)
CONCEPT DRIFT (2)
CONFERENCES (2)
DECISION TREES (2)
FEATURE EXTRACTION (2)
FILTERING (2)
INDEXES (2)
LABELING (2)
NICKEL (2)
OPTIMIZATION (2)
PREDICTIVE MODELS (2)
QUERY PROCESSING (2)
RELIABILITY (2)
TEXT CATEGORIZATION (2)
ADAPTIVE FILTERS (1)
ADAPTIVE PREFILTERING TECHNIQUE (1)
ALGORITHM DESIGN AND ANALYSIS (1)
AMINO ACIDS (1)
APPROXIMATION METHODS (1)
ARUBAS-SCHEFFER ALGORITHM (1)
ASSOCIATION RULE MINING ALGORITHM (1)
ASSOCIATION RULE-BASED SIMILARITY FRAMEWORK (1)
ASSOCIATIVE CLASSIFIER (1)
BALANCED RANDOMIZED CLASS DISTRIBUTION (1)
BARIUM (1)
BATCH SYSTEM (1)
BENCHMARK TESTING (1)
BENCHMARK TEXT CATEGORISATION DATASETS (1)
BIOLOGICAL ANALYSIS (1)
BIOLOGICAL INFORMATION THEORY (1)
BIOLOGY (1)
BIOLOGY COMPUTING (1)
BOTANY (1)
CELLULAR BIOPHYSICS (1)
CLASSIFICATION LEARNING ALGORITHM (1)
CLASSIFICATION RELIABILITY (1)
CLASSIFIER TRAINING (1)
CLUSTERING ALGORITHMS (1)
CLUSTERING METHODS (1)
CO-TESTING (1)
CO-TRAINING (1)
COLLABORATION (1)
COLLABORATIVE CLUSTERING (1)
COLLABORATIVE FILTERING (1)
COMPLEX KDD QUERIES (1)
COMPUTATIONAL MODELING (1)
CONSUMER BEHAVIOUR (1)
CONTEXT GRANULARITY (1)
CONTEXT MODELING (1)
CONTEXTUAL INFORMATION (1)
CORRELATION (1)
COSTING (1)
DATA ANALYSIS (1)
DATA CLASSIFICATION (1)
DATA DISTRIBUTION (1)
DATA ORIENTED FACTORS (1)
DATA STREAM MINING SYSTEM (1)
DATA STREAM PREDICTION (1)
DATABASE MANAGEMENT SYSTEMS (1)
DECISION LISTS (1)
DECISION SUPPORT (1)
DECISION TABLES (1)
DISCOVERY RELIABILITY (1)
DISCRIMINATIVE PARTITION-BASED SUBSEQUENCES (1)
DISTANCE MEASUREMENT (1)
DYNAMIC INTEGRATION (1)
EDUCATIONAL INSTITUTIONS (1)
EMPIRICAL EVALUATIONS (1)
EMPIRICAL MEASURE (1)
ENSEMBLE LEARNING (1)
EQUATIONS (1)
ERROR REDUCTION SAMPLING (1)
ERROR-REDUCTION SAMPLING (1)
ESTIMATION (1)
EVALUATED RULE SETS (1)
EVOLUTION (BIOLOGY) (1)
EVOLUTIONARY DATA MINING (1)
EXTRACELLULAR PLANT PROTEINS (1)
FEATURE SPACE TRANSFORM (1)
FILTERING THEORY (1)
FOOD COMPANIES (1)
FOOD PROCESSING INDUSTRY (1)
FOOD SALES PREDICTION (1)
FOOD WHOLESALING (1)
FRAMEWORK (1)
FREQUENT PARTITION-BASED SUBSEQUENCES (1)
FUZZY SET THEORY (1)
FUZZY-RULES (1)
G-REX (1)
more

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2008 IEEE International Conference on Data Mining Workshops