2008 IEEE International Conference on Data Mining Workshops

Items from 1 to 20 out of 29 results

chapter

Using Contextual Information to Decrease the Cost of Incorrect Predictions in On-line Customer Behavior Modeling

M. Gorgoglione, C. Palmisano, S. Lombardi

2008 IEEE International Conference on Data Mining Workshops > 780 - 788

2008 IEEE International Conference on Data Mining Workshops

The performance of user profiling models depends on both the predictive accuracy and the cost of incorrect predictions. In this paper we study whether including contextual information leads to a decrease in the misclassification cost. Several experimental analyses were done by varying the cost ratio, the market granularity and the granularity of context. The experimental results show that context...

chapter

An Adaptive Pre-filtering Technique for Error-Reduction Sampling in Active Learning

M. Davy, S. Luz

2008 IEEE International Conference on Data Mining Workshops > 682 - 691

2008 IEEE International Conference on Data Mining Workshops

Error-reduction sampling (ERS) is a high performing (but computationally expensive) query selection strategy for active learning. Subset optimisation has been proposed to reduce computational expense by applying ERS to only a subset of examples from the pool. This paper compares techniques used to construct the subset, namely random sub-sampling and pre-filtering. We focus on pre-filtering which populates...

chapter

Keyword Extraction Based on Lexical Chains and Word Co-occurrence for Chinese News Web Pages

Xinghua Li, Xindong Wu, Xuegang Hu, Fei Xie, more

2008 IEEE International Conference on Data Mining Workshops > 744 - 751

2008 IEEE International Conference on Data Mining Workshops

This paper presents a new keyword extraction algorithm for Chinese news Web pages using lexical chains and word co-occurrence combined with frequency features, cohesion features, and corelation features. A lexical chain is an external performance consistency by semantically related words of a text, and is the representation of the semantic content of a portion of the text. Word co-occurrence distribution...

chapter

Semantic Features for Multi-view Semi-supervised and Active Learning of Text Classification

Shiliang Sun

2008 IEEE International Conference on Data Mining Workshops > 731 - 735

2008 IEEE International Conference on Data Mining Workshops

For multi-view learning, existing methods usually exploit originally provided features for classifier training, which ignore the latent correlation between different views. In this paper, semantic features integrating information from multiple views are extracted for pattern representation. Canonical correlation analysis is used to learn the representation of semantic spaces where semantic features...

chapter

Efficient Distance Computation Using SQL Queries and UDFs

S.K. Pitchaimalai, C. Ordonez, C. Garcia-Alvarado

2008 IEEE International Conference on Data Mining Workshops > 533 - 542

2008 IEEE International Conference on Data Mining Workshops

Distance computation is one of the most computationally intensive operations employed by many data mining algorithms. Performing such matrix computations within a DBMS creates many optimization challenges. We propose techniques to efficiently compute Euclidean distance using SQL queries and user-defined functions (UDFs). We concentrate on efficient Euclidean distance computation for the well-known...

chapter

Co-training by Committee: A New Semi-supervised Learning Framework

M. Hady, F. Schwenker

2008 IEEE International Conference on Data Mining Workshops > 563 - 572

2008 IEEE International Conference on Data Mining Workshops

For many data mining applications, it is necessary to develop algorithms that use unlabeled data to improve the accuracy of the supervised learning. Co-Training is a popular semi-supervised learning algorithm. It assumes that each example is represented by two or more redundantly sufficient sets of features (views) and these views are independent given the class. However, these assumptions are not...

chapter

A Semi-supervised Learning Algorithm for Recognizing Sub-classes

R.R. Vatsavai, S. Shekhar, B. Bhaduri

2008 IEEE International Conference on Data Mining Workshops > 458 - 467

2008 IEEE International Conference on Data Mining Workshops

In many practical situations it is not feasible to collect labeled samples for all available classes in a domain. Especially in supervised classification of remotely sensed images it is impossible to collect ground truth information over large geographic regions for all thematic classes. As a result often analysts collect labels for aggregate classes (e.g., Forest, Agriculture, Urban). In this paper...

chapter

Association Action Rules

Z.W. Ras, A. Dardzinska, L.-S. Tsay, H. Wasyluk

2008 IEEE International Conference on Data Mining Workshops > 283 - 290

2008 IEEE International Conference on Data Mining Workshops

Action rules describe possible transitions of objects from one state to another with respect to a distinguished attribute. Previous research on action rule discovery usually required the extraction of classification rules before constructing any action rule. This paper gives anew approach for generating association-type action rules. The notion of frequent action sets and Apriori-like strategy generating...

chapter

A Case Study on Classification Reliability

Honghua Dai

2008 IEEE International Conference on Data Mining Workshops > 69 - 73

2008 IEEE International Conference on Data Mining Workshops

The reliability of an induced classifier can be affected by several factors including the data oriented factors and the algorithm oriented factors. In some cases, the reliability could also be affected by knowledge oriented factors. In this paper, we analyze three special cases to examine the reliability of the discovered knowledge. Our case study results show that (1) in the cases of mining from...

chapter

One-Class Classification of Text Streams with Concept Drift

Yang Zhang, Xue Li, M. Orlowska

2008 IEEE International Conference on Data Mining Workshops > 116 - 125

2008 IEEE International Conference on Data Mining Workshops

Research on streaming data classification has been mostly based on the assumption that data can be fully labelled. However, this is impractical. Firstly it is impossible to make a complete labelling before all data has arrived. Secondly it is generally very expensive to obtain fully labelled data by using man power. Thirdly user interests may change with time so the labels issued earlier may be inconsistent...

chapter

TransRank: A Novel Algorithm for Transfer of Rank Learning

Depin Chen, Jun Yan, Gang Wang, Yan Xiong, more

2008 IEEE International Conference on Data Mining Workshops > 106 - 115

2008 IEEE International Conference on Data Mining Workshops

Recently, learning to rank technique has attracted much attention. However, the lack of labeled training data seriously limits its application in real-world tasks. In this paper, we propose to break this bottleneck by considering the cross-domain ldquotransfer of rank learningrdquo problem. Simultaneously, we propose a novel algorithm called TransRank, which can effectively utilize the labeled data...

chapter

Hierarchical Text Categorization in a Transductive Setting

M. Ceci

2008 IEEE International Conference on Data Mining Workshops > 184 - 191

2008 IEEE International Conference on Data Mining Workshops

Transductive learning is the learning setting that permits to learn from "particular to particular'' and to consider both labelled and unlabelled examples when taking classification decisions. In this paper, we investigate the use of transductive learning in the context of hierarchical text categorization. At this aim, we exploit a modified version of an inductive hierarchical learning framework...

chapter

Clustering Events on Streams Using Complex Context Information

YongChul Kwon, Wing Yee Lee, M. Balazinska, Guiping Xu

2008 IEEE International Conference on Data Mining Workshops > 238 - 247

2008 IEEE International Conference on Data Mining Workshops

Monitoring applications play an increasingly important role in many domains. They detect events in monitored systems and take actions such as invoke a program or notify an administrator. Often administrators must then manually investigate events to figure out the source of a problem. Stream processing engines (SPEs) are general purpose data management systems for monitoring applications. They provide...

chapter

Identification of Causal Variables for Building Energy Fault Detection by Semi-supervised LDA and Decision Boundary Analysis

K. Yoshida, M. Inui, T. Yairi, K. Machida, more

2008 IEEE International Conference on Data Mining Workshops > 164 - 173

2008 IEEE International Conference on Data Mining Workshops

This paper addresses the identification problem of causal variables for the system anomaly. In real-world complicated systems, even experts often fail to specify causal factors, thus they attempt to detect the anomaly with exploratory heuristics. Our goal is to offer further information that supports anomaly cause analysis using the incomplete empirical knowledge. Proposed technique discovers responsible...

chapter

Chi-Square Test Based Decision Trees Induction in Distributed Environment

Jie Ouyang, N. Patel, I.K. Sethi

2008 IEEE International Conference on Data Mining Workshops > 477 - 485

2008 IEEE International Conference on Data Mining Workshops

The decision tree-based classification is a popular approach for pattern recognition and data mining. Most decision tree induction methods assume training data being present at one central location. Given the growth in distributed databases at geographically dispersed locations, the methods for decision tree induction in distributed settings are gaining importance. This paper describes one distributed...

chapter

Harmonic Blind Sound Source Isolation Enhanced by Spectrum Clustering

Xin Zhang, Wenxin Jiang, Z.W. Ras

2008 IEEE International Conference on Data Mining Workshops > 310 - 319

2008 IEEE International Conference on Data Mining Workshops

Automatic indexing of music by instruments and their types is a challenging problem, especially when multiple instruments are playing at the same time. We have built a database containing more than one million of music instrument sounds, each described by a large number o features including standard MPEG7 audio descriptors, features for speech recognition, and many new audio features developed by...

chapter

GeoDMA - A Novel System for Spatial Data Mining

T.S. Korting, L.M.G. Fonseca, M.I.S. Escada, F.C. da Silva, more

2008 IEEE International Conference on Data Mining Workshops > 975 - 978

2008 IEEE International Conference on Data Mining Workshops

Although a huge amount of remote sensing data has been provided by Earth observation satellites, few data manipulation techniques and information extraction in large data sets have been developed. In this context, the present paper aims to show a new system for spatial data mining, and two test cases applied to land use change in the Brazilian Amazon region. We present the operational environment...

chapter

Human Action Recognition by Radon Transform

Yan Chen, Qiang Wu, Xiangjian He

2008 IEEE International Conference on Data Mining Workshops > 862 - 868

2008 IEEE International Conference on Data Mining Workshops

A new feature description is used for human behaviour representation and recognition. The feature is based on Radon transforms of extracted silhouettes. Key postures are selected based on the Radon transform. Key postures are combined to construct an action template for each sequence. Linear discriminant analysis (LDA) is applied to the set of key postures to obtain low dimensional feature vectors...

chapter

Estimating True and False Positive Rates in Higher Dimensional Problems and Its Data Mining Applications

A. Foss, O.R. Zaiane

2008 IEEE International Conference on Data Mining Workshops > 673 - 681

2008 IEEE International Conference on Data Mining Workshops

If we can estimate the accuracy of our observations then we can estimate the true and false positive rates over a series of samples in high dimensional data mining problems. To date such issues have been largely neglected and previously no algorithm has been provided to facilitate the computations involved. In high dimensional data mining tasks, increasing sparsity leads to decreasing true positive...

chapter

ZCS Revisited: Zeroth-Level Classifier Systems for Data Mining

F.A. Tzima, P.A. Mitkas

2008 IEEE International Conference on Data Mining Workshops > 700 - 709

2008 IEEE International Conference on Data Mining Workshops

Learning classifier systems (LCS) are machine learning systems designed to work for both multi-step and single-step decision tasks. The latter case presents an interesting,though not widely studied, challenge for such algorithms,especially when they are applied to real-world data mining problems. The present investigation departs from the popular approach of applying accuracy-based LCS to data mining...

Keywords:
CLASSIFICATION ALGORITHMS

Publication date

Set your own date range

Keywords

DATA MINING (18)
TRAINING (13)
LEARNING (ARTIFICIAL INTELLIGENCE) (11)
PATTERN CLASSIFICATION (11)
ACCURACY (9)
FEATURE EXTRACTION (7)
CLUSTERING ALGORITHMS (5)
DECISION TREES (5)
PATTERN CLUSTERING (5)
TEXT ANALYSIS (5)
TRAINING DATA (5)
CLASSIFICATION (4)
CONFERENCES (4)
DISTANCE MEASUREMENT (4)
IMAGE CLASSIFICATION (4)
KERNEL (4)
QUERY PROCESSING (4)
ALGORITHM DESIGN AND ANALYSIS (3)
ASSOCIATION RULES (3)
CLASSIFICATION TREE ANALYSIS (3)
DATABASES (3)
LABELING (3)
REMOTE SENSING (3)
TEXT CATEGORIZATION (3)
ACTIVE LEARNING (2)
BENCHMARK TESTING (2)
BUILDINGS (2)
CLUSTERING (2)
CO-TRAINING (2)
ESTIMATION (2)
GEOPHYSICAL SIGNAL PROCESSING (2)
HUMANS (2)
IMAGE SEQUENCES (2)
INDEXES (2)
INFORMATION SYSTEMS (2)
KNOWLEDGE DISCOVERY (2)
LEARNING SYSTEMS (2)
MACHINE LEARNING (2)
MATHEMATICAL MODEL (2)
MERGING (2)
NICKEL (2)
OPTIMIZATION (2)
PREDICTION ALGORITHMS (2)
PREDICTIVE MODELS (2)
SEMI-SUPERVISED LEARNING (2)
SET THEORY (2)
STATISTICAL ANALYSIS (2)
SUPPORT VECTOR MACHINES (2)
TESTING (2)
VECTORS (2)
WEB PAGES (2)
ACTION RECOGNITION (1)
ACTION RULE DISCOVERY (1)
ACTION TEMPLATE (1)
ACTIONABLE PATTERNS (1)
ADABOOST (1)
ADAPTIVE FILTERS (1)
ADAPTIVE PREFILTERING TECHNIQUE (1)
ADJACENT SPECTRAL BANDS (1)
AGGREGATES (1)
AGRICULTURE (1)
ANOMALY DETECTION (1)
APRIORI-LIKE STRATEGY (1)
ARTIFICIAL SYSTEMS (1)
ARUBAS-SCHEFFER ALGORITHM (1)
ASIA (1)
ASSOCIATION ACTION RULES (1)
ASSOCIATION RULE MINING ALGORITHM (1)
ASSOCIATION RULE-BASED SIMILARITY FRAMEWORK (1)
ASSOCIATIVE CLASSIFIER (1)
AUDIO CODING (1)
AUDIO DATABASES (1)
AUDIO FEATURE (1)
AUTOMATIC INDEXING (1)
AUTOMATIC MUSIC INDEXING (1)
BAGGING (1)
BALANCED RANDOMIZED CLASS DISTRIBUTION (1)
BAND PASS FILTERS (1)
BENCHMARK TEXT CATEGORISATION DATASETS (1)
BINARY TREES (1)
BLIND SOUND SOURCE SEPARATION (1)
BLIND SOUND SOURCE SEPARATION ALGORITHM (1)
BLIND SOURCE SEPARATION (1)
BOOSTING (1)
BRAZILIAN AMAZON REGION (1)
BUILDING ENERGY FAULT DETECTION (1)
BUILDING ENERGY FAULT DIAGNOSIS (1)
BUILDING MANAGEMENT SYSTEMS (1)
CATEGORY THEORY (1)
CAUSAL VARIABLE IDENTIFICATION (1)
CDM (1)
CHAID ALGORITHM (1)
CHI SQUARE TEST (1)
CHI-SQUARE TEST (1)
CHINESE NEWS WEB PAGE (1)
CLASSIFICATION LEARNING ALGORITHM (1)
CLASSIFICATION RELIABILITY (1)
CLASSIFICATION TECHNIQUE (1)
CLASSIFIER TRAINING (1)
more

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops

Using Contextual Information to Decrease the Cost of Incorrect Predictions in On-line Customer Behavior Modeling

An Adaptive Pre-filtering Technique for Error-Reduction Sampling in Active Learning

Keyword Extraction Based on Lexical Chains and Word Co-occurrence for Chinese News Web Pages

Semantic Features for Multi-view Semi-supervised and Active Learning of Text Classification

Efficient Distance Computation Using SQL Queries and UDFs

Co-training by Committee: A New Semi-supervised Learning Framework

A Semi-supervised Learning Algorithm for Recognizing Sub-classes

Association Action Rules

A Case Study on Classification Reliability

One-Class Classification of Text Streams with Concept Drift

TransRank: A Novel Algorithm for Transfer of Rank Learning

Hierarchical Text Categorization in a Transductive Setting

Clustering Events on Streams Using Complex Context Information

Identification of Causal Variables for Building Energy Fault Detection by Semi-supervised LDA and Decision Boundary Analysis

Chi-Square Test Based Decision Trees Induction in Distributed Environment

Harmonic Blind Sound Source Isolation Enhanced by Spectrum Clustering

GeoDMA - A Novel System for Spatial Data Mining

Human Action Recognition by Radon Transform

Estimating True and False Positive Rates in Higher Dimensional Problems and Its Data Mining Applications

ZCS Revisited: Zeroth-Level Classifier Systems for Data Mining

Filter options

Publication date

Keywords

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2008 IEEE International Conference on Data Mining Workshops