2008 IEEE International Conference on Data Mining Workshops

Items from 21 to 40 out of 139 results

chapter

Wavelet-Based Data Perturbation for Simultaneous Privacy-Preserving and Statistics-Preserving

Lian Liu, Jie Wang, Jun Zhang

2008 IEEE International Conference on Data Mining Workshops > 27 - 35

2008 IEEE International Conference on Data Mining Workshops

With the rapid development of data mining technologies, preserving privacy in certain data becomes a challenge to data mining applications in many fields, especially in medical, financial and homeland security fields. We present a privacy-preserving strategy based on wavelet perturbation to keep the data privacy and data statistical properties and data mining utilities at the same time. Our mathematical...

chapter

Online Reliability Estimates for Individual Predictions in Data Streams

P.P. Rodrigues, J. Gama, Z. Bosnic

2008 IEEE International Conference on Data Mining Workshops > 36 - 45

2008 IEEE International Conference on Data Mining Workshops

Several predictive systems are nowadays vital for operations and decision support. The quality of these systems is most of the time defined by their average accuracy which has low or no information at all about the estimated error of each individual prediction. In many sensitive applications, users should be allowed to associate a measure of reliability to each prediction. In the case of batch systems,...

chapter

A Comparative Study of Data Sampling and Cost Sensitive Learning

C. Seiffert, T.M. Khoshgoftaar, J. Van Hulse, A. Napolitano

2008 IEEE International Conference on Data Mining Workshops > 46 - 52

2008 IEEE International Conference on Data Mining Workshops

Two common challenges data mining and machine learning practitioners face in many application domains are unequal classification costs and class imbalance. Most traditional data mining techniques attempt to maximize overall accuracy rather than minimize cost. When data is imbalanced, such techniques result in models that highly favor the over represented class, the class which typically carries a...

chapter

Region Classification with Decision Trees

J. van Prehn, E.N. Smirnov

2008 IEEE International Conference on Data Mining Workshops > 53 - 59

2008 IEEE International Conference on Data Mining Workshops

The region-classification task is to construct class regions containing the correct classes of the objects being classified with a given probability. To turn a point classifier into a region classifier, the conformal framework is used . However, applying the framework requires a non-conformity function. This function estimates the instances' non-conformity for the point classifier used. This paper...

chapter

A Study on the Reliability of Case-Based Reasoning Systems

Ke Wang, J. Liu, Wei-min Ma

2008 IEEE International Conference on Data Mining Workshops > 60 - 68

2008 IEEE International Conference on Data Mining Workshops

Case-based reasoning (CBR) is a methodology for problem solving, which suggests a solution to a new problem based on the previously-solved problems and their associated solutions. A key issue in this methodology is that can we always trust the solutions suggested by a case-based reasoning system? This paper studies the reliability of CBR systems at an overall level first. Factors affecting the reliability...

chapter

A Case Study on Classification Reliability

Honghua Dai

2008 IEEE International Conference on Data Mining Workshops > 69 - 73

2008 IEEE International Conference on Data Mining Workshops

The reliability of an induced classifier can be affected by several factors including the data oriented factors and the algorithm oriented factors. In some cases, the reliability could also be affected by knowledge oriented factors. In this paper, we analyze three special cases to examine the reliability of the discovered knowledge. Our case study results show that (1) in the cases of mining from...

chapter

Domain Driven Data Mining (D3M)

Longbing Cao

2008 IEEE International Conference on Data Mining Workshops > 74 - 76

2008 IEEE International Conference on Data Mining Workshops

In deploying data mining into the real-world business, we have to cater for business scenarios, organizational factors, user preferences and business needs. However, the current data mining algorithms and tools often stop at the delivery of patterns satisfying expected technical interestingness. Business people are not informed about how and what to do to take over the technical deliverables. The...

chapter

Parameter Tuning for Differential Mining of String Patterns

J. Besson, C. Rigotti, I. Mitasiunaite, J.-F. Boulicaut

2008 IEEE International Conference on Data Mining Workshops > 77 - 86

2008 IEEE International Conference on Data Mining Workshops

Constraint-based mining has been proven to be extremely useful for supporting actionable pattern discovery. However, useful conjunctions of constraints that support domain driven mining tasks generally need to set several parameter values and how to tune these parameters remains fairly open. We study this problem for substring pattern discovery, when using a conjunction of maximal frequency, minimal...

chapter

Behavior Informatics and Analytics: Let Behavior Talk

Longbing Cao

2008 IEEE International Conference on Data Mining Workshops > 87 - 96

2008 IEEE International Conference on Data Mining Workshops

Behavior is increasingly recognized as a key component in business intelligence and problem-solving. Different from traditional behavior analysis, which mainly focus on implicit behavior and explicit business appearance as a result of business usage and customer demographics, this paper proposes the field of Behavior Informatics and Analytics (BIA), to support explicit behavior involvement through...

chapter

Scoring Models for Insurance Risk Sharing Pool Opimization

N. Chapados, C. Dugas, P. Vincent, R. Ducharme

2008 IEEE International Conference on Data Mining Workshops > 97 - 105

2008 IEEE International Conference on Data Mining Workshops

We introduce a flexible scoring model that can be used by property and casualty insurers that have access to a risk-sharing pool to better select the insureds to transfer to the pool. The model discriminates between insureds whose transfer is likely to be profitable under the pool regulations against those paying a fair premium. This model makes use of feature selection methods to automatically discover...

chapter

TransRank: A Novel Algorithm for Transfer of Rank Learning

Depin Chen, Jun Yan, Gang Wang, Yan Xiong, more

2008 IEEE International Conference on Data Mining Workshops > 106 - 115

2008 IEEE International Conference on Data Mining Workshops

Recently, learning to rank technique has attracted much attention. However, the lack of labeled training data seriously limits its application in real-world tasks. In this paper, we propose to break this bottleneck by considering the cross-domain ldquotransfer of rank learningrdquo problem. Simultaneously, we propose a novel algorithm called TransRank, which can effectively utilize the labeled data...

chapter

One-Class Classification of Text Streams with Concept Drift

Yang Zhang, Xue Li, M. Orlowska

2008 IEEE International Conference on Data Mining Workshops > 116 - 125

2008 IEEE International Conference on Data Mining Workshops

Research on streaming data classification has been mostly based on the assumption that data can be fully labelled. However, this is impractical. Firstly it is impossible to make a complete labelling before all data has arrived. Secondly it is generally very expensive to obtain fully labelled data by using man power. Thirdly user interests may change with time so the labels issued earlier may be inconsistent...

chapter

Post-Processing of Discovered Association Rules Using Ontologies

C. Marinica, F. Guillet, H. Briand

2008 IEEE International Conference on Data Mining Workshops > 126 - 133

2008 IEEE International Conference on Data Mining Workshops

In Data Mining, the usefulness of association rules is strongly limited by the huge amount of delivered rules. In this paper we propose a new approach to prune and filter discovered rules. Using Domain Ontologies, we strengthen the integration of user knowledge in the post-processing task. Furthermore, an interactive and iterative framework is designed to assist the user along the analyzing task....

chapter

Food Sales Prediction: "If Only It Knew What We Know"

P. Meulstee, M. Pechenizkiy

2008 IEEE International Conference on Data Mining Workshops > 134 - 143

2008 IEEE International Conference on Data Mining Workshops

Sales prediction is an important problem for different companies involved in manufacturing, logistics, marketing, wholesaling and retailing. Food companies are more concerned with sales prediction of products having a short shelf-life and seasonal changes in demand. The demand may depend on many hidden contexts, not given explicitly in the form of predictive features. Even if some changes are known...

chapter

Discovering Implicit Redundancies in Network Communications for Detecting Inconsistent Values

B.T. Nassu, T. Nanya, H. Nakamura

2008 IEEE International Conference on Data Mining Workshops > 144 - 153

2008 IEEE International Conference on Data Mining Workshops

Detecting inconsistent values received in a communication is a challenging problem faced in networked systems. Inconsistent values occur when a message contains incorrect data, even though the syntax is correct and there is no corruption due to transmission errors. In many cases, traditional schemes based on voting protocols or error detection codes cannot be used. An alternative is discovering implicit...

chapter

Actionable Knowledge Discovery for Threats Intelligence Support Using a Multi-dimensional Data Mining Methodology

O. Thonnard, M. Dacier

2008 IEEE International Conference on Data Mining Workshops > 154 - 163

2008 IEEE International Conference on Data Mining Workshops

This paper describes a multi-dimensional knowledge discovery and data mining (KDD) methodology that aims at discovering actionable knowledge related to Internet threats, taking into account domain expert guidance and the integration of domain-specific intelligence during the data mining process. The objectives are twofold: i) to develop global indicators for assessing the prevalence of certain malicious...

chapter

Identification of Causal Variables for Building Energy Fault Detection by Semi-supervised LDA and Decision Boundary Analysis

K. Yoshida, M. Inui, T. Yairi, K. Machida, more

2008 IEEE International Conference on Data Mining Workshops > 164 - 173

2008 IEEE International Conference on Data Mining Workshops

This paper addresses the identification problem of causal variables for the system anomaly. In real-world complicated systems, even experts often fail to specify causal factors, thus they attempt to detect the anomaly with exploratory heuristics. Our goal is to offer further information that supports anomaly cause analysis using the incomplete empirical knowledge. Proposed technique discovers responsible...

chapter

Unifying Unknown Nodes in the Internet Graph Using Semisupervised Spectral Clustering

A. Almog, J. Goldberger, Y. Shavitt

2008 IEEE International Conference on Data Mining Workshops > 174 - 183

2008 IEEE International Conference on Data Mining Workshops

Most research on Internet topology is based on active measurement methods. A major difficulty in using these tools is that one comes across many unresponsive routers. Different methods of dealing with these anonymous nodes to preserve the connectivity of the real graph have been suggested. One of the more practical approaches involves using a placeholder for each unknown, resulting in multiple copies...

chapter

Hierarchical Text Categorization in a Transductive Setting

M. Ceci

2008 IEEE International Conference on Data Mining Workshops > 184 - 191

2008 IEEE International Conference on Data Mining Workshops

Transductive learning is the learning setting that permits to learn from "particular to particular'' and to consider both labelled and unlabelled examples when taking classification decisions. In this paper, we investigate the use of transductive learning in the context of hierarchical text categorization. At this aim, we exploit a modified version of an inductive hierarchical learning framework...

chapter

Towards Combining Structured Pattern Mining and Graph Kernels

F. Costa, B. Bringmann

2008 IEEE International Conference on Data Mining Workshops > 192 - 201

2008 IEEE International Conference on Data Mining Workshops

This paper presents a novel approach to feature construction for structured data in order to enhance graph prediction classification performance. To this end we combine graph mining techniques with graph kernel based classifiers. The main idea is to employ efficient mining techniques to extract a set of patterns correlated with the target concept and use these, or a selected subset of these, to annotate...

Publication date

Set your own date range

Keywords

DATA MINING (86)
CLASSIFICATION ALGORITHMS (29)
DATABASES (23)
DATA MODELS (19)
LEARNING (ARTIFICIAL INTELLIGENCE) (19)
CLUSTERING ALGORITHMS (18)
TRAINING (18)
DISTANCE MEASUREMENT (17)
FEATURE EXTRACTION (16)
ACCURACY (15)
PATTERN CLUSTERING (15)
ALGORITHM DESIGN AND ANALYSIS (14)
CONFERENCES (14)
PATTERN CLASSIFICATION (14)
ASSOCIATION RULES (13)
QUERY PROCESSING (11)
INTERNET (10)
ITEMSETS (10)
INDEXES (8)
KNOWLEDGE DISCOVERY (8)
PREDICTIVE MODELS (8)
STATISTICAL ANALYSIS (8)
COMPUTATIONAL MODELING (7)
CORRELATION (7)
DATABASE MANAGEMENT SYSTEMS (7)
DECISION TREES (7)
ESTIMATION (7)
GRAPH THEORY (7)
KERNEL (7)
MATHEMATICAL MODEL (7)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (7)
TEXT ANALYSIS (7)
TRAINING DATA (7)
OPTIMIZATION (6)
RELIABILITY (6)
VISUAL DATABASES (6)
WEB SERVICES (6)
BIOLOGICAL SYSTEM MODELING (5)
CLASSIFICATION (5)
CLUSTERING (5)
FILTERING (5)
GRAPH MINING (5)
MACHINE LEARNING (5)
MARKETING (5)
MATRIX ALGEBRA (5)
MERGING (5)
ONTOLOGIES (5)
PROTEINS (5)
SPATIAL DATABASES (5)
APPROXIMATION METHODS (4)
BIOLOGY (4)
BUILDINGS (4)
BUSINESS (4)
CITIES AND TOWNS (4)
DATA ANALYSIS (4)
ENGINES (4)
EQUATIONS (4)
HIDDEN MARKOV MODELS (4)
HUMANS (4)
IMAGE CLASSIFICATION (4)
LABELING (4)
LEARNING SYSTEMS (4)
METEOROLOGY (4)
NOISE (4)
PEDIATRICS (4)
PROBABILITY (4)
PROPOSALS (4)
REDUNDANCY (4)
REGRESSION ANALYSIS (4)
REMOTE SENSING (4)
SET THEORY (4)
SOCIAL NETWORK SERVICES (4)
SOFTWARE (4)
SUPPORT VECTOR MACHINES (4)
TIME SERIES ANALYSIS (4)
WEB PAGES (4)
AGRICULTURE (3)
AMINO ACIDS (3)
ANALYTICAL MODELS (3)
ANOMALY DETECTION (3)
ATMOSPHERIC MEASUREMENTS (3)
BENCHMARK TESTING (3)
CLASSIFICATION TREE ANALYSIS (3)
COMPANIES (3)
COMPLEXITY THEORY (3)
COMPUTER SCIENCE (3)
CONSUMER BEHAVIOUR (3)
DATA HANDLING (3)
DATA VISUALISATION (3)
DATA VISUALIZATION (3)
DELAY (3)
DISTRIBUTED DATABASES (3)
EVOLUTION (BIOLOGY) (3)
GEOGRAPHIC INFORMATION SYSTEMS (3)
GRAPHICS (3)
IMAGE COLOR ANALYSIS (3)
IMAGE SEQUENCES (3)
INFORMATION EXTRACTION (3)
IP NETWORKS (3)
KNOWLEDGE ENGINEERING (3)
more

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops

Wavelet-Based Data Perturbation for Simultaneous Privacy-Preserving and Statistics-Preserving

Online Reliability Estimates for Individual Predictions in Data Streams

A Comparative Study of Data Sampling and Cost Sensitive Learning

Region Classification with Decision Trees

A Study on the Reliability of Case-Based Reasoning Systems

A Case Study on Classification Reliability

Domain Driven Data Mining (D3M)

Parameter Tuning for Differential Mining of String Patterns

Behavior Informatics and Analytics: Let Behavior Talk

Scoring Models for Insurance Risk Sharing Pool Opimization

TransRank: A Novel Algorithm for Transfer of Rank Learning

One-Class Classification of Text Streams with Concept Drift

Post-Processing of Discovered Association Rules Using Ontologies

Food Sales Prediction: "If Only It Knew What We Know"

Discovering Implicit Redundancies in Network Communications for Detecting Inconsistent Values

Actionable Knowledge Discovery for Threats Intelligence Support Using a Multi-dimensional Data Mining Methodology

Identification of Causal Variables for Building Energy Fault Detection by Semi-supervised LDA and Decision Boundary Analysis

Unifying Unknown Nodes in the Internet Graph Using Semisupervised Spectral Clustering

Hierarchical Text Categorization in a Transductive Setting

Towards Combining Structured Pattern Mining and Graph Kernels

Filter options

Publication date

Keywords

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2008 IEEE International Conference on Data Mining Workshops