2008 IEEE International Conference on Data Mining Workshops

Items from 81 to 100 out of 139 results

chapter

Distributed Linear Programming and Resource Management for Data Mining in Distributed Environments

H. Dutta, H. Kargupta

2008 IEEE International Conference on Data Mining Workshops > 543 - 552

2008 IEEE International Conference on Data Mining Workshops

Advances in computing and communication has resulted in very large scale distributed environments in recent years. They are capable of storing large volumes of data and often have multiple compute nodes. However, the inherent heterogeneity of data components, the dynamic nature of distributed systems, the need for information synchronization and data fusion over a network and security and access control...

chapter

Investigation of Various Matrix Factorization Methods for Large Recommender Systems

G. Takacs, I. Pilaszy, B. Nemeth, D. Tikk

2008 IEEE International Conference on Data Mining Workshops > 553 - 562

2008 IEEE International Conference on Data Mining Workshops

Matrix factorization (MF) based approaches have proven to be efficient for rating-based recommendation systems. In this work, we propose several matrix factorization approaches with improved prediction accuracy. We introduce a novel and fast (semi)-positive MF approach that approximates the features by using positive values for either users or items. We describe a momentum-based MF approach. A transductive...

chapter

Co-training by Committee: A New Semi-supervised Learning Framework

M. Hady, F. Schwenker

2008 IEEE International Conference on Data Mining Workshops > 563 - 572

2008 IEEE International Conference on Data Mining Workshops

For many data mining applications, it is necessary to develop algorithms that use unlabeled data to improve the accuracy of the supervised learning. Co-Training is a popular semi-supervised learning algorithm. It assumes that each example is represented by two or more redundantly sufficient sets of features (views) and these views are independent given the class. However, these assumptions are not...

chapter

Rules Extraction from Multiple Decisions Ordered Information Tables

Bin Shen, Min Yao, Zhaohui Wu

2008 IEEE International Conference on Data Mining Workshops > 573 - 582

2008 IEEE International Conference on Data Mining Workshops

Ordered information table is one of the most important research areas of granular computing. In this thesis, we introduce multiple decisions ordered information tables based on the concept of ordered information tables. Multiple decisions ordered information tables are used to describe the actual multiple decision attributes situation of reality. We study the process of rule extraction from multiple...

chapter

An Efficient Sequential Pattern Mining Algorithm Based on the 2-Sequence Matrix

Chia-Ying Hsieh, Don-Lin Yang, Jungpin Wu

2008 IEEE International Conference on Data Mining Workshops > 583 - 591

2008 IEEE International Conference on Data Mining Workshops

Sequential pattern mining has become more and more popular in recent years due to its wide applications and the fact that it can find more information than association rules. Two famous algorithms in sequential pattern mining are AprioriAll and PrefixSpan. These two algorithms not only need to scan a database or projected-databases many times, but also require setting a minimal support threshold to...

chapter

Mining Allocating Patterns in One-Sum Weighted Items

Y.J. Wang, Xinwei Zheng, F. Coenen, C.Y. Li

2008 IEEE International Conference on Data Mining Workshops > 592 - 598

2008 IEEE International Conference on Data Mining Workshops

An association rule (AR) is a common knowledge model in data mining that describes an implicative co-occurring relationship between two disjoint sets of binary-valued transaction database attributes (items), expressed in the form of an "antecedent rArr consequent" rule. A variant of the AR is the weighted association rule (WAR). With regard to a marketing context, this paper introduces a...

chapter

Remarks to Logical Aspects of Measures of Interestingness of Association Rules

J. Rauch

2008 IEEE International Conference on Data Mining Workshops > 599 - 608

2008 IEEE International Conference on Data Mining Workshops

Relations of logical calculi of association rules to measures of interestingness of association rules are studied. Logical calculi of association rules, 4ft-quantifiers and important classes of association rules are briefly introduced. New 4ft-quantifiers and association rules are defined by applications of suitable thresholds to several known measures of interestingness. It is proved that some of...

chapter

Bounding and Estimating Association Rule Support from Clusters on Binary Data

C. Ordonez, Kai Zhao, Zhibo Chen

2008 IEEE International Conference on Data Mining Workshops > 609 - 618

2008 IEEE International Conference on Data Mining Workshops

The theoretical relationship between association rules and machine learning techniques needs to be studied in more depth. This article studies the use of clustering as a model for association rule mining. The clustering model is exploited to bound and estimate association rule support and confidence. We first study the efficient computation of the clustering model with K-means; we show the sufficient...

chapter

Reclassification Rules

Li-Shiang Tsay, Z.W. Ras, Seunghyun Im

2008 IEEE International Conference on Data Mining Workshops > 619 - 627

2008 IEEE International Conference on Data Mining Workshops

The ultimate goal of knowledge discovery (KD) is to extract sets of patterns leading to useful knowledge for obtaining user desirable outcomes. The key characteristics of knowledge usefulness is that these patterns are actionable. In the last decade, KD algorithms such as mining for association rules, clustering, and classification rules, have made a tremendous progress and have been demonstrated...

chapter

A Logical Formulation of the Granular Data Model

Tuan-Fang Fan, Churn-Jung Liau, Tsau-Young Lin, K. Lee

2008 IEEE International Conference on Data Mining Workshops > 628 - 634

2008 IEEE International Conference on Data Mining Workshops

In data mining problems, data is usually provided in the form of data tables. To represent knowledge discovered from data tables, decision logic (DL) is proposed in rough set theory. While DL is an instance of propositional logic, we can also describe data tables by other logical formalisms. In this paper, we use a kind of many-sorted logic, called attribute value-sorted logic, to study association...

chapter

Multilayer Change-Point Detection on Stock Order Flows by Wavelet Transformation

Xiaoyan Liu, Xindong Wu, Huaiqing Wang, Yingfeng Wang

2008 IEEE International Conference on Data Mining Workshops > 635 - 642

2008 IEEE International Conference on Data Mining Workshops

In empirical finance, the increase or decrease in the number of stock buy/sell orders is aroused by the information asymmetry, which eventually affects the change of the stock price. To monitor the change in the stock order flow, we propose a multilayer change-point detection algorithm which makes use of the multi-resolution property of wavelet transformation. We first detect the change-points in...

chapter

Statistical Independence and Contingency Matrix

S. Tsumoto, S. Hirano

2008 IEEE International Conference on Data Mining Workshops > 643 - 648

2008 IEEE International Conference on Data Mining Workshops

This paper shows the meaning of Pearson residuals as an indicator of statistical independence. While information granules of statistical independence of two variables can be viewed as determinants of 2times2-submatrices, those of three variables consist of several combinations of linear equations which will become residuals for odds ratio (outer products) when they are equal to 0. Interestingly, the...

chapter

An FUSP-Tree Maintenance Algorithm for Record Modification

Chun-Wei Lin, Tzung-Pei Hong, Wen-Hsiang Lu, Hsin-Yi Chen

2008 IEEE International Conference on Data Mining Workshops > 649 - 653

2008 IEEE International Conference on Data Mining Workshops

There are several algorithms proposed for maintaining the sequential patterns as records are inserted. In addition to record insertion, the pattern maintenance for record modification is also very important in the real-applications. In the past, we have proposed the fast updated sequential pattern tree (called FUSP tree) structure for handling record insertion. In this paper, we attempt to handle...

chapter

Hunting for Coherent Co-clusters in High Dimensional and Noisy Datasets

M. Deodhar, J. Ghosh, G. Gupta, Hyuk Cho, more

2008 IEEE International Conference on Data Mining Workshops > 654 - 663

2008 IEEE International Conference on Data Mining Workshops

Clustering problems often involve datasets where only a part of the data is relevant to the problem, e.g., in microarray data analysis only a subset of the genes show cohesive expressions within a subset of the conditions/features. The existence of a large number of non-informative data points and features makes it challenging to hunt for coherent and meaningful clusters from such datasets. Additionally,...

chapter

Text Knowledge Mining: An Alternative to Text Data Mining

D. Sanchez, M.J. Martin-Bautista, I. Blanco, C. Torre

2008 IEEE International Conference on Data Mining Workshops > 664 - 672

2008 IEEE International Conference on Data Mining Workshops

In this paper we introduced an alternative view of text mining and we review several alternative views proposed by different authors. We propose a classification of text mining techniques into two main groups: techniques based on inductive inference, that we call text data mining (TDM, comprising most of the existing proposals in the literature), and techniques based on deductive or abductive inference,...

chapter

Estimating True and False Positive Rates in Higher Dimensional Problems and Its Data Mining Applications

A. Foss, O.R. Zaiane

2008 IEEE International Conference on Data Mining Workshops > 673 - 681

2008 IEEE International Conference on Data Mining Workshops

If we can estimate the accuracy of our observations then we can estimate the true and false positive rates over a series of samples in high dimensional data mining problems. To date such issues have been largely neglected and previously no algorithm has been provided to facilitate the computations involved. In high dimensional data mining tasks, increasing sparsity leads to decreasing true positive...

chapter

An Adaptive Pre-filtering Technique for Error-Reduction Sampling in Active Learning

M. Davy, S. Luz

2008 IEEE International Conference on Data Mining Workshops > 682 - 691

2008 IEEE International Conference on Data Mining Workshops

Error-reduction sampling (ERS) is a high performing (but computationally expensive) query selection strategy for active learning. Subset optimisation has been proposed to reduce computational expense by applying ERS to only a subset of examples from the pool. This paper compares techniques used to construct the subset, namely random sub-sampling and pre-filtering. We focus on pre-filtering which populates...

chapter

ARUBAS: An Association Rule Based Similarity Framework for Associative Classifiers

B. Depaire, K. Vanhoof, G. Wets

2008 IEEE International Conference on Data Mining Workshops > 692 - 699

2008 IEEE International Conference on Data Mining Workshops

This article introduces ARUBAS, a new framework to build associative classifiers. In contrast with many existing associative classifiers, it uses class association rules to transform the feature space and uses instance-based reasoning to classify new instances. The framework allows the researcher to use any association rule mining algorithm to produce the class association rules. Every aspect of the...

chapter

ZCS Revisited: Zeroth-Level Classifier Systems for Data Mining

F.A. Tzima, P.A. Mitkas

2008 IEEE International Conference on Data Mining Workshops > 700 - 709

2008 IEEE International Conference on Data Mining Workshops

Learning classifier systems (LCS) are machine learning systems designed to work for both multi-step and single-step decision tasks. The latter case presents an interesting,though not widely studied, challenge for such algorithms,especially when they are applied to real-world data mining problems. The present investigation departs from the popular approach of applying accuracy-based LCS to data mining...

chapter

A New Graph-Based Algorithm for Clustering Documents

A.P. Suarez, J.F.M. Trinidad, J.A.C. Ochoa, J.E.M. Pagola

2008 IEEE International Conference on Data Mining Workshops > 710 - 719

2008 IEEE International Conference on Data Mining Workshops

In this paper a new algorithm, called CStar, for document clustering is presented. This algorithm improves recently developed algorithms like generalized star (GStar) and ACONS algorithms, originally proposed for reducing some drawbacks presented in previous Star-like algorithms.The CStar algorithm uses the condensed star-shaped sub-graph concept defined by ACONS, but defines a new heuristic that...

Publication date

Set your own date range

Keywords

DATA MINING (86)
CLASSIFICATION ALGORITHMS (29)
DATABASES (23)
DATA MODELS (19)
LEARNING (ARTIFICIAL INTELLIGENCE) (19)
CLUSTERING ALGORITHMS (18)
TRAINING (18)
DISTANCE MEASUREMENT (17)
FEATURE EXTRACTION (16)
ACCURACY (15)
PATTERN CLUSTERING (15)
ALGORITHM DESIGN AND ANALYSIS (14)
CONFERENCES (14)
PATTERN CLASSIFICATION (14)
ASSOCIATION RULES (13)
QUERY PROCESSING (11)
INTERNET (10)
ITEMSETS (10)
INDEXES (8)
KNOWLEDGE DISCOVERY (8)
PREDICTIVE MODELS (8)
STATISTICAL ANALYSIS (8)
COMPUTATIONAL MODELING (7)
CORRELATION (7)
DATABASE MANAGEMENT SYSTEMS (7)
DECISION TREES (7)
ESTIMATION (7)
GRAPH THEORY (7)
KERNEL (7)
MATHEMATICAL MODEL (7)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (7)
TEXT ANALYSIS (7)
TRAINING DATA (7)
OPTIMIZATION (6)
RELIABILITY (6)
VISUAL DATABASES (6)
WEB SERVICES (6)
BIOLOGICAL SYSTEM MODELING (5)
CLASSIFICATION (5)
CLUSTERING (5)
FILTERING (5)
GRAPH MINING (5)
MACHINE LEARNING (5)
MARKETING (5)
MATRIX ALGEBRA (5)
MERGING (5)
ONTOLOGIES (5)
PROTEINS (5)
SPATIAL DATABASES (5)
APPROXIMATION METHODS (4)
BIOLOGY (4)
BUILDINGS (4)
BUSINESS (4)
CITIES AND TOWNS (4)
DATA ANALYSIS (4)
ENGINES (4)
EQUATIONS (4)
HIDDEN MARKOV MODELS (4)
HUMANS (4)
IMAGE CLASSIFICATION (4)
LABELING (4)
LEARNING SYSTEMS (4)
METEOROLOGY (4)
NOISE (4)
PEDIATRICS (4)
PROBABILITY (4)
PROPOSALS (4)
REDUNDANCY (4)
REGRESSION ANALYSIS (4)
REMOTE SENSING (4)
SET THEORY (4)
SOCIAL NETWORK SERVICES (4)
SOFTWARE (4)
SUPPORT VECTOR MACHINES (4)
TIME SERIES ANALYSIS (4)
WEB PAGES (4)
AGRICULTURE (3)
AMINO ACIDS (3)
ANALYTICAL MODELS (3)
ANOMALY DETECTION (3)
ATMOSPHERIC MEASUREMENTS (3)
BENCHMARK TESTING (3)
CLASSIFICATION TREE ANALYSIS (3)
COMPANIES (3)
COMPLEXITY THEORY (3)
COMPUTER SCIENCE (3)
CONSUMER BEHAVIOUR (3)
DATA HANDLING (3)
DATA VISUALISATION (3)
DATA VISUALIZATION (3)
DELAY (3)
DISTRIBUTED DATABASES (3)
EVOLUTION (BIOLOGY) (3)
GEOGRAPHIC INFORMATION SYSTEMS (3)
GRAPHICS (3)
IMAGE COLOR ANALYSIS (3)
IMAGE SEQUENCES (3)
INFORMATION EXTRACTION (3)
IP NETWORKS (3)
KNOWLEDGE ENGINEERING (3)
more

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops

Distributed Linear Programming and Resource Management for Data Mining in Distributed Environments

Investigation of Various Matrix Factorization Methods for Large Recommender Systems

Co-training by Committee: A New Semi-supervised Learning Framework

Rules Extraction from Multiple Decisions Ordered Information Tables

An Efficient Sequential Pattern Mining Algorithm Based on the 2-Sequence Matrix

Mining Allocating Patterns in One-Sum Weighted Items

Remarks to Logical Aspects of Measures of Interestingness of Association Rules

Bounding and Estimating Association Rule Support from Clusters on Binary Data

Reclassification Rules

A Logical Formulation of the Granular Data Model

Multilayer Change-Point Detection on Stock Order Flows by Wavelet Transformation

Statistical Independence and Contingency Matrix

An FUSP-Tree Maintenance Algorithm for Record Modification

Hunting for Coherent Co-clusters in High Dimensional and Noisy Datasets

Text Knowledge Mining: An Alternative to Text Data Mining

Estimating True and False Positive Rates in Higher Dimensional Problems and Its Data Mining Applications

An Adaptive Pre-filtering Technique for Error-Reduction Sampling in Active Learning

ARUBAS: An Association Rule Based Similarity Framework for Associative Classifiers

ZCS Revisited: Zeroth-Level Classifier Systems for Data Mining

A New Graph-Based Algorithm for Clustering Documents

Filter options

Publication date

Keywords

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2008 IEEE International Conference on Data Mining Workshops