2008 IEEE International Conference on Data Mining Workshops

Items from 1 to 7 out of 7 results

chapter

One-Class Classification of Text Streams with Concept Drift

Yang Zhang, Xue Li, M. Orlowska

2008 IEEE International Conference on Data Mining Workshops > 116 - 125

2008 IEEE International Conference on Data Mining Workshops

Research on streaming data classification has been mostly based on the assumption that data can be fully labelled. However, this is impractical. Firstly it is impossible to make a complete labelling before all data has arrived. Secondly it is generally very expensive to obtain fully labelled data by using man power. Thirdly user interests may change with time so the labels issued earlier may be inconsistent...

chapter

Hierarchical Text Categorization in a Transductive Setting

M. Ceci

2008 IEEE International Conference on Data Mining Workshops > 184 - 191

2008 IEEE International Conference on Data Mining Workshops

Transductive learning is the learning setting that permits to learn from "particular to particular'' and to consider both labelled and unlabelled examples when taking classification decisions. In this paper, we investigate the use of transductive learning in the context of hierarchical text categorization. At this aim, we exploit a modified version of an inductive hierarchical learning framework...

chapter

Mining Unstructured Text at Gigabyte per Second Speeds

A. Ratner

2008 IEEE International Conference on Data Mining Workshops > 468 - 476

2008 IEEE International Conference on Data Mining Workshops

Humans communicate with text in thousands of languages, in dozens of scripts, in a variety of binary codes, on millions of topics. There is a need, for both government and commercial applications, to identify these text characteristics to enable follow-on processing such as transcoding, translation, transliteration, routing and prioritization. This paper deals with the implementation of real-time...

chapter

Text Knowledge Mining: An Alternative to Text Data Mining

D. Sanchez, M.J. Martin-Bautista, I. Blanco, C. Torre

2008 IEEE International Conference on Data Mining Workshops > 664 - 672

2008 IEEE International Conference on Data Mining Workshops

In this paper we introduced an alternative view of text mining and we review several alternative views proposed by different authors. We propose a classification of text mining techniques into two main groups: techniques based on inductive inference, that we call text data mining (TDM, comprising most of the existing proposals in the literature), and techniques based on deductive or abductive inference,...

chapter

An Adaptive Pre-filtering Technique for Error-Reduction Sampling in Active Learning

M. Davy, S. Luz

2008 IEEE International Conference on Data Mining Workshops > 682 - 691

2008 IEEE International Conference on Data Mining Workshops

Error-reduction sampling (ERS) is a high performing (but computationally expensive) query selection strategy for active learning. Subset optimisation has been proposed to reduce computational expense by applying ERS to only a subset of examples from the pool. This paper compares techniques used to construct the subset, namely random sub-sampling and pre-filtering. We focus on pre-filtering which populates...

chapter

Semantic Features for Multi-view Semi-supervised and Active Learning of Text Classification

Shiliang Sun

2008 IEEE International Conference on Data Mining Workshops > 731 - 735

2008 IEEE International Conference on Data Mining Workshops

For multi-view learning, existing methods usually exploit originally provided features for classifier training, which ignore the latent correlation between different views. In this paper, semantic features integrating information from multiple views are extracted for pattern representation. Canonical correlation analysis is used to learn the representation of semantic spaces where semantic features...

chapter

Keyword Extraction Based on Lexical Chains and Word Co-occurrence for Chinese News Web Pages

Xinghua Li, Xindong Wu, Xuegang Hu, Fei Xie, more

2008 IEEE International Conference on Data Mining Workshops > 744 - 751

2008 IEEE International Conference on Data Mining Workshops

This paper presents a new keyword extraction algorithm for Chinese news Web pages using lexical chains and word co-occurrence combined with frequency features, cohesion features, and corelation features. A lexical chain is an external performance consistency by semantically related words of a text, and is the representation of the semantic content of a portion of the text. Word co-occurrence distribution...

Filter options

Keywords:
TEXT ANALYSIS

Publication date

Set your own date range

Keywords

CLASSIFICATION ALGORITHMS (5)
DATA MINING (5)
ACCURACY (3)
LEARNING (ARTIFICIAL INTELLIGENCE) (3)
TEXT CATEGORIZATION (3)
TRAINING (3)
ACTIVE LEARNING (2)
FEATURE EXTRACTION (2)
OPTIMIZATION (2)
WEB PAGES (2)
ABDUCTIVE INFERENCE (1)
ADAPTIVE FILTERS (1)
ADAPTIVE PREFILTERING TECHNIQUE (1)
BENCHMARK TESTING (1)
BENCHMARK TEXT CATEGORISATION DATASETS (1)
BLOGS (1)
CATEGORY THEORY (1)
CHINESE NEWS WEB PAGE (1)
CLASSIFICATION (1)
CLASSIFIER TRAINING (1)
CO-TESTING (1)
CO-TRAINING (1)
COGNITION (1)
COHESION FEATURES (1)
COMPUTER AIDED MANUFACTURING (1)
COMPUTER SCIENCE (1)
CONCEPT DRIFT (1)
CORELATION FEATURES (1)
CORRELATION (1)
DATA DISTRIBUTION (1)
DATABASES (1)
DISTANCE MEASUREMENT (1)
DOCUMENT CLASSIFICATION DECISION (1)
EMPIRICAL EVALUATIONS (1)
ENCODING (1)
ERROR REDUCTION SAMPLING (1)
ERROR-REDUCTION SAMPLING (1)
EXTRATERRESTRIAL MEASUREMENTS (1)
FILTERING (1)
FILTERING THEORY (1)
FREQUENCY FEATURES (1)
FREQUENCY MEASUREMENT (1)
GIGABYTE PER SECOND SPEED (1)
HARDWARE (1)
HIERARCHICAL CLASSIFICATION (1)
HIERARCHICAL TEXT CATEGORIZATION (1)
INDUCTIVE INFERENCE (1)
KERNEL (1)
KEYWORD EXTRACTION (1)
KNOWLEDGE BASED SYSTEMS (1)
LANGUAGE (1)
LAPLACE EQUATIONS (1)
LEARNING SETTING (1)
LEXICAL CHAIN (1)
LEXICAL CHAINS (1)
MAGNESIUM (1)
MULTI-VIEW LEARNING (1)
MULTIVIEW SEMI-SUPERVISED (1)
NATURAL (1)
NATURAL LANGUAGE PROCESSING (1)
NETWORK DATA STREAMS (1)
ONE-CLASS CLASSIFICATION (1)
ORGANIZATIONS (1)
PATTERN MATCHING (1)
PATTERN REPRESENTATION (1)
POSITIVELY LABELLED DOCUMENTS (1)
PROCESSING (1)
QUERY PROCESSING (1)
QUERY SELECTION STRATEGY (1)
RANDOM ACCESS MEMORY (1)
REAL-TIME MINING (1)
SEMANTIC FEATURES (1)
SEMANTIC SIMILARITY (1)
SET THEORY (1)
SIGNAL SAMPLING (1)
STACKING (1)
STACKING STYLE ENSEMBLE-BASED APPROACH (1)
STATISTICAL ANALYSIS (1)
STATISTICAL MODEL (1)
STREAMING DATA CLASSIFICATION (1)
SUBSET OPTIMISATION (1)
TEXT CATEGORISATION (1)
TEXT CLASSIFICATION (1)
TEXT DATA MINING (1)
TEXT KNOWLEDGE MINING (1)
TEXT MINING (1)
TEXT STREAM (1)
TEXT STREAMS (1)
THESAURI (1)
TIME DIVISION MULTIPLEXING (1)
TRAINING DATA (1)
TRANSDUCERS (1)
TRANSDUCTIVE LEARNING (1)
TRANSDUCTIVE SETTING (1)
TRASDUCTIVE LEARNING (1)
UNCERTAINTY (1)
UNSTRUCTURED TEXT MINING (1)
WINDOW-BASED APPROACH (1)
WORD COOCCURRENCE (1)
more

INFONA - science communication portal

2008 IEEE International Conference on Data Mining Workshops $("#expandableTitles").expandable();

One-Class Classification of Text Streams with Concept Drift

Hierarchical Text Categorization in a Transductive Setting

Mining Unstructured Text at Gigabyte per Second Speeds

Text Knowledge Mining: An Alternative to Text Data Mining

An Adaptive Pre-filtering Technique for Error-Reduction Sampling in Active Learning

Semantic Features for Multi-view Semi-supervised and Active Learning of Text Classification

Keyword Extraction Based on Lexical Chains and Word Co-occurrence for Chinese News Web Pages

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2008 IEEE International Conference on Data Mining Workshops