Search results

Items from 1 to 20 out of 24 results

chapter

Investigating analysis of speech content through text classification

S Ezzat, N E Gayar, M M Ghanem

2010 International Conference of Soft Computing and Pattern Recognition > 105 - 110

2010 International Conference of Soft Computing and Pattern Recognition (SoCPaR 2010)

The field of Text Mining has evolved over the past years to analyze textual resources. However, it can be used in several other applications. In this research, we are particularly interested in performing text mining techniques on audio materials after translating them into texts in order to detect the speakers' emotions. We describe our overall methodology and present our experimental results. In...

chapter

Search-based short-text classification

Kang Wei, Ruiquan Zhang, Xinguo Xu

5th International Conference on Pervasive Computing and Applications > 297 - 301

2010 5th International Conference on Pervasive Computing and Applications (ICPCA 2010)

Since the traditional classification algorithm does not work well in the case of short-text classification, we propose a search-based method employing Na'iveBayes classification algorithm. This paper describes the whole process, including the classification algorithms, training and the evaluation. The results indicate that the classifier has better performance comparing with other methods.

chapter

Improving Arabic document categorization: Introducing local stem

Eiman Tamah Al-Shammari

2010 10th International Conference on Intelligent Systems Design and Applications > 385 - 390

10th International Conference on Intelligent Systems Design and Applications (ISDA 2010)

Stemming is a fundamental step in processing textual data preceding the tasks of text mining, Information Retrieval (IR), and natural language processing (NLP). The common goal of stemming is to standardize words by reducing a word to its base (root or stem), thus can be also considered a feature reduction technique. This paper aims at presenting a new dictionary free, content-based Arabic stemmer...

chapter

An Improved Algorithm to Term Weighting in Text Classification

Ran Li, Xianjiu Guo

2010 International Conference on Multimedia Technology > 1 - 3

2010 International Conference on Multimedia Technology (ICMT)

The traditional TF-IDF algorithm is a common method that is used to measure feature weight in text categorization. However, the algorithm doesn't take the distribution of feature terms in inter-class and intra-class into consideration. Consequently, the algorithm can't effectively weigh the distribution proportion of feature items. In order to solve this problem, information entropy in inter-class...

chapter

Learning to integrate unlabeled data in text classification

Eric P Jiang

2010 3rd International Conference on Computer Science and Information Technology > 4 > 82 - 86

2010 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2010)

The paper deals with the text classification problem where labeled training samples are very limited while unlabeled data are readily available in large quantities. The paper proposes an efficient classification algorithm that incorporates a weighted k-means clustering scheme into an Expectation Maximization (EM) process. It aims to balance predictive values between labeled and unlabeled training...

chapter

Feature selection for text classification using OR+SVM-RFE

Meixiang Luo, Linkai Luo

2010 Chinese Control and Decision Conference > 1648 - 1652

2010 Chinese Control and Decision Conference (CCDC)

Feature selection is the key issue in text classification because there are a large number of attributes. In this paper, we propose a new algorithm OR+SVM-RFE that integrates Odds Radio(OR) with recursive feature elimination based on SVM(SVM-RFE). Odds Radio is first used to roughly and rapidly select a feature subset. Then SVM-RFE is used to delicately select a smaller feature subset. Experiment...

chapter

A New Method of Training Sample Selection in Text Classification

Yixing Liao, Xuezeng Pan

2010 Second International Workshop on Education Technology and Computer Science > 1 > 211 - 214

2010 2nd International Workshop on Education Technology and Computer Science (ETCS)

Aiming to noise samples in the training dataset, a new method for reducing the amount of training dataset is proposed in the paper which is applicable to text classification. This method describes the distribution of training dataset according to the representativeness score of samples in the class they belong to, so as to show representative samples and noise samples in each class. The new method...

chapter

Collecting health related text from patient health writings

Saiful Akbar, Laura Slaughter, ystein Nytro

2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE) > 1 > 15 - 19

2nd International Conference on Computer and Automation Engineering (ICCAE 2010)

The Internet has been a huge resource for sharing and collecting information including health related information. Some health related information is written by patients (lay persons) discussing their experience about health problems and treatments. This paper introduces our initial work on providing physicians with clinically useful patient health writings. More specifically, the paper presented...

chapter

A Survey on Text Classification Techniques for E-mail Filtering

Upasana, S Chakravarty

2010 Second International Conference on Machine Learning and Computing > 32 - 36

2nd International Conference on Machine Learning and Computing (ICMLC 2010)

The continuing explosive growth of textual content within the World Wide Web has given rise to the need for sophisticated Text Classification (TC) techniques that combine efficiency with high quality of results. E-mail filtering is one application that has the potential to affect every user of the internet. Even though a large body of research has delved into this problem, there is a paucity of survey...

chapter

Study on Method of Word Segmentation in Feature Selection in Chinese Text Categorization

Huang Wei, Liu Yi, Gao Bing, Yang Ke-wei

2010 Third International Conference on Knowledge Discovery and Data Mining > 411 - 415

2010 3rd International Conference on Knowledge Discovery and Data Mining (WKDD 2010)

Since the automatic word segmentation of Chinese text will bring the lack of information, method of word segmentation according to lexical chunk as segmentation unit are proposed. Use traditional segmentation method segment Chinese text based calculate mutual information between two lexical entries and adjacent frequency of two or more lexical entries, according to this calculated value judge and...

chapter

Chinese Question Classification Based on Semantic Gram and SVM

Liang Wang, Hui Zhang, Deqing Wang, Jia Huang

2009 International Forum on Computer Science-Technology and Applications > 1 > 432 - 435

2009 International Forum on Computer Science-Technology and Applications (IFCSTA 2009)

Question classification plays a crucial important role in the question answering system. Recent research on question classification for open-domain mostly concentrates on using machine learning methods to resolve the special kind of text classification. This paper presents our research about Chinese question classification using machine learning method and gives our approach based on SVM and semantic...

chapter

A Novel Feature Selection Approach and Feature Weight Adjustment Technique in Text Classification

Yixing Liao, Xuezeng Pan

2009 Seventh ACIS International Conference on Software Engineering Research, Management and Applications > 41 - 44

2009 7th ACIS International Conference on Software Engineering Research, Management and Applications (SERA 2009)

Feature selection and feature weight calculating are key preprocesses in text classification. A new feature selection approach based on average interaction gain (AIG) is presented and a new feature weight adjustment technique (WA) taking inter-class distribution and intra-class distribution into consideration is presented too. Then a new approach combining AIG with WA called AIG-WA is presented. In...

chapter

Cross-domain classification: Trade-off between complexity and accuracy

E. Lex, C. Seifert, M. Granitzer, A. Juffinger

2009 International Conference for Internet Technology and Secured Transactions, (ICITST) > 1 - 6

2009 4th International Conference for Internet Technology and Secured Transactions (ICITST 2009)

Text classification is one of the core applications in data mining due to the huge amount of not categorized digital data available. Training a text classifier generates a model that reflects the characteristics of the domain. However, if no training data is available, labeled data from a related but different domain might be exploited to perform cross-domain classification. In our work, we aim to...

chapter

Method for feature word weight calculating

Yanling Li, Jing Yuan, Xia Ye

2009 IEEE International Conference on Intelligent Computing and Intelligent Systems > 1 > 309 - 312

2009 IEEE International Conference on Intelligent Computing and Intelligent Systems (ICIS 2009)

Automatic text categorization has been one of the hotspots in the information processing field. To aim at the important impact of feature weight calculating on text classification accuracy, first, the relationship between text representation model and feature weight calculating is studied, and the existed methods of feature weight calculating are analyzed, then the common idea of feature weighting...

chapter

Text classification based on limited bibliographic metadata

K. Denecke, T. Risse, T. Baehr

2009 Fourth International Conference on Digital Information Management > 1 - 6

2009 Fourth International Conference on Digital Information Management

In this paper, we introduce a method for categorizing digital items according to their topic, only relying on the document's metadata, such as author name and title information. The proposed approach is based on a set of lexical resources constructed for our purposes (e.g., journal titles, conference names) and on a traditional machine-learning classifier that assigns one category to each document...

chapter

Classifying Sentence-Based Summaries of Web Documents

M.S. Pera, Yiu-Kai Ng

2009 21st IEEE International Conference on Tools with Artificial Intelligence > 433 - 440

2009 21st IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2009)

Text classification categories Web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consuming and users are still required to spend considerable amount of time scanning through the classified Web documents to identify the ones that satisfy their information needs. In solving this problem, we first introduce CorSum,...

chapter

Classifying Text with Statistically Selected Features to Closely Related Categories

M. Janaki Meena, K.R. Chandran

2009 International Conference on Advances in Recent Technologies in Communication and Computing > 297 - 301

2009 International Conference on Advances in Recent Technologies in Communication and Computing. ARTCom 2009

Text classification is continuing to be one of the most researched problems due to continuously-increasing amount of electronic documents and digital data. Classifying documents to closely related categories is the most complex task in text categorization. Feature selection is an essential preprocessing step for improving the efficiency and accuracy of the text classifiers by removing redundant and...

chapter

Chinese Text Classification Using Key Characters String Kernel

Shiqiang Zheng, Yujiu Yang, Haiping Wu, Wenhuang Liu

2009 Fifth International Conference on Semantics, Knowledge and Grid > 113 - 119

2009 Fifth International Conference on Semantics, Knowledge and Grid (SKG 2009)

Most Chinese text classification methods are based on Chinese word segmentation and bag of words (BOW). The classification performance largely relies on the accuracy of segmentation. Unfortunately, perfect precision and disambiguation of segmentation cannot be reached. In order to solve this problem, a novel Chinese text classification method using string kernel is presented. String kernel computes...

chapter

A new method for attribute extraction with application on text classification

G. Biricik, B. Diri, A.C. Sonmez

2009 Fifth International Conference on Soft Computing, Computing with Words and Perceptions in System Analysis, Decision and Control > 1 - 4

2009 Fifth International Conference on Soft Computing, Computing with Words and Perceptions in System Analysis, Decision and Control

We introduce a new method for dimensionality reduction by attribute extraction and evaluate its impact on text classification. The textual contents in body sections of the news in Reuters-21758 are the selected attributes for classification. Using the offered method, high dimension of attributes- words extracted from the news bodies- are projected onto a new hyper plane having dimensions equal to...

chapter

Semi-supervised text classification from unlabeled documents using class associated words

Hong-qi Han, Dong-hua Zhu, Xue-feng Wang

2009 International Conference on Computers&Industrial Engineering > 1255 - 1260

2009 International Conference on Computers & Industrial Engineering (CIE39)

Automatically classifying text documents is an important field in machine learning. Unsupervised text classification does not need training data but is often criticized to cluster blindly. Supervised text classification needs large quantities of labeled training data to achieve high accuracy. However, in practice, labeled samples are often difficult, expensive or time consuming to obtain. In the meanwhile,...

Data set:
ieee
Keywords:
ACCURACY
CLASSIFICATION
TEXT CLASSIFICATION

Publication date

Set your own date range

Content availability

Available (23)
None (1)

Keywords

TEXT ANALYSIS (23)
CLASSIFICATION ALGORITHMS (16)
TEXT CATEGORIZATION (14)
TRAINING (12)
FEATURE EXTRACTION (10)
SUPPORT VECTOR MACHINES (9)
DATA MINING (8)
FEATURE SELECTION (8)
BAYES METHODS (5)
INTERNET (4)
LEARNING (ARTIFICIAL INTELLIGENCE) (4)
MACHINE LEARNING (4)
NAIVE BAYES CLASSIFIER (4)
SUPPORT VECTOR MACHINE CLASSIFICATION (4)
ELECTRONIC MAIL (3)
NATURAL LANGUAGE PROCESSING (3)
PATTERN CLASSIFICATION (3)
SUPPORT VECTOR MACHINE (3)
ARTIFICIAL NEURAL NETWORKS (2)
CLASSIFICATION ALGORITHM (2)
CLUSTERING (2)
DICTIONARIES (2)
ENTROPY (2)
EXPECTATION-MAXIMISATION ALGORITHM (2)
FEATURE WEIGHT (2)
FILTERING (2)
INFORMATION FILTERING (2)
INFORMATION RETRIEVAL (2)
MACHINE LEARNING ALGORITHMS (2)
MUTUAL INFORMATION (2)
NIOBIUM (2)
PATTERN CLUSTERING (2)
SVM (2)
TEXT MINING (2)
TEXT REPRESENTATION MODEL (2)
TEXTUAL CONTENT (2)
TRAINING DATA (2)
UNSOLICITED E-MAIL (2)
ALGORITHM DESIGN AND ANALYSIS (1)
ANALYTICAL MODELS (1)
ARABIC DOCUMENT CATEGORIZATION (1)
ARABIC STEMMING ALGORITHMS (1)
ARABIC TEXT CATEGORIZATION (1)
ARGON (1)
ARRAYS (1)
ATOMIC MEASUREMENTS (1)
ATTRIBUTE EXTRACTION (1)
AUDIO AND TEXT MINING (1)
AUDIO MATERIAL (1)
AUDIO STREAMING (1)
AUDIO SYSTEMS (1)
AUTHOR NAME (1)
AUTOMATIC TEXT CATEGORIZATION (1)
AUTOMATIC TEXT DOCUMENT CLASSIFICATION (1)
AVERAGE INTERACTION GAIN (1)
BAYESIAN METHODS (1)
BIBLIOGRAPHIC METADATA (1)
BIOLOGICAL SYSTEM MODELING (1)
BLOGS (1)
BREAST CANCER MAILING LIST (1)
CANCER (1)
CATEGORY WEIGHT (1)
CENTROID-BASED ALGORITHM (1)
CHEMISTRY (1)
CHI-SQUARE MAX METHOD (1)
CHI-SQUARE STATISTICS (1)
CHINESE QUESTION CLASSIFICATION (1)
CHINESE TEXT CATEGORIZATION (1)
CHINESE TEXT CLASSIFICATION (1)
CLASS ASSOCIATED WORDS (1)
CLASS SPACE MODEL (1)
CLASS SPACE MODEL (CSM) (1)
CLASS-FEATURE-CENTROID CLASSIFIER (1)
CLASSIFICATION ACCURACY (1)
CLASSIFICATION METHOD (1)
CLASSIFIER SELECTION (1)
CLINICAL ANALYSIS (1)
COMPUTATIONAL COMPLEXITY (1)
CONFERENCE NAMES (1)
CONTENT-BASED ARABIC STEMMER (1)
CONTEXT (1)
CONTEXT-BASED EMAIL FILTERING (1)
CORRELATION (1)
CORSUM-GENERATED SUMMARIES (1)
CROSS-DOMAIN CLASSIFICATION (1)
CULTURAL DIFFERENCES (1)
DATA MODELS (1)
DEMOCRATIC CLASSIFIER (1)
DENSITY FUNCTIONAL THEORY (1)
DICTIONARY FREE ARABIC STEMMER (1)
DIGITAL DATA (1)
DIGITAL ITEM CATEGORIZATION (1)
DIMENSIONALITY REDUCTION (1)
DISTRIBUTION PROPORTION (1)
DIVERGENCE MEASURE (1)
DIVERGENCE-BASED (1)
DIVERGENCE-BASED FEATURE SELECTION (1)
more

INFONA - science communication portal

Search results

Investigating analysis of speech content through text classification

Search-based short-text classification

Improving Arabic document categorization: Introducing local stem

An Improved Algorithm to Term Weighting in Text Classification

Learning to integrate unlabeled data in text classification

Feature selection for text classification using OR+SVM-RFE

A New Method of Training Sample Selection in Text Classification

Collecting health related text from patient health writings

A Survey on Text Classification Techniques for E-mail Filtering

Study on Method of Word Segmentation in Feature Selection in Chinese Text Categorization

Chinese Question Classification Based on Semantic Gram and SVM

A Novel Feature Selection Approach and Feature Weight Adjustment Technique in Text Classification

Cross-domain classification: Trade-off between complexity and accuracy

Method for feature word weight calculating

Text classification based on limited bibliographic metadata

Classifying Sentence-Based Summaries of Web Documents

Classifying Text with Statistically Selected Features to Closely Related Categories

Chinese Text Classification Using Key Characters String Kernel

A new method for attribute extraction with application on text classification

Semi-supervised text classification from unlabeled documents using class associated words

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options