Search results

Items from 1 to 6 out of 6 results

chapter

A feature selection method for document clustering based on part-of-speech and word co-occurrence

Zitao Liu, Wenchao Yu, Yalan Deng, Yongtao Wang, more

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery > 5 > 2331 - 2334

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

Feature selection is a process which chooses a subset from the original feature set according to some rules. The selected feature retains original physical meaning and provides a better understanding for the data and learning process. However, few modern feature selection approaches take the advantage of features' context information. Based on this analysis, we propose a novel feature selection method...

chapter

A Feature Selection Method based on Improved TFIDF

Wei Yong-qing, Liu Pei-yu, Zhu Zhen-fang

2008 Third International Conference on Pervasive Computing and Applications > 1 > 94 - 97

2008 Third International Conference on Pervasive Computing and Applications (ICPCA08)

Feature selection is a valid method to reduce the dimension of vector in text categorization system. After analyzed several common evaluation functions for feature selection, we applied terms weight function to feature selection. A new evaluation function based on improved TFIDF method is presented; in this function the category information is introduced to feature items, and the feature items of...

chapter

Sequential Pattern Mining for Chinese E-mail Authorship Identification

Jianbin Ma, Ying Li, Guifa Teng, Fang Wang, more

2008 3rd International Conference on Innovative Computing Information and Control > 73

2008 3rd International Conference on Innovative Computing Information and Control (ICICIC)

With the rapid growth in computer technology and popularization of Internet, e-mail has become one economical and convenient form of communication. But different types of crime and civil action involving e-mail documents appear which do harm to people's life and social's stabilization. So the criminal e-mail's authorship has to be identified automatically for the purpose of computer forensic. To solve...

chapter

Research on Chinese Text Automatic Categorization Based on VSM

Tong Xiao-Jun, Cui Ming-Gen, Song Guo-Long

2007 International Conference on Wireless Communications, Networking and Mobile Computing > 3863 - 3866

2007 3rd International Conference on Wireless Communications, Networking, and Mobile Computing - WiCOM '07

Automatic text classifying is an import application of the information processing technology. This paper introduces the key techniques of Chinese text categorization such as text preprocessing, feature selection, feature representation, training and classifying algorithm, especially analyses the current most important several feature selection methods with emphasis. A Chinese text classifier based...

chapter

Histogram-Based Dimensionality Reduction of Term Vector Space

K. Ciesielski, M.A. Klopotek, S.T. Wierzchori

6th International Conference on Computer Information Systems and Industrial Management Applications (CISIM'7) > 103 - 108

2007 6th International Conference on Computer Information Systems and Industrial Management Applications

One of the most vital problems of free-text document processing is the curse of dimensionality. The paper presents a dimensionality reduction algorithm based on informed feature selection. Terms describing the document are based on histogram-like statistics which can be computed as well as incrementally updated at low complexity. The document representation can adapt to changing document collection...

article

A New Text Categorization Technique Using Distributional Clustering and Learning Logic

H. Al-Mubaid, S.A. Umair

IEEE Transactions on Knowledge and Data Engineering > 2006 > 18 > 9 > 1156 - 1165

Text categorization is continuing to be one of the most researched NLP problems due to the ever-increasing amounts of electronic documents and digital libraries. In this paper, we present a new text categorization method that combines the distributional clustering of words and a learning logic technique, called Lsquare, for constructing text classifiers. The high dimensionality of text in a document...

Filter options

Keywords:
WORD PROCESSING
FEATURE SELECTION

Publication date

Set your own date range

Publication type

book (5)
article (1)

Keywords

FEATURE EXTRACTION (3)
CLASSIFICATION ALGORITHMS (2)
DATA MINING (2)
FEATURE SELECTION METHOD (2)
MACHINE LEARNING (2)
PATTERN CLUSTERING (2)
TEXT CATEGORIZATION (2)
ACCURACY (1)
AUTOMATIC TEXT CLASSIFICATION (1)
CHINESE DOCUMENT (1)
CHINESE E-MAIL AUTHORSHIP IDENTIFICATION (1)
CHINESE TEXT AUTOMATIC CATEGORIZATION (1)
CHINESE TEXT PROCESSING (1)
CLASSIFICATION (1)
COMPUTER CRIME (1)
COMPUTER FORENSIC (1)
CONTEXT (1)
CONTEXT INFORMATION (1)
DATABASES (1)
DIGITAL LIBRARIES (1)
DIGITAL LIBRARY (1)
DIMENSIONALITY REDUCTION (1)
DISTRIBUTIONAL WORD CLUSTERING METHOD (1)
DOCUMENT CLUSTERING (1)
DOCUMENT COLLECTION (1)
DOCUMENT REPRESENTATION (1)
EDUCATIONAL INSTITUTIONS (1)
ELECTRONIC DOCUMENT (1)
ELECTRONIC MAIL (1)
EMPIRICAL VERIFICATION (1)
ENTROPY (1)
EVALUATION FUNCTION (1)
FEATURE CLUSTERING (1)
FEATURE REPRESENTATION (1)
FEATURE SUBSET (1)
FREE-TEXT DOCUMENT PROCESSING (1)
HISTOGRAM-BASED DIMENSIONALITY REDUCTION (1)
HISTOGRAM-LIKE STATISTICS (1)
INFORMATION PROCESSING TECHNOLOGY (1)
KNN ALGORITHM (1)
LEARNING (ARTIFICIAL INTELLIGENCE) (1)
LEARNING PROCESS (1)
LSQUARE LEARNING LOGIC TECHNIQUE (1)
MACHINE LEARNING. (1)
NLP PROBLEM (1)
PART OF SPEECH (1)
PART-OFSPEECH (1)
PATTERN CLASSIFICATION (1)
POSTAL SERVICES (1)
PROBABILITY DENSITY FUNCTION (1)
SEQUENTIAL PATTERN MINING (1)
SOFTWARE (1)
SOGOU LAB (1)
SPEECH (1)
SPEECH SYNTHESIS (1)
STATISTICS (1)
SUPPORT VECTOR MACHINE ALGORITHM (1)
SUPPORT VECTOR MACHINES (1)
SVM (1)
TERM VECTOR SPACE (1)
TEXT CATEGORIZATION SYSTEM (1)
TEXT CATEGORIZATION TECHNIQUE (1)
TEXT CLASSIFIER (1)
TEXT CORPUS (1)
TEXT PREPROCESSING (1)
TFIDF (1)
TRAINING (1)
TRAINING ALGORITHM (1)
UNSOLICITED E-MAIL (1)
UNSUPERVISED LEARNING (1)
VSM (1)
WORD CO-OCCURRENCE (1)
more

INFONA - science communication portal

Search results

A feature selection method for document clustering based on part-of-speech and word co-occurrence

A Feature Selection Method based on Improved TFIDF

Sequential Pattern Mining for Chinese E-mail Authorship Identification

Research on Chinese Text Automatic Categorization Based on VSM

Histogram-Based Dimensionality Reduction of Term Vector Space

A New Text Categorization Technique Using Distributional Clustering and Learning Logic

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options