Search results

Items from 1 to 6 out of 6 results

chapter

Fusing Gini Index and Term Frequency for Text Feature Selection

Lin Wu, Yongbin Wang, Shengyan Zhang, Yannan Zhang

2017 IEEE Third International Conference on Multimedia Big Data (BigMM) > 280 - 283

2017 IEEE Third International Conference on Multimedia Big Data (BigMM)

Automatic text classification is the key technology to process and organize large-scale text data. It is well known that the high dimensionality of feature space is a main challenge for text classification. In order to attenuate such a problem as well as inspired by existing arts, we propose an effective text feature selection algorithm by novelly fusing the classical methodologies of Gini index and...

chapter

Optimized Approach of Feature Selection Based on Information Gain

Guohua Wu, Junjun Xu

2015 International Conference on Computer Science and Mechanical Automation (CSMA) > 157 - 161

2015 International Conference on Computer Science and Mechanical Automation (CSMA)

Text feature selection is the key technology in text classification and text information retrieval. The feature selection method - information gain - has extensive application in text categorization. This paper theoretically analyzed the deficiency of information gain in feature selection methods, and then introduced two improvement factors which were LDFWF (Limiting Document Frequency's Word Frequency)...

chapter

A k-Highest Expert Text Classification Algorithm Based on Choquet Integral

Shuchao Feng, Wenqian Shang, Yuqi Wang

2015 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence > 499 - 503

2015 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence (ACIT-CSI)

In recent years, the research on text classification algorithm is still a hot topic in text mining. The KNN is a classic text classification algorithm. The rule of finding the nearest neighbors directly affects the performance and precision of categorization. In this paper, we mainly focus on distance measure and similarity. We propose a new text classification algorithm which combines KNN and Choquet...

chapter

An empirical evaluation of linear and nonlinear kernels for text classification using Support Vector Machines

Ya Gao, Shiliang Sun

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery > 4 > 1502 - 1505

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

This paper compares the performance of linear and nonlinear kernels of Support Vector Machines (SVM) used for text classification. The study is motivated by the previous viewpoint that linear SVM performs better than nonlinear one, and that, although there are many investigations have proved that SVM performs well in text classification, there is no serious investigation on the comparison between...

chapter

A study of the identification of authorship for Chinese texts

Zhang Jian, Yao Tianfang

2008 IEEE International Conference on Intelligence and Security Informatics > 263 - 264

2008 IEEE International Conference on Intelligence and Security Informatics (ISI 2008)

Style-based text authorship identification extracts features from authorship-known texts, constructs classifier and then identifies disputed texts. Authorship identification belongs to the domain of style classification and is a branch of text classification. In contrast with text classification which deals with the content of texts, authorship identification focuses on the form property of texts...

article

A Fast Tracking Algorithm for Generalized LARS/LASSO

S.S. Keerthi, S. Shevade

IEEE Transactions on Neural Networks > 2007 > 18 > 6 > 1826 - 1830

This letter gives an efficient algorithm for tracking the solution curve of sparse logistic regression with respect to the regularization parameter. The algorithm is based on approximating the logistic regression loss by a piecewise quadratic function, using Rosset and Zhu's path tracking algorithm on the approximate problem, and then applying a correction to get to the true path. Application of the...

Filter options

Keywords:
TEXT CATEGORIZATION
COMPUTER SCIENCE

Publication date

Set your own date range

Publication type

book (5)
article (1)

Keywords

CLASSIFICATION ALGORITHMS (4)
SUPPORT VECTOR MACHINES (4)
ALGORITHM DESIGN AND ANALYSIS (3)
PATTERN CLASSIFICATION (3)
TEXT ANALYSIS (3)
ACCURACY (2)
KERNEL (2)
MACHINE LEARNING (2)
SVM (2)
TRAINING (2)
APPLICATION SOFTWARE (1)
ARTIFICIAL INTELLIGENCE (1)
AUTOMATION (1)
BEHAVIORAL SCIENCE (1)
CHINESE TEXT (1)
CHOQUET INTEGRAL (1)
CLASSIFIER (1)
COMPUTATIONAL LINGUISTICS (1)
COMPUTERS (1)
COSINE SIMILARITY (1)
DATA MINING (1)
EDUCATIONAL INSTITUTIONS (1)
EMPIRICAL EVALUATION (1)
ENTROPY (1)
FEATURE EXTRACTION (1)
FEATURE SELECTION (1)
FREQUENCY MEASUREMENT (1)
FUSE (1)
GAIN (1)
GENERALIZED LEAST ANGLE REGRESSION (1)
GENERALIZED LEAST ANGLE REGRESSION (LARS) (1)
GINI INDEX (1)
GOVERNMENT (1)
INDEXES (1)
INFORMATION GAIN (1)
INTERNET (1)
ITERATIVE ALGORITHMS (1)
KNN (1)
KNOWLEDGE ENGINEERING (1)
LEAD (1)
LEAST ABSOLUTE SHRINKAGE (1)
LEAST ABSOLUTE SHRINKAGE AND SELECTION OPERATOR (LASSO) (1)
LIMITING (1)
LINEAR KERNEL (1)
LINEAR KERNELS (1)
LOGISTICS (1)
MATERIALS (1)
MEASUREMENT (1)
MEDIA (1)
MONOTONE MEASURE (1)
MUTUAL INFORMATION (1)
NOISE (1)
NONLINEAR KERNEL (1)
NONLINEAR KERNELS (1)
PATH TRACKING ALGORITHM (1)
PIECEWISE QUADRATIC FUNCTION (1)
PRESSES (1)
Q-FACTOR (1)
REGRESSION ANALYSIS (1)
RESISTANCE (1)
SELECTION OPERATOR (1)
SOFTWARE ENGINEERING (1)
SPARSE KERNEL LOGISTIC REGRESSION (1)
SPARSE LOGISTIC REGRESSION (1)
STYLE-BASED TEXT AUTHORSHIP IDENTIFICATION (1)
SYMMETRIC MATRICES (1)
TERM FREQUENCY (1)
TESTING (1)
TEXT FEATURE SELECTION (1)
TRANSFORMS (1)
VECTORS (1)
WEB PAGES (1)
WEB SITES (1)
WRITING (1)
more

INFONA - science communication portal

Search results

Fusing Gini Index and Term Frequency for Text Feature Selection

Optimized Approach of Feature Selection Based on Information Gain

A k-Highest Expert Text Classification Algorithm Based on Choquet Integral

An empirical evaluation of linear and nonlinear kernels for text classification using Support Vector Machines

A study of the identification of authorship for Chinese texts

A Fast Tracking Algorithm for Generalized LARS/LASSO

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options