Search results

Items from 141 to 160 out of 474 results

1 ...
5
6
7
8
9
10
11

chapter

A Model for Term Selection in Text Categorization Problems

Laura Maria Cannas, Nicoletta Dessi, Stefania Dessi

2012 23rd International Workshop on Database and Expert Systems Applications > 169 - 173

2012 23rd International Workshop on Database and Expert Systems Applications (DEXA)

In the last ten years, automatic Text Categorization (TC) has been gaining an increasing interest from the research community, due to the need to organize a massive number of digital documents. Following a machine learning paradigm, this paper presents a model which regards TC as a classification task supported by a wrapper approach and combines the utilization of a Genetic Algorithm (GA) with a filter...

chapter

A Method for Chinese Text Classification Based on Three-Dimensional Vector Space Model

Jixian Zhang, Qinglin Wang, Yuan Li, Dongmei Li, more

2012 International Conference on Computer Science and Service System > 1324 - 1327

2012 International Conference on Computer Science and Service System (CSSS)

Text classification is an important research direction of text mining and the research of Chinese text automatic classification is also becoming a research focus of intelligent classification. Against the particularity of the Chinese text classification, this paper presents a three-dimensional vector space model on the basis of the vector space model to improve the accuracy and efficiency of text...

chapter

An Improved Ambiguity Measure Feature Selection for Text Categorization

Zhiying Liu, Jieming Yang

2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics > 1 > 220 - 223

2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)

The high dimensionality of the text categorization raises big hurdles in applying many sophisticated learning algorithms to the text categorization. Feature selection, which reduces the number of features that represent documents, is an absolute requirement in text categorization. In this paper, we proposed a feature selection method, which improved the performance of the Ambiguity Measure feature...

chapter

Hybrid ACO and TOFA feature selection approach for text classification

Hanan S. Alghamdi, H. Lilian Tang, Saleh Alshomrani

2012 IEEE Congress on Evolutionary Computation > 1 - 6

2012 IEEE Congress on Evolutionary Computation (CEC)

With the highly increasing availability of text data on the Internet, the process of selecting an appropriate set of features for text classification becomes more important, for not only reducing the dimensionality of the feature space, but also for improving the classification performance. This paper proposes a novel feature selection approach to improve the performance of text classifier based on...

chapter

Text categorization based on improved Rocchio algorithm

Guanyu Gao, Shengxiao Guan

2012 International Conference on Systems and Informatics (ICSAI2012) > 2247 - 2250

2012 International Conference on Systems and Informatics (ICSAI)

Text categorization is used to assign each text document to predefined categories. This paper presents a new text classification method for classifying Chinese text based on Rocchio algorithm. We firstly use the TFIDF to extract document vectors from the training documents which have been correctly categorized, and then use those document vectors to generate codebooks as classification models using...

chapter

Active semi-supervised framework with data editing

Xue Zhang, Wang-xin Xiao

2012 International Conference on Systems and Informatics (ICSAI2012) > 46 - 50

2012 International Conference on Systems and Informatics (ICSAI)

Self-labeled training data in semi-supervised learning may contain much noise due to the initial insufficient training data, which may hurt the generalization ability of the final hypothesis. In this paper, we propose an Active Semi-Supervised framework with Data Editing(ASSDE) to improve sparsely labeled text classification. A data editing technique is used to identify and remove noise introduced...

chapter

Enhanced intelligent text categorization using concise keyword analysis

Amir Mohammad Shahi, Biju Issac, Jashua Rajesh Modapothala

2012 International Conference on Innovation Management and Technology Research > 574 - 579

2012 International Conference on Innovation Management and Technology Research (ICIMTR)

Supervised learning is a popular approach to text classification among the research community as well as within software development industry. It enables intelligent systems to solve various text analysis problems such as document organization, spam detection and report scoring. However, the extremely difficult and time intensive process of creating a training corpus makes it inapplicable to many...

chapter

An improved KNN algorithm for text classification based on clustering center vector

Jie Ling, Lina Zou

2012 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER) > 584 - 588

2012 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER)

The traditional KNN algorithm for text classification has some insufficiencies, an improved KNN algorithm has been presented in this paper. By use of the clustering center vector, we put the distance of the be classified text and the text category into the similarity calculation formula, and take the ratio of the number of common features appear in two texts and the maximum number of respective features...

chapter

Research on grain information classification based on improved support vector machine

Ruihuan Geng, Miao Zhang, Dexian Zhang, Jiajia Chai

2012 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER) > 293 - 296

2012 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER)

This paper analyses the defections of traditional support vector machine (for short SVM). According to the characteristics of grain information on the web, a multi-class classification method based on Huffman binary tree SVM (for short HBT-SVM) is presented for grain information classification. Compared with existing SVM methods, this method has higher computation efficiency. The experimental results...

chapter

Clustering based two-stage text classification requiring minimal training data

Xue Zhang, Wang-xin Xiao

2012 International Conference on Systems and Informatics (ICSAI2012) > 2233 - 2237

2012 International Conference on Systems and Informatics (ICSAI)

Clustering aided classification methods are based on the assumption that the learned clusters under the guidance of initial training data can somewhat characterize the underlying distribution of the data set. However, our experiments show that whether such assumption holds is based on both the separability of the considered data set and the size of the training data set. It is often violated on data...

chapter

A New Method of Class Centriod Vectors Classification Based on the Feedback

Li Weijiang, Chen Xing, Zhao Tiejun, Wang Xiangang

2012 International Conference on Computer Distributed Control and Intelligent Environmental Monitoring > 56 - 60

2012 International Conference on Computer Distributed Control and Intelligent Environmental Monitoring (CDCIEM)

It is a great challenge for information technology that how to organize and manage large amount of document data, and find users' interested information quickly and exactly. Text classification can achieve the goal of information distributaries and solve the problem of information disorder, and then it can offer the convenience to users to make decisions. Centroid classifier is one of the most efficient...

chapter

A Feature Selection Method Based on Information Gain and Genetic Algorithm

Shang Lei

2012 International Conference on Computer Science and Electronics Engineering > 2 > 355 - 358

2012 International Conference on Computer Science and Electronics Engineering (ICCSEE)

with the rapid development of the Computer Science and Technology, It has become a major problem for the users that how to quickly find useful or needed information. Text categorization can help people to solve this question. The feature selection method has become one of the most critical techniques in the field of the text automatic categorization. A new method of the text feature selection based...

chapter

Arabie text classification using Learning Vector Quantization

Mohammed Azara, Tamer Fatayer, Alaa El-Halees

2012 8th International Conference on Informatics and Systems (INFOS) > NLP-39 - NLP-43

2012 8th International Conference on Informatics and Systems (INFOS 2012)

One of the several benefits of text classification is to automatically assign document in predefined category. Researchers using LVQ algorithm in English and Persian [1, 2] and don't be attention for Arabic language. So in our research, we used neural network approach for classify Arabic text by using Learning Vector Quantization (LVQ) algorithm. This algorithm is based on Kohonen self organizing...

chapter

A comparison between keywords and key-phrases in text categorization using feature section technique

Vatinee Nuipian, Phayung Meesad, Pudsadee Boonrawd

2011 Ninth International Conference on ICT and Knowledge Engineering > 156 - 160

2011 9th International Conference on ICT and Knowledge Engineering (ICT & Knowledge Engineering 2011) - Conference postponed to 2012

Text categorization is the main issue which affects search results. Moreover, most approaches suffer from the high dimensionality of feature space. To overcome this problem, the use of feature selection techniques with statistical text categorization is investigated. The methods were evaluated based on Chi-Square, Information Gain and Gain Ratio. The data used to test the system consisted of 1,510...

chapter

TKNN: An Improved KNN Algorithm Based on Tree Structure

Li Juan

2011 Seventh International Conference on Computational Intelligence and Security > 1390 - 1394

2011 Seventh International Conference on Computational Intelligence and Security (CIS)

Text classification is the process of assigning document to a set of previously fixed categories. It is widely used in many applications, such as web page categorization, email spam filtering, and document indexing, etc. Many popular algorithms for text classification have been proposed, such as Naive Bayes, K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). However, these classification...

chapter

Feature selection algorithm based on the Community discovery

Xiaoqiang Jia

2011 Seventh International Conference on Computational Intelligence and Security > 455 - 458

2011 Seventh International Conference on Computational Intelligence and Security (CIS)

In order to overcome the SVM for text classification ignoring the context of semantic information and the use of a community to text classification, one boundary point can only belong to a community of view, the concept of contribution and overlapping coefficient based on the complex network diagram is introduced. And feature selection algorithm based on community discovery is proposed. Experiments...

chapter

Different similarity measures in semi-supervised text classification

Mohammed Abdul Wajeed, T. Adilakshmi

2011 Annual IEEE India Conference > 1 - 5

2011 Annual IEEE India Conference (INDICON)

Information has a great value, in order to use the existing information we need to store it in a manner which can be retrieved easily when needed. So classifying the available information becomes inevitable. In addition to the existing supervised and unsupervised paradigms of classification the paper attempts to exploit the concept of semi-supervised learning paradigm. Semi-supervised learning is...

chapter

Arabic text categorization based on rough set classification

Moawia Elfaki Yahia

2011 9th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) > 293 - 294

2011 9th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA)

The process of text categorization has been used in many applications and areas. Classifying of Arabic texts is different than classifying of English texts because Arabic is highly inflectional and derivational language which makes monophonical analysis a very complex task. This short paper has made a review of some researches in Arabic text categorization, and recent works for adopting rough sets...

chapter

A global evaluation criterion for feature selection in text categorization using Kullback-Leibler divergence

Zhilong Zhen, Xiaoqin Zeng, Haijuan Wang, Lixin Han

2011 International Conference of Soft Computing and Pattern Recognition (SoCPaR) > 440 - 445

2011 International Conference of Soft Computing and Pattern Recognition

A major difficulty of text categorization is extremely high dimensionality of text feature space. The use of feature selection techniques for large-scale text categorization task is desired for improving the accuracy and efficiency. χ² statistic and simplified χ² are two effective feature selection methods in text categorization. Using these two feature selection criteria, for a term, one needs to...

chapter

A Web Page Classification Algorithm Based on Link Information

Zhaohui Xu, Fuliang Yan, Jie Qin, Haifeng Zhu

2011 10th International Symposium on Distributed Computing and Applications to Business, Engineering and Science > 82 - 86

2011 Tenth International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES)

Effective classification of web pages can improve the quality of information retrieval. The traditional classification algorithms are basically based on the analysis of Web content, but the content of the web page is complicated, filled with a large number of false, erroneous information, has seriously affected the accuracy of the classification of network information. To solve this problem, this...

1 ...
5
6
7
8
9
10
11

Keywords:
CLASSIFICATION ALGORITHMS
Publication type:
book

Publication date

Set your own date range

Content availability

Available (469)
None (5)

Keywords

TRAINING (252)
TEXT ANALYSIS (247)
SUPPORT VECTOR MACHINES (165)
TEXT CLASSIFICATION (145)
FEATURE EXTRACTION (137)
PATTERN CLASSIFICATION (120)
ACCURACY (114)
ALGORITHM DESIGN AND ANALYSIS (94)
SUPPORT VECTOR MACHINE CLASSIFICATION (89)
CLASSIFICATION (88)
MACHINE LEARNING (87)
DATA MINING (76)
FEATURE SELECTION (68)
LEARNING (ARTIFICIAL INTELLIGENCE) (59)
SUPPORT VECTOR MACHINE (46)
INTERNET (45)
NATURAL LANGUAGE PROCESSING (43)
BAYES METHODS (40)
SVM (38)
CLUSTERING ALGORITHMS (37)
INFORMATION RETRIEVAL (37)
COMPUTERS (34)
MACHINE LEARNING ALGORITHMS (33)
TEXT MINING (32)
TESTING (30)
SEMANTICS (29)
ENTROPY (27)
NIOBIUM (26)
KERNEL (24)
VECTOR SPACE MODEL (24)
COMPUTATIONAL MODELING (23)
KNN (22)
WEB PAGES (21)
ARTIFICIAL NEURAL NETWORKS (20)
DECISION TREES (20)
TRAINING DATA (20)
PROBABILITY (19)
DATABASES (18)
FILTERING (18)
MUTUAL INFORMATION (18)
STATISTICAL ANALYSIS (18)
MATHEMATICAL MODEL (17)
VECTORS (17)
BAYESIAN METHODS (16)
CORRELATION (16)
DICTIONARIES (16)
PATTERN CLUSTERING (16)
CLASSIFICATION TREE ANALYSIS (15)
PREDICTION ALGORITHMS (15)
COMPUTER SCIENCE (14)
GENETIC ALGORITHMS (14)
INDEXING (14)
INFORMATION GAIN (14)
NAIVE BAYES (14)
EDUCATIONAL INSTITUTIONS (13)
INFORMATION FILTERING (13)
DOCUMENT HANDLING (12)
INDEXES (12)
ROUGH SET THEORY (12)
SEMI-SUPERVISED LEARNING (12)
SENTIMENT ANALYSIS (12)
VOCABULARY (12)
WORD PROCESSING (12)
DISTANCE MEASUREMENT (11)
EQUATIONS (11)
NEAREST NEIGHBOR SEARCHES (11)
ONTOLOGIES (11)
WEB SITES (11)
CONTEXT (10)
DATA MODELS (10)
DECISION TREE (10)
ELECTRONIC MAIL (10)
ENCODING (10)
NOISE (10)
ROUGH SET (10)
TEXT CLASSIFICATION ALGORITHM (10)
CHINESE TEXT CATEGORIZATION (9)
CLUSTERING (9)
DIMENSION REDUCTION (9)
FUZZY SET THEORY (9)
MATRIX DECOMPOSITION (9)
NAIVE BAYES CLASSIFIER (9)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (9)
OPTIMIZATION (9)
SEARCH ENGINES (9)
SET THEORY (9)
ARTIFICIAL INTELLIGENCE (8)
COMPLEXITY THEORY (8)
DECISION MAKING (8)
DOCUMENT CLASSIFICATION (8)
FEATURE SELECTION METHOD (8)
FILTERING ALGORITHMS (8)
GAIN (8)
GENETIC ALGORITHM (8)
K-NEAREST NEIGHBOR (8)
KNN ALGORITHM (8)
KNOWLEDGE ENGINEERING (8)
NAïVE BAYES (8)
more

INFONA - science communication portal

Search results

A Model for Term Selection in Text Categorization Problems

A Method for Chinese Text Classification Based on Three-Dimensional Vector Space Model

An Improved Ambiguity Measure Feature Selection for Text Categorization

Hybrid ACO and TOFA feature selection approach for text classification

Text categorization based on improved Rocchio algorithm

Active semi-supervised framework with data editing

Enhanced intelligent text categorization using concise keyword analysis

An improved KNN algorithm for text classification based on clustering center vector

Research on grain information classification based on improved support vector machine

Clustering based two-stage text classification requiring minimal training data

A New Method of Class Centriod Vectors Classification Based on the Feedback

A Feature Selection Method Based on Information Gain and Genetic Algorithm

Arabie text classification using Learning Vector Quantization

A comparison between keywords and key-phrases in text categorization using feature section technique

TKNN: An Improved KNN Algorithm Based on Tree Structure

Feature selection algorithm based on the Community discovery

Different similarity measures in semi-supervised text classification

Arabic text categorization based on rough set classification

A global evaluation criterion for feature selection in text categorization using Kullback-Leibler divergence

A Web Page Classification Algorithm Based on Link Information

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options