Search results

Items from 41 to 49 out of 49 results

chapter

An Improved Algorithm for Multiclass Text Categorization with Support Vector Machine

Fubo Shao, Guoping He, Xin Zhang

2008 International Symposium on Computational Intelligence and Design > 1 > 336 - 339

2008 International Symposium on Computational Intelligence and Design

Automated text categorization is attractive because it frees organizations from the need of manually organizing document bases. Support Vector Machine (SVM) is an efficient technique for text categorization. Computing kernel matrix is the key in text categorization with SVM. When the kind of texts is large, the matrix of texts will become sparse. If we compute the kernel matrix directly, it will waste...

chapter

Using Linguistic Information to Classify Portuguese Text Documents

T. Goncalves, P. Quaresma

2008 Seventh Mexican International Conference on Artificial Intelligence > 94 - 100

2008 Seventh Mexican International Conference on Artificial Intelligence (MICAI)

This paper examines the role of various linguistic structures on text classification applying the study to the Portuguese language. Besides using a bag-of-words representation where we evaluate different measures and use linguistic knowledge for term selection, we do several experiments using syntactic information representing documents as strings of words and strings of syntactic parse trees. To...

chapter

Exploiting syntactic and semantic information in coarse chinese question classification

Xin Kang, Xiaojie Wang, Fuji Ren

2008 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 7

2008 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

Recent years have seen great process in studying English question classification. In our research, we learn Chinese question classification by exploiting the result of lexical, syntactic and semantic parsing on question sentences. Support vector machines are adopted to train a classifier on 6 coarse categories using single and combination of different parsing results as features. We find that even...

chapter

Arabic part-of-speech tagger based Support Vectors Machines

J.H. Yousif, T. Sembok

2008 International Symposium on Information Technology > 3 > 1 - 7

2008 International Symposium on Information Technology

Support vector machines (SVMs) and related kernel methods have become widely known tools for text mining tasks such as classification and regression. The Arabic part of speech (POS) based support vectors machine is designed and implemented. The NeuroSolutions software is used to adopt and learn the proposed tagger. The radial basis functions (RBFs) is used as a linear function approximator. The experiments...

chapter

A Hybrid Text Classification Model based on Rough Sets and Genetic Algorithms

Xiaoyue Wang, Zhen Hua, Rujiang Bai

2008 Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing > 971 - 977

2008 Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD)

Automatic categorization of documents into pre-defined taxonomies is a crucial step in data mining and knowledge discovery. Standard machine learning techniques like support vector machines(SVM) and related large margin methods have been successfully applied for this task. Unfortunately, the high dimensionality of input feature vectors impacts on the classification speed. The kernel parameters setting...

chapter

Classifiers based on Bernoulli mixture models for text mining and handwriting recognition tasks

M. Saeed, H. Babri

2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence) > 2169 - 2175

2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)

In this paper we describe a model for classifying binary data using classifiers based on Bernoulli mixture models. We show how Bernoulli mixtures can be used for feature extraction and dimensionality reduction of raw input data. The extracted features are then used for training a classifier for supervised labeling of individual sample points. We have applied this method to two different types of datasets,...

chapter

A local Latent Semantic Analysis-based kernel for document similarities

S. Aseervatham

2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence) > 214 - 219

2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)

The document similarity measure is a key point in textual data processing. It is the main responsible of the performance of a processing system. Since a decade, kernels are used as similarity functions within inner-product based algorithms such as the SVM for NLP problems and especially for text categorization. In this paper, we present a semantic space constructed from latent concepts. The concepts...

chapter

A Platform Framework for Cross-Lingual Text Relatedness Evaluation and Plagiarism Detection

Chung-Hong Lee, Chih-Hong Wu, Hsin-Chang Yang

2008 3rd International Conference on Innovative Computing Information and Control > 303

2008 3rd International Conference on Innovative Computing Information and Control (ICICIC)

Research work related to plagiarism detection methods in dealing with monolingual texts (e.g. English texts) have been well established in recent years. However, little attention has been paid to facilitate plagiarism detection in cross-lingual text collections (e.g. English and Chinese texts). In this paper we present a system platform to evaluating text similarity and relatedness in multilingual...

article

A Discriminative Kernel-Based Approach to Rank Images from Text Queries

D. Grangier, S. Bengio

IEEE Transactions on Pattern Analysis and Machine Intelligence > 2008 > 30 > 8 > 1371 - 1384

This paper introduces a discriminative model for the retrieval of images from text queries. Our approach formalizes the retrieval task as a ranking problem, and introduces a learning procedure optimizing a criterion related to the ranking performance. The proposed model hence addresses the retrieval problem directly and does not rely on an intermediate image annotation task, which contrasts with previous...

Keywords:
TEXT ANALYSIS
KERNEL

Publication date

Set your own date range

INFONA - science communication portal

Search results

An Improved Algorithm for Multiclass Text Categorization with Support Vector Machine

Using Linguistic Information to Classify Portuguese Text Documents

Exploiting syntactic and semantic information in coarse chinese question classification

Arabic part-of-speech tagger based Support Vectors Machines

A Hybrid Text Classification Model based on Rough Sets and Genetic Algorithms

Classifiers based on Bernoulli mixture models for text mining and handwriting recognition tasks

A local Latent Semantic Analysis-based kernel for document similarities

A Platform Framework for Cross-Lingual Text Relatedness Evaluation and Plagiarism Detection

A Discriminative Kernel-Based Approach to Rank Images from Text Queries

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options