Search results

Items from 1 to 6 out of 6 results

article

A comparative study of TF*IDF, LSI and multi-words for text classification

Wen Zhang, Taketoshi Yoshida, Xijin Tang

Expert Systems With Applications > 2011 > 38 > 3 > 2758-2765

One of the main themes in text mining is text representation, which is fundamental and indispensable for text-based intellegent information processing. Generally, text representation inludes two tasks: indexing and weighting. This paper has comparatively studied TF*IDF, LSI and multi-word for text representation. We used a Chinese and an English document collection to respectively evaluate the three...

chapter

Use semantic meaning of coreference to improve classification text representation

Ziqiang Li, Mingtian Zhou

2010 2nd IEEE International Conference on Information Management and Engineering > 416 - 420

2010 2nd IEEE International Conference on Information Management and Engineering (ICIME 2010)

On large scale dataset, the effect of automatic text classification is now still far from perfect. It's a common agreement that more sufficient text semantic meaning be adopted in text representation to deal with the challenge. This paper introduces semantic meaning of coreference in and to improve traditional BOW representation. The result of text classification experiment shows that, contrasted...

chapter

Increasing the Accuracy of Discriminative of Multinomial Bayesian Classifier in Text Classification

T. Mouratis, S. Kotsiantis

2009 Fourth International Conference on Computer Sciences and Convergence Information Technology > 1246 - 1251

2009 Fourth International Conference on Computer Sciences and Convergence Information Technology

Text classification plays an important role in information extraction and summarization, text retrieval, and question-answering. The discriminative multinomial naive Bayes classifier has been a focus of research in the field of text classification. This paper increases the accuracy of discriminative multinomial Bayesian classifier with the usage of the feature selection technique that evaluates the...

chapter

Text representation and classification based on multi-instance learning

He Wei, Wang Yu

2009 International Conference on Management Science and Engineering > 34 - 39

2009 16th International Conference on Management Science and Engineering (ICMSE)

In multi-instance learning, the training set comprises labeled bags which are composed of unlabeled instances, and the task is to predict the labels of unseen bags. In this paper, a text mining problem, i.e. text representation, is investigated from a multi-instance view. In detail, each text is regarded as a bag while each of its sentences is regarded as an instance. Bag can be labeled by its class...

chapter

A Novel Conception Based Texts Classification Method

Bai Rujiang, Liao Junhua

2009 International e-Conference on Advanced Science and Technology > 30 - 34

2009 International e-Conference on Advanced Science and Technology (AST 2009)

Text classification has been widely used to assist users with the discovery of useful information from the Internet. However, current text classification systems are based on the ldquoBag of Wordsrdquo (BOW) representation, which only accounts for term frequency in the documents, and ignores important semantic relationships between key terms. To overcome this problem, previous work attempted to enrich...

chapter

TFIDF, LSI and multi-word in information retrieval and text categorization

Wen Zhang, T. Yoshida, Xijin Tang

2008 IEEE International Conference on Systems, Man and Cybernetics > 108 - 113

2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)

Text representation, which is a fundamental and necessary process for text-based intelligent information processing, includes the tasks of determining the index terms for documents and producing the numeric vectors corresponding to the documents. In this paper, multi-word, which is regarded as containing more contextual semantics than individual word and possessing the favorable statistical characteristics,...

Filter options

Keywords:
TEXT CATEGORIZATION
TEXT REPRESENTATION

Publication date

Set your own date range

INFONA - science communication portal

Search results

A comparative study of TF*IDF, LSI and multi-words for text classification

Use semantic meaning of coreference to improve classification text representation

Increasing the Accuracy of Discriminative of Multinomial Bayesian Classifier in Text Classification

Text representation and classification based on multi-instance learning

A Novel Conception Based Texts Classification Method

TFIDF, LSI and multi-word in information retrieval and text categorization

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options