Search results

Items from 1 to 20 out of 21 results

chapter

Study on question classification approach mixing multiple semantic characteristics together

LiGuo Duan, YanQin Niu, JunJie Chen

2011 3rd International Conference on Computer Research and Development > 1 > 354 - 357

2011 3rd International Conference on Computer Research and Development (ICCRD 2011)

This article proposes such a question classification approach that integrates multiple semantic features. It is aimed at these two questions in Chinese question classification models: inaccurate semantic information extraction and too slow processing speed caused by too high Eigenvector dimension. With the help of HowNet and the support vector machine and syntactic and semantic information of question...

chapter

Job Opportunity Mining by Text Categorization

Shilin Zhang, Mei Gu

2010 2nd International Conference on Information Engineering and Computer Science > 1 - 4

2010 2nd International Conference on Information Engineering and Computer Science (ICIECS)

Text Classification is an important field of research. There are a number of approaches to classify text documents. However, there is an important challenge to improve the computational efficiency and recall. In this paper, we propose a novel framework to segment Chinese words, generate word vectors, train the corpus and make prediction. Based on the text classification technology, we successfully...

chapter

Affective-word based Chinese text sentiment classification

Yue Ning, Tingshao Zhu, Yan Wang

5th International Conference on Pervasive Computing and Applications > 111 - 115

2010 5th International Conference on Pervasive Computing and Applications (ICPCA 2010)

When browsing news on the web, various emotions may be evoked in readers and furthermore cause different influence on their minds and life. We expect that emotional analysis and classification of text may provide good performance and significance to users surfing the Internet. Most previous research only focus on bi-emotion classification, that is, Positive and Negative, e.g., identifying whether...

chapter

Automatic extraction and classification approach of opinions in texts

Rihab Bouchlaghem, Aymen Elkhlifi, Rim Faiz

2010 10th International Conference on Intelligent Systems Design and Applications > 918 - 922

10th International Conference on Intelligent Systems Design and Applications (ISDA 2010)

In this paper, we present an approach to automatically extract and classify opinions in texts. We propose a similarity measurement calculating semantically distances between a word and predefined subgroups of seed words. We have evaluated our algorithm on the semantic evaluation company “SemEval 2007” corpus, and we obtained the best value of Precision and F1 62% and 61%. As an improvement of 20 %...

chapter

Chinese Web Text Classification System Model Based on Naive Bayes

Gong Zheng, Yu Tian

2010 International Conference on E-Product E-Service and E-Entertainment > 1 - 4

2010 International Conference on E-Product E-Service and E-Entertainment (ICEEE 2010)

Web text classification is the process of determine the text types automatically under a given classification, according to the text content. Web text categorization system is the use of machine learning, knowledge engineering and other related fields of knowledge, access to the web on the text, after text preprocessing, Chinese word segmentation and training classifier, using classification algorithm...

chapter

Using genetic algorithms in word-vector optimisation

P W H Smith

2010 UK Workshop on Computational Intelligence (UKCI) > 1 - 5

2010 UK Workshop on Computational Intelligence (UKCI)

Word vectors and sets of words are used in a wide range of text-based applications. Yet these word sets are often chosen on an ad hoc basis. In this study, we examine two text-based applications that use word sets and in both cases find that classification performance can be optimised using a fairly simple genetic algorithm. The first study is in authorship attribution, the second one is sentiment...

chapter

Discriminating the Machine-Printed and Hand-Written Words Based on Legibility

Shahin Akbarpour, Md Sulaiman, Norwati Mustapha, Rahmita Wirza Rahmat

2010 Seventh International Conference on Information Technology: New Generations > 364 - 369

Seventh International Conference on Information Technology: New Generations (ITNG 2010)

Discrimination of machine-printed and hand-written words is deemed as a major problem in the recognition of the mixed texts. To present a new method to distinguish between machine-printed words and hand-written words using a novel statistical feature on base legibility and discriminator threshold are objectives of this study. Because of the hand trembling, sudden uncontrollable movement of hand and...

chapter

Applying latent semantic analysis to classify emotions in Thai text

P Inrak, S Sinthupinyo

2010 2nd International Conference on Computer Engineering and Technology > 6 > V6-450 - V6-454

2010 2nd International Conference on Computer Engineering and Technology (ICCET)

With a rapid growth of the internet communication, many types of text are produced. They can convey the meanings that can contribute to text categorization. Emotion classification also becomes more interesting, but emotion classification in Thai text is still not able to be correctly classified. Thus, this paper proposes a novel approach that takes advantage of bi-words occurrence to classify emotion...

chapter

A new topic-bridged model for transfer learning

Meng-Sung Wu, Jen-Tzung Chien

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5346 - 5349

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

In real-world information systems, there are abundant unlabeled data but sparse labeled data. It is challenging to construct an adaptive model to classify a large amount of documents containing different domains. The classifiers trained from a source domain shall perform poorly for the test data in a target domain due to the domain mismatch. In this study, we build a topic-bridged latent Dirichlet...

chapter

Commodity Classification in Hierarchies

Jie Shen, Cang Chen, Ying Gao

2009 International Conference on Wireless Networks and Information Systems > 267 - 269

2009 International Conference on Wireless Networks and Information Systems (WNIS 2009)

In e-commerce transactions, goods are classified according to the hierarchical structure, which refers to a tree category. In the process of classification, we shall consider the special features. While using brand name for category, for instance, the degree of distinction characteristic of brand is higher. Based on this, we prepare a dictionary of brands for Chinese words segamentatin on one hand...

chapter

The Application Research of Topic Word List In Text Automatic Classification

Huan Huang, Qingtang Liu, Linjing Wu, Tao Huang, more

2009 Second International Symposium on Knowledge Acquisition and Modeling > 2 > 111 - 114

2009 Second International Symposium on Knowledge Acquisition and Modeling (KAM 2009)

When the traditional text classification technologies classify academic dissertations, the dimension of extracted feature terms is high, and they can't represent the theme of thesis. it makes the efficiency is very low and the accuracy rate is not high. The topic words are small in quantity and can reflect the theme of thesis well. Accordingly, the paper proposes to extract the topic words with topic...

chapter

An Efficient Word Searching Algorithm through Splitting and Hashing the Offline Text

B. Singh, I. Yadav, S. Agarwal, R. Prasad

2009 International Conference on Advances in Recent Technologies in Communication and Computing > 387 - 389

2009 International Conference on Advances in Recent Technologies in Communication and Computing. ARTCom 2009

Word matching problem is to find all the occurrences of a pattern P[0...m-1] in the text T[0...n-1], where P neither contains any white space nor preceded and followed by space. In this paper, we assume that our text is offline. Ibrahiem et al. in 2008 have proposed an algorithm (WSA) for solving the word matching problem by splitting the offline text into number of tables in the preprocessing phase...

chapter

An Efficient Bit-Parallel Multi-Patterns Word Searching Algorithm through Splitting the Text

I. Yadav, B. Singh, S. Agarwal, R. Prasad

2009 International Conference on Advances in Recent Technologies in Communication and Computing > 406 - 410

2009 International Conference on Advances in Recent Technologies in Communication and Computing. ARTCom 2009

Word matching problem is to find all the occurrences of a pattern P[0...m-1] in the text T[0...n-1], where P neither contains any white space nor preceded and followed by space. In the multi-patterns word matching problem, all the occurrences of multiple word P₀, P₁, P₂ ...P_r-1, (rges1) in the given text T are to be reported. In the present discussion, we assume that all the patterns have equal size...

chapter

A Novel Method of Three Dimensional Text Representation

Jinzhu Hu, Chunxiu Xiong, Jiangbo Shu, Xing Zhou, more

2009 International Conference on Management and Service Science > 1 - 4

2009 International Conference on Management and Service Science (MASS)

According to the high-dimensional sparse features of the storage of the textual document, this paper puts forward a novel model through 3-dimensional space to express text data, in this model, one dimension registers the count of feature words, another denotes the part of speech of the feature words, and the third one records the count of textual documents, that is, the 3-dimensional space model expresses...

chapter

A Text Classification Method with an Effective Feature Extraction Based on Category Analysis

Yun Li, Yan Sheng, Luan Luan, Ling Chen

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery > 1 > 95 - 99

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009)

Text classification refers to determine the class of an unknown text according to its content in the given classification system. In order to extract fewer features to express the information in the text as much as possible, the paper analysis the various features' statistical properties and to extract the global features according to Zipf's law; and then, based on the statistical analysis of the...

chapter

Automatic Tamil Content Generation

S. Kohilavani, T. Mala, T.V. Geetha

2009 International Conference on Intelligent Agent&Multi-Agent Systems > 1 - 6

2009 International Conference on Intelligent Agent & Multi-Agent Systems (IAMA 2009)

Automatic content generation aims on developing an intelligent tutoring system in Tamil language. This system focuses on delivering personalized content in Tamil language to an individual user needs based on their learning abilities and interests. This paper deals with automatic classification of Tamil documents and also the information extraction from those documents to construct the knowledge base...

chapter

A Novel Hybrid system for Large-Scale Chinese Text Classification Problem

Zhong Gao, Guanming Lu, Daquan Gu

2008 Japan-China Joint Workshop on Frontier of Computer Science and Technology > 121 - 124

2008 Japan-China Joint Workshop on Frontier of Computer Science and Technology

Most of the Chinese text classification systems are all based on the technology of bag of words (BW) which is a valid probability tool for text representation and can provide a better semantic architecture. But the weakness in classification accuracy is still unconquerable. Support vector machine (SVM) has become a popular classification tool and can be applied in the scheme, but the main disadvantages...

chapter

A Feature Selection Method based on Improved TFIDF

Wei Yong-qing, Liu Pei-yu, Zhu Zhen-fang

2008 Third International Conference on Pervasive Computing and Applications > 1 > 94 - 97

2008 Third International Conference on Pervasive Computing and Applications (ICPCA08)

Feature selection is a valid method to reduce the dimension of vector in text categorization system. After analyzed several common evaluation functions for feature selection, we applied terms weight function to feature selection. A new evaluation function based on improved TFIDF method is presented; in this function the category information is introduced to feature items, and the feature items of...

chapter

Using Linguistic Information to Classify Portuguese Text Documents

T. Goncalves, P. Quaresma

2008 Seventh Mexican International Conference on Artificial Intelligence > 94 - 100

2008 Seventh Mexican International Conference on Artificial Intelligence (MICAI)

This paper examines the role of various linguistic structures on text classification applying the study to the Portuguese language. Besides using a bag-of-words representation where we evaluate different measures and use linguistic knowledge for term selection, we do several experiments using syntactic information representing documents as strings of words and strings of syntactic parse trees. To...

chapter

Sequential Pattern Mining for Chinese E-mail Authorship Identification

Jianbin Ma, Ying Li, Guifa Teng, Fang Wang, more

2008 3rd International Conference on Innovative Computing Information and Control > 73

2008 3rd International Conference on Innovative Computing Information and Control (ICICIC)

With the rapid growth in computer technology and popularization of Internet, e-mail has become one economical and convenient form of communication. But different types of crime and civil action involving e-mail documents appear which do harm to people's life and social's stabilization. So the criminal e-mail's authorship has to be identified automatically for the purpose of computer forensic. To solve...

Keywords:
WORD PROCESSING
CLASSIFICATION ALGORITHMS

Publication date

Set your own date range

Content availability

Available (20)
None (1)

Keywords

FEATURE EXTRACTION (12)
TEXT CATEGORIZATION (11)
TRAINING (11)
DATA MINING (8)
NATURAL LANGUAGE PROCESSING (8)
ACCURACY (7)
PATTERN CLASSIFICATION (7)
SUPPORT VECTOR MACHINE CLASSIFICATION (6)
SUPPORT VECTOR MACHINES (6)
ALGORITHM DESIGN AND ANALYSIS (5)
BAYES METHODS (4)
INTERNET (4)
SEMANTICS (4)
TEXT CLASSIFICATION (4)
BAYESIAN METHODS (3)
LEARNING (ARTIFICIAL INTELLIGENCE) (3)
SUPPORT VECTOR MACHINE (3)
ALGORITHM (2)
CHINESE WORDS SEGMENTATION (2)
CLASSIFICATION (2)
COMPUTATIONAL MODELING (2)
COMPUTERS (2)
DATABASES (2)
DICTIONARIES (2)
EMOTION RECOGNITION (2)
FEATURE SELECTION (2)
FORCE (2)
HOWNET (2)
MACHINE LEARNING (2)
NAIVE BAYES (2)
OFFLINE SEARCHING (2)
PATTERN MATCHING (2)
PRESSES (2)
PROBABILITY (2)
STRING MATCHING (2)
SVM (2)
TEXT RECOGNITION (2)
χ²-BASED CHINESE TEXT EMOTION CLASSIFICATION (1)
ACADEMIC DISSERTATIONS (1)
ADAPTATION MODEL (1)
AFFECTIVE COMPUTING (1)
AFFECTIVE WORD (1)
AFFECTIVE-WORD (1)
AND WORD SEARCHING (1)
ARRAYS (1)
AUTHORSHIP ATTRIBUTION (1)
AUTOMATA (1)
AUTOMATIC CONTENT GENERATION (1)
AUTOMATIC EXTRACTION (1)
BAG OF WORDS (1)
BAG-OF-WORDS REPRESENTATION (1)
BAYES CLASSIFICATION (1)
BAYES PROCEDURES (1)
BAYESIAN CLASSIFICATION METHOD (1)
BAYESIAN TEXT CLASSIFICATION METHODS (1)
BEHAVIOURAL SCIENCES COMPUTING (1)
BELIEF NETWORKS (1)
BI-WORD OCCURRENCE (1)
BIEMOTION CLASSIFICATION (1)
BIOLOGICAL SYSTEM MODELING (1)
BIT-PARALLEL MULTIPATTERNS WORD SEARCHING ALGORITHM (1)
BRAND NAME (1)
CATEGORY ANALYSIS (1)
CATEGORY FREQUENCY (1)
CHINESE DISABLED PERSONS (1)
CHINESE E-MAIL AUTHORSHIP IDENTIFICATION (1)
CHINESE QUESTION CLASSIFICATION MODEL (1)
CHINESE TEXT PROCESSING (1)
CHINESE WEB TEXT CLASSIFICATION SYSTEM (1)
CHINESE WORD SEGMENTATION (1)
CLASSIFICATION PERFORMANCE (1)
CLASSIFICATION PRECISION (1)
CLASSIFICATION TREE ANALYSIS (1)
CLASSIFIER (1)
CLUSTERING ALGORITHMS (1)
CLUSTERING METHODS (1)
COMMODITY CLASSIFICATION (1)
COMPLEXITY THEORY (1)
COMPUTATIONAL EFFICIENCY (1)
COMPUTER CRIME (1)
COMPUTER FORENSIC (1)
CRYPTOGRAPHY (1)
DATA MODELS (1)
DATABASE MANAGEMENT SYSTEMS (1)
DECISION TREE (1)
DECISION TREES (1)
DISCRIMINATING THE MACHINE-PRINTED AND HAND-WRITTEN WORDS (1)
DISCRIMINATIVE NAIVE BAYES (1)
DISCRIMINATIVE NAIVE BAYES CLASSIFIER (1)
DISCRIMINATOR THRESHOLD (1)
DISTANCE MEASUREMENT (1)
DOCUMENT CATEGORIZATION (1)
DOCUMENT CLASSIFICATION (1)
DOCUMENT CLASSIFIER (1)
DOCUMENT IMAGE PROCESSING (1)
E-COMMERCE TRANSACTIONS (1)
EDUCATIONAL TECHNOLOGY (1)
more

INFONA - science communication portal

Search results

Study on question classification approach mixing multiple semantic characteristics together

Job Opportunity Mining by Text Categorization

Affective-word based Chinese text sentiment classification

Automatic extraction and classification approach of opinions in texts

Chinese Web Text Classification System Model Based on Naive Bayes

Using genetic algorithms in word-vector optimisation

Discriminating the Machine-Printed and Hand-Written Words Based on Legibility

Applying latent semantic analysis to classify emotions in Thai text

A new topic-bridged model for transfer learning

Commodity Classification in Hierarchies

The Application Research of Topic Word List In Text Automatic Classification

An Efficient Word Searching Algorithm through Splitting and Hashing the Offline Text

An Efficient Bit-Parallel Multi-Patterns Word Searching Algorithm through Splitting the Text

A Novel Method of Three Dimensional Text Representation

A Text Classification Method with an Effective Feature Extraction Based on Category Analysis

Automatic Tamil Content Generation

A Novel Hybrid system for Large-Scale Chinese Text Classification Problem

A Feature Selection Method based on Improved TFIDF

Using Linguistic Information to Classify Portuguese Text Documents

Sequential Pattern Mining for Chinese E-mail Authorship Identification

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options