Search results

Items from 1 to 20 out of 68 results

chapter

A supervised ranking approach for detecting relationally similar word pairs

D Bollegala

2010 Fifth International Conference on Information and Automation for Sustainability > 323 - 328

2010 5th International Conference on Information and Automation for Sustainability (ICIAfS)

The similarity between the semantic relations that exist between two word pairs is defined as their relational similarity. For example, the semantic relation, is a large holds between the words in the word pair (lion, cat) and (ostrich, bird), because lion is a large cat, and ostrich is the largest living bird on earth. Consequently, the two word pairs, (lion, cat) and (ostrich, bird), are considered...

chapter

Sentiment classification for stock news

Yang Gao, Li Zhou, Yong Zhang, Chunxiao Xing, more

5th International Conference on Pervasive Computing and Applications > 99 - 104

2010 5th International Conference on Pervasive Computing and Applications (ICPCA 2010)

Web news articles play an important role in stock market. Sentiment classification of news articles can help the investors make investment decisions more efficiently. In this paper, we implemented an approach of Chinese new words detection by using N-gram model and applied the result for Chinese word segmentation and sentiment classification. Appraisal theory was introduced into sentiment analysis...

chapter

Review of language identification techniques

P Roy, P K Das

2010 IEEE International Conference on Computational Intelligence and Computing Research > 1 - 4

2010 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC 2010)

Language identification (LID) is always regarded to be a fascinating field to be studied. Studies on language identification has been carried out from early 1970's and up to now lot of research have been undergone in this area. In this paper a few of the papers are highlighted and reviewed based on the past history and the current state of research on various techniques that have been applied for...

chapter

Automatic lexical stress detection for Chinese learners' of English

Jin-Yu Chen, Lan Wang

2010 7th International Symposium on Chinese Spoken Language Processing > 407 - 411

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

This paper investigates lexical stress detection for Chinese learners of English, where a combined differential acoustic feature is developed to represent the lexical stress of polysyllabic words in continuous speech. The use of frame-averaged feature and the contextual information intra-word can be input to the classifiers without normalization. The word-based stress detection method proposed in...

chapter

Thai-English spam SMS filtering

C Khemapatapan

2010 16th Asia-Pacific Conference on Communications (APCC) > 226 - 230

2010 16th Asia-Pacific Conference on Communications (APCC 2010)

SMS spam filtering for Thai-English language has not previously been studied and implemented. Two methods of spam SMS message filtering objected to filter spam SMS messages written in Thai and English have been studied and implemented. The first method simply uses current spam English message filtering and then upgrades for Thai language support. The second one applies text normalization, word segmentation...

chapter

Extracting Parallel Texts from the Web

Le Quang Hung, Le Anh Cuong

2010 Second International Conference on Knowledge and Systems Engineering > 147 - 151

2010 Second International Conference on Knowledge and Systems Engineering (KSE)

Parallel corpus is the valuable resource for some important applications of natural language processing such as statistical machine translation, dictionary construction, cross-language information retrieval. The Web is a huge resource of knowledge, which partly contains bilingual information in various kinds of web pages. It currently attracts many studies on building parallel corpora based on the...

chapter

Chinese Chunk Recognition Using HMSVM Method

Wang Zhong-Hua, Qi Hui

2010 International Conference on Artificial Intelligence and Computational Intelligence > 1 > 3 - 7

2010 International Conference on Artificial Intelligence and Computational Intelligence (AICI 2010)

Hidden Markov Support Vector Machines is a novel structural SVMs model. Its efficiency has been proved in label sequence learning task such as English text chunking. In this paper, we treat Chinese chunk recognition as a label sequence learning problem. After giving the definition of Chinese chunk, we apply HMSVM to solve Chinese chunk problem. The results of experiment show that it achieves a better...

chapter

Web Text Categorization for Large-scale Corpus

Zhijuan Jia, Jianbo Mu

2010 International Conference on Computer Application and System Modeling (ICCASM 2010) > 8 > V8-188 - V8-191

2010 International Conference on Computer Application and System Modeling (ICCASM 2010)

Corpus is the set of language materials which are stored in computers and can use computers to search, query and analyze for enterprise decision-makers. Automated text categorization has been extensively studied and various techniques for document categorization. But based on the current scarcity of Chinese corpus, especially in the field of text categorization, the Chinese categorization corpus is...

chapter

Feature selection for Chinese Text Categorization based on improved particle swarm optimization

Yaohong Jin, Wen Xiong, Cong Wang

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010) > 1 - 6

2010 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE 2010)

Feature selection is an important preprocessing step of Chinese Text Categorization, which reduces the high dimension and keeps the reduced results comprehensible compared to feature extraction. A novel criterion to filter features coarsely is proposed, which integrating the superiorities of term frequency-inverse document frequency as inner-class measure and CHI-square as inter-class, and a new feature...

chapter

Holistic Urdu Handwritten Word Recognition Using Support Vector Machine

Malik Waqas Sagheer, Chun Lei He, Nicola Nobile, Ching Y Suen

2010 20th International Conference on Pattern Recognition > 1900 - 1903

2010 20th International Conference on Pattern Recognition (ICPR 2010)

Since the Urdu language has more isolated letters than Arabic and Farsi, a research on Urdu handwritten word is desired. This is a novel approach to use the compound features and a Support Vector Machine (SVM) in offline Urdu word recognition. Due to the cursive style in Urdu, a classification using a holistic approach is adapted efficiently. Compound feature sets, which involves in structural and...

chapter

A hybrid textual entailment system using lexical and syntactic features

Partha Pakray, Sivaji Bandyopadhyay, Alexander Gelbukh

9th IEEE International Conference on Cognitive Informatics (ICCI'10) > 291 - 296

2010 9th IEEE International Conference on Cognitive Informatics (ICCI)

A two-way textual entailment (TE) recognition system that uses lexical and syntactic features has been described in this paper. The hybrid TE system is based on the Support Vector Machine that uses twenty three features for lexical similarity and the output tag from a rule based syntactic two-way TE system as another feature. The important lexical features that are used in the present system are:...

chapter

Language identification using Fuzzy-SVM technique

G Mishra, S L Nitharwal, S Kaur

2010 Second International conference on Computing, Communication and Networking Technologies > 1 - 5

2010 International Conference on Computing, Communication and Networking Technologies (ICCCNT'10)

Language Identification is an important issue in today's multilingual world. In this paper we have analyzed Fuzzy-SVM technique for identification of romanized plaintexts of five Indian regional languages namely Hindi, Bangla, Manipuri, Urdu and Kashmiri. Distinguishing features/characteristics have been extracted from romanized plaintexts of each of these five languages and represented suitably through...

chapter

Extracting Biomarker Information Applying Natural Language Processing and Machine Learning

Md Tawhidul Islam, Mostafa Shaikh, Abhaya Nayak, Shoba Ranganathan

2010 4th International Conference on Bioinformatics and Biomedical Engineering > 1 - 4

2010 4th International Conference on Bioinformatics and Biomedical Engineering (iCBBE 2010)

In this paper, we detail an approach to a very specific task of information extraction namely, extracting biomarker information in biomedical literature. Starting with the abstract of a given publication, we first identify the evaluative sentence(s) among other sentences by recognizing words and phrases in the text belonging to semantic categories of interest to bio-medical entities (i.e., semantic...

chapter

Text categorization algorithms representations based on inductive learning

Cao Jian-fang, Wang Hong-bin

2010 2nd IEEE International Conference on Information Management and Engineering > 352 - 355

2010 2nd IEEE International Conference on Information Management and Engineering (ICIME 2010)

Text categorization-assignment of natural language texts to one or more predefined categories based on their content-is an important component in many information organization and management tasks. Categorization algorithm is the most critical factor to text categorization system performance. The inductive learning classifiers are put forward. Very accurate text categorization result can be learned...

chapter

Applying latent semantic analysis to classify emotions in Thai text

P Inrak, S Sinthupinyo

2010 2nd International Conference on Computer Engineering and Technology > 6 > V6-450 - V6-454

2010 2nd International Conference on Computer Engineering and Technology (ICCET)

With a rapid growth of the internet communication, many types of text are produced. They can convey the meanings that can contribute to text categorization. Emotion classification also becomes more interesting, but emotion classification in Thai text is still not able to be correctly classified. Thus, this paper proposes a novel approach that takes advantage of bi-words occurrence to classify emotion...

chapter

Summarization- and learning-based approaches to information distillation

Boriska Toth, Dilek Hakkani-Tür, Sibel Yaman

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5306 - 5309

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Information distillation is the task that aims to extract relevant passages of text from massive volumes of textual and audio sources, given a query. In this paper, we investigate two perspectives that use shallow language processing for answering open-ended distillation queries, such as “List me facts about [event]”. The first approach is a summarization-based approach that uses the unsupervised...

chapter

Sentiment Classification Based on Ontology and SVM Classifier

K.P.P. Shein, T.T.S. Nyunt

2010 Second International Conference on Communication Software and Networks > 169 - 172

2010 Second International Conference on Communication Software and Networks (ICCSN 2010)

There are a lot of text documents on the Web which contain opinions or sentiments about an object such as software reviews, product reviews, movies reviews, music reviews, and book reviews etc. Opinion mining or sentiment classification aim to extract the features on which the reviewers express their opinions and determine they are positive or negative. In this paper we proposed an ontology based...

chapter

The sensitive feature selection for both English and Chinese text chunking

Liang Ying-Hong, Li Jin-xiang, Zhou De-fu, Wang De-peng

2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE) > 4 > 305 - 309

2nd International Conference on Computer and Automation Engineering (ICCAE 2010)

Traditional text chunking approach is to identify many phrases using only one model, and the same features are used to identify these phrases too. So the helpful features of each phrase are ignored. In fact, different phrases have different helpful features. In this paper, the concept of ??sensitive feature?? is proposed, and the sensitive features of eleven English types and seven Chinese types of...

chapter

Applying machine learning algorithms for automatic Persian text classification

Mojgan Farhoodi, Alireza Yari

2010 6th International Conference on Advanced Information Management and Service (IMS) > 318 - 323

2010 6th International Conference on Advanced Information Management and Service (IMS 2010)

Automatic document classification due to its various applications in data mining and information technology is one of the important topics in computer science. Classification plays a vital role in many information management and retrieval tasks. Document classification, also known as document categorization, is the process of assigning a document to one or more predefined category labels. Classification...

chapter

Tagger voting improves morphosyntactic tagging accuracy on Croatian texts

Z Agić, M Tadić, Zdravko Dovedan

Proceedings of the ITI 2010, 32nd International Conference on Information Technology Interfaces > 61 - 66

2010 32nd International Conference on Information Technology Interfaces (ITI 2010)

We present results of an experiment dealing with combining outputs of five part-of-speech taggers via tagger voting in order to improve the overall accuracy of morphosyntactic tagging of Croatian texts using a subset of the Multext-East v3 tagset. The increase in accuracy over the best-performing single tagger is shown to exist, but not to be statistically significant. We discuss the performance of...

Keywords:
SUPPORT VECTOR MACHINES
TEXT ANALYSIS
NATURAL LANGUAGE PROCESSING

Publication date

Set your own date range

Content availability

Available (66)
None (2)

Keywords

FEATURE EXTRACTION (31)
SUPPORT VECTOR MACHINE (28)
TRAINING (25)
DATA MINING (24)
TEXT CATEGORIZATION (22)
CLASSIFICATION ALGORITHMS (21)
PATTERN CLASSIFICATION (18)
ACCURACY (15)
LEARNING (ARTIFICIAL INTELLIGENCE) (15)
CLASSIFICATION (14)
MACHINE LEARNING (14)
SVM (14)
KERNEL (12)
FEATURE SELECTION (9)
TEXT CLASSIFICATION (9)
SPEECH (8)
SUPPORT VECTOR MACHINE CLASSIFICATION (8)
COMPUTATIONAL LINGUISTICS (7)
INFORMATION RETRIEVAL (7)
BAYES METHODS (5)
CHINESE TEXT CATEGORIZATION (5)
SEMANTICS (5)
TAGGING (5)
WORD PROCESSING (5)
DICTIONARIES (4)
ENTROPY (4)
GRAMMARS (4)
HIDDEN MARKOV MODELS (4)
INTERNET (4)
MEDICAL INFORMATION SYSTEMS (4)
NAIVE BAYES (4)
NIOBIUM (4)
PROBABILITY DENSITY FUNCTION (4)
SENTIMENT CLASSIFICATION (4)
SVM CLASSIFIER (4)
WORD SENSE DISAMBIGUATION (4)
ARRAYS (3)
CHINESE TEXT (3)
CONFERENCES (3)
CONTEXT (3)
DATABASES (3)
DOCUMENT CATEGORIZATION (3)
DOCUMENT CLASSIFICATION (3)
ELECTRONIC MESSAGING (3)
EQUATIONS (3)
FEATURE SELECTION METHOD (3)
INFORMATION EXTRACTION (3)
INFORMATION TECHNOLOGY (3)
MACHINE LEARNING ALGORITHMS (3)
MEASUREMENT (3)
NATURAL LANGUAGE (3)
NATURAL LANGUAGES (3)
SPEECH PROCESSING (3)
STATISTICAL ANALYSIS (3)
TEXT MINING (3)
TEXT PROCESSING (3)
WEB PAGES (3)
ART (2)
ARTIFICIAL INTELLIGENCE (2)
BAYESIAN METHODS (2)
BELIEF NETWORKS (2)
BIOLOGY COMPUTING (2)
BIOMEDICAL LITERATURE (2)
BOOK REVIEWS (2)
CANCER (2)
CHARACTER RECOGNITION (2)
COMPUTER SCIENCE (2)
COMPUTERS (2)
CONCEPT DESCRIPTION LANGUAGE (2)
DATA ANALYSIS (2)
DECISION TREE (2)
DECISION TREES (2)
DEPENDENCY PARSING (2)
DIMENSION REDUCTION (2)
DIMENSIONALITY REDUCTION (2)
DISTANCE MEASUREMENT (2)
DOCUMENT HANDLING (2)
DOCUMENT REPRESENTATION (2)
EDUCATIONAL INSTITUTIONS (2)
EMOTION RECOGNITION (2)
ENGLISH LANGUAGE (2)
ERROR ANALYSIS (2)
FREE-TEXT HISTOLOGY REPORTS (2)
GRAPH THEORY (2)
HANDWRITING RECOGNITION (2)
HANDWRITTEN CHARACTER RECOGNITION (2)
IMAGE CLASSIFICATION (2)
INFORMATION GAIN (2)
INFORMATION MANAGEMENT (2)
INFORMATION ORGANIZATION (2)
K-NEAREST NEIGHBOR (2)
KERNEL METHOD (2)
KNN (2)
KNOWLEDGE BASED SYSTEMS (2)
LABELING (2)
LEARNING (2)
LEXICAL FEATURES (2)
more

INFONA - science communication portal

Search results

A supervised ranking approach for detecting relationally similar word pairs

Sentiment classification for stock news

Review of language identification techniques

Automatic lexical stress detection for Chinese learners' of English

Thai-English spam SMS filtering

Extracting Parallel Texts from the Web

Chinese Chunk Recognition Using HMSVM Method

Web Text Categorization for Large-scale Corpus

Feature selection for Chinese Text Categorization based on improved particle swarm optimization

Holistic Urdu Handwritten Word Recognition Using Support Vector Machine

A hybrid textual entailment system using lexical and syntactic features

Language identification using Fuzzy-SVM technique

Extracting Biomarker Information Applying Natural Language Processing and Machine Learning

Text categorization algorithms representations based on inductive learning

Applying latent semantic analysis to classify emotions in Thai text

Summarization- and learning-based approaches to information distillation

Sentiment Classification Based on Ontology and SVM Classifier

The sensitive feature selection for both English and Chinese text chunking

Applying machine learning algorithms for automatic Persian text classification

Tagger voting improves morphosyntactic tagging accuracy on Croatian texts

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options