Wyniki wyszukiwania

Pozycje od 121 do 140 spośród 708 wyników

Poprzednia

1 ...
4
5
6
7
8
9
10

Następna

rozdział

Bengali parts-of-speech tagging using Global Linear Model

Sankar Mukherjee, Shyamal Kumar Das Mandal

2013 Annual IEEE India Conference (INDICON) > 1 - 4

2013 Annual IEEE India Conference (INDICON)

The paper describes an automatic parts-of-speech tagging for Bengali sentences using Global Linear Model (GLM) which learns to represent the whole sentence through a feature vector called Global feature. Tagger has been trained using averaged perceptron algorithm. Performance of this tagger has been compared to Conditional Random Field (CRF), Support Vector Machine (SVM), Hidden Markov Model (HMM)...

rozdział

Effect of Verb Subdivision and Noun Incorporation on Dependency Parsing

Hongsheng Wang, Rui Xiao, Yu'e Li

2013 6th International Conference on Intelligent Networks and Intelligent Systems > 1 - 4

2013 6th International Conference on Intelligent Networks and Intelligent Systems (ICINIS)

Parsing based on tree bank is a central issue of current natural language processing. The machine learning method of SVM and the dependency tree bank of HIT-IR-CDT is adopted in this work. In order to increase the parsing accuracy by linguistic means, verb subdivision and noun incorporation is done. The result shows, after verb subdivision, the accuracy of unlabeled attachment score increases from...

rozdział

Entailment analysis for improving Chinese textual entailment system

Shih-Hung Wu, Shan-Shun Yang, Hung-Sheng Chiu, Liang-Pu Chen, więcej

2013 IEEE 14th International Conference on Information Reuse & Integration (IRI) > 75 - 81

2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)

Textual Entailment (TE) is a critical issue in natural language processing (NLP); many NLP applications can be benefited from the recognition of textual entailment (RTE). In this paper we report our observation on how to improve the Chinese textual entailment system and the experiment results on the NTCIR-10 RITE-2 dataset. To complement the traditional machine learning approach, which treat every...

rozdział

N-gram based algorithm for distinguishing between Hindi and Sanskrit texts

C Sreejith, M Indu, P C Reghu Raj

2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT) > 1 - 4

2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT)

Language Identification (LI) is the process of determining the natural language in which the given content is written. It is an important preprocessing step in many tasks of Natural Language Processing (NLP). In a multilingual society like India, automatic language identification has a wider scope, since it would be a vital step in bridging the digital divide between the Indian masses and others....

rozdział

Extraction of gene regulatory networks from biological literature

Karthik Tangirala, Doina Caragea

2013 IEEE 3rd International Conference on Computational Advances in Bio and medical Sciences (ICCABS) > 1 - 6

2013 IEEE 3rd International Conference on Computational Advances in Bio and Medical Sciences (ICCABS)

A gene regulatory network (GRN) is a network of interacting cellular components. The components are genes and their products, and the interactions represent regulatory relationships among genes, specifically activation and inhibition of gene expression, under certain conditions. Many regulatory relationships are known in the literature. However, assembling isolated relationships into networks is a...

rozdział

Arabic Topic Detection using automatic text summarisation

Rim Koulali, Mahmoud El-Haj, Abdelouafi Meziane

2013 ACS International Conference on Computer Systems and Applications (AICCSA) > 1 - 4

2013 ACS International Conference on Computer Systems and Applications (AICCSA)

With the exponential growth of the online available Arabic documents, classifying and processing large Arabic corpora has became a challenging task. The presence of noisy information embedded in these documents has made it even more difficult to get accurate results when applying a Topic Detection (TD) process. To address this problem, a proper features selection approach is needed to enhance the...

rozdział

Studying the impact of various features on the performance of Conditional Random Field-based Arabic Named Entity Recognition

Alia Morsi, Ahmed Rafea

2013 ACS International Conference on Computer Systems and Applications (AICCSA) > 1 - 5

2013 ACS International Conference on Computer Systems and Applications (AICCSA)

The task of Named Entity Recognition (NER) is crucial to Natural Language Processing (NLP). NER can be defined as the computational identification and classification of Named Entities in running text. The importance of NER stems from the variety of Natural Language Processing applications where accurate NLP would be highly useful. Such include machine translation and information extraction. In this...

rozdział

Detection of wordplay generated by reproduction of letters in social media texts

Pawanrat Hirankan, Atiwong Suchato, Proadpran Punyabukkana

The 2013 10th International Joint Conference on Computer Science and Software Engineering (JCSSE) > 6 - 10

2013 10th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Wordplay generated by letters of its original word being repeated is commonly found in social network texts. Most of the time, wordplay items of this type are ambiguous to machines in language processing tasks such as Text-to-Speech. This paper shows some statistics on the number of letters from 102,586 real social network text items and proposes a set of classification features together with a few...

rozdział

The impact of arabic inter-character proximity and similarity on spell-checking

Hicham Gueddah, Abdallah Yousfi

2013 8th International Conference on Intelligent Systems: Theories and Applications (SITA) > 1 - 4

2013 8th International Conference on Intelligent Systems: Theories and Applications (SITA)

Following a statistical study carried out on the typographical errors committed when typing documents in Arabic language, it was found that most of these typos are character permutation errors, accounting for 65% of overall errors.

rozdział

Stop-words in keyphrase extraction problem

S. Popova, L. Kovriguina, D. Mouromtsev, I. Khodyrev

14th Conference of Open Innovation Association FRUCT > 113 - 121

2013 14th Conference of Open Innovations Association (FRUCT)

Keyword extraction problem is one of the most significant tasks in information retrieval. High-quality keyword extraction sufficiently influences the progress in the following subtasks of information retrieval: classification and clustering, data mining, knowledge extraction and representation, etc. The research environment has specified a layout for keyphrase extraction. However, some of the possible...

rozdział

Domain Adaptation Using Domain Similarity- and Domain Complexity-Based Instance Selection for Cross-Domain Sentiment Analysis

Robert Remus

2012 IEEE 12th International Conference on Data Mining Workshops > 717 - 723

2012 IEEE 12th International Conference on Data Mining Workshops

We propose an approach to domain adaptation that selects instances from a source domain training set, which are most similar to a target domain. The factor by which the original source domain training set size is reduced is determined automatically by measuring domain similarity between source and target domain as well as their domain complexity variance. Domain similarity is measured as divergence...

rozdział

Learning Domain-Specific Polarity Lexicons

Gulsen Demiroz, Berrin Yanikoglu, Dilek Tapucu, Yucel Saygin

2012 IEEE 12th International Conference on Data Mining Workshops > 674 - 679

2012 IEEE 12th International Conference on Data Mining Workshops

Sentiment analysis aims to automatically estimate the sentiment in a given text as positive or negative. Polarity lexicons, often used in sentiment analysis, indicate how positive or negative each term in the lexicon is. However, since creating domain-specific polarity lexicons is expensive and time consuming, researchers often use a general purpose or domain independent lexicon. In this work, we...

rozdział

A Knowledge Discovery Methodology for Semantic Categorization of Unstructured Textual Sources

Daniele Toti, Paolo Atzeni, Fabio Polticelli

2012 Eighth International Conference on Signal Image Technology and Internet Based Systems > 944 - 951

2012 Eighth International Conference on Signal-Image Technology & Internet-Based Systems (SITIS 2012)

We describe a methodology for identifying characterizing terms from a source text or paper and automatically building an ontology around them, with the purpose of semantically categorizing a paper corpus where documents sharing similar subjects may be subsequently clustered together by means of ontology alignment. We first employ a Natural Language Processing pipeline to extract relevant terms from...

rozdział

A Pointwise Approach for Vietnamese Diacritics Restoration

Tuan Anh Luu, Kazuhide Yamamoto

2012 International Conference on Asian Language Processing > 189 - 192

2012 International Conference on Asian Language Processing (IALP)

The automatic insertion of diacritics in electronic texts is necessary for a number of languages, including French, Romanian, Croatian, Sindhi, Vietnamese, etc. When diacritics are removed from a word and the resulting string of characters is not a word, it is easy to recover the diacritics. However, sometimes the resulting string is also a word, possibly with different grammatical properties or a...

rozdział

Multi-view Learning for Semi-supervised Sentiment Classification

Yan Su, Shoushan Li, Shengfeng Ju, Guodong Zhou, więcej

2012 International Conference on Asian Language Processing > 13 - 16

2012 International Conference on Asian Language Processing (IALP)

Standard supervised approach to sentiment classification requires a large amount of manually labeled data which is costly and time-consuming to obtain. To tackle this problem, we propose a novel semi-supervised learning method based on multi-view learning. The main idea of our approach is generate multiple views by exploiting both feature partition and language translation strategies and then standard...

rozdział

An empirical study to address the problem of Unbalanced Data Sets in sentiment classification

Asmaa Mountassir, Houda Benbrahim, Ilham Berrada

2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 3298 - 3303

2012 IEEE International Conference on Systems, Man and Cybernetics - SMC

With the emergence of Web 2.0, Sentiment Analysis is receiving more and more attention. Several interesting works were performed to address different issues in Sentiment Analysis. Nevertheless, the problem of Unbalanced Data Sets was not enough tackled within this research area. This paper presents the study we have carried out to address the problem of unbalanced data sets in supervised sentiment...

rozdział

Twitter part-of-speech tagging using pre-classification Hidden Markov model

Shichang Sun, Hongbo Liu, Hongfei Lin, Ajith Abraham

2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 1118 - 1123

2012 IEEE International Conference on Systems, Man and Cybernetics - SMC

Hidden Markov models (HMM) have been widely used in natural language processing (NLP), especially in syntactic level applications, which appears naturally as short-range-dependent sequence recognition problems. But the structure of HMM limits the usage of global knowledge including the sentiment analysis of the text, which has become an increasingly popular research topic in NLP now. In this paper,...

rozdział

Word sense disambiguation in Mongolian language

Batzolboo Bataa, Khuder Altangerel

2012 7th International Forum on Strategic Technology (IFOST) > 1 - 4

2012 7th International Forum on Strategic Technology (IFOST)

Word sense disambiguation is an important intermediate stage for many natural language processing applications, especially transformation from Cyrillic into Mongolian script. A word sense could be disambiguated by other words in the context as nouns, verbs used with the word. In this research, we have analyzed the result of an experiment on a word disambiguation system for Mongolian language based...

rozdział

Automatic Evaluation of Document Classification Using N-Gram Statistics

Dongjin Choi, Byeongkyu Ko, Eunji Lee, Myunggwon Hwang, więcej

2012 15th International Conference on Network-Based Information Systems > 739 - 742

2012 15th International Conference on Network-Based Information Systems (NBiS)

Due to the development of World Wide Web technologies, people are living in the place flooding trillions of web pages in every moment. The amount of web size has been increasing dramatically. For this reason, it is getting more difficult to find relevant web documents corresponding to what users want to read. Classifying documents into predefined categories is one of the most important tasks in Natural...

rozdział

Improving Binary-class Chinese Textural Entailment by monolingual machine translation technology

Shan-Shun Yang, Shih-Hung Wu, Liang-Pu Chen, Wen-Tai Hsieh, więcej

2012 IEEE 13th International Conference on Information Reuse & Integration (IRI) > 65 - 68

2012 IEEE 13th International Conference on Information Reuse & Integration (IRI)

In this paper, we describe how we improve our system for Chinese Textual Entailment Recognition by a monolingual machine translation system. Previously, our approach is based on the standard supervised learning classification. We integrate the result of monolingual machine translation system with the other available computational linguistic resources of Chinese language processing to build the system...

Poprzednia

1 ...
4
5
6
7
8
9
10

Następna

Opcje filtrowania

Słowa kluczowe:
TRAINING
NATURAL LANGUAGE PROCESSING

Data publikacji

Ustaw własny zakres dat

Dostępność treści

Dostępna (697)
Brak (11)

Słowa kluczowe

DATA MINING (194)
FEATURE EXTRACTION (175)
HIDDEN MARKOV MODELS (175)
ACCURACY (155)
SPEECH (151)
TEXT ANALYSIS (134)
SPEECH RECOGNITION (124)
SUPPORT VECTOR MACHINES (110)
LEARNING (ARTIFICIAL INTELLIGENCE) (91)
MACHINE LEARNING (90)
CONTEXT (81)
TAGGING (80)
DICTIONARIES (78)
CLASSIFICATION ALGORITHMS (76)
COMPUTATIONAL MODELING (73)
ARTIFICIAL NEURAL NETWORKS (69)
SEMANTICS (69)
DATA MODELS (64)
TESTING (64)
TRAINING DATA (62)
COMPUTATIONAL LINGUISTICS (59)
PATTERN CLASSIFICATION (59)
STATISTICAL ANALYSIS (59)
SPEECH PROCESSING (51)
INFORMATION RETRIEVAL (49)
LANGUAGE TRANSLATION (49)
ENTROPY (47)
PROBABILITY (47)
ACOUSTICS (46)
MATHEMATICAL MODEL (41)
TEXT CATEGORIZATION (39)
VOCABULARY (39)
DATABASES (38)
LABELING (38)
CHARACTER RECOGNITION (36)
HIDDEN MARKOV MODEL (36)
INTERNET (36)
SUPPORT VECTOR MACHINE CLASSIFICATION (35)
ADAPTATION MODEL (34)
SYNTACTICS (34)
COMPUTERS (33)
GRAMMARS (32)
WORD PROCESSING (32)
SUPPORT VECTOR MACHINE (31)
CLASSIFICATION (30)
EDUCATIONAL INSTITUTIONS (30)
HMM (28)
ALGORITHM DESIGN AND ANALYSIS (27)
KERNEL (27)
NATURAL LANGUAGES (27)
NEURAL NETS (27)
HUMANS (26)
DECODING (25)
HANDWRITING RECOGNITION (24)
NEURAL NETWORKS (24)
LINGUISTICS (23)
MACHINE TRANSLATION (23)
CONFERENCES (21)
CONTEXT MODELING (21)
KNOWLEDGE BASED SYSTEMS (21)
CONDITIONAL RANDOM FIELDS (20)
ERROR ANALYSIS (20)
HANDWRITTEN CHARACTER RECOGNITION (20)
SPEECH SYNTHESIS (20)
GAUSSIAN PROCESSES (19)
PROBABILITY DENSITY FUNCTION (19)
RANDOM PROCESSES (19)
CONDITIONAL RANDOM FIELD (18)
FEATURE SELECTION (18)
NAMED ENTITY RECOGNITION (18)
PRAGMATICS (18)
SENTIMENT ANALYSIS (18)
WORD SENSE DISAMBIGUATION (18)
CRF (17)
LANGUAGE MODEL (17)
NIST (17)
ORGANIZATIONS (17)
TEXT CLASSIFICATION (17)
BAYES METHODS (16)
STATISTICAL MACHINE TRANSLATION (16)
SVM (16)
TEXT MINING (16)
DOCUMENT HANDLING (15)
INFORMATION EXTRACTION (15)
MAXIMUM ENTROPY METHODS (15)
NEURONS (15)
PATTERN CLUSTERING (15)
CHINESE WORD SEGMENTATION (14)
CORRELATION (14)
PATTERN RECOGNITION (14)
PREDICTIVE MODELS (14)
SEARCH ENGINES (14)
SPEAKER RECOGNITION (14)
STANDARDS (14)
AUTOMATIC SPEECH RECOGNITION (13)
CLUSTERING ALGORITHMS (13)
DECISION TREES (13)
EQUATIONS (13)
więcej

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Dostępność treści

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu