Search results

Items from 1 to 20 out of 63 results

chapter

Gramatika: A grammar checker for the low-resourced Filipino language

Matthew Phillip Go, Nicco Nocon, Allan Borra

TENCON 2017 - 2017 IEEE Region 10 Conference > 471 - 475

TENCON 2017 - 2017 IEEE Region 10 Conference

This research focuses on the implementation of Gramatika, a grammar checker designed for the Filipino language given its available resources and linguistic tools. The checker uses hybrid n-grams generated from n-grams of words, part-of-speech tags, and lemmas of grammatically-correct texts. It covers a variety of error types including those unique in Filipino: wrong word form, and incorrectly merged...

chapter

Authorship recognition of tweets: A comparison between social behavior and linguistic profiles

Madeena Sultana, Padma Polash, Marina Gavrilova

2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 471 - 476

2017 IEEE International Conference on Systems, Man and Cybernetics (SMC)

Authorship recognition from micro-blogs such as Twitter is a challenging task due to limitation of text length to 140 characters. However, identification of micro-blog authors is crucial in many cyber-crime investigations as well as in forensic applications. So far, traditional linguistic profiles such as Bag-Of-Words (BOW) and style-based markers have been investigated for identification of micro-blog...

chapter

Improving the rule based machine translation system using sentence simplification (english to tamil)

B. Kavirajan, M. Anand Kumar, K.P. Soman, S. Rajendran, more

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 957 - 963

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

The ultimate aim of this research is to develop a Rule Based Machine Translation System (RBMT) using sentence simplification. The sentence pattern for English is SVO and Tamil is SOV. Complex and larger sentence are not easy to parse and translate. So, the sentence simplifier is also accommodated in the rule based system to split a large sentence into simple multiple sentences. Machine translation...

chapter

Natural language processing based features for sarcasm detection: An investigation using bilingual social media texts

Mohd Suhairi Md Suhaimin, Mohd Hanafi Ahmad Hijazi, Rayner Alfred, Frans Coenen

2017 8th International Conference on Information Technology (ICIT) > 703 - 709

2017 8th International Conference on Information Technology (ICIT)

The presence of sarcasm in text can hamper the performance of sentiment analysis. The challenge is to detect the existence of sarcasm in texts. This challenge is compounded when bilingual texts are considered, for example using Malay social media data. In this paper a feature extraction process is proposed to detect sarcasm using bilingual texts; more specifically public comments on economic related...

chapter

Improving Thai-English word alignment for interrogative sentences in SMT by grammatical knowledge

Kanyalag Phodong, Rachada Kongkachandra

2017 9th International Conference on Knowledge and Smart Technology (KST) > 226 - 231

2017 9th International Conference on Knowledge and Smart Technology (KST)

This paper presents a method to improve Thai-English word alignment in statistical machine translation (SMT) for interrogative sentences in a parallel corpus. We utilize the Thai and English grammatical knowledge i.e. tense, part of speech (POS), and question inversion pattern. The proposed method handles the difference of Thai and English interrogative sentences using sentence transformation, interrogative...

chapter

Rule based chunker for Hindi

Sneha Asopa, Pooja Asopa, Iti Mathur, Nisheeth Joshi

2016 2nd International Conference on Contemporary Computing and Informatics (IC3I) > 442 - 445

2016 2nd International Conference on Contemporary Computing and Informatics (IC3I)

In this research paper, a rule based chunker is developed and evaluated. For the development of the chunker, handcrafted linguistic rules for mainly noun, adverb, verb, adjective phrases and conjuncts were generated. Indian Languages Chunk Tagset is used for annotations. In order to evaluate, 500 sentences of Hindi language tagged by HMM tagger were considered and given as an input to our chunker...

chapter

Syntax or semantics? knowledge-guided joint semantic frame parsing

Yun-Nung Chen, Dilek Hakanni-Tur, Gokhan Tur, Asli Celikyilmaz, more

2016 IEEE Spoken Language Technology Workshop (SLT) > 348 - 355

2016 IEEE Spoken Language Technology Workshop (SLT)

Spoken language understanding (SLU) is a core component of a spoken dialogue system, which involves intent prediction and slot filling and also called semantic frame parsing. Recently recurrent neural networks (RNN) obtained strong results on SLU due to their superior ability of preserving sequential information over time. Traditionally, the SLU component parses semantic frames for utterances considering...

chapter

A grapheme-level approach for constructing a Korean morphological analyzer without linguistic knowledge

Jihun Choi, Jonghem Youn, Sang-goo Lee

2016 IEEE International Conference on Big Data (Big Data) > 3872 - 3879

2016 IEEE International Conference on Big Data (Big Data)

Morphological analysis is an essential step for processing the Korean language, due to highly agglutinative properties of the language. In this paper, we propose a novel approach for constructing a Korean morphological analyzer that can capture linguistic properties using graphemes as basic processing units. Since our model does not utilize prior linguistic knowledge, the model can be applied to other...

chapter

Developing learner corpus annotation for Chinese grammatical errors

Lung-Hao Lee, Li-Ping Chang, Yuen-Hsien Tseng

2016 International Conference on Asian Language Processing (IALP) > 254 - 257

2016 International Conference on Asian Language Processing (IALP)

This study describes the construction of the TOCFL (Test Of Chinese as a Foreign Language) learner corpus, including the collection and grammatical error annotation of 2,837 essays written by Chinese language learners originating from a total of 46 different mother-tongue languages. We propose hierarchical tagging sets to manually annotate grammatical errors, resulting in 33,835 inappropriate usages...

chapter

Semantic annotation for Mandarin verbal lexicon

Mei-chun Liu, Jui-ching Chang

2016 International Conference on Asian Language Processing (IALP) > 30 - 36

2016 International Conference on Asian Language Processing (IALP)

This study examines the challenging issues in the semantic annotation of the characteristics of verbal information of Mandarin Chinese. It proposes a frame-based constructional approach that aligns with linguistic premises in Frame Semantics, Construction Grammar and Cognitive Grammar. Given that semantic processing has a lot to do with human cognitive capacities, semantic transfer and profile on...

chapter

A study on the construction of a grade-level reading corpus for TCSL

Juan Li, Lijiao Yang, Liang Wen, Zhiying Liu, more

2016 International Conference on Asian Language Processing (IALP) > 279 - 282

2016 International Conference on Asian Language Processing (IALP)

Reading ability is one of the most important skills to language learners. Grade-level reading corpus can be more targeted to improve learners' reading abilities. Based on the Corpus of Teaching Chinese as a Second Language (CTC), this paper presents a grade standard for the construction of a grade-level reading corpus. The corpus is tagged with linguistic information, and it can be used as a language...

chapter

Automatic extraction and recommendation of Grammar Points in L2 Chinese

Tianbao Song, Jing He, Weiming Peng, Jihua Song

2016 International Conference on Asian Language Processing (IALP) > 184 - 188

2016 International Conference on Asian Language Processing (IALP)

Grammar teaching and learning have always been important and difficult parts in L2 Chinese. This paper demonstrates a method for automatically extracting and recommending Grammar Points to L2 Chinese teachers and learners. First, a L2 Chinese grammar syllabus is reconstructed based on a corpus of international Chinese teaching materials. Second, a regular expression-based learning algorithm is explored...

chapter

Product aspect detection for sentiment analysis by employing aggrandized affinity measure

A. Soundariya, N. Balaganesh, K. Muneeswaran

2016 Online International Conference on Green Engineering and Technologies (IC-GET) > 1 - 7

2016 Online International Conference on Green Engineering and Technologies (IC-GET)

Customer reviews in online websites has been increased a lot nowadays. Detecting aspects on those reviews are becoming a challenging task because of size complexity. Hence, an automated mechanism is needed to detect the product aspects from the online consumer reviews. In this paper we modeled an unsupervised technique to detect product aspects. In general, the product aspect may be single word or...

chapter

ICT in humanities: Phenomenon of Corpus Linguistics: Russian national experience

Galina Kedrova, Maria Volkova

2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT) > 1 - 5

2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT)

The paper presents various Russian language corpora to discuss professional advantages and cultural benefits of linguistic corpora technology in comparison with the pre-computational and pre-corpora state-of-the-art in language research and Arts and Humanities. As the most faithful ‘mirror’ of political, intellectual and spiritual life of a nation during current state and in historical perspective,...

chapter

Building a Chinese Dependency GraphBank

Bin Li, Yuan Wen, Cuijuan Xing, Yichu Zhou, more

2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW) > 9 - 12

2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW)

How to represent the structure of a sentence is a key issue in linguistic and NLP fields. Dependency Grammar (DG) has been widely used as it directly describes the relations between words in a sentence. However, it always follows the tree structure that does not fit the argument sharing phenomenon. On the other hand, the Semantic Role Labeling (SRL) annotation does not give a full structure for a...

chapter

ExATO - High Quality Term Extraction for Portuguese and English

Lucelene Lopes, Paulo Fernandes, Renata Vieira

2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI) > 540 - 545

2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)

This paper presents a novel version of ExATO, a term extractor originally designed to extract relevant terms from corpora in Portuguese. In this new version not only corpora in Portuguese can be handled, but also texts in English are accepted. This extension is likely to offer the same quality pattern already achieved for Portuguese. In this paper, we draw the analysis of results in parallel corpora...

chapter

A method to extract essential keywords from a tweet using NLP tools

Tharindu Weerasooriya, Nandula Perera, S.R. Liyanage

2016 Sixteenth International Conference on Advances in ICT for Emerging Regions (ICTer) > 29 - 34

2016 Sixteenth International Conference on Advances in ICT for Emerging Regions (ICTer)

A tweet is an authentic use of Natural Language where the user has to deliver the message in 140 characters or less. According to previous researchers, this restriction increases the possible ambiguity of a tweet making it difficult for traditional Natural Language Processing (NLP) tools to analyze it. This research enhances the machine learning based Stanford CoreNLP Part-of-Speech (POS) tagger with...

chapter

Document Summarization in Malayalam with sentence framing

Kavya Kishore, Greeshma N Gopal, Neethu P H

2016 International Conference on Information Science (ICIS) > 194 - 200

2016 International Conference in Information Science (ICIS)

Document Summarization is a technique of conveying important information in a given document. It is one of the most important chores of Natural Language Processing as the summary produced is helpful for information retrieval systems, question answering systems, medical domain and news domain etc. Most of the summarization works in Indian languages are of extractive nature and not much work is oriented...

chapter

Pragmatic analysis of malayalam sentences

T. A. Shaharban, Rosna P Haroon

2016 International Conference on Inventive Computation Technologies (ICICT) > 3 > 1 - 5

2016 International Conference on Inventive Computation Technologies (ICICT)

Natural language processing is one of the major field in computer science. NLP is the ability of the system to process different sentences in natural language. Parts of speech tagging, pragmatic analysis, machine translation, discourse analysis etc are the different fields in NLP. Malayalam is the one of the important language in Dravidian family, where the difficult grammar structure will be the...

chapter

Summarizing customer review based on product feature and opinion

Jawad Khan, Byeong Soo Jeong

2016 International Conference on Machine Learning and Cybernetics (ICMLC) > 158 - 165

2016 International Conference on Machine Learning and Cybernetics (ICMLC)

Opinion or sentiment analysis has risen to extract useful information from a lot of unstructured text data, in the form of customer reviews on different products and their features or online SNS data respectively. Customer reviews are not only helpful for potential customers, but also are helpful for the manufacturers of the products to raise their products and services. The reviews conciseness takes...

Keywords:
TAGGING

Publication date

Set your own date range

Publication type

book (62)
article (1)

Keywords

NATURAL LANGUAGE PROCESSING (22)
SYNTACTICS (17)
SEMANTICS (14)
SPEECH (13)
FEATURE EXTRACTION (12)
DICTIONARIES (9)
GRAMMAR (9)
CONTEXT (8)
EDUCATIONAL INSTITUTIONS (8)
TWITTER (8)
COMPUTATIONAL LINGUISTICS (7)
HIDDEN MARKOV MODELS (7)
ACCURACY (5)
DATA MINING (5)
MACHINE LEARNING (5)
MEDIA (5)
TEXT ANALYSIS (5)
TRAINING (5)
COMPUTER SCIENCE (4)
ELECTRONIC MAIL (4)
MANUALS (4)
POS TAGGING (4)
SENTIMENT ANALYSIS (4)
STANDARDS (4)
ARABIC LANGUAGE (3)
ARTIFICIAL NEURAL NETWORKS (3)
COMPUTERS (3)
CONFERENCES (3)
DATABASES (3)
EDUCATION (3)
INTERNET (3)
LINGUISTICS (3)
MORPHOLOGY (3)
ONTOLOGIES (3)
OPINION MINING (3)
ORGANIZATIONS (3)
STATISTICAL ANALYSIS (3)
SUPPORT VECTOR MACHINES (3)
WORD ALIGNMENT (3)
WORD PROCESSING (3)
WRITING (3)
ANALYTICAL MODELS (2)
ANNOTATION (2)
BLOGS (2)
BOOK REVIEWS (2)
BUILDINGS (2)
CASE MARKER (2)
CLASSIFICATION ALGORITHMS (2)
COMPOUNDS (2)
CULTURAL DIFFERENCES (2)
DEEP LEARNING (2)
ENCODING (2)
EUROPE (2)
FILTERING (2)
FREQUENCY MEASUREMENT (2)
INDEXES (2)
INFORMATION EXTRACTION (2)
KNOWLEDGE BASED SYSTEMS (2)
LABELING (2)
LINGUISTIC FEATURES (2)
LINGUISTIC KNOWLEDGE (2)
MACHINE TRANSLATION (2)
MUTUAL INFORMATION (2)
N-GRAM (2)
NATURAL LANGUAGES (2)
NIOBIUM (2)
PARSING (2)
PART OF SPEECH (2)
PRESSES (2)
REGULAR EXPRESSION (2)
SEMANTIC CATEGORY (2)
STATISTICAL MACHINE TRANSLATION (2)
STRESS (2)
SUPERVISED LEARNING (2)
SYNTAX (2)
SYSTEMATICS (2)
TERM EXTRACTION (2)
VOCABULARY (2)
ABSTRACTIVE SUMMARIZATION (1)
ADAPTATION MODELS (1)
AFFINITY MEASURE (1)
ANAPHORA RESOLUTION (1)
ANNOTATION PROCESS (1)
ARABIC PHRASES (1)
ASPECT DETECTION (1)
ASSAMESE (1)
ASSOCIATION MEASUREMENT (1)
ASSOCIATION RULES (1)
AUTHORSHIP RECOGNITION (1)
AUTOMATED ERROR DETECTION (1)
AUTOMATED ESSAY SCORING (1)
AUTOMATIC QUESTION ANSWERING SYSTEMS (1)
AUTOMATIC TAGGING ALGORITHM (1)
BAG-OF-WORDS (1)
BATTERIES (1)
BILINGUAL ANNOTATION (1)
BILINGUAL FEATURE (1)
BINARY ADJACENT WORD PAIR (1)
more

INFONA - science communication portal

Search results

Gramatika: A grammar checker for the low-resourced Filipino language

Authorship recognition of tweets: A comparison between social behavior and linguistic profiles

Improving the rule based machine translation system using sentence simplification (english to tamil)

Natural language processing based features for sarcasm detection: An investigation using bilingual social media texts

Improving Thai-English word alignment for interrogative sentences in SMT by grammatical knowledge

Rule based chunker for Hindi

Syntax or semantics? knowledge-guided joint semantic frame parsing

A grapheme-level approach for constructing a Korean morphological analyzer without linguistic knowledge

Developing learner corpus annotation for Chinese grammatical errors

Semantic annotation for Mandarin verbal lexicon

A study on the construction of a grade-level reading corpus for TCSL

Automatic extraction and recommendation of Grammar Points in L2 Chinese

Product aspect detection for sentiment analysis by employing aggrandized affinity measure

ICT in humanities: Phenomenon of Corpus Linguistics: Russian national experience

Building a Chinese Dependency GraphBank

ExATO - High Quality Term Extraction for Portuguese and English

A method to extract essential keywords from a tweet using NLP tools

Document Summarization in Malayalam with sentence framing

Pragmatic analysis of malayalam sentences

Summarizing customer review based on product feature and opinion

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options