The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper describes about the development and details of a linguistic resource, Sense Annotated Hindi Corpus. Word Sense Disambiguation (WSD) is an important task in Natural Language Processing. Sense annotated Hindi Corpus was developed for Lexical Sample WSD task for Hindi language. It consists of 60 polysemous Hindi nouns. The sense inventory for sense annotated Hindi corpus was derived from Hindi...
This study examines the challenging issues in the semantic annotation of the characteristics of verbal information of Mandarin Chinese. It proposes a frame-based constructional approach that aligns with linguistic premises in Frame Semantics, Construction Grammar and Cognitive Grammar. Given that semantic processing has a lot to do with human cognitive capacities, semantic transfer and profile on...
We present a word sense disambiguation (WSD) tool of Japanese Hiragana words. Unlike other WSD tasks which output something like “sense #3” as result, our WSD task rewrites the target word into a Kanji word, which is a different orthography. This means that the task is also a kind of orthographical normalization as well as WSD. In this paper we present the task, our method, and the performance.
Automatic Question Answering (QA) is a hot topic in both Natural Language Processing (NLP) and Information Retrieval (IR). And question classification is the key step of a successful automatic QA system. In this paper, an SVM-based approach is firstly proposed as our baseline system. Then two additional features, i.e., top-words and dependency relations, are introduced to improve the performance of...
The study of intonation in a tonal language presents a challenge. The challenge is to see how a language succeeds in encoding the functions which is shown by means of intonation in non-tonal languages. This paper demonstrates most of the intonational marking in tonal language in Yichang.
In this paper the systems submitted by the joint team of Dublin City University and National Taiwan University to the IALP 2016 Shared Task: Dimensional Sentiment Analysis for Chinese Words are presented. The systems learn the vector representation using Word2Vec algorithm for each Chinese word for sentiment analysis. The corpus used for the calculation of vector representation is 5 years (2006 to...
This paper describes a model to address the task of named-entity recognition on Indonesian microblog messages due to its usefulness for higher-level tasks or text mining applications on Indonesian microblogs. We view our task as a sequence labeling problem using machine learning approach. We also propose various word-level and orthographic features, including the ones that are specific to the Indonesian...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.