The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Named Entity Identification (NEI) is the task of identifying named entities from textual data. While NEI for English language can be done with considerable accuracy owing to tools like Stanford NER tagger, the accuracy in case of Indian languages like Hindi is comparatively poor. One of the reasons for this is the lack of sufficiently large annotated corpora in Indian languages on which NE-taggers...
The fifth Dialog State Tracking Challenge (DSTC5) introduces a new cross-language dialog state tracking scenario, where the participants are asked to build their trackers based on the English training corpus, while evaluating them with the unlabeled Chinese corpus. Although the computer-generated translations for both English and Chinese corpus are provided in the dataset, these translations contain...
Discovering topics in short texts, such as news titles and tweets, has become an important task for many content analysis applications. However, due to the lack of rich context information in short texts, the performance of conventional topic models on short texts is usually unsatisfying. In this paper, we propose a novel topic model for short text corpus using word embeddings. Continuous space word...
Understanding changes in the mood and mentalhealth of large populations is a challenge, with the need for largenumbers of samples to uncover any regular patterns within thedata. The use of data generated by online activities of healthyindividuals offers the opportunity to perform such observationson the large scales and for the long periods that are required. Various studies have previously examined...
Quite a number of recent works have concentrated on the task of recommending to Twitter users whom they should follow, among which, the WTF (Who To Follow) service provided by Twitter. Recommenders are based either on the user's network structure, or on some notion of topical similarity with other users, or on both. We present a method for analysis of Twitter users supported by a hierarchical representation...
Traditional Chinese Medicine (TCM) has been around for over 2000 years and it's a significant part of Chinese cultural heritage. The theoretical framework of TCM is unique and with rich of content, which contains the complex relationships between disease and medicine and has formed a unique system to diagnose and cure illness. Research on question-answering (QA) over TCM is significant for Chinese...
While a large number of well-known knowledge bases (KBs) in life science have been published as Linked Open Data, there are few KBs in Chinese. However, KBs of life science in Chinese are necessary when we want to automatically process and analyze electronic medical records (EMRs) in Chinese. Of all, the symptom KB in Chinese is the most seriously in need, since symptoms are the starting point of...
Semantic analysis is an important component of recommendation systems and information retrieval in computer aided detection. Previous researches have made certain breakthroughs in disease diagnosis and drugs recommended by semantic analysis. We propose a bilateral shortest paths method for computing semantic relatedness based on the human thought patterns for making sufficient use of the hyperlink...
Mappings verification is a laborious task. The paper presents a Game with a Purpose based system for verification of automatically generated mappings. General description of idea standing behind the games with the purpose is given. Description of TGame system, a 2D platform mobile game with verification process included in the gameplay, is provided. Additional mechanisms for anti-cheating, increasing...
In contemporary world, translation becomes a critical need of the time. Parallel dictionaries have now become a most accessible source by humans, but confines are there as they do not offer good quality translation function, because of neologisms and words that are out of vocabulary. To overcome this problem in the usage of statistical translation systems is becoming more and more important in maintaining...
Automatic classification of news articles is a relevant problem due to the large amount of news generated every day, so it is crucial that these news are classified to allow for users to access to information of interest quickly and effectively. On the one hand, traditional classification systems represent documents as bag-of-words (BoW), which are oblivious to two problems of language: synonymy and...
Relation discovery is a crucial task in ontology learning process. The classical approaches for relation extraction, based on statistical, syntactical or pattern matching techniques, focus typically on the taxonomic aspect. The discovery of non-taxonomic relationships is often neglected. We extend these approaches by taking into account the document structure which bears additional knowledge. This...
This paper deals with development of effective Intelligent Control teaching environment using Challenge Based Learning (CBL) at Mechatronics Department of Politeknik Elektronika Negeri Surabaya (PENS). CBL is one of problem-based, student centered learning in which a group of students learn according their own pace, and they make their own final challenging target. Moreover, social media was utilized...
This work focuses on two specific types of sentimental information analysis for traditional Chinese words, i.e., valence represents the degree of pleasant and unpleasant feelings (i.e., sentiment orientation), and arousal represents the degree of excitement and calm (i.e., sentiment strength). To address it, we proposed supervised ensemble learning models to assign appropriate real valued ratings...
This paper presents the IALP 2016 shared task on Dimensional Sentiment Analysis for Chinese Words (DSAW) which seeks to identify a real-value sentiment score of Chinese words in the both valence and arousal dimensions. Valence represents the degree of pleasant and unpleasant (or positive and negative) feelings, and arousal represents the degree of excitement and calm. Of the 22 teams registered for...
Studying examples of working code is an excellent way for users of advanced computing resources to understand how to complete a task given a set of resources, constraints and goals. Expert users are adept at searching the Internet for examples that they can use or modify, yet novices can be easily overwhelmed by the quantity and variety of examples available online. Even when a relevant example has...
In modern data centers a large amount of energy can be saved by intelligently distributing load on the available servers and transferring idle nodes into low energy modes. Distributing load leads to a more energy-efficient usage of the servers within a server farm. Additionally, the use of energy saving modes like suspend to main memory can decrease the energy consumption dramatically. The selection...
This paper presents a new selection-based question answering dataset, SelQA. The dataset consists of questions generated through crowdsourcing and sentence length answers that are drawn from the ten most prevalent topics in the English Wikipedia. We introduce a corpus annotation scheme that enhances the generation of large, diverse, and challenging datasets by explicitly aiming to reduce word co-occurrences...
Recommender systems have been widely used in our daily life to recommend objects to users meeting the users' preference. In this paper, we focus on objects with temporally variable features such as restaurant with seasonal dishes and point-of-interests (POIs) to have seasonal attractions, and propose a method to automatically generate temporal feature vectors for those objects. The basic idea of the...
In this work, we describe the design, development, and deployment of NEREA (Named Entity Recognizer for spEcific Areas), an automatic Named Entity Recognizer and Disambiguation system, developed in collaboration with professional documentalists. The aim of NEREA is to keep accurate and current information about the entities mentioned in a local repository, and then support building appropriate infoboxes,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.