The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The need for short-text classification arises in many text mining applications particularly health care applications. In such applications shorter texts mean linguistic ambiguity limits the semantic expression, which in turns would make typical methods fail to capture the exact semantics of the scarce words. This is particularly true in health care domains when the text contains domain-specific or...
The report is devoted to the systematization of large terminological texts on the basis of a hierarchical representation of definitions of concepts. By integration, such models form a semantic network of terms. Thanks to the binary, homogeneous and oriented nature of this network, it is possible to obtain a hierarchical representation model and quantitative indices indicators for evaluating the integrity...
In spite of the proliferation of the business process and data compliance checking approaches, in practice, regulatory compliance management still demands considerable manual intervention. Previous research in the field of compliance has established that the manual specification/tagging of the regulations not only fails to ensure their proper coverage but also negatively affects the turnaround time...
Sentiment lexicon, which is the basis of research on opinion mining and sentiment analysis, plays an important role in the field of natural language processing. The generalized sentiment lexicon lacks domain adaptability that does not adequately meet the sentiment analysis needs of the target domain, so automatically building a domain-specific sentiment lexicon is particularly important for specific...
Zero-shot learning (ZSL) aims to transfer knowledge from observed classes to the unseen classes, based on the assumption that both the seen and unseen classes share a common semantic space, among which attributes enjoy a great popularity. However, few works study whether the human-designed semantic attributes are discriminative enough to recognize different classes. Moreover, attributes are often...
In this study, we investigate whether sparse coding helps explain the semantic representation in human cerebral cortex. We show this by using sparse coding to model semantic representation in the cerebral cortex. We propose three methods for estimating semantic representation from brain activity data. For estimating a new semantic representation, in the first method, we use only a semantic representation...
Suicide is becoming a serious problem, and how to prevent suicide has become a very important research topic. The development of Social Network System (SNS) provides an ideal platform to monitor persons' suicidal ideation. Based on Sina microblog (Weibo), this paper proposes a real-time monitoring system detecting users' suicidal ideation. From 59046 posts collected with labels of either suicide or...
This paper proposes an emotion classification method for spoken utterances using a spoken-term detection (STD) method. This is a keyword extraction method using spoken utterances. The extracted keywords are used to decide on the emotion category of an utterance. Most keywords extracted by the STD system are redundant and some of them negatively affect the emotion classification performance. Therefore,...
In this paper, we introduce our multi-document summarization system for Turkish news. The aim of the summarization system is to build a single document for multi document news that have been collected previously. The news were collected from several Turkish news sources via Real Simple Syndication (RSS). They were separated into clusters according to their topics. We utilized cosine similarity metric...
Machine Translation (MT) is very useful in supporting multicultural communication. Existing Statistical Machine Translation (SMT) which requires high quality and quantity of corpora and Rule-Based Machine Translation (RBMT) which requires bilingual dictionaries, morphological, syntax, and semantic analyzer are scarce for low-resource languages. Due to the lack of language resources, it is difficult...
Scene recognition is an important and challenging problem in the field of computer vision owing to the variations in the same class and the similarities between different classes. This paper presents a novel approach that learns a reasonable dictionary from convolutional features to effectively describe the distinctive and shared properties in scene images. Substantial convolution operations in Deep...
A simple semantic lexicon extraction method is proposed based on one hypothesis and three filtering rules from Baidu Chinese Network Encyclopedia. The acquired affective lexicon includes emotional words and their lexical semantic relations including synonyms and antonyms. The acquiring method is recursive algorithm using the seed words. The extracted affective lexicon is labeled with affective tendency...
Twitter and social media as a whole has great potential as a source of disease surveillance data however the general messiness of tweets presents several challenges for standard information extraction methods. Current methods for disease surveillance on twitter rely on inflexible keyword based approaches that require messages to be pre-filtered on the basis of a disease name which is supplied a priori...
Document summarization is a strategy, intended to extract information from multiple documents, deliberating the same subject. Many software applications handle document summarization, helping people grab the main thought, from a collection of documents, within a short time. Automatic summaries present information algorithmically extracted from multiple sources, without any impressionistic human intervention...
Recently, many researchers have shown interest in using Word2Vec as the features for text classification tasks such as sentiment analysis. Its ability to model high quality distributional semantics among words has contributed to its success in many of the tasks. However, due to the high dimensional nature of the Word2Vec features, it increases the complexity for the classifier. In this paper, a method...
This paper designs and implements an automatic evaluation system for experimental reports in the field of university computer virtual experiment. The evaluation type is divided into three types: the only answer type, the rule-related type and the subjective short answer type. For the problems of the subjective short answer, a simple and effective method based on the participle of the standard answer...
Onomatopoeia is a generic name for onomatopoeia and mimetic words. Using onomatopoeia can express the behavior and state of things in more detail, widening the range of communication. However, learning onomatopoeia has been a difficult task for Japanese learners. There are several existing studies aiming at a support with onomatopoeia learning, while no platform is available to help learners find...
The research of malicious comments in sina weibo is very important. Because a large number of malicious comments seriously undermine the user experience in sina weibo. Based on the malicious comments detection technology named semantic information, this paper gives a different technology which improves the process of malicious dictionary construction and the process of malicious comments detection...
Source code readability is critical to software quality assurance and maintenance. In this paper, we present a novel approach to the automated measurement of source code readability based on Word Concreteness and Memory Retention (WCMR) of variable names. The approach considers programming and maintenance as processes of organizing variables and their operations to describe solutions to specific problems...
Zero-shot learning, a special case of unsupervised domain adaptation where the source and target domains have disjoint label spaces, has become increasingly popular in the computer vision community. In this paper, we propose a novel zero-shot learning method based on discriminative sparse non-negative matrix factorization. The proposed approach aims to identify a set of common high-level semantic...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.