The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The characteristics of big data not only challenge the processing methods of large volume of data, but also the way we make use of such semantic-rich resources, among which how users plan to manipulate the intermediate or final results requires to be well considered. This is especially challenging in building analytics systems as big data is richer in semantics and the heterogeneous data modalities...
In molecular biology, phenotypes are often described using complex semantics and diverse biomedical expressions, thereby facilitating the development of named entity recognition (NER). Here, we propose a novel approach of recognizing plant phenotypes by cascading word embedding to sentence embedding with a class label enhancement. We utilized a word embedding method to find high-frequency phenotypes...
Post Traumatic Stress Disorder (PTSD) is a public health problem afflicting millions of people each year. It is especially prominent among military veterans. Understanding the language, attitudes, and topics associated with PTSD presents an important and challenging problem. Based on their expertise, mental health professionals have constructed a formal definition of PTSD. However, even the most assiduous...
Opioid (e.g., heroin and morphine) addiction has become one of the largest and deadliest epidemics in the United States. To combat such deadly epidemic, there is an urgent need for novel tools and methodologies to gain new insights into the behavioral processes of opioid addiction and treatment. In this paper, we design and develop an intelligent system named iOPU to automate the detection of opioid...
The present work proposes an unsupervised approach for recognising relations between named entities from a large corpora based on crime in Indian states and union territories. Initially, named entities have been identified from the extracted crime corpus and certain pair of entities have been chosen that facilitates the crime analysis. Then the entity pairs with their intermediate context words have...
Extracting stop purpose information from raw GPS data is a crucial task in most location-aware applications. With the continuous growth of GPS data collected from mobile devices, this task is becoming more and more interesting; a lot of recent research has focused on pedestrians (mobile phones) data, while the commercial vehicles sector is almost unexplored. In this paper we target the problem of...
Spatiotemporal event sequences (STESs) are the ordered series of event types whose evolving region-based instances frequently follow each other in time and are located closeby. Previous studies on STES mining require significance and prevalence thresholds for the discovery, which is usually unknown to domain experts. As the quality of the discovered STESs is of great importance to the domain experts...
We deal with online learning of acyclic Conditional Preference networks (CP-nets) from data streams, possibly corrupted with noise. We introduce a new, efficient algorithm relying on (i) information-theoretic measures defined over the induced preference rules, which allow us to deal with corrupted data in a principled way, and on (ii) the Hoeffding bound to define an asymptotically optimal decision...
In this article we address the problem of expanding the set of papers that researchers encounter when conducting bibliographic research on their scientific work. Using classical search engines or recommender systems in digital libraries, some interesting and relevant articles could be missed if they do not contain the same search key-phrases that the researcher is aware of. We propose a novel model...
Because of the crisis of unexpected events, data sources are complex and diverse. The application of the phrase weight measurement technique and the network user free marking technology in large data technology, transform the multimodal crisis information into a single information source, An integrated model for the extraction of crisis information was established. The integrative course includes...
Currently, open source projects receive various kinds of issues daily, because of the extreme openness of Issue Tracking System (ITS) in GitHub. ITS is a labor-intensive and time-consuming task of issue categorization for project managers. However, a contributor is only required a short textual abstract to report an issue in GitHub. Thus, most traditional classification approaches based on detailed...
Automatic sentiment classification is becoming a popular and effective way to help online users or companies process and make sense of customer reviews. In this article, a learning-based method for classification of online reviews that achieves better classification accuracy is obtained by (a) combining valence shifters and opinion words into bigrams for use as features in an ordinal margin classifier...
Understanding user query intent is a crucial task to Question-Answering area. With the development of online health services, online health communities generate huge amount of valuable medical Question-Answering data, where user intention can be mined. However, the queries posted by common users have many domain concepts and colloquial expressions, which make the understanding of user intents very...
Commit comments increasingly receive attention as an important complementary component in code change comprehension. To address the comment scarcity issue, a variety of automatic approaches for commit comment generation have been intensively proposed. However, most of these approaches mechanically outline a superficial level summary of the changed software entities, the change intent behind the code...
This paper deals with modeling human behavior routines during driving. We propose a new vision of the maximum causal entropy framework for inverse reinforcement learning to predict actions to be triggered in particular situation (lane change). We designed a plugin to enhance functionalities of the vCar platform which is presents an open source solution for the analysis and visualization of data from...
Natural language processing methods are widely used to study the relationship between traditional Chinese medicine (TCM) prescriptions and diseases in textual data, and the results can discover the essence of TCM literature. In this paper, we get TCM treatment information from the abstract text at first by using the web crawlers. Second, the eigenvectors will be selected from the cleaned abstract...
Online opinions play an important role in supporting consumers make decisions about purchasing products or services. In addition, customer reviews allow companies to understand the strengths and limitations of their products and services, which aids in improving their marketing campaigns. Such valuable information can only be obtained via appropriate analysis of the opinions provided by customers...
On the basis of general pivot method for paraphrase extraction which might introduce much noise in extracted paraphrases, this paper proposes a syntactic knowledge-enhanced method to extract higher-quality paraphrases to further improve the quality of statistical machine translation. Firstly, the syntactic knowledge is acquired and added to paraphrase extraction algorithm as constraints to obtain...
In last decade, the field of information extraction and retrieval has increased exponentially. Sentiment analysis is a task to identify the polarity of given content. Extracting the useful content from the opinion sources becomes a challenging task. This paper used lexicon based approach for classifying a review document as positive, negative or neutral. This paper extracts the sentiments from customer...
The construction of knowledge graph of dangerous goods (KGDG) is with great significance of inferring relative information of dangerous goods, developing corresponding policy for its storage and transport, preventing disaster caused by dangerous goods(DG), and providing emergency plan when the disaster happens. Since distributed representation of natural language is an effective method for knowledge...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.