The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Sentiment analysis or recognizing emotions from short and noisy text from social networks such as twitter has been a challenging task. Most of the existing models use word level embeddings for the final classification of the sentiments. This paper proposes a novel representation of short text derived from a combination of word embeddings and character embeddings using Bidirectional LSTM (BiLSTM)....
The proliferation of Web 2.0 technologies and the increasing use of computer-mediated communication resulted in a new form of written text, termed microtext. This poses new challenges to natural language processing tools which are usually designed for well-written text. This paper proposes a phonetic-based framework for normalizing microtext to plain English and, hence, improve the classification...
Emotion cause extraction is one of the promising research topics in sentiment analysis, but has not been well-investigated so far. This task enables us to obtain useful information for sentiment classification and possibly to gain further insights about human emotion as well. This paper proposes a bootstrapping technique to automatically acquire conjunctive phrases as textual cue patterns for emotion...
User-generated mobile application reviews have become a gold mine for timely identifying functional defects in this type of software artifacts. In this work, we develop a hidden structural SVM model for extracting detailed defect descriptions from user reviews at the sentence level. Structured features and constraints are introduced to reduce the demand of exhaustive manual annotation at the sentence...
Understanding user query intent is a crucial task to Question-Answering area. With the development of online health services, online health communities generate huge amount of valuable medical Question-Answering data, where user intention can be mined. However, the queries posted by common users have many domain concepts and colloquial expressions, which make the understanding of user intents very...
Adapted from biological sequence alignment, trace alignment is a process mining technique used to visualize and analyze workflow data. Any analysis done with this method, however, is affected by the alignment quality. The best existing trace alignment techniques use progressive guide-trees to heuristically approximate the optimal alignment in O(N2L2) time. These algorithms are heavily dependent on...
Gene ontology (GO) defines terms and classes used to describe gene functions and relationships between them. GO has been the standard to describing the functions of specific genes in different model organisms. GO annotation which tags genes with GO terms has mostly been a manual and timeconsuming curation process. In this paper we describe the development and evaluation of an innovative predictive...
Post Traumatic Stress Disorder (PTSD) is a public health problem afflicting millions of people each year. It is especially prominent among military veterans. Understanding the language, attitudes, and topics associated with PTSD presents an important and challenging problem. Based on their expertise, mental health professionals have constructed a formal definition of PTSD. However, even the most assiduous...
Effective mining of large amount of DNA and RNA fragments obtained from next generation sequencing technologies, depends on the availability of efficient analytical tools to process them. One of the important aspects of this analysis, dealing with huge number of fragments, is partitioning them based on their level of similarities. In this paper we propose a space transformation based clustering approach...
Biomarkers have tremendous potential in different phases of treatment such as risk assessment, screening/detection, diagnosis and patient's response prediction. In this paper, we present an approach for development of a generic tool for an end to end analysis of expression data to identify the probable biomarkers. We follow machine learning as well as network analysis approaches in parallel. We use...
Medical insurance claims data offer a coarse view of a patient's medical profile, including information about previous diagnoses and procedures performed. These data have been exploited in the past to predict presence of unmanifested conditions. Rarer conditions however, provide an extremely limited amount of ground truth to train supervised models, but predicting relevant co-morbidities can help...
Driving is an activity that requires considerable alertness. Insufficient attention, imperfect perception, inadequate information processing, and sub-optimal arousal are possible causes of poor human performance. Understanding of these causes and the implementation of effective remedies is of key importance to increase traffic safety and improve driver's well-being. For this purpose, we used deep...
Given some (but not all) monthly totals of people with measles (or counts of product-units sold, or counts of retweets), how can we recover the weekly counts? Requiring smoothness between successive weeks is reasonable - but can we do better, if we have some domain knowledge? For example, we know that measles (flu, count-of-retweets, etc) follow a specific cascade model, like the so-called 'SIS'....
Opioid (e.g., heroin and morphine) addiction has become one of the largest and deadliest epidemics in the United States. To combat such deadly epidemic, there is an urgent need for novel tools and methodologies to gain new insights into the behavioral processes of opioid addiction and treatment. In this paper, we design and develop an intelligent system named iOPU to automate the detection of opioid...
In this paper, we propose a new discriminative dictionary learning framework, called robust Label Embedding Projective Dictionary Learning (LE-PDL), for data classification. LE-PDL can learn a discriminative dictionary and the blockdiagonal representations without using the l0-norm or l1-norm sparsity regularization, since the l0 or l1-norm constraint on the coding coefficients used in the existing...
The bag of words (BOW) represents a corpus in a matrix whose elements are the frequency of words. However, each row in the matrix is a very high-dimensional sparse vector. Dimension reduction (DR) is a popular method to address sparsity and high-dimensionality issues. Among different strategies to develop DR method, Unsupervised Feature Transformation (UFT) is a popular strategy to map all words on...
Consider a problem of estimating an unknown high dimensional density whose support lies on unknown low-dimensional data manifold. This problem arises in many data mining tasks, and the paper proposes a new geometrically motivated solution for the problem in manifold learning framework, including an estimation of an unknown support of the density. Firstly, tangent bundle manifold learning problem is...
The Krylov subspace based information retrieval (IR) approach has been shown to provide comparable accuracy to latent semantic indexing (LSI), while providing some computational advantages. Recently, in the area of numerical linear algebra, attention has been drawn to the block Krylov subspace methods, which are shown to be more efficient than the classic Krylov subspace methods in solving linear...
Comparing images to recommend items from an image-inventory is a subject of continued interest. Added with the scalability of deep-learning architectures the once 'manual' job of hand-crafting features have been largely alleviated, and images can be compared according to features generated from a deep convolutional neural network. In this paper, we compare distance metrics (and divergences) to rank...
Mid-Infrared (MIR) spectroscopy has emerged as the most economically viable technology to determine milk values as well as to identify a set of animal phenotypes related to health, feeding, well-being and environment. However, Fourier transform-MIR spectra incurs a significant amount of redundant data. This creates critical issues such as increased learning complexity while performing Fog and Cloud...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.