The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Automatic sentiment classification is becoming a popular and effective way to help online users or companies process and make sense of customer reviews. In this article, a learning-based method for classification of online reviews that achieves better classification accuracy is obtained by (a) combining valence shifters and opinion words into bigrams for use as features in an ordinal margin classifier...
An individual's personality determines the probable repertoire of their reactions to a particular situation. A social robot is much more effective if it is able to learn and so take into account the properties of the humans around it, including personalities. We investigate how well personality can be estimated based on modest amounts of speech or writing, which a social robot might (over)hear. Such...
Global aging brings new challenges to elderly healthdata management. Existing systems such as HIS and CIS focus on the storage and management of information, the key limitations are that they lack effective mining approaches and usually cannot handle the large-scale health data, these drawbacks make them very hard to be a robust and light-weight system. In this paper, we develop a memory computing...
TCM is a traditional medicine in China and has made great contributions to the Chinese nation and has rich information resources. With the development of information technology, data mining technology is rapidly rising. Under the guidance of Chinese medicine theory, how to combine data mining technology with Chinese medicine to make it serve people has become a new topic. This paper mainly applies...
Psoriasis is a chronic inflammatory skin disease that have bad effects on the quality of life of the patients. As psoriasis is intractable and its cause is difficult to discover, Traditional Chinese Medicine is proved in China to be a more effective medical way. In Chinese Medicine, decision on prescription is based on ZHENG rather than disease. Only after successful differentiation of ZHENG, can...
It is well recognized that air quality inference is of great importance for environmental protection. However, due to the limited monitoring stations and various impact factors, e.g., meteorology, traffic volume and human mobility, inference of air quality index (AQI) could be a difficult task. Recently, with the development of new ways for collecting and integrating urban, mobile, and public service...
Adapted from biological sequence alignment, trace alignment is a process mining technique used to visualize and analyze workflow data. Any analysis done with this method, however, is affected by the alignment quality. The best existing trace alignment techniques use progressive guide-trees to heuristically approximate the optimal alignment in O(N2L2) time. These algorithms are heavily dependent on...
User-generated mobile application reviews have become a gold mine for timely identifying functional defects in this type of software artifacts. In this work, we develop a hidden structural SVM model for extracting detailed defect descriptions from user reviews at the sentence level. Structured features and constraints are introduced to reduce the demand of exhaustive manual annotation at the sentence...
Social media serves as a unified platform for users to express their thoughts on subjects ranging from their daily lives to their opinion on consumer brands and products. These users wield an enormous influence in shaping the opinions of other consumers and influence brand perception, brand loyalty and brand advocacy. In this paper, we analyze the opinion of 19M Twitter users towards 62 popular industries,...
Understanding user query intent is a crucial task to Question-Answering area. With the development of online health services, online health communities generate huge amount of valuable medical Question-Answering data, where user intention can be mined. However, the queries posted by common users have many domain concepts and colloquial expressions, which make the understanding of user intents very...
Opinion mining and demographic attribute inference have many applications in social science. In this paper, we propose models to infer daily joint probabilities of multiple latent attributes from Twitter data, such as political sentiment and demographic attributes. Since it is costly and time-consuming to annotate data for traditional supervised classification, we instead propose scalable Learning...
Since their introduction over a decade ago, time series motifs have become a fundamental tool for time series analytics, finding diverse uses in dozens of domains. In this work we introduce Time Series Chains, which are related to, but distinct from, time series motifs. Informally, time series chains are a temporally ordered set of subsequence patterns, such that each pattern is similar to the pattern...
With the rapid development of location-based social networks, Point-of-Interest (POI) recommendation has played an important role in helping people discover attractive locations. However, existing POI recommendation methods assume a flat structure of POIs, which are better described in a hierarchical structure in reality. Furthermore, we discover that both users' content and spatial preferences exhibit...
Recommender systems have attracted much attention in last decades, which can help the users explore new items in many applications. As a popular technique in recommender systems, item recommendation works by recommending items to users based on their historical interactions. Conventional item recommendation methods usually assume that users and items are stationary, which is not always the case in...
Understanding newly emerging events or topics associated with a particular region of a given day can provide deep insight on the critical events occurring in highly evolving metropolitan cities. We propose herein a novel topic modeling approach on text documents with spatio-temporal information (e.g., when and where a document was published) such as location-based social media data to discover prevalent...
In this paper, we study relations ranking and object classification for multi-relational data where objects are interconnected by multiple relations. The relations among objects should be exploited for achieving a good classification. While most existing approaches exploit either by directly counting the number of connections among objects or by learning the weight of each relation from labeled data...
Time series motifs are approximately repeating patterns in real-valued time series data. They are useful for exploratory data mining and are often used as inputs for various time series clustering, classification, segmentation, rule discovery, and visualization algorithms. Since the introduction of the first motif discovery algorithm for univariate time series in 2002, multiple efforts have been made...
Given the soaring amount of data being generated daily, graph mining tasks are becoming increasingly challenging, leading to tremendous demand for summarization techniques. Feature selection is a representative approach that simplifies a dataset by choosing features that are relevant to a specific task, such as classification, prediction, and anomaly detection. Although it can be viewed as a way to...
Subgroup discovery is a local pattern mining technique to find interpretable descriptions of sub-populations that stand out on a given target variable. That is, these sub-populations are exceptional with regard to the global distribution. In this paper we argue that in many applications, such as scientific discovery, subgroups are only useful if they are additionally representative of the global distribution...
The contents generated from different data sources are usually non-uniform, such as long texts produced by news websites and short texts produced by social media. Uncovering topics over large-scale non-uniform texts becomes an important task for analyzing network data. However, the existing methods may fail to recognize the difference between long texts and short texts. To address this problem, we...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.