The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we propose a new discriminative dictionary learning framework, called robust Label Embedding Projective Dictionary Learning (LE-PDL), for data classification. LE-PDL can learn a discriminative dictionary and the blockdiagonal representations without using the l0-norm or l1-norm sparsity regularization, since the l0 or l1-norm constraint on the coding coefficients used in the existing...
The bag of words (BOW) represents a corpus in a matrix whose elements are the frequency of words. However, each row in the matrix is a very high-dimensional sparse vector. Dimension reduction (DR) is a popular method to address sparsity and high-dimensionality issues. Among different strategies to develop DR method, Unsupervised Feature Transformation (UFT) is a popular strategy to map all words on...
Post Traumatic Stress Disorder (PTSD) is a public health problem afflicting millions of people each year. It is especially prominent among military veterans. Understanding the language, attitudes, and topics associated with PTSD presents an important and challenging problem. Based on their expertise, mental health professionals have constructed a formal definition of PTSD. However, even the most assiduous...
Opioid (e.g., heroin and morphine) addiction has become one of the largest and deadliest epidemics in the United States. To combat such deadly epidemic, there is an urgent need for novel tools and methodologies to gain new insights into the behavioral processes of opioid addiction and treatment. In this paper, we design and develop an intelligent system named iOPU to automate the detection of opioid...
In recent years, predicting future hot events in online social networks is becoming increasingly meaningful in marketing, advertisement, and recommendation systems to support companies' strategy making. Currently, most prediction models require long-term observations over the event or depend a lot on other features which are expensive to extract. However, at the early stage of an event, the temporal...
HDBSCAN*, a state-of-the-art density-based hierarchical clustering method, produces a hierarchical organization of clusters in a dataset w.r.t. a parameter mpts. While the performance of HDBSCAN* is robust w.r.t. mpts, choosing a "good" value for it can be challenging: depending on the data distribution, a high or low value for mpts may be more appropriate, and certain data clusters may...
We study bribery resistance properties in two classes of reputation-based ranking systems, where the rankings are computed by weighting the rates given by users with their reputations. In the first class, the rankings are the result of the aggregation of all the ratings, and all users are provided with the same ranking for each item. In the second class, there is a first step that clusters users by...
Time series classification has attracted much attention due to the ubiquity of time series. With the advance of technologies, the volume of available time series data becomes huge and the content is changing rapidly. This requires time series data mining methods to have low computational complexities. In this paper, we propose a parameter-free time series classification method that has a linear time...
Media analysis can reveal interesting patterns in the way newspapers report the news and how these patterns evolve over time. One example pattern is the quoting choices that media make, which could be used as bias indicators. Media slant can be expressed both with the choice of reporting an event, e.g. a person’s statement, but also with the words used to describe the event. Thus, automatic discovery...
Product bundling is widely adopted for information goods and online services because it can increase profit for companies. For example, cable companies often bundle Internet access and video streaming services together. However, it is challenging to obtain an optimal bundling strategy, not only because it is computationally expensive, but also that customers’ private information (e.g., valuations...
In recent years, finding repetitive similar patterns in time series has become a popular problem. These patterns are called time series motifs. Recent studies show that using grammar compression algorithms to find repeating patterns from the symbolized time series holds promise in discovering approximate motifs with variable length. However, grammar compression algorithms are traditionally designed...
NB-UVB Phototherapy is one of the most common treatments administrated by dermatologists for psoriasis patients. Although in general, the treatment results in improving the condition, it also can worsen it. If a model can predict the treatment response before hand, the dermatologists can adjust the treatment accordingly. In this paper, we use data mining techniques and conduct four experiments. The...
The rapidly increasing availability of healthcare data from multiple heterogeneous sources has spearheaded the adoption of data-driven approaches for improved clinical research, decision making, and patient management. The patient healthcare data are usually longitudinal and can be expressed as medical event sequences, where the events include clinical diagnosis, medications, laboratory reports, etc...
Granger causality is proposed to fuse stock prices and social media sentiment information for stock market prediction. Sentiment extraction is performed on the Twitter data from major stock companies. Analysis shows that authoritative user's sentiment affects the other users after an event with the lag of 3 days. The prediction is performed for Twitter and stock data from four companies. The sentiment...
In recent years, the usage of unmanned aircraft systems (UAS) for security-related purposes has increased, ranging from military applications to different areas of civil protection. The deployment of UAS can support security forces in achieving an enhanced situational awareness. However, in order to provide useful input to a situational picture, sensor data provided by UAS has to be integrated with...
Background: Code smells are indicators of quality problems that make a software hard to maintain and evolve. Given the importance of smells in the source code's maintainability, many studies have explored the characteristics of smells and analyzed their effects on the software's quality. Aim: We aim to investigate fundamental characteristics of code smells through an empirical study on frequently...
We suggested a method of clustering, which allows to build a model of conceptual clustering for objects of fuzzy nature, and also to increase the accuracy of clustering for such objects. We used Cobweb clustering method as a base. We modified the formula of assessing the utility of conceptual clustering for objects with fuzzy parameter values. Then we suggested a modified Cobweb version for working...
The present work proposes an unsupervised approach for recognising relations between named entities from a large corpora based on crime in Indian states and union territories. Initially, named entities have been identified from the extracted crime corpus and certain pair of entities have been chosen that facilitates the crime analysis. Then the entity pairs with their intermediate context words have...
Though accident data have been collected across industries, they may inherently contain uncertainty of randomness and fuzziness which in turn leads to misleading interpretation of the analysis. To handle the issue of uncertainty within accident data, the present work proposes a rough set theory (RST)-based approach to provide rule-based solution to the industry to minimize the number of accidents...
Machine learning is widely used in various applications such as data mining, computer vision, and bioinformatics owing to the explosion of available data. However, in practice, many data have some missing attributes. The graphic theory serves as a powerful tool for modeling and analyzing many such practical problems, such as networks of communication and data organization. This paper focuses on semi-supervised...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.