The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Biomarkers have tremendous potential in different phases of treatment such as risk assessment, screening/detection, diagnosis and patient's response prediction. In this paper, we present an approach for development of a generic tool for an end to end analysis of expression data to identify the probable biomarkers. We follow machine learning as well as network analysis approaches in parallel. We use...
The bag of words (BOW) represents a corpus in a matrix whose elements are the frequency of words. However, each row in the matrix is a very high-dimensional sparse vector. Dimension reduction (DR) is a popular method to address sparsity and high-dimensionality issues. Among different strategies to develop DR method, Unsupervised Feature Transformation (UFT) is a popular strategy to map all words on...
Post Traumatic Stress Disorder (PTSD) is a public health problem afflicting millions of people each year. It is especially prominent among military veterans. Understanding the language, attitudes, and topics associated with PTSD presents an important and challenging problem. Based on their expertise, mental health professionals have constructed a formal definition of PTSD. However, even the most assiduous...
Given some (but not all) monthly totals of people with measles (or counts of product-units sold, or counts of retweets), how can we recover the weekly counts? Requiring smoothness between successive weeks is reasonable - but can we do better, if we have some domain knowledge? For example, we know that measles (flu, count-of-retweets, etc) follow a specific cascade model, like the so-called 'SIS'....
Opioid (e.g., heroin and morphine) addiction has become one of the largest and deadliest epidemics in the United States. To combat such deadly epidemic, there is an urgent need for novel tools and methodologies to gain new insights into the behavioral processes of opioid addiction and treatment. In this paper, we design and develop an intelligent system named iOPU to automate the detection of opioid...
We treat failure prediction in a supervised learning framework using a convolutional neural network (CNN). Due to the nature of the problem, learning a CNN model on this kind of dataset is generally associated with three primary problems: 1) negative samples (indicating a healthy system) outnumber positives (indicating system failures) by a great margin; 2) implementation design often requires chopping...
In this paper, we propose an online spatiotemporal data-driven methodology to detect malicious cyber attacks that target power system balancing and frequency control. The anomaly detection, which spots abnormal generator behavioral patterns in real time, is achieved locally at a power plant with peer to peer communication capability. We mainly consider the data integrity attack targeting Automatic...
Recently, heterogeneous information network(HIN) analysis has attracted a lot of attentions. One of the HIN application is recommendation. Due to HIN containing multiple different objects and links and rich semantic meanings, it is promising to generate better recommendation. Previous studies on movie recommendation have combined the single implicit feedback information with heterogeneous information...
Advanced statistics have proved to be a crucial tool for basketball coaches in order to improve training skills. Indeed, the performance of the team can be further optimized by studying the behaviour of players under certain conditions. In the United States of America, companies such as STATS or Second Spectrum use a complex multi-camera setup to deliver advanced statistics to all NBA teams, but the...
Sparse subspace clustering (SSC) is an effective approach to cluster high-dimensional data. However, how to adaptively select the number of clusters/eigenvectors for different data sets, especially when the data are corrupted by noise, is a big challenge in SSC and also an open problem in field of data mining. In this paper, considering the fact that the eigenvectors are robust to noise, we develop...
In this work we demonstrate a method to detect controversy on news issues. This is done by performing an analysis of people's reaction on social media to news articles reporting these issues. Detecting controversial news topics on web is a relevant problem today. It helps to identify the issues upon which people have divided opinion and is specially useful on topics such as a presidential election,...
Social-media debates on longitudinal political topics often take the form of adversarial discussions: highly polarized user posts, favoring one of two opposing parties, over an extended time period. Recent prominent cases are the US Presidential campaign and the UK Brexit referendum. This paper approaches such discussions as a multi-faceted data space, and applies data mining to identify interesting...
Support vector data description (SVDD) is a popular technique for detecting anomalies. The SVDD classifier partitions the whole space into an inlier region, which consists of the region near the training data, and an outlier region, which consists of points away from the training data. The computation of the SVDD classifier requires a kernel function, and the Gaussian kernel is a common choice for...
Change point analysis is a statistical tool to identify homogeneity within time series data. We propose a pruning approach for approximate nonparametric estimation of multiple change points. This general purpose change point detection procedure 'cp3o' applies a pruning routine within a dynamic program to greatly reduce the search space and computational costs. Existing goodness-of-fit change point...
Deep learning techniques have been successfully applied to solve many problems in climate and geoscience using massive-scaled observed and modeled data. For extreme climate event detections, several models based on deep neural networks have been recently proposed and attend superior performance that overshadows all previous handcrafted expert based method. The issue arising, though, is that accurate...
This paper explores recent achievements and novel challenges of the annoying privacy-preserving big data stream mining problem, which consists in applying mining algorithms to big data streams while ensuring the privacy of data. Recently, the emerging big data analytics context has conferred a new light to this exciting research area. This paper follows the so-depicted research trend.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.