The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes a Contrarian Probabilistic Model (CPM) to evaluate the effectiveness of contrarians' investment in preferred stocks using big data from Tradeline. CPM accommodates the unique features of investment data which are often correlated, nested, heterogeneous, non-normal with missing values. The clustering and statistical inference are integrated in CPM, which enables joint investment...
In this paper we accelerate the Alternating Least Squares (ALS) algorithm used for generating product recommendations on the basis of implicit feedback datasets. We approach the algorithm with concepts proven to be successful in High Performance Computing. This includes the formulation of the algorithm as a mix of cache-optimized algorithm-specific kernels and standard BLAS routines, acceleration...
A Business Cloud is defined to be a collection of company datasets that are stored on the "Cloud". For simplicity, we have assumed: Each company only has one dataset. There are information flows among these datasets. Within such an environment Chinese Wall Security Policy (CWSP) is revisited. Based on the "physical" view of Brewer and Nash, the Chinese Wall policy that regulates...
Volatility analysis plays a major role in finance and economics. It is the key input for many financial topics including risk management, option and derivative pricing. One pressing computational hurdle in high frequency financial statistics is the tremendous amount of data and the optimization procedures that require computing power beyond the currently available desktop systems. In this article,...
In this paper, we present a method to dynamically predict the failure of physiological subsystems from patients admitted to the Intensive Care Unit (ICU) using heterogeneous data. We model the probability of failure in each subsystem as a latent state that evolves over time. We propose a method using Generalized Linear Dynamic models to model this latent state which is updated each time new patient...
The task of community detection in a graph formalizes the intuitive task of grouping together subsets of vertices such that vertices within clusters are connected tighter than those in disparate clusters. This paper approaches community detection in graphs by constructing Markov random walks on the graphs. The mixing properties of the random walk are then used to identify communities. We use coupling...
To efficiently utilize their cloud based services, consumers have to continuously monitor and manage the Service Level Agreements (SLA) that define the service performance measures. Currently this is still a time and labor intensive process since the SLAs are primarily stored as text documents. We have significantly automated the process of extracting, managing and monitoring cloud SLAs using natural...
Nodes of a social graph often represent entities with specific labels, denoting properties such as age-group or gender. Design of algorithms to assign labels to unlabeled nodes by leveraging node-proximity and a-priori labels of seed nodes is of significant interest. A semi-supervised approach to solve this problem is termed "LPA-Label Propagation Algorithm" where labels of a subset of nodes...
The increasing popularity of mobile devices has brought severe challenges to device usability and big data analysis. In this paper we investigate the intellectual recommender system on cell phones by incorporating mobile data analysis. Nowadays with the development of smart phones, more and more applications have emerged on various areas, such as entertainment, education and health care. While these...
Truth table optimization is of great importance for simplification of combinational logic circuits. In this paper, Granular Computing (GrC) and statistic methods are combined to convert traditional big truth table optimization problem into the minimal rule discovery of the logic information system. The proposed method is the improvements of our former work. The possible solutions were searched in...
This paper proposes to study a novel problem, discovering a Smallest Unique Subgraph (SUS) for any node of interest specified by user in a heterogeneous social network. The rationale of the SUS problem lies in how a person is different from any others in a social network, and how to represent the identity of a person using her surrounding relational structure in a social network. To deal with the...
Acquiring a network of trust relations among users in social media sites, e.g., item-review sites, is important for analyzing users' behavior and efficiently finding reliable information on the Web. We address the problem of predicting trustlinks among users for an item-review site. Non-negative matrix factorization (NMF) methods have recently been shown useful for trust-link prediction in such a...
Data quality is a challenging problem in many real world application domains. While a lot of attention has been given to detect anomalies for data at rest, detecting anomalies for streaming applications still largely remains an open problem. For applications involving several data streams, the challenge of detecting anomalies has become harder over time, as data can dynamically evolve in subtle ways...
We provide an algorithm to build quantile regression trees in O(N log N) time, where N is the number of instances in the training set. Quantile regression trees are regression trees that model conditional quantiles of the response variable, rather than the conditional expectation as in standard regression trees. We build quantile regression trees by using the quantile loss function in our node splitting...
The problem of discovering all formal concepts embedded in a binary relational dataset is of significant interest for many data analysis and processing problems. The problem of enumerating all concepts for a dataset is known to be NP-hard. A number of Map-Reduce based algorithms have been developed to conquer the difficulty of processing large datasets. But these algorithms are not very scalable because...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.