The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes a Contrarian Probabilistic Model (CPM) to evaluate the effectiveness of contrarians' investment in preferred stocks using big data from Tradeline. CPM accommodates the unique features of investment data which are often correlated, nested, heterogeneous, non-normal with missing values. The clustering and statistical inference are integrated in CPM, which enables joint investment...
Diffusion Tensor Imaging (DTI) is an effective tool for the analysis of structural brain connectivity in normal development and in a broad range of brain disorders. However efforts to derive inherent characteristics of structural brain networks have been hampered by the very high dimensionality of the data, relatively small sample sizes, and the lack of widely acceptable connectivity-based regions...
Influence among objects prevalently exists in graph structured data. However, most existing research efforts detect influence among objects from snapshots of homogeneous graphs. In this paper, we study a new problem of detecting time-evolving influence among objects from dynamic heterogeneous graphs. We propose a probabilistic graphical model, Time-evolving Influence Model (TIM), to capture the temporal...
Entity matching is the problem of determining if two entities in a data set refer to the same real-world object. In the last decade a growing number of large-scale knowledge bases have been created online. Tools for automatically aligning these sources would make it possible to unify them in a structured knowledge and to answer complex queries. Here we present Holistic Entity Matching (HolisticEM),...
In this paper we present novel experimental results comparing two interpretations of missing attribute values: attribute-concept values and "do not care" conditions. Experiments were conducted on 12 data sets with many missing attribute values using the MLEM2 rule induction system. In the experiments, three kinds of probabilistic approximations were used: singleton, subset and concept; with...
This paper dicusses how to formalize medical diagnostic reasoing from the viewpoint of rule reasoning. Characteristics of rules shows that the rule model is closely related with rough set rule model. The important point is that medical diagnostic reasoning is characterized by focusing mechanism, composed of screening and differential diagnosis, which corresponds to upper approximation and lower approximation...
We introduce a new neural network based similarity model for learning document relevance under a query. The main idea is to use the binomial distribution to model the proportion of people who clicked document d under query q among the users who viewed d under q. Our model is a generalization of existing neural network based latent semantic models in that both its objective function and its parametrization...
Water management field has concentrated great interest, with the potential to affect the long term well-being, the societal economy and security. In parallel, it imposes specific research challenges which have not been already met, due to the lack of fine-grained data. Knowledge extraction and decision making for efficient management in the energy field has attracted a lot of interest in Big Data...
Set-valued dataset contains different types of items/values per individual, for example, visited locations, purchased goods, watched movies, or search queries. As it is relatively easy to re-identify individuals in such datasets, their release poses significant privacy threats. Hence, organizations aiming to share such datasets must adhere to personal data regulations. In order to get rid of these...
There are users who generate significant amounts of domain knowledge in online forums or community question and answer (CQA) websites. Existing literature defines them as ‘experts.’ These users attain such statuses by providing multiple relevant answers to the question askers. Past works have focused on recommending relevant posts to these users. With the rise of web forums where certified experts...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.