The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper focuses on developing classification algorithms for problems in which there is a need to predict the class based on multiple observations (examples) of the same phenomenon (class). These problems give rise to a new classification problem, referred to as set classification, that requires the prediction of a set of instances given the prior knowledge that all the instances of the set belong...
Constraint-based mining has been proven to be extremely useful. It has been applied not only to many pattern discovery settings (e.g., for sequential pattern mining) but also, recently, on classification and clustering tasks (see, e.g., ). It appears as a key technology for an inductive database perspective on knowledge discovery in databases (KDD), and constraint-based mining is indeed an answer...
For multi-view learning, existing methods usually exploit originally provided features for classifier training, which ignore the latent correlation between different views. In this paper, semantic features integrating information from multiple views are extracted for pattern representation. Canonical correlation analysis is used to learn the representation of semantic spaces where semantic features...
This paper proposes a novel framework of incorporating protein-protein interactions (PPI) ontology knowledge into PPI extraction from biomedical literature in order to address the emerging challenges of deep natural language understanding. It is built upon the existing work on relation extraction using the hidden vector state (HVS) model. The HVS model belongs to the category of statistical learning...
This paper presents a new keyword extraction algorithm for Chinese news Web pages using lexical chains and word co-occurrence combined with frequency features, cohesion features, and corelation features. A lexical chain is an external performance consistency by semantically related words of a text, and is the representation of the semantic content of a portion of the text. Word co-occurrence distribution...
Motivated by the need for unification of the field of data mining and the growing demand for formalized representation of outcomes of research, we address the task of constructing an ontology of data mining. The proposed ontology, named OntoDM, is based on a recent proposal of a general framework for data mining, and includes definitions of basic data mining entities, such as datatype and dataset,...
Active KDD research groups typically make their software tools at disposal of others through the net. However, integration and reuse of these tools typically require a considerable amount of time to understand software scope and use, install it, transform data in a format compatible with the required input. This paper introduces a semantic based, service-oriented framework for tools sharing and reuse,...
We describe Deimos, a system that automatically discovers and models new sources of information.The system exploits four core technologies developed by our group that makes an end-to-end solution to this problem possible. First, given an example source, Deimos finds other similar sources online. Second, it invokes and extracts data from these sources. Third, given the syntactic structure of a source,...
The performance of user profiling models depends on both the predictive accuracy and the cost of incorrect predictions. In this paper we study whether including contextual information leads to a decrease in the misclassification cost. Several experimental analyses were done by varying the cost ratio, the market granularity and the granularity of context. The experimental results show that context...
A variety of services have recently been provided depending on highly developed networks and personal equipment. With these advances, connecting this equipment has become increasingly more complicated. Problems such as an increase in no-connection and determining the cause have become difficult in some cases because software is often updated to keep up with advancements in services or security. Telecom...
The growing complexity and variability characterizing markets have induced scholars and marketers to propose new segmentation approaches. Recent research has shown that including the context in which a transaction occurs in customer behavior models, improves the ability of predicting their behavior. However, no systematic research has studied whether contextual information really matters in market...
Several marketing problems involve prediction of customer purchase behavior and forecasting future preferences. We consider predictive modeling of large scale, bi-modal or multimodal temporal marketing data, for instance, datasets consisting of customer spending behavior over time. Such datasets are characterized by variability in purchase patterns across different customer subgroups and shifting...
This paper proposes a support system for composing good titles for research papers in order to reach new audiences. Our system takes titles as input. The system evaluates title understandability and interest level of a title. The system ranks titles and outputs a title list. Users are able to recompose their titles by referring to the list and each evaluation value. Using the system, users can obtain...
The "value" in this paper can be dealt with as a new variable which business workers create from their interaction with the dynamic environment, on which they redesign products and the market sustainably. Here we first show how data mining and data visualization can provide useful tools for aiding marketerspsila/designerspsila sensitivity of emerging values of consumers/users. By visualizing...
This purpose of this study is to propose a knowledge-discovery system that can abstract helpful information from character strings representing shopper visits to product sections associated with positive and negative purchasing events by applying character string parsing technologies to stream data describing customer purchasing behavior inside a store. Taking data that traced customers' movements...
There are two main requirements for effective advertising in social networks. The first is that links in the social network are relevant to the targeted ads. The second is that social information can be easily incorporated with existing targeting methods to predict response rates. Our purpose in this paper is to investigate these requirements. We measure the relevance of a social network, the Yahoo!...
Semantic concept learning is one of the most challenging problems in video retrieval. The key barrier for semantic concept learning is lack of annotated training data. Internet videos are different from ordinary videos: massive, rich information, customized, non-uniform format, uneven quality, little descriptive text, only a few shots with limited length etc. Therefore, Internet is a potential repository...
Conventional video representation methods focus predominantly on a single video, aiming at reducing the space-time redundancy as much as possible, while this paper describes a novel approach to simultaneously presenting dynamics of multiple videos, aiming at a less intrusive viewing experience. Given a main video and multiple supplementary videos, the proposed approach automatically constructs a synthesized...
A new feature description is used for human behaviour representation and recognition. The feature is based on Radon transforms of extracted silhouettes. Key postures are selected based on the Radon transform. Key postures are combined to construct an action template for each sequence. Linear discriminant analysis (LDA) is applied to the set of key postures to obtain low dimensional feature vectors...
In the problem of face clustering with multi-views, the similarity between faces of different persons with similar pose is usually greater than the similarity between multi-view faces of the same person. This may exert a tremendous impact on the clustering result that sent back to the user. To solve this problem, we should do pose clustering first and then within each dasiapose grouppsila, clustering...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.