The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In modern sales organizations, the sales force is constantly in flux due to sellers retiring or leaving to take positions in other organizations, and due to new sellers being hired from universities, from other organizations, and transferring in from different divisions within the enterprise. The productivity of sellers in the time period after these human resource events is not the same as that of...
Computerization of hospital information enables us to visualize and analyze temporal characteristics of hospital services, which can be viewed as a first step to improve and innovate clinical services. % This paper proposes a temporal data mining process which consists of decision tree, clustering, MDS and three-dimensional trajectories mining. The results show that the reuse of stored data will give...
This paper addresses an information extraction from different structural web sites by using a user instantiated example. A user instantiated example consists of labels as criteria for decision making on purchasing a target product or service and instances related to the labels. When information extraction method outputs the information in table form, labels are used as column heading of table and...
Support Vector Machines (SVM) and Nonnegative Matrix Factorization (NMF) are standard tools for data analysis. We explore the connections between these two problems, thereby enabling us to import algorithms from SVM world to solve NMF and vice-versa. In particular, one such algorithm developed to solve SVM is adapted to solve NMF. Empirical results show that this new algorithm is competitive with...
A lot of constrained patterns (e.g., emerging patterns, subgroup discovery, classification rules) emphasize the contrasts between data classes and are at the core of many classification techniques. Nevertheless, the extremely large collection of generated patterns hampers the end-user interpretation and the deep understanding of the knowledge revealed by the whole collection of patterns. The key idea...
This study shows a method of determining and visualizing the existence probability of customers from shopping-path data in supermarkets using a database collected by a RFID (Radio Frequency Identification) technique, which allows us to analyze the detailed behaviors of customers. First, we present a method to estimate customer existence probability density on the sales floor using a Kernel density...
Multitask network structure learning is an important problem in several scientific domains, such as, computational neuroscience and bioinformatics. However, existing algorithms do not leverage valuable domain knowledge about the relatedness of tasks. We present the first multitask Bayesian network learning algorithm that incorporates task-relatedness. Empirical results demonstrate that our algorithm...
In this paper, we address two sub-problems within the broad topic of similarity search, focusing on the enhancement of search efficiency based on their common clue ``distance''. One is the fundamental query type, k-nearest neighbor ($k$-NN) and range queries that are regarding distance comparison in terms of nearness. The other is a relatively special query type, reverse furthest neighbor (RFN) query,...
We revisit well-known variables for database marketing/CRM and relationship marketing using a new methodology: Binary Bayesian Quantile regression. This method allows for a more thorough investigation of the relationship between the response variable and the covariates. The main conclusion is that taking intentions as a proxy for real churn behavior yields biased results because the effects are differentially...
Concept drift is believed to be prevalent in most data gathered from naturally occurring processes and thus warrants research by the machine learning community. There are a myriad of approaches to concept drift handling which have been shown to handle concept drift with varying degrees of success. However, most approaches make the key assumption that the labelled data will be available at no labelling...
Concept drift is usually met in rapidly changing environments, especially in sequential data classification, where different types of concept drift occur on regular basis. This paper presents an approach to dynamic visualization of sequential data characteristics aiming to improve the comprehensibility of concept drifts that result in significant change of classification performance. The proposed...
Privacy protection is one of the key requirements of smart grids. To understand the importance of privacy threats it is necessary to study nature of power signals. In this paper, we propose a well-known statistical method which relies on the empirical probability distribution. The method is used to reveal trends in the power signal data and how these trends are changed if a) different data sampling...
In the cloud computing environment resources are accessed as services rather than as a product. Monitoring this system for performance is crucial because of typical pay-per-use packages bought by the users for their jobs. With the huge number of machines currently in the cloud system, it is often extremely difficult for system administrators to keep track of all machines using distributed monitoring...
The analysis of large-scale networks requires the parallel techniques of graph processing. Hadoop as an open-source version of Map/Reduce implementation gains its popularity by high efficiency, scalability and fault tolerance. However, Map/Redeuce as a simplified programming model tends to be used in applications with massive datasets and simple processing. In this paper, we aim to adapt Map/Reduce...
A system for efficient team formation in social networks is demonstrated. Given a project whose completion requires a set of skills, our system finds a set of experts that together have all of the required skills and also have the minimal communication cost. The system finds the best teams with or without a leader using two types of communication structures. After discovering the teams of experts,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.