The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper we come up with a novel approach for the early detection of events in blog entries. The detection of trend is already discussed pretty often. Nevertheless, in our understanding the detection of events goes one step further. The presented algorithms detects unique happenings at a given point in time by perceiving unusual frequent occurrences of words or word groups. We introduce an implementation...
Using page-level metrics of a randomly selected group of 15,625 among the top 100,000 Face book check-in locations which rank high in terms of customer engagement, we explore if the short-term dynamical information on these metrics could deliver, via a clustering approach, some new insights for marketing decision making. Using a highly-scalable clustering algorithm, statistical methods, and combinatorial...
Real data often are comprised of multiple modalities or different views, which provide complementary and consensus information to each other. Exploring those information is important for the multi-view data clustering and classification. Multiview embedding is an effective method for multiple view data which uncovers the common latent structure shared by different views. Previous studies assumed that...
Digital photo management is becoming indispensable for the explosively growing family photo albums due to the rapid popularization of digital cameras and mobile phone cameras. An effective photo management system could accurately and efficiently group all faces of the same person into a small number of clusters. In this paper, we present a novel photo grouping method based on spectral theory. The...
Differentiating people on the basis of their names has always been a complex issue and our desire for grouping people, in a particular domain, based on their attributes is growing day by day. Despite years of research and a bunch of proposed techniques, the name ambiguity problem remains largely unsolved and the so far proposed techniques have faced one problem or the other. In case of author name...
Semi-supervised clustering is a popular machine learning technique, used for challenge data categorization tasks, when some prior knowledge is available to users. In this paper, we report the empirical studies on our newly proposed semi-supervised clustering framework, which utilizes multiple viewpoints for the similarity measure, with the help of the prior knowledge. Two different MVS-based approaches...
Three different families of hierarchical clustering methods satisfying the axioms of value — in a network with two nodes the nodes cluster together at resolutions at which both can influence each other — and transformation — when we reduce some pairwise dissimilarities and increase none, the resolutions at which nodes cluster together may decrease but not increase — are introduced. The grafting family...
Schema matching is widely used in many database applications, such as, data integration, data warehouse, data spaces, and ontology merging. In this paper, we propose multi-schema matching based on web structured information sources. There are two meanings at this point. Traditional matching techniques mainly address matching tasks between two attributes, namely pair wise-attribute correspondence....
Schema matching plays an important role in many database applications, such as ontology merging, data integration, data warehouse and dataspaces. The problem of schema matching is to find the semantic correspondence between attributes of schemas to be matched. In this paper, we propose multi-schema matching based on clustering techniques. Traditional matching techniques mainly address matching tasks...
In this paper, we propose a skyline computation system UCOS (User Clustering based Online Skyline), which divides the computation into offline and online stages. Based on the truth that QoS similarity implies the skyline similarity, the offline stage of UCOS system dose user clustering according to the historical user-service QoS records by given distance metrics. Then, we compute the representative...
In active learning, raw samples are queried as few as possible to learn an accurate classifier. However, queried samples may encounter the problem of low diversity if they are selected without considering sample content. Then the classifier would be inefficiently resulted by the similar queried samples. In this paper, the approach, ALUC, is proposed to increase the diversity of queried uncertain samples...
Context-aware query aims to make the user get suitable query results based on the users' contexts. When a user as a leader or a representative issues a query, s/he often needs to consider a group of people. To this end, context-aware database should meet most of the people's contexts in this situation. In this paper, we propose an approximation algorithm to compute context-aware group top-k query...
The problem of hierarchical clustering items from pairwise similarities is found across various scientific disciplines, from biology to networking. Often, applications of clustering techniques are limited by the cost of obtaining similarities between pairs of items. While prior work has been developed to reconstruct clustering using a significantly reduced set of pairwise similarities via adaptivemeasurements,...
In recent years, there is an increasing interest in the research community in finding community structure in complex networks. The networks are usually represented as graphs, and the task is usually cast as a graph clustering problem. Traditional clustering algorithms and graph partitioning algorithms have been applied to this problem. New graph clustering algorithms have also been proposed. Random...
Affinity propagation clustering algorithm is with a broad value in science and engineering because of it no need to input the number of clusters in advances, robustness and good generalization. But the algorithm needs the initial similarity (the distance between any two points) as a parameter, a lot of time and storage space is required for the calculation of similarity. It's limited to apply to cluster...
Currently, the information in the internet is becoming explosive. In order to help the users searching the items they are interested in, such as, the news, the books, in this paper, we propose an automatic personalized recommendation algorithm by constructing the social graph resting on the users' implicit interaction information. We at first introduce a metric to measure the users' affinity based...
This paper presents a novel pairwise clustering approach. We pose the problem as a question of parameter estimation and show the pairwise indicator variables can be estimated by using the maximum likelihood estimate (MLE) method. Based on this, a two-level clustering algorithm is developed: the grouping graph is first condensed by using the MLE results and then the k-means clustering method is applied...
Gesture recognition is an important aspect of interpersonal social interaction. Developing a similar capacity in a robot will improve human-robot interaction. Various unsupervised clustering methods applied to clustering a set of dynamic human arm gestures are compared. Unsupervised clustering is important in gesture recognition as it imposes no a priori bound on the set of gestures. Results are compared...
Evolutionary changes in object-oriented systems can result in large, complex classes, known as "God Classes". In this paper, we present a tool, developed as part of the JDeodorant Eclipse plugin, that can recognize opportunities for extracting cohesive classes from "God Classes" and automatically apply the refactoring chosen by the developer.
The repeated random walks algorithm (RRW) is a graph clustering algorithm proposed recently. RRW has been shown to achieve better performance on functional module discovery in protein-protein interaction networks than Markov Clustering Algorithm (MCL). There is however little work applying RRW to community detection in social networks. We ran RRW on some real-world social networks that are well-documented...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.