The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Restructuring web search results is the best solution for ambiguous queries being entered to the search engine. When ambiguous queries are entered to the search engine gives multiple results for same query, so user don't get specific and accurate information about what they really want, so it becomes difficult for a user to get specific information related to the submitted keyword. For this reason...
Clustering is a widely used technique in data mining applications for discovering patterns in underlying data. Most traditional clustering algorithms are limited to handling datasets that contain either numeric or categorical attributes. However, data sets with mixed types of attributes are common in real life data mining applications. In this paper, we introduce a new framework for clustering mixed...
In large-scale environments, robots should have proper internal representation of the surroundings for achieving tasks such as localization, navigation, and exploration. Internal representations could be categorized in two ways: metric (grid-based) map and topological map. In this paper, we aim to generate a topological map representation (collision-free graph) of the large-scale environment from...
Regression testing is an activity during the maintenance phase to validate the changes made to the software and to ensure that these changes would not affect the previously verified code or functionality. Often, regression testing is performed with limited computing resources and time budget. So in this phase, it is infeasible to run the complete test suite Thus, test-case prioritization approaches...
Generating synthetic network graphs that capture key topological and electrical characteristics of real-world electric power systems is important in aiding widespread and accurate analysis of these systems. Classical statistical models of graphs, such as small-world networks or Erdos-Renyi graphs, are unable to generate synthetic graphs that accurately represent the topology of real electric power...
In batch systems monitoring information at the level of individual jobs is crucial to optimize resource utilization and prevent misusage. However, especially the usage of network resources is difficult to track. In order to understand usage patterns in modern computing clusters, a more detailed monitoring than existent solutions is required. A monitoring on job level leads to dynamic graphs of processes...
Text segmentation (TS) aims at dividing long textinto coherent segments which reflect the subtopic structure of the text. It is beneficial to many natural language processing tasks, such as Information Retrieval (IR) and document summarisation. Current approaches to text segmentation are similar in that they all use word-frequency metrics to measure the similarity between two regions of text, so that...
We propose an approach for discovering functional communities in social media by identifying groups of users who interact with similar content, represented as dense biclusters in a user-content matrix. We present a heuristic algorithm to efficiently search the space of possible co-clusterings for one which maximizes the value of a given metric, along with a new class of co-clustering metrics that...
Dynamic community detection algorithms tryto solve problems that identify communities of dynamicnetwork which consists of a series of network snapshots. Toaddress this issue, here we propose a new dynamiccommunity detection algorithm based on incrementalidentification according to a vertex-based metric calledpermanence. We incrementally analyze the communityownership of partial vertices, so as to...
Clustering explores meaningful patterns in the non-labeled data sets. Cluster Ensemble Selection (CES) is a new approach, which can combine individual clustering results for increasing the performance of the final results. Although CES can achieve better final results in comparison with individual clustering algorithms and cluster ensemble methods, its performance can be dramatically affected by its...
The goal of this paper was to apply fuzzy clustering algorithm known as Fuzzy C-Means to color image segmentation, which is an important problem in pattern recognition and computer vision. For computational experiments, serial and parallel versions were implemented. Both were tested using various parameters and random number generator seeds. Various distance measures were used: Euclidean, Manhattan...
We suggest an effective method for solving the problem of correlation clustering. This method is based on an extension of a partial tolerance relation to clusters. We present several implementation of this method using different data structures, and we show a method to speed up the execution by a quasi-parallelism.
Graph partitioning, or graph cut, has been studied by several authors as a tool for image segmentation. It refers to partitioning a graph into several subgraphs such that each of them represents a meaningful object of interest in the image. In this work we propose a hierarchical agglomerative clustering algorithm driven by the cut and mean cut criteria. Some preliminary experiments were performed...
Clustering items using textual features is an important problem with many applications, such as root-cause analysis of spam campaigns, as well as identifying common topics in social media. Due to the sheer size of such data, algorithmic scalability becomes a major concern. In this work, we present our approach for text clustering that builds an approximate k-NN graph, which is then used to compute...
In this paper, we propose a novel user clustering algorithm for non-orthogonal multiple access (NOMA) considering the channel correlation between users and the channel gain. We also adopt sum-rate maximization approach to find an optimal precoding matrix in the multi-user multiple-input-multiple-output (MU-MIMO) setup after the user clustering. Grouping two users in a single beam to serve users in...
Multi-tenant storage management environments typically manage multiple enterprise accounts with heterogeneous storage footprints. Identifying and grouping accounts with similar storage footprints into clusters reduces account management overhead, and provides a framework for data-driven storage recommendation services. This paper describes a method for the clustering of accounts in multi-tenant storage...
The roll out of smart meters introduces “Time of Use” tariffs to incentive demand response for household customers. This paper describes a methodology to identify the impact of demand response in customer load profiles and applies it to a smart meter data set. The smart meter data for residential household is from the Irish CER Smart Metering Project. The profiles are segmented via kmeans clustering...
This paper reports robustness comparison of clustering-based multi-label classification methods versus non-clustering counterparts for multi-concept associated image and video annotations. In the experimental setting of this paper, we adopted six popular multi-label classification algorithms, two different base classifiers for problem transformation based multi-label classifications, and three different...
Interference Alignment (IA) in heterogeneous networks (HetNets) is a promising technique that improves the spectral efficiency significantly. We showed in [1] that transmit antennas at pico BSs could be utilized more efficiently by clustering pico cells in IA in HetNet where the clustering formation was optimized so as to minimize the rate loss caused by inter-cluster interference. In [1], the optimum...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.