The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Locality Sensitive Hashing (LSH) is a widely used similarity search technique for many web services, such as content-based retrieval services for images and videos. Due to its popularity, much research effort has been devoted to improving the search quality, and the indexing and query performance of LSH. However, most existing variants of LSH can only run on single node, which limits their applicability...
Large-scale graph computation is central to applications ranging from language processing to social networks. However, natural graphs tend to have skewed power-law distributions where a small subset of the vertices have a large number of neighbors. Existing graph-parallel systems suffer from load imbalance, high communication cost, or suboptimal and complex processing. In this paper we present GraphA,...
This paper proposes the GraphF abstraction which exploits Adaptive Radix Tree for efficient graph indexing with lower storage cost. Leveraging the GraphF abstraction, we implement a separate graph computation engine on Spark. Experiments showed that on average GraphF outperforms GraphX and PowerGraph by up to 8.1X and 3.6X separately in execution time both for real world and for synthetic graphs....
This paper presents SMARTPARTITION, an efficient approach to partition large-scala natural graphs [5]. We design a new partitioning algorithm which can perceive graph layout locality to improve the performance of graph computation. Experimental results demonstrate that SMARTPARTITION can achieve significant reduction in ingress time and execution time for both real-world and synthetic graph datasets.
Locality Sensitive Hashing (LSH) is an important indexing technique for approximate similarity search in high-dimensional spaces. An obvious limitation of LSH approaches is the lack of capability and scalability to deal with massive data. This paper proposes a distributed variant of LSH called Spark-LSH, which is implemented on Apache Spark, a well-known distributed computing framework. We design...
Data clustering is usually time-consuming since it by default needs to iteratively aggregate and process large volume of data. Approximate aggregation based on sample provides fast and quality ensured results. In this paper, we propose to leverage approximation techniques to data clustering to obtain the trade-off between clustering efficiency and result quality, along with online accuracy estimation...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.