The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Motif search plays an important role in gene finding and understanding gene regulation relationship. Motif search is one of the most challenging problems in bioinformatics. In this paper, we present three data partitions for the PMSP algorithm and propose the PMSP MapReduce algorithm (PMSPMR) for solving the motif search problem. For instances of the problem with different difficulties, the experimental...
Although testing is a standard method for improving the quality of software, conventional testing methods often fail to detect faults. Concolic testing attempts to remedy this by automatically generating test cases to explore execution paths in a program under test, helping testers achieve greater coverage of program behavior in a more automated fashion. Concolic testing, however, consumes a significant...
Entity resolution (ER) is the problem of identifying which records in a database represent the same entity. Often, records of different types are involved (e.g., authors, publications, institutions, venues), and resolving records of one type can impact the resolution of other types of records. In this paper we propose a flexible, modular resolution framework where existing ER algorithms developed...
The ever-growing amount of data requires highly scalable storage solutions. The most flexible approach is to use storage pools that can be expanded and scaled down by adding or removing storage devices. To make this approach usable, it is necessary to provide a solution to locate data items in such a dynamic environment. This paper presents and evaluates the Random Slicing strategy, which incorporates...
Data Distribution Management (DDM) is one of the High Level Architecture (HLA) services that reduce message traffic over the network. The major purpose of the DDM is to filter the exchange of data between federates during a federation. However, this traffic reduction usually suffers from higher computational overhead when calculating the intersection between update regions and subscription regions...
The availability of streaming data in different fields and in various forms increases the importance of streaming data analysis. The huge size of a continuously flowing data has put forward a number of challenges in data stream analysis. Exploration of the structure of streamed data represented a major challenge that resulted in introducing various clustering algorithms. However, current clustering...
Several distinct multi-robot patrolling strategies have been presented for the last decade in the context of security applications. However, there is a deficit of studies comparing these strategies, namely in terms of their performance and the scalability in the number of robots. For that reason, in this paper, an evaluation of five representative patrolling approaches is presented. This analysis...
Since there has been significant amount of XML documents generated in various application domains, efficient XML management has become an important problem. Distributed XML storage and parallel query based on Map Reduce can be an effective solution to this problem. As XML data placement strategy is a key factor of parallel system performance, in this paper we present an XML placement strategy, which...
The Information technology Infrastructure plays an important role in the success of business applications. However, these applications suffer from performance and availability. In this vain, resource utilization is out of balance. Load balancing is very important approach to minimize the execution time because it has many processes units that are running in the same time. It is important to decompose...
Collaborative filtering (CF) techniques have achieved widespread success in E-commerce nowadays. The tremendous growth of the number of customers and products in recent years poses some key challenges for recommender systems in which high quality recommendations are required and more recommendations per second for millions of customers and products need to be performed. Thus, the improvement of scalability...
Recommender systems play an important role in online activities by making personalized recommendations to users, as finding what users are looking for among an enormous number of items in huge databases is a tedious job. The most popular recommender systems employ collaborative filtering algorithms. These methods require large amounts of training data, which cause scalability problems. One approach...
Eigensolvers are important tools for analyzing and mining useful information from scale-free graphs. Such graphs are used in many applications and can be extremely large. Unfortunately, existing parallel eigensolvers do not scale well for these graphs due to the high communication overhead in the parallel matrix-vector multiplication (MatVec). We develop a MatVec algorithm based on 2D edge partitioning...
By taking into account communication startup overhead and the assigned processor distribution order and by applying hashing technique, a novel sequence distribution strategy is presented and the parallel local alignment algorithm for multiple sequences is designed on the heterogeneous cluster system that the computing nodes have different computing speeds and communication capabilities based on divisible...
Given a collection of objects, the Similarity Self-Join problem requires to discover all those pairs of objects whose similarity is above a user defined threshold. In this paper we focus on document collections, which are characterized by a sparseness that allows effective pruning strategies. Our contribution is a new parallel algorithm within the MapReduce framework. This work borrows from the state...
Many problems are characterized by dynamics occurring on a wide range of length and time scales. One approach to overcoming the tyranny of scales is adaptive mesh refinement/coarsening (AMR), which dynamically adapts the mesh to resolve features of interest. However, the benefits of AMR are difficult to achieve in practice, particularly on the petascale computers that are essential for difficult problems...
Research and discussion on the Internets addressing and routing scalability problems has been for more than a decade. IP aware growing population, and learned needs in terms of site multi-homing, traffic engineering, non-aggregatable address allocations and policy based routing have resulted in the continuous alarming growth of the routing tables in the Default Free Zone (DFZ). Constraints posed by...
Graph algorithms are widely used in image processing techniques. With technology advancements, image sizes are increasing, and the contents inside images are becoming more complex, resulting in increased runtimes for graph algorithms on these images. Breadth First Search (BFS) is a fundamental graph traversal approach. A key to parallelizing graph algorithms used in image processing is to parallelize...
Asynchronous algorithms have been demonstrated to improve scalability of a variety of applications in parallel environments. Their distributed adaptations have received relatively less attention, particularly in the context of conventional execution environments and associated overheads. One such framework, MapReduce, has emerged as a commonly used programming framework for large-scale distributed...
This work focuses on the scalability of the Evidence Accumulation Clustering (EAC) method. We first address the space complexity of the co-association matrix. The sparseness of the matrix is related to the construction of the clustering ensemble. Using a split and merge strategy combined with a sparse matrix representation, we empirically show that a linear space complexity is achievable in this framework,...
In recent years, extensive researches have been conducted to develop approaches to answer two major challenges for collaborative filtering problems, namely sparsity and scalability. In this paper, we propose a novel collaborative filtering recommendation approach to alleviate these challenges. Our approach firstly converts the user-item ratings matrix to user-class matrix, and hence increases greatly...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.