Search results

chapter

A MapReduce-based Algorithm for Motif Search

Hongwei Huo, Shuai Lin, Qiang Yu, Yipu Zhang, more

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 2052 - 2060

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Motif search plays an important role in gene finding and understanding gene regulation relationship. Motif search is one of the most challenging problems in bioinformatics. In this paper, we present three data partitions for the PMSP algorithm and propose the PMSP MapReduce algorithm (PMSPMR) for solving the motif search problem. For instances of the problem with different difficulties, the experimental...

chapter

A Scalable Distributed Concolic Testing Approach: An Empirical Evaluation

Moonzoo Kim, Yunho Kim, Gregg Rothermel

2012 IEEE Fifth International Conference on Software Testing, Verification and Validation > 340 - 349

2012 IEEE Fifth International Conference on Software Testing, Verification and Validation (ICST)

Although testing is a standard method for improving the quality of software, conventional testing methods often fail to detect faults. Concolic testing attempts to remedy this by automatically generating test cases to explore execution paths in a program under test, helping testers achieve greater coverage of program behavior in a more automated fashion. Concolic testing, however, consumes a significant...

chapter

Joint Entity Resolution

Steven Euijong Whang, Hector Garcia-Molina

2012 IEEE 28th International Conference on Data Engineering > 294 - 305

2012 IEEE International Conference on Data Engineering (ICDE 2012)

Entity resolution (ER) is the problem of identifying which records in a database represent the same entity. Often, records of different types are involved (e.g., authors, publications, institutions, venues), and resolving records of one type can impact the resolution of other types of records. In this paper we propose a flexible, modular resolution framework where existing ER algorithms developed...

chapter

Reliable and randomized data distribution strategies for large scale storage systems

Alberto Miranda, Sascha Effert, Yangwook Kang, Ethan L. Miller, more

2011 18th International Conference on High Performance Computing > 1 - 10

2011 18th International Conference on High Performance Computing (HiPC)

The ever-growing amount of data requires highly scalable storage solutions. The most flexible approach is to use storage pools that can be expanded and scaled down by adding or removing storage devices. To make this approach usable, it is necessary to provide a solution to locate data items in such a dynamic environment. This paper presents and evaluates the Random Slicing strategy, which incorporates...

chapter

A binary partition-based matching algorithm for Data Distribution Management

Junghyun Ahn, Changho Sung, Tag Gon Kim

Proceedings of the 2011 Winter Simulation Conference (WSC) > 2723 - 2734

2011 Winter Simulation Conference - (WSC 2011)

Data Distribution Management (DDM) is one of the High Level Architecture (HLA) services that reduce message traffic over the network. The major purpose of the DDM is to filter the exchange of data between federates during a federation. However, this traffic reduction usually suffers from higher computational overhead when calculating the intersection between update regions and subscription regions...

chapter

Discovering Clusters with Arbitrary Shapes and Densities in Data Streams

Amr Magdy, Noha A. Yousri, Nagwa M. El-Makky

2011 10th International Conference on Machine Learning and Applications and Workshops > 1 > 279 - 282

2011 Tenth International Conference on Machine Learning and Applications (ICMLA 2011)

The availability of streaming data in different fields and in various forms increases the importance of streaming data analysis. The huge size of a continuously flowing data has put forward a number of challenges in data stream analysis. Exploration of the structure of streamed data represented a major challenge that resulted in introducing various clustering algorithms. However, current clustering...

chapter

On the performance and scalability of multi-robot patrolling algorithms

David Portugal, Rui P. Rocha

2011 IEEE International Symposium on Safety, Security, and Rescue Robotics > 50 - 55

2011 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR)

Several distinct multi-robot patrolling strategies have been presented for the last decade in the context of security applications. However, there is a deficit of studies comparing these strategies, namely in terms of their performance and the scalability in the number of robots. For that reason, in this paper, an evaluation of five representative patrolling approaches is presented. This analysis...

chapter

An XML Data Placement Strategy for Distributed XML Storage and Parallel Query

Jing Zhang, Bo Lang, Yawei Duan

2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies > 433 - 439

2011 12th International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT)

Since there has been significant amount of XML documents generated in various application domains, efficient XML management has become an important problem. Distributed XML storage and parallel query based on Map Reduce can be an effective solution to this problem. As XML data placement strategy is a key factor of parallel system performance, in this paper we present an XML placement strategy, which...

chapter

Evaluation of Load Balance Algorithms

Hamid Mcheick, Ziad Rajih Mohammed, Abbass Lakiss

2011 Ninth International Conference on Software Engineering Research, Management and Applications > 104 - 109

2011 9th International Conference on Software Engineering Research, Management and Applications (SERA)

The Information technology Infrastructure plays an important role in the success of business applications. However, these applications suffer from performance and availability. In this vain, resource utilization is out of balance. Load balancing is very important approach to minimize the execution time because it has many processes units that are running in the same time. It is important to decompose...

chapter

Scaling-Up Item-Based Collaborative Filtering Recommendation Algorithm Based on Hadoop

Jing Jiang, Jie Lu, Guangquan Zhang, Guodong Long

2011 IEEE World Congress on Services > 490 - 497

2011 IEEE World Congress on Services (SERVICES)

Collaborative filtering (CF) techniques have achieved widespread success in E-commerce nowadays. The tremendous growth of the number of customers and products in recent years poses some key challenges for recommender systems in which high quality recommendations are required and more recommendations per second for millions of customers and products need to be performed. Thus, the improvement of scalability...

chapter

A scalable collaborative recommender algorithm based on user density-based clustering

Siavash Ghodsi Moghaddam, Ali Selamat

The 3rd International Conference on Data Mining and Intelligent Information Technology Applications > 246 - 249

2011 3rd International Conference on Data Mining and Intelligent Information Technology Applications (ICMiA)

Recommender systems play an important role in online activities by making personalized recommendations to users, as finding what users are looking for among an enormous number of items in huge databases is a tedious job. The most popular recommender systems employ collaborative filtering algorithms. These methods require large amounts of training data, which cause scalability problems. One approach...

chapter

A scalable eigensolver for large scale-free graphs using 2D graph partitioning

Andy Yoo, Allison H. Baker, Roger Pearce, Van Emden Henson

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) > 1 - 11

2011 SC - International Conference for High Performance Computing, Networking, Storage and Analysis

Eigensolvers are important tools for analyzing and mining useful information from scale-free graphs. Such graphs are used in many applications and can be extremely large. Unfortunately, existing parallel eigensolvers do not scale well for these graphs due to the high communication overhead in the parallel matrix-vector multiplication (MatVec). We develop a MatVec algorithm based on 2D edge partitioning...

chapter

Parallel Local Alignment Algorithm for Multiple Sequences on Heterogeneous Cluster Systems

Xin Cui, Cheng Zhong, Xiang-Yan Lu

2010 3rd International Symposium on Parallel Architectures, Algorithms and Programming > 316 - 320

Third International Symposium on Parallel Architectures, Algorithms and Programming (PAAP 2010)

By taking into account communication startup overhead and the assigned processor distribution order and by applying hashing technique, a novel sequence distribution strategy is presented and the parallel local alignment algorithm for multiple sequences is designed on the heterogeneous cluster system that the computing nodes have different computing speeds and communication capabilities based on divisible...

chapter

Document Similarity Self-Join with MapReduce

Ranieri Baraglia, Gianmarco De Francisci Morales, Claudio Lucchese

2010 IEEE International Conference on Data Mining > 731 - 736

2010 10th IEEE International Conference on Data Mining (ICDM 2010)

Given a collection of objects, the Similarity Self-Join problem requires to discover all those pairs of objects whose similarity is above a user defined threshold. In this paper we focus on document collections, which are characterized by a sparseness that allows effective pruning strategies. Our contribution is a new parallel algorithm within the MapReduce framework. This work borrows from the state...

chapter

Extreme-Scale AMR

Carsten Burstedde, Omar Ghattas, Michael Gurnis, Tobin Isaac, more

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis > 1 - 12

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis

Many problems are characterized by dynamics occurring on a wide range of length and time scales. One approach to overcoming the tyranny of scales is adaptive mesh refinement/coarsening (AMR), which dynamically adapts the mesh to resolve features of interest. However, the benefits of AMR are difficult to achieve in practice, particularly on the petascale computers that are essential for difficult problems...

chapter

Ipv6 Address Space Allocation Schemes - Issues and Challenges, A Survey

M R Kumar, S Ramadass

2010 Second International Conference on Network Applications, Protocols and Services > 170 - 175

2010 Second International Conference on Network Applications Protocols and Services (NETAPPS 2010)

Research and discussion on the Internets addressing and routing scalability problems has been for more than a decade. IP aware growing population, and learned needs in terms of site multi-homing, traffic engineering, non-aggregatable address allocations and policy based routing have resulted in the continuous alarming growth of the routing tables in the Default Free Zone (DFZ). Constraints posed by...

chapter

Parallel BFS graph traversal on images using structured grid

Bor-Yiing Su, Tasneem G Brutch, Kurt Keutzer

2010 IEEE International Conference on Image Processing > 4489 - 4492

2010 17th IEEE International Conference on Image Processing (ICIP 2010)

Graph algorithms are widely used in image processing techniques. With technology advancements, image sizes are increasing, and the contents inside images are becoming more complex, resulting in increased runtimes for graph algorithms on these images. Breadth First Search (BFS) is a fundamental graph traversal approach. A key to parallelizing graph algorithms used in image processing is to parallelize...

chapter

Asynchronous Algorithms in MapReduce

Karthik Kambatla, Naresh Rapolu, Suresh Jagannathan, Ananth Grama

2010 IEEE International Conference on Cluster Computing > 245 - 254

2010 IEEE International Conference on Cluster Computing (CLUSTER 2010)

Asynchronous algorithms have been demonstrated to improve scalability of a variety of applications in parallel environments. Their distributed adaptations have received relatively less attention, particularly in the context of conventional execution environments and associated overheads. One such framework, MapReduce, has emerged as a commonly used programming framework for large-scale distributed...

chapter

On the Scalability of Evidence Accumulation Clustering

André Lourenço, Ana L N Fred, Anil K Jain

2010 20th International Conference on Pattern Recognition > 782 - 785

2010 20th International Conference on Pattern Recognition (ICPR 2010)

This work focuses on the scalability of the Evidence Accumulation Clustering (EAC) method. We first address the space complexity of the co-association matrix. The sparseness of the matrix is related to the construction of the clustering ensemble. Using a split and merge strategy combined with a sparse matrix representation, we empirically show that a linear space complexity is achievable in this framework,...

chapter

Collaborative filtering recommendation based on fuzzy clustering of user preferences

Jing Wang, Nai-Ying Zhang, Jian Yin

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery > 4 > 1946 - 1950

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

In recent years, extensive researches have been conducted to develop approaches to answer two major challenges for collaborative filtering problems, namely sparsity and scalability. In this paper, we propose a novel collaborative filtering recommendation approach to alleviate these challenges. Our approach firstly converts the user-item ratings matrix to user-class matrix, and hence increases greatly...

INFONA - science communication portal

Search results

A MapReduce-based Algorithm for Motif Search

A Scalable Distributed Concolic Testing Approach: An Empirical Evaluation

Joint Entity Resolution

Reliable and randomized data distribution strategies for large scale storage systems

A binary partition-based matching algorithm for Data Distribution Management

Discovering Clusters with Arbitrary Shapes and Densities in Data Streams

On the performance and scalability of multi-robot patrolling algorithms

An XML Data Placement Strategy for Distributed XML Storage and Parallel Query

Evaluation of Load Balance Algorithms

Scaling-Up Item-Based Collaborative Filtering Recommendation Algorithm Based on Hadoop

A scalable collaborative recommender algorithm based on user density-based clustering

A scalable eigensolver for large scale-free graphs using 2D graph partitioning

Parallel Local Alignment Algorithm for Multiple Sequences on Heterogeneous Cluster Systems

Document Similarity Self-Join with MapReduce

Extreme-Scale AMR

Ipv6 Address Space Allocation Schemes - Issues and Challenges, A Survey

Parallel BFS graph traversal on images using structured grid

Asynchronous Algorithms in MapReduce

On the Scalability of Evidence Accumulation Clustering

Collaborative filtering recommendation based on fuzzy clustering of user preferences

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options