The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The chemical task in internal combustion engine simulations concerns with the solution of a non-linear stiff system of Ordinary Differential Equations (ODEs) per each cell of a discretization grid representing engine geometry. The computational cost of the above task, when a detailed kinetic scheme is used, is dominating in engine simulations. Due to local physical-chemical conditions, each system...
We present a multi-core virtual platform which follows single-core architecture, SPARC v8, available as an open source development suite. The proposed multi-SPARC system operates at electronic system level to accelerate its simulation speed. TLM channels are devised to connect the processors. To simplify the use of the proposed virtual platform, we define some specific APIs for data transaction and...
The emergence of cloud computing and Google's MapReduce paradigm is renewing interest in the development of broadly applicable high level abstractions as a means to deliver easy programmability and cyber resources to the user, while hiding complexities of system architecture, parallelism and algorithms, heterogeneity, and fault-tolerance. In this paper, we present a high-level framework for computations...
We explore the multisend interface as a data mover interface to optimize applications with neighborhood collective communication operations. One of the limitations of the current MPI 2.1 standard is that the vector collective calls require counts and displacements (zero and non-zero bytes) to be specified for all the processors in the communicator. Further, all the collective calls in MPI 2.1 are...
Emerging large-scale systems have many nodes with several processors per node and multiple cores per processor. These systems require effective task distribution between cores, processors and nodes to achieve high levels of performance and utilization. Current scheduling strategies distribute tasks between cores according to a count of available cores, b ut ignore the execution time and energy implications...
Recent research in multi-agent systems incorporate fault tolerance concepts. However, the research does not explore the extension and implementation of such ideas for large scale parallel computing systems. The work reported in this paper investigates a swarm array computing approach, namely 'Intelligent Agents'. In the approach considered a task to be executed on a parallel computing system is decomposed...
Multicore nodes have become ubiquitous in just a few years. At the same time, writing portable parallel software for multicore nodes is extremely challenging. Widely available programming models such as OpenMP and Pthreads are not useful for devices such as graphics cards, and more flexible programming models such as RapidMind are only available commercially. OpenCL represents the first truly portable...
Many computing-intensive applications in different domains are benefiting from new High Performance Computing (HPC) architectures such as clusters and computational grids. This experimental study shows the performance results of parallelizing a computing-intensive risk management financial simulator application using the Message Passing Interface (MPI) and running it on two configurations of a dedicated...
In this work, parallel preconditioning methods based on ??hierarchical interface decomposition (HID)?? and hybrid parallel programming models were applied to finite-element based simulations of linear elasticity problems in media with heterogeneous material properties. Reverse Cuthill-McKee reordering with cyclic multicoloring (CM-RCM) was applied for parallelism through OpenMP. The developed code...
In regards to applications like 3D seismic migration, it is quite important to improve the I/O performance within an cluster computing system. Such seismic data processing applications are the I/O intensive applications. For example, large 3D data volume cannot be hold totally in computer memories. Therefore the input data files have to be divided into many fine-grained chunks. Intermediate results...
As the size of today's supercomputers grow exponentially in numbers of processors, the applications that run on these systems scale to larger processor counts. The majority of these applications commonly use Message Passing Interface (MPI); a trace of these MPI communication events is an important input to the tools that visualize, simulate for performance modeling, or enable tuning of parallel applications...
DNA sequence alignment is the most commonly application in computational biology. It is essential pre-requisite of many other operations in computational biology applications. Optimal alignment for a large scale size DNA sequence dataset is a known example of a time and space consuming. The Smith-Waterman algorithm is a well recognized technique to produce optimal alignment between DNA sequences....
Scientific computing algorithms on parallel computing environments are popularly used to simulate scientific and engineering phenomena rather than physical experimentations. The performance of these applications on parallel computing environments depends on the communication delay between processors. To reduce the delay, communication patterns have been studied by many research scientists. The communication...
Fault tolerance is a critical issue in the arena of large-scale computing. The fault-tolerant parallel algorithm (FTPA) is an application-level technique for tolerating hardware failures. FTPA achieves fast failure recovery making use of parallel recomputing. However, it complicates the coding of the application program. This paper uses compiler technology to automate the design of FTPA, and introduces...
Modern scientific research is a collaborative process, with researchers from many disciplines and institutions working toward a common goal. Dynamic languages, like Ruby, provide a platform for quickly developing simulation and analysis tools, freeing researchers to focus on research instead of spending time developing infrastructure. Ruby is a particularly good fit, allowing incorporation of existing...
This paper addresses the challenge of how to permit tightly coupled parallel applications, optimised for uniform, stable, static environments, execute equally efficiently in environments which exhibit the complete opposite characteristics. Using the N-body problem as a case study, both the traditional and proposed grid enabled MPI implementations of the popular ring algorithm are analysed. Results...
As an emergence technology, P2P is spreading to distributed simulation area, and many distributed simulation frameworks have used P2P as the middleware to interconnect their existing single processor's simulators to form distributed environments for simulation execution. In terms of simulation time management, most existing tools use a middleware layer to implement and support time management in a...
Many parallel applications running on a distributed memory cluster generate data dynamically to process during their execution. In this case it is possible that some cluster nodes become overloaded. To improve performance we can integrate a dynamic data distribution algorithm.The integration of a dynamic load distribution policy into an application must consider the correct programming of several...
View-oriented parallel programming (VOPP) is a novel parallel programming model which uses views for communication between multiple processes. With the introduction of views, mutual exclusion and shared data access are bundled together, which offers both convenience and high performance to parallel programming. This paper presents the implementation of VOPP on chip-multi threading processors, e.g...
This paper presented the parallel solution of general 3D EM problems using the FETI-DPEM (dual-primal finite element tearing and interconnecting method). An excellent parallel efficiency has been achieved on a cluster system with an automatic decomposition of the computational domain into hundreds of subdomains. In this work, the parallel implementation of the FETI-DPEM method on a distributed-memory...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.