The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue for the foreseeable future, it is important to study techniques that can make it run fast on parallel hardware. In this paper, we provide the first analysis of a technique called BUCKWILD! that uses both asynchronous execution and low-precision...
Large-scale graph analysis or also called network analysis of networks is supported by different algorithms, among the most relevant are PageRank (Web page ranking), Betweenness centrality (centrality in a graph) and Community Detection, these by of their complexity and the large amount of data that process diverse applications, increasingly need to use computational resources such as processor, memory...
Embedded multi core systems are implemented as systems-on-chip (SoC) that rely on packet store-and-forward networks-on-chip (NoC) for communications. These systems do not use busses nor global clock. Instead routers are used to move data between the cores and each core uses its own local clock. This implies concurrent asynchronous computing. Implementing algorithms in such system is very much facilitated...
Heterogeneous computing platforms containing a wide range of computing resources from CPUs to specialized hardware accelerators is the trend today resulting from the physical limitations on processors speed and the increasing demand for computing performance. Hence many optimization strategies are studied to get better throughput and lower energy consumption in heterogeneous systems. Various memory...
With cloud computing, the efficient management of resources is of great importance as an increased utilization of the available resources can result in higher scalability and significant energy and cost reductions. Experimental validation of novel resource management strategies is costly and time consuming, and often requires in-depth knowledge of and control over the underlying cloud platform. As...
With cloud computing, efficient resource management is of great importance, as it has a direct impact on the scalability of the cloud application, and can result in significant energy and cost reductions. In recent years, a lot of research has been done regarding the management of cloud resources, resulting in multiple novel resource allocation strategies. Validation of these strategies however is...
There has been increasing interests in processing large-scale real-world graphs, and recently many graph systems have been proposed. Vertex-centric GAS (Gather-Apply-Scatter) and Edge-centric GAS are two graph computation models being widely adopted, and existing graph analytics systems commonly follow only one computation model, which is not the best choice for real-world graph processing. In fact,...
Side-channel attacks are among the most powerful and cost-effective attacks on cryptographic systems. Simulators that are developed for side-channel analysis are very useful for preliminary analysis of new schemes, in depth analysis of existing schemes as well as for analysis of products on early stages of development. The contribution of this paper is three-fold. We present a first survey of existing...
FPGAs are promising platforms to efficiently execute distributed graph algorithms. Unfortunately, they are notoriously hard to program, especially when the problem size and system complexity increases. In this paper, we propose GraVF, a high-level design framework for distributed graph processing on FPGAs. It leverages the vertex-centric paradigm, which is naturally distributed and requires the user...
Real-time simulation technique of power systems is becoming realizable due to the growing significant computational power of computing platform. This paper builds a real-time prototypical platform based on PXI and LabVIEW as its main hardware and software architecture. Taking advantage of the integration characteristics of NI products, the platform embodies high expansibility and good compatibility...
The paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various...
Recent work on graph analytics has sought to leverage the high performance offered by GPU devices, but challenges remain due to the inherent irregularity of graph algorithm and limitations in GPU-resident memory for storing large graphs. The Graph Reduce methods presented in this paper permit a GPU-based accelerator to operate on graphs that exceed its internal memory capacity. Graph Reduce operates...
Reconfigurable computing (RC) is a compromise of General-propose processor (GPP) computing and Application Specific Integrated Circuit (ASIC) computing with both hardware efficiency and software flexibility. An efficient algorithm to tackle the scheduling and placement problem for the dynamically reconfigurable Field-Programmable Gate Arrays (FPGAs) with real time decisions is highly concerned for...
This paper presents both parallel and sequential implementations of linear models for the computation of message update, a critical operation in belief propagation (BP)-based stereo matching that computes the depth information from two images captured at different positions. An improved parallel implementation of the message update is presented that can execute the forward and backward pass concurrently...
As software and hardware have grown in functionality and complexity, the existing computer systems are confronting serious challenges with safety, dependability and reliability. Being one of the primary causes with responsibility for these challenges, the software-hardware interaction is advanced and widely studied in recent years. Unfortunately, none of the state-of-art researches has achieved widespread...
This article outlines a fully complete process in the term of exploration and evaluation about the multi-core SoC architecture with target radar algorithms computing on it. As powerful radar system is in need and SoC technology is widely used in radar field, architecture considers not only speed of processing units but also large data throughput of multi-channels. This work focuses on the architecture...
Texture is an essential feature in modeling the appearance of objects and is instrumental in making virtual objects appear interesting and/or realistic. Unfortunately, obtaining textures is a labor intensive task requiring parameter tuning for procedural methods or careful photography and post-processing for natural images. Many texture synthesis techniques have been developed to generate textures...
Computing today is largely not about calculating a precise numerical end result. Instead, computing platforms are increasingly used to execute applications (such as search, analytics, sensor data processing, recognition, mining, and synthesis) for which “correctness” is defined as producing results that are good enough, or of sufficient quality. These applications are often intrinsically resilient...
The neuron machine (NM) is a hardwarearchitecture that can be used to design efficient neural networksimulation systems. However, owing to its intrinsicunidirectional nature, NM architecture does not supportbackpropagation (BP) learning algorithms. This paperproposes novel schemes for NM architecture to support BPalgorithms. Reverse-mapping memories, synapse placementalgorithm, and a memory structure...
The High Level Architecture (HLA) as a well-known IEEE standard for developing parallel and distributed simulation systems has been around for many years. In this paper, Runtime Infrastructure (RTI) of HLA is re-evaluated in the light of the current trends in many-core processor architectures. The future many-core processor architectures will contain thousands of cores connected with on chip networks...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.