Search results

Items from 1 to 20 out of 54 results

chapter

Understanding and optimizing asynchronous low-precision stochastic gradient descent

Christopher De Sa, Matthew Feldman, Christopher Re, Kunle Olukotun

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 561 - 574

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue for the foreseeable future, it is important to study techniques that can make it run fast on parallel hardware. In this paper, we provide the first analysis of a technique called BUCKWILD! that uses both asynchronous execution and low-precision...

chapter

Analysis and simulation of graphs applied to learning with parallel programming in HPC

Edwin Malagon, Alexis Rojas

2017 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON) > 1 - 7

2017 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON)

Large-scale graph analysis or also called network analysis of networks is supported by different algorithms, among the most relevant are PageRank (Web page ranking), Betweenness centrality (centrality in a graph) and Community Detection, these by of their complexity and the large amount of data that process diverse applications, increasingly need to use computational resources such as processor, memory...

chapter

Data-flow implementation of concurrent asynchronous systems

Fayez Gebali, Ali Alzahrani

2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM) > 1 - 5

2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)

Embedded multi core systems are implemented as systems-on-chip (SoC) that rely on packet store-and-forward networks-on-chip (NoC) for communications. These systems do not use busses nor global clock. Instead routers are used to move data between the cores and each core uses its own local clock. This implies concurrent asynchronous computing. Implementing algorithms in such system is very much facilitated...

chapter

Hierarchical Read/Write Analysis for Pointer-Based OpenCL Programs on RRAM

Lin-Ya Yu, Shao-Chung Wang, Jenq-Kuen Lee

2017 46th International Conference on Parallel Processing Workshops (ICPPW) > 45 - 52

2017 46th International Conference on Parallel Processing Workshops (ICPPW)

Heterogeneous computing platforms containing a wide range of computing resources from CPUs to specialized hardware accelerators is the trend today resulting from the physical limitations on processors speed and the increasing demand for computing performance. Hence many optimization strategies are studied to get better throughput and lower energy consumption in heterogeneous systems. Various memory...

chapter

Resource Allocation in the Cloud: From Simulation to Experimental Validation

Pieter-Jan Maenhaut, Hendrik Moens, Bruno Volckaert, Veerle Ongenae, more

2017 IEEE 10th International Conference on Cloud Computing (CLOUD) > 701 - 704

2017 IEEE 10th International Conference on Cloud Computing (CLOUD)

With cloud computing, the efficient management of resources is of great importance as an increased utilization of the available resources can result in higher scalability and significant energy and cost reductions. Experimental validation of novel resource management strategies is costly and time consuming, and often requires in-depth knowledge of and control over the underlying cloud platform. As...

chapter

Demo abstract: RPiaaS: A raspberry pi testbed for validation of cloud resource management strategies

Pieter-Jan Maenhaut, Bruno Volckaert, Veerle Ongenae, Filip De Turck

2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) > 946 - 947

IEEE INFOCOM 2017 -IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)

With cloud computing, efficient resource management is of great importance, as it has a direct impact on the scalability of the cloud application, and can result in significant energy and cost reductions. In recent years, a lot of research has been done regarding the management of cloud resources, resulting in multiple novel resource allocation strategies. Validation of these strategies however is...

chapter

Mermaid: Integrating Vertex-Centric with Edge-Centric for Real-World Graph Processing

Jinhong Zhou, Chongchong Xu, Xianglan Chen, Chao Wang, more

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) > 780 - 783

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

There has been increasing interests in processing large-scale real-world graphs, and recently many graph systems have been proposed. Vertex-centric GAS (Gather-Apply-Scatter) and Edge-centric GAS are two graph computation models being widely adopted, and existing graph analytics systems commonly follow only one computation model, which is not the best choice for real-world graph processing. In fact,...

chapter

Use of Simulators for Side-Channel Analysis

Nikita Veshchikov, Sylvain Guilley

2017 IEEE European Symposium on Security and Privacy (EuroS&P) > 51 - 59

2017 IEEE European Symposium on Security and Privacy (EuroS&P)

Side-channel attacks are among the most powerful and cost-effective attacks on cryptographic systems. Simulators that are developed for side-channel analysis are very useful for preliminary analysis of new schemes, in depth analysis of existing schemes as well as for analysis of products on early stages of development. The contribution of this paper is three-fold. We present a first survey of existing...

chapter

GraVF: A vertex-centric distributed graph processing framework on FPGAs

Nina Engelhardt, Hayden Kwok-Hay So

2016 26th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2016 26th International Conference on Field Programmable Logic and Applications (FPL)

FPGAs are promising platforms to efficiently execute distributed graph algorithms. Unfortunately, they are notoriously hard to program, especially when the problem size and system complexity increases. In this paper, we propose GraVF, a high-level design framework for distributed graph processing on FPGAs. It leverages the vertex-centric paradigm, which is naturally distributed and requires the user...

chapter

A real-time simulation algorithm for power electronics circuit considering multiple switching events and its application on PXI platform

Weijiang Ji, Keyou Wang, Guojie Li, Jian Zhuang, more

2016 IEEE 16th International Conference on Environment and Electrical Engineering (EEEIC) > 1 - 6

2016 IEEE 16th International Conference on Environment and Electrical Engineering (EEEIC)

Real-time simulation technique of power systems is becoming realizable due to the growing significant computational power of computing platform. This paper builds a real-time prototypical platform based on PXI and LabVIEW as its main hardware and software architecture. Taking advantage of the integration characteristics of NI products, the platform embodies high expansibility and good compatibility...

chapter

Simulation of parallel similarity measure computations for large data sets

Pawel Czarnul, Pawel Rosciszewski, Mariusz Matuszek, Julian Szymanski

2015 IEEE 2nd International Conference on Cybernetics (CYBCONF) > 472 - 477

2015 IEEE 2nd International Conference on Cybernetics (CYBCONF)

The paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various...

chapter

GraphReduce: Large-Scale Graph Analytics on Accelerator-Based HPC Systems

Dipanjan Sengupta, Kapil Agarwal, Shuaiwen Leon Song, Karsten Schwan

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 604 - 609

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Recent work on graph analytics has sought to leverage the high performance offered by GPU devices, but challenges remain due to the inherent irregularity of graph algorithm and limitations in GPU-resident memory for storing large graphs. The Graph Reduce methods presented in this paper permit a GPU-based accelerator to operate on graphs that exceed its internal memory capacity. Graph Reduce operates...

chapter

A Task Scheduling and Placement Strategy Based on Tasks' Aspect Ratio

Weiguo Wu, Tao Wang, Chaohui Wang, Qing Zhang

2014 IEEE 17th International Conference on Computational Science and Engineering > 476 - 482

2014 IEEE 17th International Conference on Computational Science and Engineering (CSE)

Reconfigurable computing (RC) is a compromise of General-propose processor (GPP) computing and Application Specific Integrated Circuit (ASIC) computing with both hardware efficiency and software flexibility. An efficient algorithm to tackle the scheduling and placement problem for the dynamically reconfigurable Field-Programmable Gate Arrays (FPGAs) with real time decisions is highly concerned for...

chapter

VLSI implementation of belief-propagation-based stereo matching with linear-model message update

Shen-Fu Hsiao, Jun-Ming Huang, Po-Sheng Wu

2014 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) > 73 - 76

2014 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)

This paper presents both parallel and sequential implementations of linear models for the computation of message update, a critical operation in belief propagation (BP)-based stereo matching that computes the depth information from two images captured at different positions. An improved parallel implementation of the message update is presented that can execute the forward and backward pass concurrently...

chapter

Software-hardware interaction analysis based on Petri Net

Zhaoxiang Yi, Xiaodong Mu, Peng Zhao, Yaqiao Yi

The 26th Chinese Control and Decision Conference (2014 CCDC) > 2815 - 2820

2014 26th Chinese Control And Decision Conference (CCDC)

As software and hardware have grown in functionality and complexity, the existing computer systems are confronting serious challenges with safety, dependability and reliability. Being one of the primary causes with responsibility for these challenges, the software-hardware interaction is advanced and widely studied in recent years. Unfortunately, none of the state-of-art researches has achieved widespread...

chapter

Multi-core SoC architecture exploration with radar digital system based on dataflow graph method

Zhiyuan Lu, Yuanyi Shen, Hu He

2013 IEEE 20th International Conference on Electronics, Circuits, and Systems (ICECS) > 617 - 620

2013 IEEE 20th International Conference on Electronics, Circuits, and Systems (ICECS)

This article outlines a fully complete process in the term of exploration and evaluation about the multi-core SoC architecture with target radar algorithms computing on it. As powerful radar system is in need and SoC technology is widely used in radar field, architecture considers not only speed of processing units but also large data throughput of multi-channels. This work focuses on the architecture...

chapter

Creating Texture Exemplars from Unconstrained Images

Yitzchak David Lockerman, Su Xue, Julie Dorsey, Holly Rushmeier

2013 International Conference on Computer-Aided Design and Computer Graphics > 397 - 398

2013 International Conference on Computer-Aided Design and Computer Graphics (CAD/Graphics)

Texture is an essential feature in modeling the appearance of objects and is instrumental in making virtual objects appear interesting and/or realistic. Unfortunately, obtaining textures is a labor intensive task requiring parameter tuning for procedural methods or careful photography and post-processing for natural images. Many texture synthesis techniques have been developed to generate textures...

chapter

Approximate computing: An integrated hardware approach

Vinay K. Chippa, Swagath Venkataramani, Srimat T. Chakradhar, Kaushik Roy, more

2013 Asilomar Conference on Signals, Systems and Computers > 111 - 117

2013 Asilomar Conference on Signals, Systems and Computers

Computing today is largely not about calculating a precise numerical end result. Instead, computing platforms are increasingly used to execute applications (such as search, analytics, sensor data processing, recognition, mining, and synthesis) for which “correctness” is defined as producing results that are good enough, or of sufficient quality. These applications are often intrinsically resilient...

chapter

Computation of Backpropagation Learning Algorithm Using Neuron Machine Architecture

Jerry B. Ahn

2013 Fifth International Conference on Computational Intelligence, Modelling and Simulation > 23 - 28

2013 Fifth International Conference on Computational Intelligence, Modelling and Simulation (CIMSim)

The neuron machine (NM) is a hardwarearchitecture that can be used to design efficient neural networksimulation systems. However, owing to its intrinsicunidirectional nature, NM architecture does not supportbackpropagation (BP) learning algorithms. This paperproposes novel schemes for NM architecture to support BPalgorithms. Reverse-mapping memories, synapse placementalgorithm, and a memory structure...

chapter

The High Level Architecture (HLA) on Photonic Torus: Hardware and Software Co-design

Kayhan Imre, Nevzat Sevim

2013 8th EUROSIM Congress on Modelling and Simulation > 550 - 554

2013 8th EUROSIM Congress on Modelling and Simulation (EUROSIM)

The High Level Architecture (HLA) as a well-known IEEE standard for developing parallel and distributed simulation systems has been around for many years. In this paper, Runtime Infrastructure (RTI) of HLA is re-evaluated in the light of the current trends in many-core processor architectures. The future many-core processor architectures will contain thousands of cores connected with on chip networks...

Data set:
ieee
Keywords:
HARDWARE
COMPUTATIONAL MODELING
ALGORITHM DESIGN AND ANALYSIS
Publication type:
book

Publication date

Set your own date range

Keywords

COMPUTER ARCHITECTURE (9)
PARALLEL PROCESSING (8)
FIELD PROGRAMMABLE GATE ARRAYS (7)
COMPLEXITY THEORY (5)
FPGA (5)
KERNEL (5)
SOFTWARE (5)
ANALYTICAL MODELS (4)
COMPUTATIONAL COMPLEXITY (4)
DATA MINING (4)
DIGITAL SIGNAL PROCESSING (4)
OPTIMIZATION (4)
PROGRAMMING (4)
SYSTEM-ON-CHIP (4)
CLOCKS (3)
COMPUTER GRAPHICS (3)
COMPUTERS (3)
DATA MODELS (3)
FORMAL SPECIFICATION (3)
FORMAL VERIFICATION (3)
GRAPHICS PROCESSING UNITS (3)
IMAGE COLOR ANALYSIS (3)
MATHEMATICAL MODEL (3)
PARALLEL PROGRAMMING (3)
PIPELINES (3)
PIXEL (3)
PROGRAM PROCESSORS (3)
RENDERING (COMPUTER GRAPHICS) (3)
SIMULATION (3)
SYSTEM-ON-A-CHIP (3)
BENCHMARK TESTING (2)
BIG DATA (2)
CLOUD COMPUTING (2)
COGNITION (2)
COMPUTER GRAPHIC EQUIPMENT (2)
CONCRETE (2)
CRYPTOGRAPHY (2)
GPU (2)
GRAPHICS (2)
GRAPHS (2)
HIGH PERFORMANCE COMPUTING (2)
IMAGE PROCESSING (2)
MATLAB (2)
MODEL CHECKING (2)
OBJECT ORIENTED MODELING (2)
OPTIMISATION (2)
PREDICTIVE CONTROL (2)
RADAR (2)
REAL-TIME (2)
REAL-TIME SYSTEMS (2)
REGISTERS (2)
RESOURCE ALLOCATION (2)
RESOURCE MANAGEMENT (2)
SOFTWARE ALGORITHMS (2)
SOLID MODELING (2)
SWITCHES (2)
TIMING (2)
4G BASEBAND MODEM (1)
4G FAUST CHIPSET (1)
4G MOBILE COMMUNICATION (1)
ABSTRACT STATE MACHINE FORMALISM (1)
ABSTRACTION REFINEMENT ALGORITHM (1)
ABSTRACTS (1)
ACCELERATION (1)
ACTIVE SET METHOD (1)
ADAPTATION MODELS (1)
ADAPTER (1)
ALGORITHM (1)
ALGORITHM THEORY (1)
ALGORITHMS FOR REDUCED POWER (1)
ANALYSIS (1)
ANALYSIS ALGORITHM (1)
APPLICATION FEATURES (1)
APPLICATION SPECIFIC INTEGRATED CIRCUITS (1)
APPLICATION-TO-PLATFORM ADEQUATION (1)
APPROXIMATION ALGORITHMS (1)
APPROXIMATION METHODS (1)
ARBITRARY POLYHEDRAL OBJECT MODEL (1)
ARBITRARY STRUCTURE SCENES (1)
ARCHITECTURE COMPONENTS (1)
ASM SPECIFICATIONS (1)
ASPECT RATIO (1)
ASSERTION BASED VERIFICATION (1)
ASYNCHRONOUS OCCLUSION QUERIES (1)
ASYNCHRONY (1)
BACK-END DECISION PROCEDURE (1)
BACKGROUND ESTIMATION (1)
BACKGROUND GENERATION (1)
BACKGROUND SUBTRACTION (1)
BACKPROPAGATION (1)
BANDWIDTH (1)
BAYES METHODS (1)
BAYESIAN ESTIMATION (1)
BELIEF PROPAGATION (1)
BIG DATA ANALYSIS (1)
BIT-LEVEL MODEL CHECKER (1)
BLOCK MATCHING (1)
more

INFONA - science communication portal

Search results

Understanding and optimizing asynchronous low-precision stochastic gradient descent

Analysis and simulation of graphs applied to learning with parallel programming in HPC

Data-flow implementation of concurrent asynchronous systems

Hierarchical Read/Write Analysis for Pointer-Based OpenCL Programs on RRAM

Resource Allocation in the Cloud: From Simulation to Experimental Validation

Demo abstract: RPiaaS: A raspberry pi testbed for validation of cloud resource management strategies

Mermaid: Integrating Vertex-Centric with Edge-Centric for Real-World Graph Processing

Use of Simulators for Side-Channel Analysis

GraVF: A vertex-centric distributed graph processing framework on FPGAs

A real-time simulation algorithm for power electronics circuit considering multiple switching events and its application on PXI platform

Simulation of parallel similarity measure computations for large data sets

GraphReduce: Large-Scale Graph Analytics on Accelerator-Based HPC Systems

A Task Scheduling and Placement Strategy Based on Tasks' Aspect Ratio

VLSI implementation of belief-propagation-based stereo matching with linear-model message update

Software-hardware interaction analysis based on Petri Net

Multi-core SoC architecture exploration with radar digital system based on dataflow graph method

Creating Texture Exemplars from Unconstrained Images

Approximate computing: An integrated hardware approach

Computation of Backpropagation Learning Algorithm Using Neuron Machine Architecture

The High Level Architecture (HLA) on Photonic Torus: Hardware and Software Co-design

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options