Search results

chapter

Algorithm and hardware co-optimized solution for large SpMV problems

Fazle Sadi, Larry Fileggi, Franz Franchetti

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2017 IEEE High Performance Extreme Computing Conference (HPEC)

Sparse Matrix-Vector multiplication (SpMV) is a fundamental kernel for many scientific and engineering applications. However, SpMV performance and efficiency are poor on commercial of-the-shelf (COTS) architectures, specially when the data size exceeds on-chip memory or last level cache (LLC). In this work we present an algorithm co-optimized hardware accelerator for large SpMV problems. We start...

chapter

A 0.13 CMOS integrated circuit for electrical impedance spectroscopy from 1 kHz to 10 GHz

Ronny Garcia-Ramirez, Alfonso Chacon-Rodriguez, Renato Rimolo-Donadio

2017 30th IEEE International System-on-Chip Conference (SOCC) > 126 - 131

2017 30th IEEE International System-on-Chip Conference (SOCC)

The design of an electrical impedance spectroscopy acquisition and processing system using a 0.13 μm CMOS technology with a 1kHz to 10 GHz functional frequency range is presented. The system is based on a quadrature modulator in a lock-in architecture. The design of each one of the modules of the system is explained, and post-layout simulations are used to validate the main features of the design...

chapter

Optimizing the heterogeneous network on-chip design in manycore architectures

Tung Thanh Le, Rui Ning, Dan Zhao, Hongyi Wu, more

2017 30th IEEE International System-on-Chip Conference (SOCC) > 184 - 189

2017 30th IEEE International System-on-Chip Conference (SOCC)

Current hybrid network-on-chip designs in manycore systems are agnostic to the application requirements and thus are provided for general cases. This results in high cost in the manycore systems design, wasted energy and performance. We observe that the cost of network-on-chip designs can be reduced by optimizing the application-specific traffic onto the system. This paper presents mincostflow-based...

chapter

A flexible multi-frequency channel correlator upgrade for MUSER-I array

Fei Liu, Yihua Yan, Wei Wang, Linjie Chen

2017 IEEE Radio and Antenna Days of the Indian Ocean (RADIO) > 1 - 2

2017 IEEE Radio and Antenna Days of the Indian Ocean (RADIO)

We propose a flexible multi-frequency channel correlator upgrade for MUSER-I array. The upgrade correlator has a more flexible architecture to be extensible for signal receiving elements. It can process 1024 frequency channels in IF band and the correlation sensibility is improved by utilizing 4-bit quantization in pre-correlation as well. The enhanced multi-frequency channel processing capability...

chapter

Using the Integrated GPU to Improve CPU Sort Performance

Grigore Lupescu, Emil-Ioan Slusanschi, Nicolae Tapus

2017 46th International Conference on Parallel Processing Workshops (ICPPW) > 39 - 44

2017 46th International Conference on Parallel Processing Workshops (ICPPW)

In this paper we discuss the potential of the integrated GPU to accelerate sorting by performing a partial sort prior to a comparison based CPU sort. We experiment along with several CPU comparison based sorting algorithms and outline the performance gain for a random input data set. We then analyze different x86 SoC architectures, and show that by sorting chunks stored inside the onchip GPU memory,...

chapter

MPI Process and Network Device Affinitization for Optimal HPC Application Performance

Ravindra Babu Ganapathi, Aravind Gopalakrishnan, Russell W. McGuire

2017 IEEE 25th Annual Symposium on High-Performance Interconnects (HOTI) > 80 - 86

2017 IEEE 25th Annual Symposium on High-Performance Interconnects (HOTI)

High Performance Computing(HPC) applications are highly optimized to maximize allocated resources for the job such as compute resources, memory and storage. Optimal performance for MPI applications requires the best possible affinity across all the allocated resources. Typically, setting process affinity to compute resources is well defined, i.e MPI processes on a compute node have processor affinity...

chapter

Design and performance of high-speed Ge-on-Si waveguide photodiodes

N. K. Hon, S. Sahni, A. Mekis, G. Masini

2017 IEEE 14th International Conference on Group IV Photonics (GFP) > 177 - 178

2017 IEEE 14th International Conference on Group IV Photonics (GFP)

Three Ge-on-Si photodetector architectures with different contacting schemes are compared, with emphasis on their bandwidth. The study shows that bandwidth > 50 GHz and responsivity > 1 A/W at 1490 nm can be achieved using a commercial silicon photonics process.

chapter

Boosting the Efficiency of HPCG and Graph500 with Near-Data Processing

Erik Vermij, Leandro Fiorin, Christoph Hagleitner, Koen Bertels

2017 46th International Conference on Parallel Processing (ICPP) > 31 - 40

2017 46th International Conference on Parallel Processing (ICPP)

HPCG and Graph500 can be regarded as the two most relevant benchmarks for high-performance computing systems. Existing supercomputer designs, however, tend to focus on floating-point peak performance, a metric less relevant for these two benchmarks, leaving resources underutilized, and resulting in little performance improvements, for these benchmarks, over time. In this work, we analyze the implementation...

chapter

A network traffic shunt system in SDN network

Chia-Wei Tseng, Yu-Kai Huang, Yao-Tsung Yang, Chien-Chang Liu, more

2017 International Conference on Computer, Information and Telecommunication Systems (CITS) > 195 - 199

2017 International Conference on Computer, Information and Telecommunication Systems (CITS)

With the popularity of smart devices in a variety of actions to drive more usage of wireless broadband networks. OTT refers to delivery of video, audio and other media over the Internet. The video-related applications and services are major challenges that impact the network performance in the future. It is important to achieve network and service traffic offloading to overcome high-speed, real-time,...

chapter

Analyzing Performance of Multi-cores and Applications with Cache-aware Roofline Model

Diogo Marques, Helder Duarte, Leonel Sousa, Aleksandar Ilic

2017 International Conference on High Performance Computing & Simulation (HPCS) > 933 - 934

2017 International Conference on High Performance Computing & Simulation (HPCS)

To satisfy growing computational demands of modern applications, significant enhancements have been introduced in the contemporary processor architectures with the aim to increase their attainable performance, such as increased number of cores, improved capability of memory subsystem and enhancements in the processor pipeline [1]. Therefore, the performance improvements are usually coupled with an...

chapter

Performance Analysis with Cache-Aware Roofline Model in Intel Advisor

Diogo Marques, Helder Duarte, Aleksandar Ilic, Leonel Sousa, more

2017 International Conference on High Performance Computing & Simulation (HPCS) > 898 - 907

2017 International Conference on High Performance Computing & Simulation (HPCS)

The recent increase in the complexity of processor architectures imposes significant challenges when designing and optimizing the execution of real-world applications, even on general-purpose hardware. To help in this process, tools for fast and insightful visualization of architecture and application execution bottlenecks are particularly useful for computer architects and application engineers,...

chapter

OpenFlow-based control mechanism for Coflow-aware multi-connections in DCN

Qi Wu, Hongxiang Guo, Cen Wang, Hong Cao, more

2017 Opto-Electronics and Communications Conference (OECC) and Photonics Global Conference (PGC) > 1 - 3

2017 Opto-Electronics and Communications Conference (OECC) and Photonics Global Conference (PGC)

We proposed an extended OpenFlow-based control mechanism for multiple connections in the OpenScale architecture to achieve Coflow-aware bandwidth scheduling. Experimental demonstration verifies its overall feasibility.

chapter

An in-network packet processing architecture for distributed data storage

Corey Morrison, Alex Sprintson

2017 IEEE Conference on Network Softwarization (NetSoft) > 1 - 5

2017 IEEE Conference on Network Softwarization (NetSoft)

Distributed file systems enable the reliable storage of exabytes of information on thousands of servers distributed throughout a network. These systems achieve reliability and performance by storing multiple copies of data blocks in different locations across the network. The management of these copies of data is commonly handled by intermediate servers that track and coordinate the placement of data...

chapter

Sampling architectures for ultra-wideband signals

Stephen D. Casey, Howard S. Cohl

2017 International Conference on Sampling Theory and Applications (SampTA) > 246 - 250

2017 International Conference on Sampling Theory and Applications (SampTA)

Ultra-wideband (UWB) signal processing is a technology that has tremendous potential to develop advances in communication and information technology. However, it also presents challenges to the signal processing community, and, in particular, to sampling theory. This article outlines a UWB signal processing system via a basis projection and a basis system designed specifically for UWB signals. The...

chapter

An Autonomic QoS management architecture for Software-Defined Networking environments

Felipe Volpato, Madalena Pereira Da Silva, Alexandre Leopoldo Goncalves, Mario Antonio Ribeiro Dantas

2017 IEEE Symposium on Computers and Communications (ISCC) > 418 - 423

2017 IEEE Symposium on Computers and Communications (ISCC)

Software-Defined Networking (SDN) is an innovative approach to provisioning and delivering QoS (Quality of Service) services, yet it is still devoid of context-differentiating services. In this paper we propose a network application (Autonomic QoS Broker) and a controller module that implements the OpenVSwitch Database Management Protocol (OVSDB). These two components were implemented and validated...

chapter

Software Defined Survivable Optical Interconnects for Data Centers

Sonali Chandna, Nabil Naas, Hussein T. Mouftah

2017 19th International Conference on Transparent Optical Networks (ICTON) > 1 - 4

2017 19th International Conference on Transparent Optical Networks (ICTON)

Extending the notion of Software Defined Network (SDN) from packet switching in Layers 2 and 3 to circuit switching in transport layer for service providers is a promising scenario to meet the high burstiness and high bandwidth requirements. For service providers to have a multilayer, multi-domain controller, which can provide automated controller based restoration and protection even in unprotected...

chapter

DENA: A DVFS-Capable Heterogeneous NoC Architecture

Luca Cremona, William Fornaciari, Andrea Marchese, Michele Zanella, more

2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) > 489 - 494

2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

The current design drivers for multi-cores, namely performance per watt, scalability and flexibility, make the Networks-on-Chip (NoCs) the de-facto on-chip interconnect. State of the art NoCs can exploit heterogeneous solutions and complex DVFS techniques to fulfill also the variability of the application requirements. Relevant showstoppers to the design of a truly flexible NoC fitting all the possible...

chapter

Efficient Reconfigurable Global Network-on-Chip Designs towards Heterogeneous CPU-GPU Systems: An Application-Aware Approach

Tung Thanh Le, Dan Zhao, Magdy Bayoumi

2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) > 439 - 444

2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Different applications require different communication performance between subnets in a global hybrid network-on-chip (NOC) of a heterogeneous CPU-GPU architecture (HSA). It is impractical to deploy (at design time) or switch-on (at runtime) all the hybrid routers in the network for a certain application that needs several hybrid routers for communication. Reconfiguring the customized global hybrid...

chapter

PFSI.sw: A programming framework for sea ice model algorithms based on Sunway many-core processor

Binyang Li, Bo Li, Depei Qian

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP) > 119 - 126

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

Sea ice model is a typical high performance computing problem. CPU and GPU based parallel method has been proposed to accelerate the simulation process, but it is still hard to meet the large-scale calculation demand due to the compute-intensive nature of the model. Sunway TaihuLight supercomputer use the SW26010 processor as its computing unit and achieves high performance for large-scale scientific...

chapter

Application-aware resource provisioning in a heterogeneous Internet of Things

Eric Sturzinger, Massimo Tornatore, Biswanath Mukherjee

2017 International Conference on Optical Network Design and Modeling (ONDM) > 1 - 6

2017 International Conference on Optical Network Design and Modeling (ONDM)

Internet of Things (IoT) traffic will become increasingly heterogeneous not only in terms of traditional metrics as required bandwidth and maximum latency, but also in terms of functional requirements such as compute power and temporary storage. Sophisticated planning and engineering approaches must be adopted by service providers to account for this heterogeneity, inherent in IoT applications. Metropolitan...

INFONA - science communication portal

Search results

Algorithm and hardware co-optimized solution for large SpMV problems

A 0.13 CMOS integrated circuit for electrical impedance spectroscopy from 1 kHz to 10 GHz

Optimizing the heterogeneous network on-chip design in manycore architectures

A flexible multi-frequency channel correlator upgrade for MUSER-I array

Using the Integrated GPU to Improve CPU Sort Performance

MPI Process and Network Device Affinitization for Optimal HPC Application Performance

Design and performance of high-speed Ge-on-Si waveguide photodiodes

Boosting the Efficiency of HPCG and Graph500 with Near-Data Processing

A network traffic shunt system in SDN network

Analyzing Performance of Multi-cores and Applications with Cache-aware Roofline Model

Performance Analysis with Cache-Aware Roofline Model in Intel Advisor

OpenFlow-based control mechanism for Coflow-aware multi-connections in DCN

An in-network packet processing architecture for distributed data storage

Sampling architectures for ultra-wideband signals

An Autonomic QoS management architecture for Software-Defined Networking environments

Software Defined Survivable Optical Interconnects for Data Centers

DENA: A DVFS-Capable Heterogeneous NoC Architecture

Efficient Reconfigurable Global Network-on-Chip Designs towards Heterogeneous CPU-GPU Systems: An Application-Aware Approach

PFSI.sw: A programming framework for sea ice model algorithms based on Sunway many-core processor

Application-aware resource provisioning in a heterogeneous Internet of Things

Filter options

Publication date

Content availability

Publication language

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication language

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options