Search results

chapter

Agave: A benchmark suite for exploring the complexities of the Android software stack

Martin K. Brown, Zachary Yannes, Michael Lustig, Mazdak Sanati, more

2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) > 157 - 158

2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

Traditional suites used for benchmarking high-performance computing platforms or for architectural design space exploration use much simpler virtual memory layouts and multitasking/ multithreading schemes, which means that they cannot be used to study the complex interactions among the layers of the Android software stack. To demonstrate this, we present memory reference and concurrency data showing...

chapter

X-Mem: A cross-platform and extensible memory characterization tool for the cloud

Mark Gottscho, Sriram Govindan, Bikash Sharma, Mohammed Shoaib, more

2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) > 263 - 273

2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

Effective use of the memory hierarchy is crucial to cloud computing. Platform memory subsystems must be carefully provisioned and configured to minimize overall cost and energy for cloud providers. For cloud subscribers, the diversity of available platforms complicates comparisons and the optimization of performance. To address these needs, we present X-Mem, a new open-source software tool that characterizes...

chapter

Data-driven approach of FS-SKPLS monitoring with application to wastewater treatment process

Zelin Ren, Jianxing Liu, Zhiyong She, Chengming Yang, more

2016 IEEE International Conference on Industrial Technology (ICIT) > 950 - 955

2016 IEEE International Conference on Industrial Technology (ICIT)

In this paper, a data-driven scheme of spherical kernel partial least squares based on feature subspace (FS-SKPLS) will be applied to the wastewater treatment process (WWTP). First, select appropriate data variables. Utilize the benchmark simulation model no. 1 (BSM1) to obtain large amounts of training and testing data needed in the process monitoring. Then, introduce the feather subspace method...

chapter

Sparse encoding of binocular images for depth inference

Sheng Y. Lundquist, Dylan M. Paiton, Peter F. Schultz, Garrett T. Kenyon

2016 IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI) > 121 - 124

2016 IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI)

Sparse coding models have been widely used to decompose monocular images into linear combinations of small numbers of basis vectors drawn from an overcomplete set. However, little work has examined sparse coding in the context of stereopsis. In this paper, we demonstrate that sparse coding facilitates better depth inference with sparse activations than comparable feed-forward networks of the same...

chapter

Specific Read-Only Data Management for Memory System Optimization

Gregory Vaumourin, Guerre Alexandre, Dombek Thomas, Denis Barthou

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) > 337 - 340

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)

This paper proposes a new way of managing the cache by exploiting the difference of behavior in the memory system between read-only data and read-write data. A division of the existing cache-based memory hierarchy is proposed in order to create a dedicated data path for read-only data. In order to justify this approach, an analysis performed on a set of benchmarks shows that read-only data count for...

chapter

A Quantitative Performance Evaluation of Fast on-Chip Memories of GPUs

Elias Konstantinidis, Yiannis Cotronis

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) > 448 - 455

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)

Modern Graphics Processing Units (GPUs) have evolved to high performance general purpose processors, forming an alternative to CPUs. However, programming them effectively has proven to be a challenge, not only due to the mandatory requirement of extracting massive fine grained parallelism but also due to its susceptible performance on memory traffic. Apart from regular memory caches, GPUs feature...

chapter

A Quantitative Performance Evaluation of Fast on-Chip Memories of GPUs

Elias Konstantinidis, Yiannis Cotronis

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) > 448 - 455

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)

Modern Graphics Processing Units (GPUs) have evolved to high performance general purpose processors, forming an alternative to CPUs. However, programming them effectively has proven to be a challenge, not only due to the mandatory requirement of extracting massive fine grained parallelism but also due to its susceptible performance on memory traffic. Apart from regular memory caches, GPUs feature...

chapter

Specific Read-Only Data Management for Memory System Optimization

Gregory Vaumourin, Guerre Alexandre, Dombek Thomas, Denis Barthou

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) > 337 - 340

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)

This paper proposes a new way of managing the cache by exploiting the difference of behavior in the memory system between read-only data and read-write data. A division of the existing cache-based memory hierarchy is proposed in order to create a dedicated data path for read-only data. In order to justify this approach, an analysis performed on a set of benchmarks shows that read-only data count for...

chapter

4.6 A 65nm CMOS 6.4-to-29.2pJ/FLOP@0.8V shared logarithmic floating point unit for acceleration of nonlinear function kernels in a tightly coupled processor cluster

Michael Gautschi, Michael Schaffner, Frank K. Gurkaynak, Luca Benini

2016 IEEE International Solid-State Circuits Conference (ISSCC) > 82 - 83

2016 IEEE International Solid-State Circuits Conference (ISSCC)

Energy-efficient computing and ultra-low-power operation are requirements for many application areas, such as IoT and wearables. While for some applications, integer and fixed-point processor instructions suffice, others (e.g. simultaneous localization and mapping - SLAM, stereo vision, nonlinear regression and classification) require a larger dynamic range, typically obtained using single/double-precision...

chapter

Beyond Photo-Domain Object Recognition: Benchmarks for the Cross-Depiction Problem

Hongping Cai, Qi Wu, Peter Hall

2015 IEEE International Conference on Computer Vision Workshop (ICCVW) > 74 - 79

2015 IEEE International Conference on Computer Vision Workshop (ICCVW)

The cross-depiction problem is that of recognising visual objects regardless of whether they are photographed, painted, drawn, etc. It introduces great challenge as the variance across photo and art domains is much larger than either alone. We extensively evaluate classification, domain adaptation and detection benchmarks for leading techniques, demonstrating that none perform consistently well given...

chapter

A Performance Evaluation Model for Virtual Servers in KVM-Based Virtualized System

Jing Yang, Yuqing Lan

2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity) > 66 - 71

2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity)

According to the statistics, there is low resource utilization and high energy consumption in traditional servers. To reduce the cost, more and more companies begin to build virtual servers. Sever virtualization implements the mapping from virtual resources to physical resources and deal with resource contention among all VMs. Because of complexity of virtualized server systems, it is necessary to...

chapter

To Co-run, or Not to Co-run: A Performance Study on Integrated Architectures

Feng Zhang, Jidong Zhai, Wenguang Chen, Bingsheng He, more

2015 IEEE 23rd International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems > 89 - 92

2015 IEEE 23rd International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)

Architecture designers tend to integrate both CPU and GPU on the same chip to deliver energy-efficient designs. To effectively leverage the power of both CPUs and GPUs on integrated architectures, researchers have recently put substantial efforts into co-running a single application on both the CPU and the GPU of such architectures. However, few studies have been performed to analyze a wide range...

chapter

SnabbSwitch user space virtual switch benchmark and performance optimization for NFV

Michele Paolino, Nikolay Nikolaev, Jeremy Fanguede, Daniel Raho

2015 IEEE Conference on Network Function Virtualization and Software Defined Network (NFV-SDN) > 86 - 92

2015 IEEE Conference on Network Function Virtualization and Software Defined Network (NFV-SDN)

New paradigms in networking industry, such as Software Defined Networking (SDN) and Network Functions Virtualization (NFV), require the hypervisors to enable the execution of Virtual Network Functions in virtual machines (VMs). In this context, the virtual switch function is critical to achieve carrier grade performance, hardware independence, advanced features and programmability. SnabbSwitch is...

chapter

A novel local success weighted ensemble classifier

Raghvendra Kannao, Prithwijit Guha

2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) > 469 - 473

2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)

Ensemble methods aggregate the decisions of diverse component classifiers to achieve superior classification performances. Most of the previous ensemble frameworks have used fixed weights to determine the influence of each of the component classifiers on the ensemble decision. However, in practice base classifiers usually have expertise in local regions of the feature space. This paper presents a...

chapter

An OpenCL-Compliant Multi-core Platform and Its Companion Compiler

Ramon S. Nepomuceno, Jonatas C. Santos, Laysson O. Luz, Ivan S. Silva

2015 Brazilian Symposium on Computing Systems Engineering (SBESC) > 116 - 121

2015 Brazilian Symposium on Computing Systems Engineering (SBESC)

Nowadays, multi-core architectures have become mainstream in the microprocessor industry. However, while the number of cores integrated in a single chip growth, more important becomes the need for an adequate programming model. In recent years, the OpenCL programming model has attracted the attention of multi-core designers' community. This paper presents an OpenCL-compliant architecture and demonstrates...

chapter

Enhanced simulation performance through parallelization using a synthetic and a real-world simulation model

Tommy Baumann, Bernd Pfitzinger, Dragan Macos, Thomas Jestadt

2015 Federated Conference on Computer Science and Information Systems (FedCSIS) > 1335 - 1341

2015 Federated Conference on Computer Science and Information Systems (FedCSIS)

Taking an existing large-scale simulation model of the German toll system we identify possibilities for parallelization in order to enhance simulation performance. We transform parts of the model from its current serial implementation to a parallel implementation. Afterwards we evaluate the achieved performance enhancement and compare the results to a synthetic benchmark model.

chapter

TRACO: An automatic loop nest parallelizer for numerical applications

Marek Palkowski, Tomasz Klimek, Wlodzimierz Bielecki

2015 Federated Conference on Computer Science and Information Systems (FedCSIS) > 681 - 686

2015 Federated Conference on Computer Science and Information Systems (FedCSIS)

We present the source-to-source TRACO compiler allowing for increasing program locality and parallelizing arbitrarily nested loop sequences in numerical applications. Algorithms for generation of tiled code and extracting synchronization-free slices composed of tiles are presented. Parallelism of arbitrary nested loops is obtained by creating a kernel of computations represented in the OpenMP standard...

chapter

Patch-based scale calculation for visual tracking

Yulong Xu, Yafei Zhang, Jiabao Wang, Yang Li, more

2015 International Conference on Wireless Communications & Signal Processing (WCSP) > 1 - 5

2015 International Conference on Wireless Communications & Signal Processing (WCSP)

Robust scale calculation is a challenging problem in visual object tracking. Most state-of-the-art trackers fail to handle large scale variations in complex image sequences. This paper propose a novel approach for robust scale calculation in a tracking-by-detection framework. The proposed approach divides the target into four patches and computes the scale factor by finding the maximum response position...

chapter

Performance analysis of Web application in Xen-based virtualized environment

Reza NasiriGerdeh, Negin Hosseini, Keyvan RahimiZadeh, Morteza AnaLoui

2015 5th International Conference on Computer and Knowledge Engineering (ICCKE) > 256 - 261

2015 5th International Conference on Computer and Knowledge Engineering (ICCKE)

Virtualization technologies are experiencing a renewed interest for diverse applications such as Cloud computing and server consolidation. These technologies reduce costs and improve flexibility and reliability of services. However, they pose a new performance challenge. An application performance running inside virtual machine may considerably differ from its performance in native one because of...

chapter

Performance Evaluation of Hypervisors for HPC Applications

David Beserra, Felipe Oliveira, Jean Araujo, Felipe Fernandes, more

2015 IEEE International Conference on Systems, Man, and Cybernetics > 846 - 851

2015 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

High Performance Computing (HPC) aggregates computing power in order to solve large and complex problems in different knowledge areas. Nowadays, HPC users can utilize virtualized infrastructures as a low-cost alternative to deploy their applications. However, virtualization brings some challenges for HPC, specially in regard to overhead caused by hyper visors. In this work, our main goal is to analyze...

INFONA - science communication portal

Search results

Agave: A benchmark suite for exploring the complexities of the Android software stack

X-Mem: A cross-platform and extensible memory characterization tool for the cloud

Data-driven approach of FS-SKPLS monitoring with application to wastewater treatment process

Sparse encoding of binocular images for depth inference

Specific Read-Only Data Management for Memory System Optimization

A Quantitative Performance Evaluation of Fast on-Chip Memories of GPUs

A Quantitative Performance Evaluation of Fast on-Chip Memories of GPUs

Specific Read-Only Data Management for Memory System Optimization

4.6 A 65nm CMOS 6.4-to-29.2pJ/FLOP@0.8V shared logarithmic floating point unit for acceleration of nonlinear function kernels in a tightly coupled processor cluster

Beyond Photo-Domain Object Recognition: Benchmarks for the Cross-Depiction Problem

A Performance Evaluation Model for Virtual Servers in KVM-Based Virtualized System

To Co-run, or Not to Co-run: A Performance Study on Integrated Architectures

SnabbSwitch user space virtual switch benchmark and performance optimization for NFV

A novel local success weighted ensemble classifier

An OpenCL-Compliant Multi-core Platform and Its Companion Compiler

Enhanced simulation performance through parallelization using a synthetic and a real-world simulation model

TRACO: An automatic loop nest parallelizer for numerical applications

Patch-based scale calculation for visual tracking

Performance analysis of Web application in Xen-based virtualized environment

Performance Evaluation of Hypervisors for HPC Applications

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options