The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Traditional suites used for benchmarking high-performance computing platforms or for architectural design space exploration use much simpler virtual memory layouts and multitasking/ multithreading schemes, which means that they cannot be used to study the complex interactions among the layers of the Android software stack. To demonstrate this, we present memory reference and concurrency data showing...
Effective use of the memory hierarchy is crucial to cloud computing. Platform memory subsystems must be carefully provisioned and configured to minimize overall cost and energy for cloud providers. For cloud subscribers, the diversity of available platforms complicates comparisons and the optimization of performance. To address these needs, we present X-Mem, a new open-source software tool that characterizes...
In this paper, a data-driven scheme of spherical kernel partial least squares based on feature subspace (FS-SKPLS) will be applied to the wastewater treatment process (WWTP). First, select appropriate data variables. Utilize the benchmark simulation model no. 1 (BSM1) to obtain large amounts of training and testing data needed in the process monitoring. Then, introduce the feather subspace method...
Sparse coding models have been widely used to decompose monocular images into linear combinations of small numbers of basis vectors drawn from an overcomplete set. However, little work has examined sparse coding in the context of stereopsis. In this paper, we demonstrate that sparse coding facilitates better depth inference with sparse activations than comparable feed-forward networks of the same...
This paper proposes a new way of managing the cache by exploiting the difference of behavior in the memory system between read-only data and read-write data. A division of the existing cache-based memory hierarchy is proposed in order to create a dedicated data path for read-only data. In order to justify this approach, an analysis performed on a set of benchmarks shows that read-only data count for...
Modern Graphics Processing Units (GPUs) have evolved to high performance general purpose processors, forming an alternative to CPUs. However, programming them effectively has proven to be a challenge, not only due to the mandatory requirement of extracting massive fine grained parallelism but also due to its susceptible performance on memory traffic. Apart from regular memory caches, GPUs feature...
Modern Graphics Processing Units (GPUs) have evolved to high performance general purpose processors, forming an alternative to CPUs. However, programming them effectively has proven to be a challenge, not only due to the mandatory requirement of extracting massive fine grained parallelism but also due to its susceptible performance on memory traffic. Apart from regular memory caches, GPUs feature...
This paper proposes a new way of managing the cache by exploiting the difference of behavior in the memory system between read-only data and read-write data. A division of the existing cache-based memory hierarchy is proposed in order to create a dedicated data path for read-only data. In order to justify this approach, an analysis performed on a set of benchmarks shows that read-only data count for...
Energy-efficient computing and ultra-low-power operation are requirements for many application areas, such as IoT and wearables. While for some applications, integer and fixed-point processor instructions suffice, others (e.g. simultaneous localization and mapping - SLAM, stereo vision, nonlinear regression and classification) require a larger dynamic range, typically obtained using single/double-precision...
The cross-depiction problem is that of recognising visual objects regardless of whether they are photographed, painted, drawn, etc. It introduces great challenge as the variance across photo and art domains is much larger than either alone. We extensively evaluate classification, domain adaptation and detection benchmarks for leading techniques, demonstrating that none perform consistently well given...
According to the statistics, there is low resource utilization and high energy consumption in traditional servers. To reduce the cost, more and more companies begin to build virtual servers. Sever virtualization implements the mapping from virtual resources to physical resources and deal with resource contention among all VMs. Because of complexity of virtualized server systems, it is necessary to...
Architecture designers tend to integrate both CPU and GPU on the same chip to deliver energy-efficient designs. To effectively leverage the power of both CPUs and GPUs on integrated architectures, researchers have recently put substantial efforts into co-running a single application on both the CPU and the GPU of such architectures. However, few studies have been performed to analyze a wide range...
New paradigms in networking industry, such as Software Defined Networking (SDN) and Network Functions Virtualization (NFV), require the hypervisors to enable the execution of Virtual Network Functions in virtual machines (VMs). In this context, the virtual switch function is critical to achieve carrier grade performance, hardware independence, advanced features and programmability. SnabbSwitch is...
Ensemble methods aggregate the decisions of diverse component classifiers to achieve superior classification performances. Most of the previous ensemble frameworks have used fixed weights to determine the influence of each of the component classifiers on the ensemble decision. However, in practice base classifiers usually have expertise in local regions of the feature space. This paper presents a...
Nowadays, multi-core architectures have become mainstream in the microprocessor industry. However, while the number of cores integrated in a single chip growth, more important becomes the need for an adequate programming model. In recent years, the OpenCL programming model has attracted the attention of multi-core designers' community. This paper presents an OpenCL-compliant architecture and demonstrates...
Taking an existing large-scale simulation model of the German toll system we identify possibilities for parallelization in order to enhance simulation performance. We transform parts of the model from its current serial implementation to a parallel implementation. Afterwards we evaluate the achieved performance enhancement and compare the results to a synthetic benchmark model.
We present the source-to-source TRACO compiler allowing for increasing program locality and parallelizing arbitrarily nested loop sequences in numerical applications. Algorithms for generation of tiled code and extracting synchronization-free slices composed of tiles are presented. Parallelism of arbitrary nested loops is obtained by creating a kernel of computations represented in the OpenMP standard...
Robust scale calculation is a challenging problem in visual object tracking. Most state-of-the-art trackers fail to handle large scale variations in complex image sequences. This paper propose a novel approach for robust scale calculation in a tracking-by-detection framework. The proposed approach divides the target into four patches and computes the scale factor by finding the maximum response position...
Virtualization technologies are experiencing a renewed interest for diverse applications such as Cloud computing and server consolidation. These technologies reduce costs and improve flexibility and reliability of services. However, they pose a new performance challenge. An application performance running inside virtual machine may considerably differ from its performance in native one because of...
High Performance Computing (HPC) aggregates computing power in order to solve large and complex problems in different knowledge areas. Nowadays, HPC users can utilize virtualized infrastructures as a low-cost alternative to deploy their applications. However, virtualization brings some challenges for HPC, specially in regard to overhead caused by hyper visors. In this work, our main goal is to analyze...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.