The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this letter, an improved kernel density estimation (KDE) constant false alarm rate (CFAR) method is proposed for ship detection in single polarization synthetic aperture radar (SAR) images. The proposed method consists of a target enhancement filter, an adaptive KDE bandwidth estimation method and an improved KDE-CFAR. The gravity-based target enhancement filter is utilized to remove the inhomogeneity...
As a nonlinear extension of Kalman filter, the extended Kalman filter (EKF) is also based on the minimum mean square error (MMSE) criterion. In general, the EKF performs well in Gaussian noises. But its performance may deteriorate substantially when the system is disturbed by heavy-tailed impulsive noises. In order to improve the robustness of EKF against impulsive noises, a new filter for nonlinear...
Service performance degradation and downtimes are a common on the Internet today. Many on-line services (e.g. Amazon.com, Spotify, and Netflix, etc.) report huge loss in revenue and traffic per episode. This is perhaps due to the correlation between performance and end-users's satisfaction.
For GPUs to achieve their peak performance, effective and efficient usage of memory bandwidth is necessary. To this end, programmers invest extensive development effort to optimize a GPU program, specially its memory bandwidth usage. The OpenACC programming model has been introduced to tackle the accelerators programming complexity. However, this model's coarse-grained control on a program can make...
Deep learning, and especially Convolutional Neural Network (CNN, is among the most powerful and widely used techniques in computer vision. Applications range from image classification to object detection, segmentation, Optical Character Recognition (OCR), etc. At the same time, CNNs are both computationally intensive and memory intensive, making them difficult to be deployed on low power lightweight...
The recent advent of stacked memory devices has led to a resurgence of researchassociated with the fundamental memory hierarchy and associated memory pipeline. The bandwidth advantages provided by stacked logic and DRAM devices haveinspired research associated with eliminating the bandwidth bottlenecksassociated with many applications in high performance computing. Further, recent efforts have focused...
Presented paper deals with trend modeling of spectral coefficients represents material properties at very rapid load. We focus on identification and optimization of the curve describing dependence between frequencies and time in spectrogram. The spectrogram is firstly processed with the aim to specify significant spectral coefficients. Consequently for such coefficient we apply non-parametric kernel...
The paper presents an assessment of the reliability of medium voltage networks within a power company. Reliability of power supply in medium voltage networks is one of the commonly recognized targets of Smart Grid. Novel approaches are needed for evaluating the reliability of electricity distribution and the reliability of supply in distribution network planning. This paper presents a stochastic supply...
The Amazon Rainforest degradation is a worldwide concern. The rainforest has been endangered by the illegal wood extraction without control even in the preservation areas. Due to the large geography extension prevent these crimes with an unmanned aerial vehicle (UAV) is not always possible. The Wireless Acoustics Sensor Network (WASNs) technology can alleviate this problem. Here, we present an acoustical...
We establish a framework that can be used by Origin Servers (content-generating organizations) for claiming Content Delivery Network (CDN) resources in a fine-grained way. The basis of our work lies in the use of Stocks as well as a Secondary Market for the stock trading, tools and products commonly used in modern capital markets. Network and disk resources are being monitored through well-established...
Modern Graphics Processing Units (GPUs) have evolved to high performance general purpose processors, forming an alternative to CPUs. However, programming them effectively has proven to be a challenge, not only due to the mandatory requirement of extracting massive fine grained parallelism but also due to its susceptible performance on memory traffic. Apart from regular memory caches, GPUs feature...
Our target in this work is to study ways of exploring the parallelism offered by vectorization on accelerators with very wide vector units. To this end, we implemented two kernels that derive from the Wilson Dslash operator and investigate several data layout techniques for increasing the scalability of lattice QCD scientific kernels suitable for the Intel Xeon Phi. In parts of the application where...
Modern Graphics Processing Units (GPUs) have evolved to high performance general purpose processors, forming an alternative to CPUs. However, programming them effectively has proven to be a challenge, not only due to the mandatory requirement of extracting massive fine grained parallelism but also due to its susceptible performance on memory traffic. Apart from regular memory caches, GPUs feature...
Our target in this work is to study ways of exploring the parallelism offered by vectorization on accelerators with very wide vector units. To this end, we implemented two kernels that derive from the Wilson Dslash operator and investigate several data layout techniques for increasing the scalability of lattice QCD scientific kernels suitable for the Intel Xeon Phi. In parts of the application where...
Processing data in or near memory (PIM), as opposed to in conventional computational units in a processor, can greatly alleviate the performance and energy penalties of data transfers from/to main memory. Graphics Processing Unit (GPU) architectures and applications, where main memory bandwidth is a critical bottleneck, can benefit from the use of PIM. To this end, an application should be properly...
Execution of GPGPU workloads consists of different stages including data I/O on the CPU, memory copy between the CPU and GPU, and kernel execution. While GPU can remain idle during I/O and memory copy, prior work has shown that overlapping data movement (I/O and memory copies) with kernel execution can improve performance. However, when there are multiple dependent kernels, the execution of the kernels...
Sparse matrix-vector multiplication (SpMV) is an important computational kernel in many applications. For performance improvement, software libraries designated for SpMV computation have been introduced, e.g., MKL library for CPUs and cuSPARSE library for GPUs. However, the computational throughput of these libraries is far below the peak floating-point performance offered by hardware platforms, because...
With the recent advancement of multilayer convolutional neural networks (CNN), deep learning has achieved amazing success in many areas, especially in visual content understanding and classification. To improve the performance and energy-efficiency of the computation-demanding CNN, the FPGA-based acceleration emerges as one of the most attractive alternatives. In this paper we design and implement...
As massive multi-threading in GPU imposes tremendous pressure on memory subsystems, efficient bandwidth utilization becomes a key factor affecting the GPU throughput. In this work, we propose thread batch enabled memory partitioning (TEMP), to improve GPU performance through the improvement of memory bandwidth utilization. In particular, TEMP clusters multiple thread blocks sharing the same set of...
The speed of memory capacity expansion of the computer system has not kept up with the speed of the increase of the memory requirement of large memory applications. Also, big memory system has been too expensive for many researchers and students. Therefore, approaches to utilize remote memory has been considered as a cost effective way to run large memory applications in the cluster environment where...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.