The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
By taking the advantages of both CPU and GPU as well as the shared DRAM and cache, the integrated CPU-GPU architecture has the potential to boost the performance for a variety of applications, including real-time applications as well. However, before being applied to the hard real-time and safety-critical applications, the time-predictability of the integrated CPU-GPU architecture needs to be studied...
This paper studies visual pattern discovery in large-scale image collections via binarized mode seeking, where images can only be represented as binary codes for efficient storage and computation. We address this problem from the perspective of binary space mode seeking. First, a binary mean shift (bMS) is proposed to discover frequent patterns via mode seeking directly in binary space. The binomial-based...
The capability of GPUs to accelerate general-purpose applications that can be parallelized into massive number of threads makes it promising to apply GPUs to real-time applications as well, where high throughput and intensive computation are also needed. However, due to the different architecture and programming model of GPUs, the worst-case execution time (WCET) analysis methods and techniques designed...
DNNs (Deep Neural Networks) have demonstrated great success in numerous applications such as image classification, speech recognition, video analysis, etc. However, DNNs are much more computation-intensive and memory-intensive than previous shallow models. Thus, it is challenging to deploy DNNs in both large-scale data centers and real-time embedded systems. Considering performance, flexibility, and...
The recent adoption of OpenCL programming model by FPGA vendors has realized the function portability of OpenCL workloads on FPGA. However, the poor performance portability prevents its wide adoption. To harness the power of FPGAs using OpenCL programming model, it is advantageous to design an analytical performance model to estimate the performance of OpenCL workloads on FPGAs and provide insights...
The release of OpenCL support for FPGAs represents a significant improvement in extending database applications to the reconfigurable domain. Taking advantage of the programmability offered by the OpenCL HLS tool, an OpenCL database can be easily ported and re-designed for FPGAs. A single SQL query in these database systems usually consists of multiple operators, and each one of these operators in...
Heterogeneous computing is rapidly gaining increased attention due to the promise it holds in overcoming power and performance walls in traditional computing systems. With its focus on customized processing nodes dedicated to the different tasks in an application, it is hoped that these walls will be overcome. Therefore, CPU-FPGA co-architectures are also gaining ground in application areas like recognition,...
In elastic cloud computing environment, multiple virtual machines may reside in the same physical machine for services consolidation. For the same residential guest domains or multi-tiered hosting services, the inter-domain communications are complex and frequent. However, traditional inter-domain communications are conducted through the virtual network interfaces of both sending and receiving virtual...
Intelligent GPU cache bypassing can improve the efficiency of using GPU memory bandwidth, which can benefit GPU performance. In this paper, we study a pure hardware-based GPU cache bypassing method that can be applied to GPU applications without having to recompile the programs. Moreover, we introduce a hybrid method that can exploit profiling information to further enhance the hardware-based bypassing...
Cache memories have been introduced in recent generations of Graphics Processing Units (GPUs) to benefit general-purpose computing on GPUs (GPGPUs). In this work, we analyze the memory access patterns of GPGPU applications and propose a cost-effective profiling-based method to identify the data accesses that should bypass the L1 data cache to improve performance. The evaluation indicates that the...
Recent Graphics Processing Units (GPUs) have employed cache memories to boost performance. However, cache memories are well known to be harmful to time predictability for CPUs. For high-performance real-time systems using GPUs, it remains unknown whether or not cache memories should be employed. In this paper, we quantitatively compare the performance for GPUs with and without caches, and find that...
Graphics Processing Units (GPUs) have become a popular choice for general-purpose high-performance computing. Encryption and decryption algorithms such as the Advanced Encryption Standard (AES) have been implemented on GPUs to gain significant speedup. However, the security of the GPU architecture is not well studied, making it potentially risky to offload sensitive computation to GPUs. In this paper,...
A parallelized and pipelined architecture based on FPGA and a higher-level Self Reconfiguration Platform are proposed in this paper to model Generalized Laguerre-Volterra MIMO system essential in identifying the time-varying neural dynamics underlying spike activities. Our proposed design is based on the Xilinx Virtex-6 FPGA platform and the processing core can produce data samples at a speed of 1...
The deconvolution of blurred and noisy images is an ill-posed inverse problem, which can be regularized under the Bayesian framework by introducing an appropriate image prior. In this paper, inspired by the state-of-art nonlocal means(NLM) denoising technique which exploits the similarity of the image patches, we construct an inhomogeneous and anisotropic image prior under the Markov random field...
In this paper, a new approach to improve least squares support vector machines is presented. We consider the membership of every sample in constraints, that is to say, every sample are not fully assigned to one class. The membership is computed by employing the technique of fuzzy rough sets, and then a new least squares support vector machine algorithm based on fuzzy rough sets is proposed, experiments...
Blurring and jaggy artifacts are the primal culprits that plague the current super-resolution techniques. In this paper, we propose a simple but effective approach which is capable of producing a pleasant artifact-free high-resolution image from a single low-resolution input. Specifically, we first magnify the low-resolution image to the desired resolution through structure adaptive interpolation...
Face recognition plays an every important role in security surveillance, secure access and identity authentication. In this paper, we propose a novel face recognition method based on supervised learning. Our method consists in first extracting face feature using a supervised spectral regression, then we use multiple kernel SVM to classify face. Experimental results on Yale B face database and AR face...
Aimed to the problem that it is hardship to get real-time and on-line measuring parameters in wood drying process, a novel PSO-SVM model that hybridized the particle swarm optimization (PSO) and support vector machines (SVM) to improve the nonlinearity caused by ambient temperature and other disturbance factors is presented. Support vector machines (SVM) based on statistical learning theory and structural...
Dynamical analysis of the current network status is critical to detect large scale intrusions and to ensure the networks to continually function. Collecting and analyzing traffic in real time and reporting the current status in time provide a feasible way. In this paper we used a refined naive Bayes method, naive Bayes kernel estimator (NBKE), to identify flooding attacks and port scans from normal...
In this paper, we address the problem of producing super- resolved image from a single low-resolution input. Unlike most previous work, the camera's point spread function (PSF) is not assumed to be known in advance and the single image super-resolution problem is formulated as a blind deconvolution problem under a MAP framework which can be optimized effectively in an iterative manner. Experimental...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.