The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we propose a 2-D grouping FIFO based FFT hardware architecture, supporting 36 different FFT sizes defined in 3GPP-LTE systems. Also, the important design foundation is to develop a hybrid-radix computing kernel engine, including 4 configuration types. In a design implementation via TSMC 90-nm CMOS technology, the reconfigurable FFT chip only has a core area occupation of 1.51 mm2, dissipating...
The evolution of convolutional neural networks (CNNs) into more complex forms of organization, with additional layers, larger convolutions and increasing connections, established the state-of-the-art in terms of accuracy errors for detection and classification challenges in images. Moreover, as they evolved to a point where Gigabytes of memory are required for their operation, we have reached a stage...
Network virtualization offers flexibility by decoupling virtual network from the underlying physical network. Software-Defined Network (SDN) could utilize the virtual network. For example, in Software-Defined Networks, the entire network can be run on commodity hardware and operating systems that use virtual elements. However, this could present new challenges of data plane performance. In this paper,...
Field-Programmable Gate Arrays (FPGAs) are gaining considerable momentum in mainstream high-performance systems in recent years due to their flexibility and low power consumption. Still, FPGAs remain largely unavailable to software programmers due to programming and debugging difficulties that are inherent to standard Hardware Description Languages. The performance that hardware-oblivious software...
Recent developments in storage class memory such as PCM, MRAM, RRAM, and STT-RAM have strengthened their leadership as storage media for memory-based file systems. Traditional Linux memory-based file systems such as Ramfs and Tmpfs utilize the Linux page cache as a file system. These file systems, when adopted as a file system for SCM, have the following problems. First, current implementation of...
A pre-trained convolutional deep neural network (CNN) is widely used for embedded systems, which requires highly power-and-area efficiency. In that case, the CPU is too slow, the embedded GPU dissipates much power, and the ASIC cannot keep up with the rapidly progress of the CNN variations. This paper uses a binarized CNN which treats only binary 2-values for the inputs and the weights. Since the...
In this paper we propose a novel CNN hardware accelerator, called AlScale, capable of accelerating convolutional, pooling, fully-connected and adding CNN layers. In contrast to most existing solutions, AIScale offers a complete solution to the full CNN acceleration. AIScale is designed as a coarse-grained reconfigurable architecture, which uses rapid, dynamic reconfiguration during the CNN layer processing...
Emerging non-volatile memory (NVM) technologies provide opportunities to improve the performance of key-value databases (KVDBs) by deploying database on NVM. However, existing in-memory KVDBs cannot fully exploit the advantages of NVM. They process data on in-memory database and store an image on persistent storage via an underlying file system. The performance of database operations is degraded by...
Next generation memory technologies, which we denote as new memory, have both nonvolatile and byte addressable properties. These characteristics are expected to bring changes to the conventional computer system structure. Most previous research on the use of new memory have been focused on how to efficiently store files, objects, and data structure while exploiting persistence in new memory. Unlike...
Attacks on memory, revealing secrets, for example, via DMA or cold boot, are a long known problem. In this paper, we present TransCrypt, a concept for transparent and guest-agnostic, dynamic kernel and user main memory encryption using a custom minimal hypervisor. The concept utilizes the address translation features provided by hardware-based virtualization support of modern CPUs to restrict the...
Heterogeneous computing platforms containing a wide range of computing resources from CPUs to specialized hardware accelerators is the trend today resulting from the physical limitations on processors speed and the increasing demand for computing performance. Hence many optimization strategies are studied to get better throughput and lower energy consumption in heterogeneous systems. Various memory...
Mobile device forensics is an interdisciplinary field consisting of techniques applied to a wide range of computing devices. Android devices are among the most disruptive technologies of the last years, gaining even more diffusion and success in the daily life of a wide range of people categories. Android devices became even more important in the forensic field due to the rich amount of personal information...
Frequent itemset mining (FIM) is a widely-used data-mining technique for discovering sets of frequently-occurring items in large databases. However, FIM is highly time-consuming when datasets grow in size. FPGAs have shown great promise for accelerating computationally-intensive algorithms, but they are hard to use with traditional HDL-based design methods. The recent introduction of Xilinx SDAccel...
Smartphones have become a vital part of our business and everyday life, as they constitute the primary communication vector. Android dominates the smartphone market (86.2%) and has become pervasive, running in ‘smart’ devices such as tablets, TV, watches, etc. Nowadays, instant messaging applications have become popular amongst smartphone users and since 2016 are the main way of messaging communication...
Convolutional Neural Network (CNN) has become one of the most successful technologies for visual classification and other applications. As CNN models continue to evolve and adopt different kernel sizes in various applications, it is necessary for the hardware architecture to support reconfigurability. Previous FPGAs and programmable ASICs are fine-grained reconfigurable but with energy efficiency...
This paper describes the implementation of approximate memory support in Linux operating system kernel. The new functionality allows the kernel to distinguish between normal memory banks, which are composed by standard memory cells that retain data without corruption, and approximate memory banks, where memory cells are subject to read/write faults with controlled probability. Approximate memories...
We compare multi-GPU performance of the multilevel fast multipole algorithm (MLFMA) on two different systems: A shared-memory IBM S822LC workstation with four NVIDIA P100 GPUs, and 16 XK nodes (each is employed with a single NVIDIA K20X GPU) of the Blue Waters supercomputer. MLFMA is implemented for solving scattering problems involving two-dimensional inhomogeneous bodies. Results show that the multi-GPU...
Machine Learning techniques such as Support Vector Machines (SVM) have found applications in many fields, e.g. in Wireless Sensor Networks (WSN) and sensor data processing in general. Especially in the case of WSN energy is very limited as agents solely operate based on battery power after they have been deployed, therefore energy efficiency is of great importance. Furthermore, agents are supposed...
This work presents an efficient hardware accelerator design of deep residual learning algorithms, which have shown superior image recognition accuracy (>90% top-5 accuracy on ImageNet database). Two key objectives of the acceleration strategy are to (1) maximize resource utilization and minimize data movements, and (2) employ scalable and reusable computing primitives to optimize physical design...
Deep learning is gaining popularity in the recent years due to its impressive performance in different application areas. Convolutional Neural Network (CNN) is the state-of-the-art deep learning architecture that is being used widely in the areas of image recognition, speech recognition and many other applications. CNN is computationally intensive and resource hungry architecture. Hence, its efficient...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.