The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A fork bomb attack is a denial of service attack. An attacker generates many processes rapidly, exhausting the resources of the target computer systems. There are several previous work to detect and remove the processes that cause fork bomb attacks. However, the operating system with the previous methods have the risks to terminate inappropriate processes that do not fork bomb processes. In this paper,...
GPUs are widely used as powerful accelerators for data-parallel applications such as financial and scientific applications in industrial and scientific areas. Effective scheduling of kernels can significantly enhance performance and utilization. In shared environments such as cloud, lots of kernels from users are being requested to be launched for execution. An effective kernel scheduling method can...
In this paper, we introduce memos, which integrates suitable memory management policies and schedules resources over the entire memory hierarchy in hybrid memory system. Powered by an OS kernel level monitoring tool, memos captures memory patterns online, and then leverages them to guide the memory page placement and data mapping. Experimental results show, on average, memos can benefit memory utilization,...
Structures of hardware and software of real time operating system of the SATELLITE programmable controller are considered. In the processor module the microprocessor based on ARM Cortex-M4F kernel is used. The user application program is developed in the environment of CODESYS, and executed under control of the RTS program of 3S-Smart Software Solutions GmbH. For contact of RTS with hardware the ARM...
Estimating the failure probabilities of SRAM memory cells using Monte Carlo or Importance Sampling techniques is expensive in the number of SPICE simulations needed. This paper presents a methodology for estimating the dynamic margin failure probabilities by building a surrogate model of the dynamic margin using Gaussian Process regression. Additive kernel functions that can extrapolate the margin...
Hardware accelerators for convolutional neural network (CNN) accompany a large amount of SRAM in order to reduce the number of expensive off-chip DRAM accesses. This design trend gives implications to architects: the SRAM area will dominate the entire chip area for the future CNN accelerators. Since the probability of soft errors such as energetic particle strikes goes as the density of SRAM, errors...
Lots of studies have shown that memory hardware error rates are orders of magnitude higher than previously reported. In order to fight with these memory hardware errors, many memory testing tools have been developed, especially software level online memory testers, which means these memory testers implemented in software can work with the OS (operating system) at the same time. However, validation...
Single-ISA heterogeneous multicore processors have gained increasing popularity with the introduction of recent technologies such as ARM big.LITTLE. These processors offer increased energy efficiency through combining low power in-order cores with high performance out-of-order cores. Efficiently exploiting this attractive feature requires careful management so as to meet the demands of targeted applications...
We present a novel architecture for sparse pattern processing, using flash storage with embedded accelerators. Sparse pattern processing on large data sets is the essence of applications such as document search, natural language processing, bioinformatics, subgraph matching, machine learning, and graph processing. One slice of our prototype accelerator is capable of handling up to 1TB of data, and...
For numerous scientific applications Sparse Matrix-Vector multiplication (SpMV) is one of the most important kernels. Unfortunately, due to its very low ratio of computation to memory access SpMV is inherently a memory bound problem. On the other hand, the main memory bandwidth of commercial off-the-shelf (COTS) architectures is insufficient for available computation resources on these platforms,...
Use of accelerators such as GPUs is increasing, but efficient use of GPUs requires making good design choices. Such design choices include type of memory allocation and overlapping concurrency of data transfer with parallel computation. Performance varies with the application, hardware version such as generation of GPU, and software version including programming language drivers. This large number...
Real-time, low-latency, image processing with high throughput is vital for many time-critical applications in fields such as medical imaging, robotics, and wearable computers. Traditionally, FPGAs have often been employed to meet these requirements. However, due to the productivity challenges, using FPGAs may not be viable in some cases. Alternatively, the typical approach of processing an image on...
The Parallella is a hybrid computing platform that came into existence as the result of a Kickstarter project by Adapteva. It is composed of the high performance, energy-efficient, manycore architecture, Epiphany chip (used as co-processor) and one Zynq-7000 series chip, which normally runs a regular Linux OS version, serves as the main processor, and implements "glue logic" in its internal...
Cloud computing is a new IT delivery paradigm that offers computing resources as on-demand services over the Internet. Like all forms of outsourcing, cloud computing raises serious concerns about the security of the data assets that are outsourced to providers of cloud services. Security issues of cloud platform have gradually drawn the attention of research institutions and various security companies...
This paper introduces a software policy for memory management in heterogeneous memory systems in order to improve the trade-offs between performance and power consumption, while attempting to make the best use of different characteristics of the underlying memory technologies. In this policy, the operating system and the application co-schedule page management in order to make informed decisions about...
In the past few years nonlocal filters have emerged as a serious contender for denoising synthetic aperture radar (SAR) images, offering superior noise reduction and detail preservation compared to many other filters. In this manuscript we analyze how nonlocal filters, whose computational costs were so far prohibitive for large scale processing, can be implemented efficiently on graphics processing...
Error data collected at runtime play a key role for dependability analysis and improvement of software systems. The use of monitoring frameworks for legacy mission-critical systems is hindered by limited intervention degree and low intrusiveness requirements. We present the design and experimentation of an error monitoring service for a legacy large-scale critical system in the Air Traffic Control...
OpenCL is a high-level language that allows mixed hardware/software systems to be specified and compiled to run on heterogeneous parallel computing platforms. The hardware parallelism can take the form of multi-core central processing units (CPUs), massively parallel graphics processing units (GPUs), and, most recently, field-programmable gate array (FPGA) fabrics. OpenCL compilers for CPUs and GPUs...
The recent advent of stacked memory devices has led to a resurgence of researchassociated with the fundamental memory hierarchy and associated memory pipeline. The bandwidth advantages provided by stacked logic and DRAM devices haveinspired research associated with eliminating the bandwidth bottlenecksassociated with many applications in high performance computing. Further, recent efforts have focused...
Classification is one of the core tasks in machine learning data mining. One of several models of classification are classification rules, which use a set of if-then rules to describe a classification model. In this paper we present a set of FPGA-based compute kernels for accelerating classification rule induction. The kernels can be combined to perform specific procedures in rule induction process,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.