Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Linear algebraic operations such as matrix manipulations form the kernel of many machine learning and other crucial algorithms. Scaling up as well as scaling out such algorithms are highly desirable to enable efficient processing over millions of data points. To this end, we present a matrix manipulation approach to effectively scale-up each node in a scale-out data parallel platform such as Apache...
The popularity of neural networks (NNs) spans academia [1], industry [2], and popular culture [3]. In particular, convolutional neural networks (CNNs) have been applied to many image based machine learning tasks and have yielded strong results [4]. The availability of hardware/software systems for efficient training and deployment of large and/or deep CNN models is critical for the continued success...
Network Functions Virtualization (NFV) has been expected to flexibly compose Virtual Network Functions (VNFs) by virtualizing existing network appliances and logically chaining them. Currently used VNFs are realized as VM-based appliances and shared by multiple users (VMs). However, the notion of NFV can be extended to reinforce network functionality of user VMs by introducing VM-dedicated VNFs. In...
In this paper finite element method for 3D DC resistivity modeling accelerated using multi-GPU (Graphics Processing Unit). Solution of the large system of linear equations is the most expensive computation in finite element method performed in GPUs to reduce the computational time. Conjugate gradient solver used to solve large system of linear equations. We developed kernel for conjugate gradient...
Hi-BoX is a generic library implementing state-of-the-art fast direct and iterative solvers for existing codes based on Boundary Element Method (BEM) or Method of Moments (MoM). It benefits from recent advances in numerical methods, linear algebra and High Performance Computing (HPC). This includes new advances in H-matrix and Fast Multipole Method (FMM) and their hybridization, new approaches for...
Device drivers are essential components of any operating system (OS). They specify the communication protocol that allows the OS to interact with a device. However, drivers for new devices are usually created for a specific OS version. These drivers often need to be backported to the older versions to allow use of the new device. Backporting is often done manually, and is tedious and error prone....
The use of GPUs for accelerating parallel applications is a consolidated approach. However, it is still difficult to write applications for this type of hardware, which is mostly done in compiled languages like C. Some effort has been employed to provide developers with libraries and frameworks for interpreted languages to be able to take advantage of the computing capabilities of GPUs. In this context...
System-level checkpoint-restart is a critical technology for long-running jobs in high-performance computing. Yet, only two approaches to checkpointing MPI applications continue to survive in wide use today. One approach is to use the kernel module-based BLCR in combination with an MPI checkpoint-restart service particular to the MPI implementation in use. Unfortunately, this lacks support for some...
The ability to execute the original source code for network protocols and applications within a network simulation environment frees the simulation modeler from the time consuming task of having to create, test and debug models representing these applications. This work extends the functionality of the Direct Code Execution (DCE) framework of ns-3 by incorporating the ability to call NVIDIA CUDA kernels...
Matrices are frequently decomposed in various ways in order to meet the conditions of an application, and therefore, algorithms for doing this are very important in the field of numerical linear algebra. In the tile algorithm, it is very critical to find a tile size that is suitable for the size of the matrix and the run-time environment. Smaller tiles can generate many fine-grained tasks. This can...
Many real-world graphs, such as those that arise from the web, biology and transportation, appear random and without a structure that can be exploited for performance on modern computer architectures. However, these graphs have a scale-free graph topology that can be leveraged for locality. Existing sparse data formats are not designed to take advantage of this structure. They focus primarily on reducing...
In this paper, we consider to seek vulnerabilities and we conduct possible attacks on the crucial and essential parts of Android OSs architecture including the framework and the Android kernel layers. As a regard, we explain the Binder component of Android OS from security point of view. Then, we demonstrate how to penetrate into the Binder and control data exchange mechanism in Android OS by proposing...
The development of microkernel has sharply increased. One of the most successful microkernel implementation is L4. L4Linux is L4 version that is able to run virtualized Linux. We have also built our microkernel named FLoW. In this paper we described about our achievement in developing virtualized Linux on top of our FLoW microkernel. We implemented unique design about virtualizing more than one Linux...
The Parallella is a hybrid computing platform that came into existence as the result of a Kickstarter project by Adapteva. It is composed of the high performance, energy-efficient, manycore architecture, Epiphany chip (used as co-processor) and one Zynq-7000 series chip, which normally runs a regular Linux OS version, serves as the main processor, and implements "glue logic" in its internal...
As we know in case of any Operating System, processes do not share resources well. Theres a high context switching overhead. Whereas, a thread (or lightweight process) is a basic unit of CPU utilization and comprises of a thread Identifier (ID), Program counter, register set and stack space. A thread within the process shares its code section, data section, and other operating-system resources, such...
Security is a prime concern in today's era of technology when dealing with digital data. All the information is managed by the file system which is the core layer of security in an Operating System. Due to lack of security at this layer, private information can be accessed by an intruder or in case of theft data can be read via mounting it on to a mount point and accessing the information. Other layer...
Fault detection and severity classification are critical to gearbox structural health monitoring. A common approach to fault severity classification is to identify the patterns associated with features extracted from raw sensor data that vary with fault deterioration. Since however features only represent partial information contained in the raw data, they may indicate different interactions as faults...
The memory subsystem of modern multi-core architectures is becoming more and more complex with the increasing number of cores integrated in a single computer system. This complexity leads to profiling needs to let software developers understand how programs use the memory subsystem. Modern processors come with hardware profiling features to help building tools for these profiling needs. Regarding...
In this paper we present several algorithms used to construct a tool that automatically optimizes static dataflow graphs for the purpose of high level hardware synthesis. Our target is to automatically merge multiple dataflow graphs in order to create a single structure implementing all distinct operations with minimal area overhead by time-slicing hardware resources. We show that a combination of...
With the emergence of heterogeneous architectures, the development of parallel software has become an increasingly complex issue. The fact of using multiple programming models targeted to specific devices has turned the implementation process into a challenging task that comes along with a variety of difficulties. In this sense, developers are preoccupied with finding ways to alleviate the burden...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.