Wyniki wyszukiwania

rozdział

FatMan vs. LittleBoy: Scaling Up Linear Algebraic Operations in Scale-Out Data Platforms

Luna Xu, Seung-Hwan Lim, Ali R. Butt, Sreenivas R. Sukumar, więcej

2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS) > 25 - 30

2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS)

Linear algebraic operations such as matrix manipulations form the kernel of many machine learning and other crucial algorithms. Scaling up as well as scaling out such algorithms are highly desirable to enable efficient processing over millions of data points. To this end, we present a matrix manipulation approach to effectively scale-up each node in a scale-out data parallel platform such as Apache...

rozdział

Boda-RTC: Productive generation of portable, efficient code for convolutional neural networks on mobile computing platforms

Matthew W. Moskewicz, Forrest N. Iandola, Kurt Keutzer

2016 IEEE 12th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob) > 1 - 10

2016 IEEE 12th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob)

The popularity of neural networks (NNs) spans academia [1], industry [2], and popular culture [3]. In particular, convolutional neural networks (CNNs) have been applied to many image based machine learning tasks and have yielded strong results [4]. The availability of hardware/software systems for efficient training and deployment of large and/or deep CNN models is critical for the continued success...

rozdział

vNFChain: A VM-Dedicated Fast Service Chaining Framework for Micro-VNFs

Ryota Kawashima, Hiroshi Matsuo

2016 Fifth European Workshop on Software-Defined Networks (EWSDN) > 13 - 18

2016 Fifth European Workshop on Software-Defined Networks (EWSDN)

Network Functions Virtualization (NFV) has been expected to flexibly compose Virtual Network Functions (VNFs) by virtualizing existing network appliances and logically chaining them. Currently used VNFs are realized as VM-based appliances and shared by multiple users (VMs). However, the notion of NFV can be extended to reinforce network functionality of user VMs by introducing VM-dedicated VNFs. In...

rozdział

Acceleration of finite element method for 3D DC resistivity modeling using multi-GPU

Hairil Anwar, Achmad Imam Kistijantoro

2016 International Conference on Information Technology Systems and Innovation (ICITSI) > 1 - 5

2016 International Conference on Information Technology Systems and Innovation (ICITSI)

In this paper finite element method for 3D DC resistivity modeling accelerated using multi-GPU (Graphics Processing Unit). Solution of the large system of linear equations is the most expensive computation in finite element method performed in GPUs to reduce the computational time. Conjugate gradient solver used to solve large system of linear equations. We developed kernel for conjugate gradient...

rozdział

Hi-B_oX: A generic library of fast solvers for boundary element methods

Toufic Abboud, Denis Barbier

2016 IEEE Conference on Antenna Measurements & Applications (CAMA) > 1 - 4

2016 IEEE Conference on Antenna Measurements & Applications (CAMA)

Hi-BoX is a generic library implementing state-of-the-art fast direct and iterative solvers for existing codes based on Boundary Element Method (BEM) or Method of Moments (MoM). It benefits from recent advances in numerical methods, linear algebra and High Performance Computing (HPC). This includes new advances in H-matrix and Fast Multipole Method (FMM) and their hybridization, new approaches for...

rozdział

Recommending Code Changes for Automatic Backporting of Linux Device Drivers

Ferdian Thung, Xuan-Bach D. Le, David Lo, Julia Lawall

2016 IEEE International Conference on Software Maintenance and Evolution (ICSME) > 222 - 232

2016 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Device drivers are essential components of any operating system (OS). They specify the communication protocol that allows the OS to interact with a device. However, drivers for new devices are usually created for a specific OS version. These drivers often need to be backported to the older versions to allow use of the new device. Backporting is often done manually, and is tedious and error prone....

rozdział

Towards a GPU Abstraction for Lua

Raphael Ribeiro, Paulo Motta

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) > 13 - 18

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)

The use of GPUs for accelerating parallel applications is a consolidated approach. However, it is still difficult to write applications for this type of hardware, which is mostly done in compiled languages like C. Some effort has been employed to provide developers with libraries and frameworks for interpreted languages to be able to take advantage of the computing capabilities of GPUs. In this context...

rozdział

Design and Implementation for Checkpointing of Distributed Resources Using Process-Level Virtualization

Kapil Arya, Rohan Garg, Artem Y. Polyakov, Gene Cooperman

2016 IEEE International Conference on Cluster Computing (CLUSTER) > 402 - 412

2016 IEEE International Conference on Cluster Computing (CLUSTER)

System-level checkpoint-restart is a critical technology for long-running jobs in high-performance computing. Yet, only two approaches to checkpointing MPI applications continue to survive in wide use today. One approach is to use the kernel module-based BLCR in combination with an MPI checkpoint-restart service particular to the MPI implementation in use. Unfortunately, this lacks support for some...

rozdział

Designing and Enabling Simulation of Real-World GPU Network Applications with ns-3 and DCE

Jared Ivey, George Riley, Brian Swenson, Margaret Loper

2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS) > 445 - 450

2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)

The ability to execute the original source code for network protocols and applications within a network simulation environment frees the simulation modeler from the time consuming task of having to create, test and debug models representing these applications. This work extends the functionality of the Direct Code Execution (DCE) framework of ns-3 by incorporating the ability to call NVIDIA CUDA kernels...

rozdział

Faster Method for Tuning the Tile Size for Tile Matrix Decomposition

Tomohiro Suzuki

2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC) > 329 - 336

2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)

Matrices are frequently decomposed in various ways in order to meet the conditions of an application, and therefore, algorithms for doing this are very important in the field of numerical linear algebra. In the tile algorithm, it is very critical to find a tile size that is suitable for the size of the matrix and the run-time environment. Smaller tiles can generate many fine-grained tasks. This can...

rozdział

A scale-free structure for power-law graphs

Richard Veras, Tze Meng Low, Franz Franchetti

2016 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2016 IEEE High Performance Extreme Computing Conference (HPEC)

Many real-world graphs, such as those that arise from the web, biology and transportation, appear random and without a structure that can be exploited for performance on modern computer architectures. However, these graphs have a scale-free graph topology that can be leveraged for locality. Existing sparse data formats are not designed to take advantage of this structure. They focus primarily on reducing...

rozdział

Welcome to Binder: A kernel level attack model for the Binder in Android operating system

Majid Salehi, Farid Daryabar, Mohammad Hesam Tadayon

2016 8th International Symposium on Telecommunications (IST) > 156 - 161

2016 8th International Symposium on Telecommunications (IST)

In this paper, we consider to seek vulnerabilities and we conduct possible attacks on the crucial and essential parts of Android OSs architecture including the framework and the Android kernel layers. As a regard, we explain the Binder component of Android OS from security point of view. Then, we demonstrate how to penetrate into the Binder and control data exchange mechanism in Android OS by proposing...

rozdział

FLoW-Linux: Virtualization distribution scheme for fault tolerant and system enhancement

Imaduddin Mukhtar, Adhe Widianjaya, J. Michael Saputra, Tito Pramudana, więcej

2016 International Electronics Symposium (IES) > 426 - 431

2016 International Electronics Symposium (IES)

The development of microkernel has sharply increased. One of the most successful microkernel implementation is L4. L4Linux is L4 version that is able to run virtualized Linux. We have also built our microkernel named FLoW. In this paper we described about our achievement in developing virtualized Linux on top of our FLoW microkernel. We implemented unique design about virtualizing more than one Linux...

rozdział

Generation of the Single Precision BLAS Library for the Parallella Platform, with Epiphany Co-processor Acceleration, Using the BLIS Framework

Miguel Tasende

2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech) > 894 - 897

2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech)

The Parallella is a hybrid computing platform that came into existence as the result of a Kickstarter project by Adapteva. It is composed of the high performance, energy-efficient, manycore architecture, Epiphany chip (used as co-processor) and one Zynq-7000 series chip, which normally runs a regular Linux OS version, serves as the main processor, and implements "glue logic" in its internal...

rozdział

Thread execution on embedded processor - ARM9 core in Embedded Linux environment

Bhairavi N. Savant, Shubhangi M. Deshmukh, Surekha K S Hegde

2016 International Conference on Computing Communication Control and automation (ICCUBEA) > 1 - 5

2016 International Conference on Computing Communication Control and automation (ICCUBEA)

As we know in case of any Operating System, processes do not share resources well. Theres a high context switching overhead. Whereas, a thread (or lightweight process) is a basic unit of CPU utilization and comprises of a thread Identifier (ID), Program counter, register set and stack space. A thread within the process shares its code section, data section, and other operating-system resources, such...

rozdział

Improved file system security through restrictive access

Navneet Kaur, Maninder Singh

2016 International Conference on Inventive Computation Technologies (ICICT) > 3 > 1 - 5

2016 International Conference on Inventive Computation Technologies (ICICT)

Security is a prime concern in today's era of technology when dealing with digital data. All the information is managed by the file system which is the core layer of security in an Operating System. Due to lack of security at this layer, private information can be accessed by an intruder or in case of theft data can be read via mounting it on to a mount point and accessing the information. Other layer...

rozdział

A sparse approach to fault severity classification for gearbox monitoring

Chuang Sun, Peng Wang, Ruqiang Yan, Robert X. Gao

2016 19th International Conference on Information Fusion (FUSION) > 2303 - 2308

2016 19th International Conference on Information Fusion (FUSION)

Fault detection and severity classification are critical to gearbox structural health monitoring. A common approach to fault severity classification is to identify the patterns associated with features extracted from raw sensor data that vary with fault deterioration. Since however features only represent partial information contained in the raw data, they may indicate different interactions as faults...

rozdział

numap: A portable library for low-level memory profiling

Manuel Selva, Lionel Morel, Kevin Marquet

2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS) > 55 - 62

2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS)

The memory subsystem of modern multi-core architectures is becoming more and more complex with the increasing number of cores integrated in a single computer system. This complexity leads to profiling needs to let software developers understand how programs use the memory subsystem. Modern processors come with hardware profiling features to help building tools for these profiling needs. Regarding...

rozdział

Automated dataflow graph merging

Nils Voss, Stephen Girdlestone, Oskar Mencer, Georgi Gaydadjiev

2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS) > 219 - 226

2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS)

In this paper we present several algorithms used to construct a tool that automatically optimizes static dataflow graphs for the purpose of high level hardware synthesis. Our target is to automatically merge multiple dataflow graphs in order to create a single structure implementing all distinct operations with minimal area overhead by time-slicing hardware resources. We show that a combination of...

rozdział

CID: A Compile-Time Implementation Decider for Heterogeneous Platforms Based on C++ Attributes

Luis Miguel Sanchez, David del Rio Astorga, Manuel F. Dolz, Javier Fernandez

2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld) > 1149 - 1156

2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld)

With the emergence of heterogeneous architectures, the development of parallel software has become an increasingly complex issue. The fact of using multiple programming models targeted to specific devices has turned the implementation process into a challenging task that comes along with a variety of difficulties. In this sense, developers are preoccupied with finding ways to alleviate the burden...

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

FatMan vs. LittleBoy: Scaling Up Linear Algebraic Operations in Scale-Out Data Platforms

Boda-RTC: Productive generation of portable, efficient code for convolutional neural networks on mobile computing platforms

vNFChain: A VM-Dedicated Fast Service Chaining Framework for Micro-VNFs

Acceleration of finite element method for 3D DC resistivity modeling using multi-GPU

Hi-B_oX: A generic library of fast solvers for boundary element methods

Recommending Code Changes for Automatic Backporting of Linux Device Drivers

Towards a GPU Abstraction for Lua

Design and Implementation for Checkpointing of Distributed Resources Using Process-Level Virtualization

Designing and Enabling Simulation of Real-World GPU Network Applications with ns-3 and DCE

Faster Method for Tuning the Tile Size for Tile Matrix Decomposition

A scale-free structure for power-law graphs

Welcome to Binder: A kernel level attack model for the Binder in Android operating system

FLoW-Linux: Virtualization distribution scheme for fault tolerant and system enhancement

Generation of the Single Precision BLAS Library for the Parallella Platform, with Epiphany Co-processor Acceleration, Using the BLIS Framework

Thread execution on embedded processor - ARM9 core in Embedded Linux environment

Improved file system security through restrictive access

A sparse approach to fault severity classification for gearbox monitoring

numap: A portable library for low-level memory profiling

Automated dataflow graph merging

CID: A Compile-Time Implementation Decider for Heterogeneous Platforms Based on C++ Attributes

Opcje filtrowania

Data publikacji

Dostępność treści

Słowa kluczowe

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Dostępność treści

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu