Search results

chapter

Evaluating weightless neural networks for bias identification on news

Rafael Dutra Cavalcanti, Priscila M.V. Lima, Massimo De Gregorio, Daniel Sadoc Menasche

2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC) > 257 - 262

2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC)

Identifying biases in articles published in the news media is one of the most fundamental problems in the realm of journalism and communication, and automatic mechanisms for detecting that a piece of news is biased have been studied for decades. In this paper, we compare the WiSARD classifier, a lightweight efficient weightless neural network architecture, against Logistic Regression, Gradient Tree...

chapter

Power Efficient Sharing-Aware GPU Data Management

Abdulaziz Tabbakh, Murali Annavaram, Xuehai Qian

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) > 698 - 707

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

The power consumed by memory system in GPUs is a significant fraction of the total chip power. As thread level parallelism increases, GPUs are likely to stress cache and memory bandwidth even more, thereby exacerbating power consumption. We observe that neighboring concurrent thread arrays (CTAs) within GPU applications share considerable amount of data. However, the default GPU scheduling policy...

chapter

MOCHA: Morphable Locality and Compression Aware Architecture for Convolutional Neural Networks

Syed Mohammad Asad Hassan Jafri, Ahmed Hemani, Kolin Paul, Naeem Abbas

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) > 276 - 286

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Today, machine learning based on neural networks has become mainstream, in many application domains. A small subset of machine learning algorithms, called Convolutional Neural Networks (CNN), are considered as state-ofthe- art for many applications (e.g. video/audio classification). The main challenge in implementing the CNNs, in embedded systems, is their large computation, memory, and bandwidth...

chapter

Argo NodeOS: Toward Unified Resource Management for Exascale

Swann Perarnau, Judicael A. Zounmevo, Matthieu Dreher, Brian C. Van Essen, more

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) > 153 - 162

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Exascale systems are expected to feature hundreds of thousands of compute nodes with hundreds of hardware threads and complex memory hierarchies with a mix of on-package and persistent memory modules. In this context, the Argo project is developing a new operating system for exascale machines. Targeting production workloads using workflows or coupled codes, we improve the Linux kernel on several fronts...

chapter

Latency Tails of Byte-Addressable Non-Volatile Memories in Systems

Chao Sun, Damien Le Moal, Qingbo Wang, Robert Mateescu, more

2017 IEEE International Memory Workshop (IMW) > 1 - 4

2017 IEEE International Memory Workshop (IMW)

Next generation non-volatile memories, like Resistive RAM, Spin-Transfer Torque Magnetic RAM and Phase Change Memory, are byte- addressable with very low latency, bridging the large performance gap between DRAM memory and NAND flash storage. For this reason we think of them as Storage Class Memories (SCMs), meaning their main use could ideally be as main memory but the non-volatility and high density...

chapter

An Evaluation of the NVIDIA TX1 for Supporting Real-Time Computer-Vision Workloads

Nathan Otterness, Ming Yang, Sarah Rust, Eunbyung Park, more

2017 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) > 353 - 364

2017 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS)

Autonomous vehicles are an exemplar for forward-looking safety-critical real-time systems where significant computing capacity must be provided within strict size, weight, and power (SWaP) limits. A promising way forward in meeting these needs is to leverage multicore platforms augmented with graphics processing units (GPUs) as accelerators. Such an approach is being strongly advocated by NVIDIA,...

chapter

A novel zero weight/activation-aware hardware architecture of convolutional neural network

Dongyoung Kim, Junwhan Ahn, Sungjoo Yoo

Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017 > 1462 - 1467

2017 Design, Automation & Test in Europe Conference & Exhibition (DATE)

It is imperative to accelerate convolutional neural networks (CNNs) due to their ever-widening application areas from server, mobile to IoT devices. Based on the fact that CNNs can be characterized by a significant amount of zero values in both kernel weights and activations, we propose a novel hardware accelerator for CNNs exploiting zero weights and activations. We also report a zero-induced load...

chapter

Accelerating read atomic multi-partition transaction with remote direct memory access

Naofumi Murata, Hideyuki Kawashima, Osamu Tatebe

2017 IEEE International Conference on Big Data and Smart Computing (BigComp) > 239 - 246

2017 IEEE International Conference on Big Data and Smart Computing (BigComp)

Many applications these days require data processing that is both efficient and reliable. Distributed databases are one way to meet these requirements, but must be updated using distributed transactions. To manage foreign key constraints, secondary indices, and materialized views in distributed environments, read atomic multi-partition (RAMP) transactions demonstrate high efficiency. RAMP transactions...

chapter

Efficient Bayesian yield optimization approach for analog and SRAM circuits

Mengshuo Wang, Fan Yang, Changhao Yan, Xuan Zeng, more

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC) > 1 - 6

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC)

Conventional yield optimization approaches rely on accurate yield estimation for given design parameters, which would be computational intensive. In this paper, a novel Bayesian yield optimization approach is proposed for analog and SRAM circuits. An equivalent problem is formulated via applying Bayes' theorem on the augmented yield problem. The yield optimization problem is converted to identifying...

chapter

Taming memory related performance pitfalls in linux Cgroups

Zhenyun Zhuang, Cuong Tran, Jerry Weng, Haricharan Ramachandra, more

2017 International Conference on Computing, Networking and Communications (ICNC) > 531 - 535

2017 International Conference on Computing, Networking and Communications (ICNC)

Linux kernel feature of Cgroups (Control Groups) is being increasingly adopted for running applications in multi-tenanted environments. Many projects (e.g., Docker) rely on cgroups to isolate resources such as CPU and memory. It is critical to ensure high performance for such deployments. At LinkedIn, we have been using Cgroups and investigated its performance. This work presents our findings about...

chapter

A Memory Accessing Method for the Parallel Aho-Corasick Algorithm on GPU

JinMyung Yoon, Kang-Il Choi, HyunJin Kim

2016 International Conference on Information Science and Security (ICISS) > 1 - 3

2016 International Conference on Information Science and Security (ICISS)

In this paper, we propose a memory accessing method of Parallel Failureless Aho-Corasick (PFAC) algorithm considering Graphic Processing Unit (GPU) memory architecture for throughput improvement. Compared with Aho-Corasick (AC) Algorithm using Central Processing Unit (CPU) and Data-Parallel Aho-Corasick (DPAC) using Open Multi-Processing (OpenMP), PFAC using GPU achieves high performance advancement...

chapter

Analyzing and Comparing Android HTC Aria, Apple iPhone 3G, and Windows Mobile HTC TouchPro 6850

F. Chevonne Thomas Dancer

2016 International Conference on Computational Science and Computational Intelligence (CSCI) > 1037 - 1042

2016 International Conference on Computational Science and Computational Intelligence (CSCI)

Digital Forensics is a field of computer science that aids in determining what may or may not have occurred during some computer task. The bit-by-bit concept satisfies computer media, but it does not apply to smartphones. One experiment was designed using three devices, Android HTC Aria, Apple iPhone 3G, and Windows Mobile HTC TouchPro 6850. These experiments compare and contrast the device by carrier,...

chapter

Forensic reconstruction of executables from Windows 7 physical memory

S Dija, G S Suma, Dagma D Gonsalvez, Arun T Pillai

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) > 1 - 5

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

Memory Forensics becomes indispensable in Cyber Forensics Investigation as Random Access Memory or Physical Memory of a Computer holds crucial evidence which is nowhere available on Hard Disks or in other non-volatile storage media. This is because, nowadays most of the malwares are memory resident which leaves no footprints in Hard Disk storage. In this paper, a novel methodology is described for...

chapter

Hugepage & Swappiness functions for optimization of the search graph algorithm using Hadoop framework

Sunita Choudhary, Preeti Narooka

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) > 1 - 5

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

21st century is best known as technology centaury and the advancement in technology has helped in close knit networking of people in the world. With the advent of internet & social networking sites, connecting with people is a click away. These advancements warranted deep research in social networking [1] and its related technologies. The social networking sites store humongous data and when a...

chapter

Always-on motion detection with application-level error control on a near-threshold approximate computing platform

Giuseppe Tagliavini, Andrea Marongiu, Davide Rossi, Luca Benini

2016 IEEE International Conference on Electronics, Circuits and Systems (ICECS) > 552 - 555

2016 IEEE International Conference on Electronics, Circuits and Systems (ICECS)

Pushing supply voltages in the near-threshold region is today one of the main avenues to minimize power consumption in digital integrated circuits. This works well with logic units, but memory operations on standard six-transistor static RAM (6T-SRAM) cells become unreliable at low voltages. Standard cell memory (SCM) works fully reliably at near-threshold voltages, but has much lower area density...

chapter

Memory access pattern based insider threat detection in big data systems

Santosh Aditham, Nagarajan Ranganathan, Srinivas Katkoori

2016 IEEE International Conference on Big Data (Big Data) > 3625 - 3628

2016 IEEE International Conference on Big Data (Big Data)

Big data platforms like Hadoop and Spark are being widely adopted both by academia and industry. In this paper, we propose a runtime intrusion detection technique that understands and works according to the memory properties of such distributed compute platforms. The proposed method is based on runtime analysis of memory access patterns of tasks running on the slave nodes of a distributed compute...

chapter

Emulating an Octeon MIPS64 based embedded system on X86 in QEMU

Muhammad Amir Mehmood, Qurrat Ul Ain, Ayaz Akram, Abdul Qadeer, more

2016 19th International Multi-Topic Conference (INMIC) > 1 - 7

2016 19th International Multi-Topic Conference (INMIC)

Embedded systems are proliferating with their growing hardware capabilities. Their application areas include internet of things, cellular devices, network devices, etc. Application development and testing natively on such embedded hardware is expensive, time consuming, and challenging. In this case, system emulation is a cost-effective alternative. We have extended Quick Emulator (QEMU) to support...

chapter

FPGA implementation of the coupled filtering method

Chen Zhang, Tianzhu Liang, Philip K.T. Mok, Weichuan Yu

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) > 435 - 442

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

In ultrasound image analysis, speckle tracking methods are widely applied to study the elasticity of body tissue. However, “feature-motion decorrelation” still remains as a challenge for speckle tracking methods. Recently, a coupled filtering method was proposed to accurately estimate strain values when the tissue deformation is large. The major drawback of the new method is its high computational...

chapter

ZNNi: Maximizing the Inference Throughput of 3D Convolutional Networks on CPUs and GPUs

Aleksandar Zlateski, Kisuk Lee, H. Sebastian Seung

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 854 - 865

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Sliding window convolutional networks (ConvNets) have become a popular approach to computer vision problems such as image segmentation and object detection and localization. Here we consider the parallelization of inference, i.e., the application of a previously trained ConvNet, with emphasis on 3D images. Our goal is to maximize throughput, defined as the number of output voxels computed per unit...

chapter

Parallelization of GST algorithm for source code similarity detection

Marko J. Misic, Dusan V. Nikolov, Jelica Z. Protic, Milo V. Tomasevic

2016 24th Telecommunications Forum (TELFOR) > 1 - 4

2016 24th Telecommunications Forum (TELFOR)

Source code is a frequent target for plagiarism in massive computing courses. Plagiarism detection requires a significant effort from the teaching staff, thus software tools have been used to detect similar source codes. This paper examines parallelization of source code similarity detection based on Greedy-String-Tiling and Karp-Rabin algorithms. CPU implementation is parallelized using Pthreads,...

INFONA - science communication portal

Search results

Evaluating weightless neural networks for bias identification on news

Power Efficient Sharing-Aware GPU Data Management

MOCHA: Morphable Locality and Compression Aware Architecture for Convolutional Neural Networks

Argo NodeOS: Toward Unified Resource Management for Exascale

Latency Tails of Byte-Addressable Non-Volatile Memories in Systems

An Evaluation of the NVIDIA TX1 for Supporting Real-Time Computer-Vision Workloads

A novel zero weight/activation-aware hardware architecture of convolutional neural network

Accelerating read atomic multi-partition transaction with remote direct memory access

Efficient Bayesian yield optimization approach for analog and SRAM circuits

Taming memory related performance pitfalls in linux Cgroups

A Memory Accessing Method for the Parallel Aho-Corasick Algorithm on GPU

Analyzing and Comparing Android HTC Aria, Apple iPhone 3G, and Windows Mobile HTC TouchPro 6850

Forensic reconstruction of executables from Windows 7 physical memory

Hugepage & Swappiness functions for optimization of the search graph algorithm using Hadoop framework

Always-on motion detection with application-level error control on a near-threshold approximate computing platform

Memory access pattern based insider threat detection in big data systems

Emulating an Octeon MIPS64 based embedded system on X86 in QEMU

FPGA implementation of the coupled filtering method

ZNNi: Maximizing the Inference Throughput of 3D Convolutional Networks on CPUs and GPUs

Parallelization of GST algorithm for source code similarity detection

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options