Advanced search

Advanced search in people

From:

To:

Items from 1 to 20 out of 26 results

chapter

Improved Stereo Matching with Constant Highway Networks and Reflective Confidence Learning

Amit Shaked, Lior Wolf

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 6901 - 6910

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present an improved three-step pipeline for the stereo matching problem and introduce multiple novelties at each stage. We propose a new highway network architecture for computing the matching cost at each possible disparity, based on multilevel weighted residual shortcuts, trained with a hybrid loss that supports multilevel comparison of image patches. A novel post-processing step is then introduced,...

chapter

×86 computer architecture simulators: A comparative study

Ayaz Akram, Lina Sawalha

2016 IEEE 34th International Conference on Computer Design (ICCD) > 638 - 645

2016 IEEE 34th International Conference on Computer Design (ICCD)

The significance of computer architecture simulators in advancing computer architecture research is widely acknowledged. Computer architects have developed numerous simulators in the past few decades and their number continues to rise. This paper explores different simulation techniques and surveys many ×86 simulators. Comparing simulators with each other and validating their correctness has been...

chapter

SSDUP: An Efficient SSD Write Buffer Using Pipeline

Ming Li, Xuanhua Shi, Wei Liu, Hai Jin, more

2016 IEEE International Conference on Cluster Computing (CLUSTER) > 166 - 167

2016 IEEE International Conference on Cluster Computing (CLUSTER)

High performance computing (HPC) applications are becoming more data-intensive and produce increasingly large I/O demands on storage systems. New storage devices such as SSD which has nearly no seek latency and high throughput have been widely used together with HDD to serve as a hybrid storage system. To solve the I/O bottleneck problem, existing hybrid storage solutions such as Burst Buffer have...

chapter

Incorporating benchmark programming in the teaching of undergraduate Computer Architecture

James R. Moulic, Jacob D. See

2015 IEEE 7th International Conference on Engineering Education (ICEED) > 1 - 5

2015 IEEE 7th International Conference on Engineering Education (ICEED)

Advanced Computer Architecture is an upper-level required course offered by the Department of Computer Science and Engineering at the University of Alaska-Anchorage (UAA). Course content is structured to provide students with a qualitative and quantitative approach to computer architecture, which addresses both the hardware and software aspects of parallelism in modern computing systems. Historically,...

chapter

Reconfigurable Dynamic Trusted Platform Module for Control Flow Checking

Sanjeev Das, Wei Zhang, Yang Liu

2014 IEEE Computer Society Annual Symposium on VLSI > 166 - 171

2014 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Trusted Platform Module (TPM) has gained its popularity in computing systems as a hardware security approach. TPM provides the boot time security by verifying the platform integrity including hardware and software. However, once the software is loaded, TPM can no longer protect the software execution. In this work, we propose a dynamic TPM design, which performs control flow checking to protect the...

chapter

Minimally buffered single-cycle deflection router

Gnaneswara Rao Jonna, John Jose, Rachana Radhakrishnan, Madhu Mutyam

2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) > 1 - 4

2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)

With the drift from computation centric designs to communication centric designs in the Chip Multi Processor (CMP) era, the interconnect fabric is gaining more importance. An efficient NoC in terms of power, area and average flit latency has a huge impact on the overall performance of a CMP. In the current work, we propose MinBSD — a minimally buffered, single cycle, deflection router. It incorporates...

chapter

Automatic Extraction of pipeline parallelism for embedded heterogeneous multi-core platforms

Daniel Cordes, Michael Engel, Olaf Neugebauer, Peter Marwedel

2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES) > 1 - 10

2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES)

Automatic parallelization of sequential applications is the key for efficient use and optimization of current and future embedded multi-core systems. However, existing approaches often fail to achieve efficient balancing of tasks running on heterogeneous cores of an MPSoC. A reason for this is often insufficient knowledge of the underlying architecture's performance. In this paper, we present a novel...

chapter

EJOP: An Extensible Java Processor with Reasonable Performance/Flexibility Trade-off

Samaneh Talebi, Niloofar Abolghasemi, Ali Jahanian

2012 15th Euromicro Conference on Digital System Design > 415 - 418

2012 15th Euromicro Conference on Digital System Design (DSD)

Architectural advancement in hardware implementation of Java increases the performance. Java processors reduce the overhead of execution time and memory accesses of traditional implementation of JVM in embedded systems. To improve the performance of Java processors and decrease the execution time, we decided to customize a processor is called JOP. We design a Reconfigurable Functional Unit (RFU) which...

chapter

A cycle-level SIMT-GPU simulation framework

Po-Han Wang, Chien-Wei Lo, Chia-Lin Yang, Yu-Jung Cheng

2012 IEEE International Symposium on Performance Analysis of Systems & Software > 114 - 115

2012 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS)

The massive parallelism provided by the modern graphics processing units (GPUs) makes them the attractive processors to accelerate the applications with high data-level parallelism. Therefore, the GPU architecture has recently gained a lot of attention in research community. However, the advance in the GPU architecture is impeded by the limited documents released from the major GPU vendors. Furthermore,...

article

Portable, Flexible, and Scalable Soft Vector Processors

Peter Yiannacouras, J. Gregory Steffan, Jonathan Rose

IEEE Transactions on Very Large Scale Integration (VLSI) Systems > 2012 > 20 > 8 > 1429 - 1442

Field-programmable gate arrays (FPGAs) are increasingly used to implement embedded digital systems, however, the hardware design necessary to do so is time-consuming and tedious. The amount of hardware design can be reduced by employing a microprocessor for less-critical computation in the system. Often this microprocessor is implemented using the FPGA reprogrammable fabric as a soft processor which...

chapter

X86-ARM binary hardware interpreter

Hussein Karaki, Haitham Akkary, Shahrokh Shahidzadeh

2011 18th IEEE International Conference on Electronics, Circuits, and Systems > 145 - 148

2011 18th IEEE International Conference on Electronics, Circuits and Systems - (ICECS 2011)

In the computer hardware industry, there are currently two highly successful instruction set architectures (ISAs): the CISC x86 ISA which is an established standard architecture in the personal computer and server markets, and the RISC ARM ISA which has become the standard in the fast growing ultra-mobile computing devices market, such as smart-phones and tablets. Program binaries that run on one...

chapter

PEPSC: A Power-Efficient Processor for Scientific Computing

Ganesh Dasika, Ankit Sethia, Trevor Mudge, Scott Mahlke

2011 International Conference on Parallel Architectures and Compilation Techniques > 101 - 110

2011 International Conference on Parallel Architectures and Compilation Techniques (PACT)

The rapid advancements in the computational capabilities of the graphics processing unit (GPU) as well as the deployment of general programming models for these devices have made the vision of a desktop supercomputer a reality. It is now possible to assemble a system that provides several TFLOPs of performance on scientific applications for the cost of a high-end laptop computer. While these devices...

chapter

How sensitive is processor customization to the workload's input datasets?

Maximilien Breughe, Zheng Li, Yang Chen, Stijn Eyerman, more

2011 IEEE 9th Symposium on Application Specific Processors (SASP) > 1 - 7

2011 IEEE 9th Symposium on Application Specific Processors (SASP)

Hardware customization is an effective approach for meeting application performance requirements while achieving high levels of energy efficiency. Application-specific processors achieve high performance at low energy by tailoring their designs towards a specific workload, i.e., an application or application domain of interest. A fundamental question that has remained unanswered so far though is to...

chapter

Control Independence Using Dual Renaming

Lin Meng, Shigeru Oyanagi

2010 First International Conference on Networking and Computing > 264 - 267

2010 First International Conference on Networking and Computing (ICNC 2010)

Modern Super scalar Processor squashes up all of wrong-path instructions when the branch prediction misses. In deeper pipelines, branch miss prediction penalty increases seriously owing to large number of squashed instructions. Exploiting control independence has been proposed for reducing this penalty. Control Independence method reuses control independent instructions (CI instructions) without squashing...

chapter

A Load-Forwarding Mechanism for the Vector Architecture in Multimedia Applications

Ye Gao, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools > 412 - 415

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools (DSD)

Nowadays, multimedia applications (MMAs) form an important workload for general purpose processors. Although the vector architecture is considered the most potential candidate for media processing, the traditional vector architecture has inefficiencies to execute MMAs. This paper proposes a media-oriented vector architecture, which improves the traditional one with a load-forwarding mechanism. The...

chapter

Permutation optimization for SIMD devices

Libo Huang, Li Shen, Zhiying Wang

Proceedings of 2010 IEEE International Symposium on Circuits and Systems > 3849 - 3852

2010 IEEE International Symposium on Circuits and Systems. ISCAS 2010

Single-instruction-multiple-data (SIMD) devices have been widely incorporated into baseline instruction level parallelism (ILP) processors to enable more efficient data level parallelism (DLP) support. This paper addresses the unsolved problem of the need to permute the SIMD elements packed in registers for maximum parallelism performance. An implicit data permutation (IDP) mechanism is proposed for...

chapter

Domain specific architecture for next generation wireless communication

Botao Zhang, Hengzhu Liu, Heng Zhao, Fangzheng Mo, more

2010 Design, Automation&Test in Europe Conference&Exhibition (DATE 2010) > 1414 - 1419

2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)

In order to solve the challenges in processor design for the next generation wireless communication systems, this paper first proposes a system level design flow for communication domain specific processor, and then proposes a novel processor architecture for the next generation wireless communication named GAEA using this design flow. GAEA is a shared memory multi-core SoC based on Software Controlled...

chapter

Light Speed Labeling for RISC architectures

L. Lacassagne, B. Zavidovique

2009 16th IEEE International Conference on Image Processing (ICIP) > 3245 - 3248

2009 16th IEEE International Conference on Image Processing (ICIP 2009)

This article introduces a fast algorithm for Connected Component Labeling of binary images called Light Speed Labeling. It is segment-based and a line-relative labeling that was especially thought for RISC computers. An extensive benchmark on both structured and unstructured images substanciates that the algorithm, the way it is designed, is faster and more runtime predictable than Wu's algorithm...

chapter

Architecture design of variable lengths instructions expansion for VLIW

Yuan Liu, Hu He, Teng Xu

2009 IEEE 8th International Conference on ASIC > 29 - 32

2009 IEEE 8th International Conference on ASIC (ASICON)

In current instruction set architecture (ISA) design, fixed length instructions are benefit for improving the efficiency of instruction dispatching. But in embeded computers where memory is limited, variable lengths instructions are much better in memory cost. In this VLIW (very long instruction word) architecture, a two-staged pipeline is used to expand and dispatch the variable lengths instructions...

chapter

Characterizing the TLB Behavior of Emerging Parallel Workloads on Chip Multiprocessors

A. Bhattacharjee, M. Martonosi

2009 18th International Conference on Parallel Architectures and Compilation Techniques > 29 - 40

2009 18th International Conference on Parallel Architectures and Compilation Techniques (PACT 2009)

Translation Lookaside Buffers (TLBs) are a staple in modern computer systems and have a significant impact on overall system performance. Numerous prior studies have addressed TLB designs to lower access times and miss rates; these, however, have been targeted towards uniprocessor architectures. As the computer industry embraces chip multiprocessor (CMP) architectures, it is important to study the...

Keywords:
COMPUTER ARCHITECTURE
BENCHMARK TESTING
PIPELINES

Publication date

Set your own date range

Publication type

book (25)
article (1)

Keywords

HARDWARE (9)
REGISTERS (9)
MICROPROCESSOR CHIPS (7)
PARALLEL PROCESSING (5)
INSTRUCTION SETS (4)
PROGRAM PROCESSORS (4)
EMBEDDED SYSTEMS (3)
FIELD PROGRAMMABLE GATE ARRAYS (3)
MULTIPROCESSING SYSTEMS (3)
PARALLEL ARCHITECTURES (3)
SOFTWARE (3)
ALGORITHM DESIGN AND ANALYSIS (2)
CLOCKS (2)
COMPUTATIONAL MODELING (2)
COMPUTERS (2)
DATA LEVEL PARALLELISM (2)
DATA MINING (2)
GRAPHICS PROCESSING UNIT (2)
INSTRUCTION LEVEL PARALLELISM (2)
MAGNETIC CORES (2)
MICROARCHITECTURE (2)
MICROPROCESSORS (2)
MULTI-THREADING (2)
OPTIMIZATION (2)
PROGRAM COMPILERS (2)
PROTOTYPES (2)
SIMD (2)
VERY LONG INSTRUCTION WORD (2)
16-BIT EMBEDDED PROCESSOR (1)
32-BIT ARCHITECTURES (1)
32-BIT EMBEDDED PROCESSOR (1)
ADDRESSING MODE (1)
ADVANCED OPTIMIZATION (1)
ANALYTICAL MODELS (1)
ARCHITECTURAL VULNERABILITY FACTORS (1)
ARCHITECTURE DESIGN (1)
AUTOMATIC PARALLELIZATION (1)
BANDWIDTH (1)
BINARY IMAGES (1)
BRANCH MISS PREDICTION PENALTY (1)
BRANCH PREDICTORS (1)
BRANCH TARGET BUFFER (1)
BTB ACCESS FILTERING (1)
BUFFER STORAGE (1)
CELL BROADBAND ENGINE (1)
CELL PROCESSOR (1)
CHIP MULTIPROCESSOR (1)
CHIP MULTIPROCESSOR ARCHITECTURE (1)
CHIP MULTIPROCESSORS (1)
CLUSTER-LEVEL SIMULTANEOUS MULTITHREADING (1)
COMMUNICATION DOMAIN SPECIFIC PROCESSOR (1)
COMPILER GENERATING QUEUE PROGRAMS (1)
COMPILER-FRIENDLY PROCESSOR (1)
COMPUTER PERFORMANCE (1)
COMPUTER SCIENCE EDUCATION (1)
COMPUTER SYSTEM (1)
CONNECTED COMPONENT LABELING (1)
CONTROL FLOW CHECKING (1)
CONTROL INDEPENDENCE (1)
CONTROL INDEPENDENT INSTRUCTION (1)
CPU INSTRUCTION DISPATCH (1)
CUSTOM INSTRUCTION (1)
CUSTOMIZATION (1)
DATA DEPENDENCY (1)
DATA-LEVEL PARALLELISM (1)
DECISION SUPPORT SYSTEMS (1)
DECODING (1)
DECOUPLED THREADED ARCHITECTURE (1)
DEEPLY-PIPELINED DESIGNS (1)
DELAY (1)
DESIGN (1)
DESIGN SPACE EXPLORATION (1)
DIGITAL SIGNAL PROCESSING (1)
DOMAIN SPECIFIC ARCHITECTURE (1)
DSP (1)
DUAL RENAMING (1)
DYNAMIC IMPLIED ADDRESSING MODE (1)
DYNAMIC TPM (1)
DYNAMIC VOLTAGE FREQUENCY SCALING (1)
EDUCATIONAL INSTITUTIONS (1)
EMBEDDED APPLICATIONS (1)
EMBEDDED PROCESSOR (1)
EMBEDDED PROCESSORS (1)
EMBEDDED SOFTWARE (1)
EMBEDED COMPUTER (1)
ENCODING (1)
ENERGY CONSUMPTION (1)
EXTENSIBLE PROCESSOR (1)
FIELD-PROGRAMMABLE GATE ARRAY (FPGA)-BASED SOFT-CORE PROCESSORS (1)
FILTERING (1)
FIXED LENGTH INSTRUCTION (1)
FPGA DEVICE (1)
GAEA (1)
GALS ARCHITECTURES (1)
GILP (1)
GPGPU (1)
GROUPED INDEPENDENT INSTRUCTIONS (1)
more

INFONA - science communication portal

Advanced search

Advanced search in people

Improved Stereo Matching with Constant Highway Networks and Reflective Confidence Learning

×86 computer architecture simulators: A comparative study

SSDUP: An Efficient SSD Write Buffer Using Pipeline

Incorporating benchmark programming in the teaching of undergraduate Computer Architecture

Reconfigurable Dynamic Trusted Platform Module for Control Flow Checking

Minimally buffered single-cycle deflection router

Automatic Extraction of pipeline parallelism for embedded heterogeneous multi-core platforms

EJOP: An Extensible Java Processor with Reasonable Performance/Flexibility Trade-off

A cycle-level SIMT-GPU simulation framework

Portable, Flexible, and Scalable Soft Vector Processors

X86-ARM binary hardware interpreter

PEPSC: A Power-Efficient Processor for Scientific Computing

How sensitive is processor customization to the workload's input datasets?

Control Independence Using Dual Renaming

A Load-Forwarding Mechanism for the Vector Architecture in Multimedia Applications

Permutation optimization for SIMD devices

Domain specific architecture for next generation wireless communication

Light Speed Labeling for RISC architectures

Architecture design of variable lengths instructions expansion for VLIW

Characterizing the TLB Behavior of Emerging Parallel Workloads on Chip Multiprocessors

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options