Search results for: Philippe Coussy

Items from 1 to 20 out of 34 results

chapter

Efficient scalable hardware architecture for highly performant encoded neural networks

Hugues Wouafo, Cyrille Chavet, Philippe Coussy, Robin Danilo

2017 IEEE International Workshop on Signal Processing Systems (SiPS) > 1 - 6

2017 IEEE International Workshop on Signal Processing Systems (SiPS)

Different neural network models have been proposed to design efficient associative memories like Hopfield networks, Boltzmann machines or Cogent confabulation. Compared to the classical models, Encoded Neural Network (ENN) is a recently introduced formalism with a proven higher efficiency. This model has been improved through different contributions like Clone-based ENN (CbNNs) or Sparse ENNs (S-ENNs)...

chapter

A 142MOPS/mW integrated programmable array accelerator for smart visual processing

Satyajit Das, Davide Rossi, Kevin J. M. Martin, Philippe Coussy, more

2017 IEEE International Symposium on Circuits and Systems (ISCAS) > 1 - 4

2017 IEEE International Symposium on Circuits and Systems (ISCAS)

Due to increasing demand of low power computing, and diminishing returns from technology scaling, industry and academia are turning with renewed interest toward energy-efficient programmable accelerators. This paper proposes an Integrated Programmable-Array accelerator (IPA) architecture based on an innovative execution model, targeted to accelerate both data and control-flow parts of deeply embedded...

chapter

Efficient mapping of CDFG onto coarse-grained reconfigurable array architectures

Satyajit Das, Kevin J. M. Martin, Philippe Coussy, Davide Rossi, more

2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC) > 127 - 132

2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC)

In the approaching era of IoT, flexible and low power accelerators have become essential to meet aggressive energy efficiency targets. During the last few decades, Coarse Grain Reconfigurable Arrays (CGRA) have demonstrated high energy efficiency as accelerators, especially for high-performance streaming applications. While existing CGRAs mostly rely on partial and full predication techniques to support...

chapter

Associative Memory based on clustered Neural Networks: Improved model and architecture for Oriented Edge Detection

Robin Danilo, Hugues Nono Wouafo, Cyrille Chavet, Vincent Gripon, more

2016 Conference on Design and Architectures for Signal and Image Processing (DASIP) > 51 - 58

2016 Conference on Design and Architectures for Signal and Image Processing (DASIP)

Associative Memories (AM) are storage devices that allow addressing content from part of it, in opposition of classical index-based memories. This property makes them promising candidates for various search challenges including pattern detection in images. Clustered based Neural Networks (CbNN) allow efficient design of AM by providing fast pattern retrieval, especially when implemented in hardware...

chapter

A Scalable Design Approach to Efficiently Map Applications on CGRAs

Satyajit Das, Thomas Peyret, Kevin Martin, Gwenole Corre, more

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) > 655 - 660

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Coarse-Grained Reconfigurable Architectures (CGRAs) are promising high-performance and power-efficient platforms. However, their uses are still limited because of the current capability of the mapping tools. This paper presents a new scalable efficient design flow to map applications written in high level language on CGRAs. This approach leverages on simultaneous scheduling and binding steps respectively...

chapter

A dynamically reconfigurable ECC decoder architecture

Awais Sani, Philippe Coussy, Cyrille Chavet

2016 Design, Automation & Test in Europe Conference & Exhibition (DATE) > 1437 - 1440

2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Due to their impressive error correction performances, Error Correcting Codes (ECC) are now widely used in communication systems. In order to achieve high throughput requirements ECC decoders are based on parallel architectures, which results in a major issue: memory access conflicts. In this paper, we introduce a new class of ECC decoder architectures that dynamically reconfigures by executing on-chip...

article

A Unified Design Flow to Automatically Generate On-Chip Monitors During High-Level Synthesis of Hardware Accelerators

Mohamed Ben Hammouda, Philippe Coussy, Loic Lagadec

IEEE Transactions on Computer-Aided Design of Integrated Circuits and... > 2017 > 36 > 3 > 384 - 397

Security and safety are more and more important in embedded system design. A key issue, hence lies in the ability of systems to respond safely when errors occur at runtime, to prevent unacceptable behaviors that can lead to failures or sensitive data leakage. In this paper, we propose a design approach that automatically generates on-chip monitors (OCMs) during high-level synthesis (HLS) of hardware...

chapter

Algorithm and implementation of an associative memory for oriented edge detection using improved clustered neural networks

Robin Danilo, Hooman Jarollahi, Vincent Gripon, Philippe Coussy, more

2015 IEEE International Symposium on Circuits and Systems (ISCAS) > 2501 - 2504

2015 IEEE International Symposium on Circuits and Systems (ISCAS)

Associative memories are capable of retrieving previously stored patterns given parts of them. This feature makes them good candidates for pattern detection in images. Clustered Neural Networks is a recently-introduced family of associative memories that allows a fast pattern retrieval when implemented in hardware. In this paper, we propose a new pattern retrieval algorithm that results in a dramatically...

chapter

Improving storage of patterns in recurrent neural networks: Clone-based model and architecture

Hugues Wouafo, Cyrille Chavet, Philippe Coussy

2015 IEEE International Symposium on Circuits and Systems (ISCAS) > 577 - 580

2015 IEEE International Symposium on Circuits and Systems (ISCAS)

Artificial neural networks are used in various domains like computer science and computer engineering for tasks like image processing or design of associative memories. The goal is to mimic the impressive brain ability to process or to memorize and retrieve information. Recently a new model of neural network has been proposed and can be used to design associative memories. When considering patterns...

chapter

In-place memory mapping approach for optimized parallel hardware interleaver architectures

Saeed Ur Reehman, Cyrille Chavet, Philippe Coussy, Awais Sani

2015 Design, Automation & Test in Europe Conference & Exhibition (DATE) > 896 - 899

2015 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Due to their impressive error correction performances, turbo-codes or LDPC architectures are now widely used in communication systems and are one of the most critical parts of decoders. In order to achieve high throughput requirements these decoders are based on parallel architectures, which results in a major problem to be solved: parallel memory access conflicts. To solve these conflicts, different...

chapter

A modeling and code generation framework for critical embedded systems design: From Simulink down to VHDL and Ada/C code

Mickael Lanoe, Matteo Bordin, Dominique Heller, Philippe Coussy, more

2014 21st IEEE International Conference on Electronics, Circuits and Systems (ICECS) > 742 - 745

2014 21st IEEE International Conference on Electronics, Circuits and Systems (ICECS)

The P project gathers industrial and academic partners to address the issue of a modeling approach and automatic code generation for critical embedded systems. Works target the definition of an open design flow which integrates qualified tools to produce both hardware and software implementations. This paper introduces the project through two code generators that allow generating Ada, C and VHDL from...

chapter

A HLS-Based Toolflow to Design Next-Generation Heterogeneous Many-Core Platforms with Shared Memory

Paolo Burgio, Andrea Marongiu, Philippe Coussy, Luca Benini

2014 12th IEEE International Conference on Embedded and Ubiquitous Computing > 130 - 137

2014 12th IEEE International Conference on Embedded and Ubiquitous Computing (EUC)

This work describes how we use High-Level Synthesis to support design space exploration (DSE) of heterogeneous many-core systems. Modern embedded systems increasingly couple hardware accelerators and processing cores on the same chip, to trade specialization of the platform to an application domain for increased performance and energy efficiency. However, the process of designing such a platform is...

chapter

Efficient application mapping on CGRAs based on backward simultaneous scheduling/binding and dynamic graph transformations

Thomas Peyret, Gwenole Corre, Mathieu Thevenin, Kevin Martin, more

2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors > 169 - 172

2014 IEEE 25th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

Mapping an application on a coarse grained reconfigurable architecture (CGRA) is a complex task which is still often completely or partially realized manually. This paper presents an automated synthesis flow based on simultaneous scheduling and binding steps. The proposed method uses a backward traversal of the formal model obtained after compilation and dynamically transforms it when needed. Our...

chapter

A design approach to automatically synthesize ANSI-C assertions during High-Level Synthesis of hardware accelerators

Mohamed Ben Hammouda, Philippe Coussy, Loic Lagadec

2014 IEEE International Symposium on Circuits and Systems (ISCAS) > 165 - 168

2014 IEEE International Symposium on Circuits and Systems (ISCAS)

Evolution of Systems-On-Chip (SoC) increases the challenge of verification and post-silicon debug. Nowadays, Assertion Based Verification (ABV) is a widely used methodology. Languages like PSL (Property Specification Language) or SVA (System Verilog Assertions) allows engineers to define properties at Register Transfer Level (RTL). Properties can then be used to generate simulation/hardware assertion...

chapter

Embedding polynomial time memory mapping and routing algorithms on-chip to design configurable decoder architectures

Saeed-ur-Rehman, Awais Sani, Cyrille Chavet, Philippe Coussy

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5036 - 5040

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

To fulfill the high data rate requirement of current telecommunication standards, error-correction codes decoders are implemented on parallel architectures leading to memory conflict problem. Different memory mapping approaches are proposed in the literature to solve this problem. However, these approaches can only be executed offline due to their computational complexity and resultant memory mapping...

chapter

A tightly-coupled hardware controller to improve scalability and programmability of shared-memory heterogeneous clusters

Paolo Burgio, Robin Danilo, Andrea Marongiu, Philippe Coussy, more

2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) > 1 - 4

2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Modern designs for embedded many-core systems increasingly include application-specific units to accelerate key computational kernels with orders-of-magnitude higher execution speed and energy efficiency compared to software counterparts. A promising architectural template is based on heterogeneous clusters, where simple RISC cores and specialized HW units (HWPU) communicate in a tightly-coupled manner...

chapter

A conflict-free memory mapping approach to design parallel hardware interleaver architectures with optimized network and controller

Aroua Briki, Cyrille Chavet, Philippe Coussy

SiPS 2013 Proceedings > 201 - 206

2013 IEEE Workshop on Signal Processing Systems (SiPS)

Recent communication standards and storage systems (e.g. wireless access, digital video broadcasting or magnetic storage in hard disk drives) uses error correcting codes such as LDPC (Low Density Parity Check) or Turbo-codes to reliably transfer data between source and destination. For high data rate applications, Turbo and LDPC codes are decoded on parallel architectures. However, parallel architectures...

chapter

Dynamic branch prediction for high-level synthesis

Vianney Lapotre, Philippe Coussy, Cyrille Chavet, Hugues Wouafo, more

2013 23rd International Conference on Field programmable Logic and Applications > 1 - 6

2013 23rd International Conference on Field Programmable Logic and Applications (FPL)

Branch prediction is a widely used technique to optimize performances of pipelined microprocessor architectures. In HighLevel Synthesis (HLS) domain, few synthesis techniques for optimizing control flows of data dominated applications have been proposed. Previous works mainly focus on using techniques like path-based scheduling algorithms, speculation techniques or static branch prediction for pipelined...

chapter

On-chip implementation of memory mapping algorithm to support flexible decoder architecture

Saeed-ur Rehman, Awais Sani, Philippe Coussy, Cyrille Chavet

2013 IEEE International Conference on Acoustics, Speech and Signal Processing > 2751 - 2755

ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Parallel hardware architectures are used to design turbo-like iterative decoders to meet the requirement of high data rate applications. However, parallel architectures suffer from memory conflict problem due to interleaving law used in turbo-like codes. To solve conflict problem, different memory mapping approaches have been developed. These methods automatically generate a set of control words stored...

chapter

Architecture and programming model support for efficient heterogeneous computing on tigthly-coupled shared-memory clusters

Paolo Burgio, Andrea Marongiu, Robin Danilo, Philippe Coussy, more

2013 Conference on Design and Architectures for Signal and Image Processing > 22 - 29

2013 Conference on Design and Architectures for Signal and Image Processing (DASIP)

Modern computer vision and image processing embedded systems exploit hardware acceleration inside scalable parallel architectures, such as tightly-coupled clusters, to achieve stringent performance and energy efficiency targets. Architectural heterogeneity typically makes software development cumbersome, thus shared memory processor-to-accelerator communication is typically preferred to simplify code...

Publication date

Set your own date range

INFONA - science communication portal

Search results for: Philippe Coussy

Efficient scalable hardware architecture for highly performant encoded neural networks

A 142MOPS/mW integrated programmable array accelerator for smart visual processing

Efficient mapping of CDFG onto coarse-grained reconfigurable array architectures

Associative Memory based on clustered Neural Networks: Improved model and architecture for Oriented Edge Detection

A Scalable Design Approach to Efficiently Map Applications on CGRAs

A dynamically reconfigurable ECC decoder architecture

A Unified Design Flow to Automatically Generate On-Chip Monitors During High-Level Synthesis of Hardware Accelerators

Algorithm and implementation of an associative memory for oriented edge detection using improved clustered neural networks

Improving storage of patterns in recurrent neural networks: Clone-based model and architecture

In-place memory mapping approach for optimized parallel hardware interleaver architectures

A modeling and code generation framework for critical embedded systems design: From Simulink down to VHDL and Ada/C code

A HLS-Based Toolflow to Design Next-Generation Heterogeneous Many-Core Platforms with Shared Memory

Efficient application mapping on CGRAs based on backward simultaneous scheduling/binding and dynamic graph transformations

A design approach to automatically synthesize ANSI-C assertions during High-Level Synthesis of hardware accelerators

Embedding polynomial time memory mapping and routing algorithms on-chip to design configurable decoder architectures

A tightly-coupled hardware controller to improve scalability and programmability of shared-memory heterogeneous clusters

A conflict-free memory mapping approach to design parallel hardware interleaver architectures with optimized network and controller

Dynamic branch prediction for high-level synthesis

On-chip implementation of memory mapping algorithm to support flexible decoder architecture

Architecture and programming model support for efficient heterogeneous computing on tigthly-coupled shared-memory clusters

Filter options

Publication date

Content availability

Publication type

Keywords

Data set

Journal

INFONA - science communication portal

Search results for: Philippe Coussy

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Data set

Journal

Reporting an error / abuse

Sending the report failed

Accessibility options