Search results

Items from 1 to 20 out of 42 results

chapter

Fast RNS implementation of elliptic curve point multiplication in GF(p) with selected base pairs

Yifeng Mo, Shuguo Li

2017 27th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 6

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

Implementing elliptic curve point multiplication (ECPM) based on residue number system (RNS) can efficiently use FPGA resources. In this paper, we propose a modular reduction method, where a kind of RNS pair is selected to achieve fast reduction. Our reduction method mainly needs several parallel additions while the reduction unit of previous designs require two multiplications which are computed...

article

Efficient FPGA Mapping of Pipeline SDF FFT Cores

Carl Ingemarsson, Petter Kallstrom, Fahad Qureshi, Oscar Gustafsson

IEEE Transactions on Very Large Scale Integration (VLSI) Systems > 2017 > 25 > 9 > 2486 - 2497

In this paper, an efficient mapping of the pipeline single-path delay feedback (SDF) fast Fourier transform (FFT) architecture to field-programmable gate arrays (FPGAs) is proposed. By considering the architectural features of the target FPGA, significantly better implementation results are obtained. This is illustrated by mapping an R2²SDF 1024-point FFT core toward both Xilinx Virtex-4 and Virtex-6...

chapter

A high-speed and area-efficiency DSP Block Embedded in FPGAs

Bang Zhang, Peng Lu, Jian Wang, Jinmei Lai

2016 13th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT) > 1497 - 1499

2016 13th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT)

DSP blocks are integrated in most modern high-performance FPGA devices in order to improve the speed and efficiency of computation-intensive DSP designs. Based on the existing ones, this paper proposes a new DSP block which can improve the speed and area-efficiency by reducing the delay of cascade path and supporting multi-input addition. Both architecture and implementation are described. Virtual...

chapter

HPAZ: A high-throughput pipeline architecture of ZUC in hardware

Zongbin Liu, Qinglong Zhang, Cunqing Ma, Changting Li, more

2016 Design, Automation & Test in Europe Conference & Exhibition (DATE) > 269 - 272

2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)

In this paper, we propose a high-throughput pipeline architecture of the stream cipher ZUC which has been included in the security portfolio of 3GPP LTE-Advanced. In the literature, the schema with the highest throughput only implements the working stage of ZUC. The schemas which implement ZUC completely can only achieve a much lower throughput, since a self-feedback loop in the critical path significantly...

chapter

Effectiveness of matrix and pipeline FPGA-based arithmetic components of safety-related systems

Julia Drozd, Oleksandr Drozd, Svetlana Antoshchuk, Alex Kushnerov, more

2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS) > 2 > 785 - 789

2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)

The paper is devoted to design of the digital components for safety-related instrumentation and control systems using the modern CAD tools. Traditionally, the digital components are built with matrix parallelism that reduces fault tolerance of circuits and safety of systems in their checkability. Circuits with bitwise pipeline data processing have advantage in checkability, but are considered as less...

chapter

Design and Implementation of an Embedded FPGA Floating Point DSP Block

Martin Langhammer, Bogdan Pasca

2015 IEEE 22nd Symposium on Computer Arithmetic > 26 - 33

2015 IEEE 22nd Symposium on Computer Arithmetic (ARITH)

This paper describes the architecture and implementation, from both the standpoint of target applications as well as circuit design, of an FPGA DSP Block that can efficiently support both fixed and single precision (SP) floating-point (FP) arithmetic. Most contemporary FPGAs embed DSP blocks that provide simple multiply-add-based fixed-point arithmetic cores. Current FP arithmetic FPGA solutions make...

chapter

Efficient montgomery multiplier for pairing and elliptic curve based cryptography

Khalid Javeed, Xiaojun Wang

2014 9th International Symposium on Communication Systems, Networks & Digital Sign (CSNDSP) > 255 - 260

2014 9th International Symposium on Communication Systems, Networks & Digital Signal Processing (CSNDSP)

In this paper, we propose an efficient 256×256 bit modular multiplier based on Montgomery reduction algorithm. The 256 × 256 bit modular multiplier is required in elliptic curve and pairing based cryptographic protocols to achieve 128 bit security level. The in-built features of modern FPGA are efficiently utilized. Two time consuming components (1) 512-bit addition (2) 256 × 256 bit multiplier are...

chapter

An Implementation of Montgomery Modular Multiplication on FPGAs

Xinkai Yan, Guiming Wu, Dong Wu, Fang Zheng, more

2013 International Conference on Information Science and Cloud Computing > 32 - 38

2013 International Conference on Information Science and Cloud Computing (ISCC)

Modular multiplication is one of the most important operations in the public key cryptographic algorithms. In order to design a high-performance modular multiplier, we present a novel hybrid Montgomery modular multiplier over GF(p) on FPGAs, which employs Karatsuba and Knuth multiplication algorithms in different levels to implement large integer multiplication. A 9-stage pipeline full-word multiplier...

chapter

An area efficient multiplexer based CORDIC

V. Naresh, B. Venkataramani, R. Raja

2013 International Conference on Computer Communication and Informatics > 1 - 5

2013 International Conference on Computer Communication and Informatics (ICCCI)

In the literature, multiplexer has been proposed for the ASIC implementation of unrolled CORDIC (COordinate Rotation DIgital Computer) processor. In this paper, the efficacy of this approach is studied for the implementation on FPGA. For this study, both non pipelined and 2 level pipelined CORDIC with 8 stages and using two schemes — one using adders in all the stages and another using multiplexers...

article

Multiply-accumulator using modified booth encoders designed for application in 16-bit RISC processor

He Jing-yu, Li Li-li, Zhu Yan-chao, Yang Wen-tao, more

02013 00002nd International Symposium on Instrumentation and Measurement,... > 2013 > 416 - 419

2013 2nd International Symposium on Instrumentation & Measurement, Sensor Network and Automation (IMSNA)

In this paper, multiply-accumulator (MAC) is designed for application in simple 16-bit RISC processors to enhance the processor's capability by adding new instruction set. Creation of new instruction set is achieved by modifying the processor's architecture using Verilog Hardware Description Language (Verilog HDL). The new instruction set has simple structure, and can be fully compatible with the...

chapter

The design and implementation of reconfigurable multiplier with high flexibility

Jiangyi Shi, Gang Jing, Zhixiong Di, Si Yang

2011 International Conference on Electronics, Communications and Control (ICECC) > 1095 - 1098

2011 International Conference on Electronics, Communications and Control (ICECC)

This paper presents a reconfigurable mechanism for the multiplier. The proposed mechanism is applied to generate a multiplier, whose data width, type and pipeline depth can be customized. The data width of each operand of these generated multipliers can be configured for 4i where i=1, 2, 3, 4, 5, 6, 7, 8. And the data type of operand can be unsigned or signed at will. The multiplier is composed of...

chapter

Parametrized hardware architectures for the Lucas primality test

Adrien Le Masle, Wayne Luk, Csaba Andras Moritz

2011 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation > 124 - 131

2011 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XI)

We present our parametric hardware architecture of the NIST approved Lucas probabilistic primality test. To our knowledge, our work is the first hardware architecture for the Lucas test. Our main contributions are a hardware architecture for calculating the Jacobi symbol based on the binary Jacobi algorithm, a pipelined modular add-shift module for calculating the Lucas sequences, methods for dependence...

chapter

Iterative Refinement on FPGAs

Jun Kyu Lee, Gregory D. Peterson

2011 Symposium on Application Accelerators in High-Performance Computing > 8 - 13

2011 Symposium on Application Accelerators in High-Performance Computing (SAAHPC)

Achievable accuracy for mixed precision iterative refinement depends on the precisions supported by computing platforms. Even though the arithmetic unit precision can be flexible for programmable logic computing architectures (e.g. FPGAs), previous work rarely discusses the performance benefits due to enabling flexible achievable accuracy. Hence, we propose an iterative refinement approach on FPGAs...

chapter

Multiprocessor FPGA implementation of a 2D digital filter

Danny Teng-Hsiang Tsuei, Mohamed-Yahia Dabbagh, Manoj Sachdev

2011 24th Canadian Conference on Electrical and Computer Engineering(CCECE) > 630 - 633

2011 24th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)

High performance implementation of 2D digital filters are highly desired in many applications for real-time processing. In this paper, a multiprocessor realization of a 2D denominator separable digital filter is implemented in Altera Stratix III FPGA. The implementation achieves a data throughput equivalent to one multiplication and two additions, plus one clock cycle. It has been found that the maximum...

chapter

Controller design for matrix multiplication on FPGAs

Ahmad Khayyat, Naraig Manjikian

2011 24th Canadian Conference on Electrical and Computer Engineering(CCECE) > 1327 - 1332

2011 24th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)

FPGA technology constitutes an attractive platform for high-performance accelerators of parallel workloads in general-purpose computers. Matrix multiplication is a computationally intensive application that is highly parallelizable. Previous work has typically described custom floating-point components and reported on specific designs or implementations using these components for FPGA-based matrix...

chapter

A Sparse Matrix Personality for the Convey HC-1

K K Nagar, J D Bakos

2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines > 1 - 8

2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM 2011)

In this paper we describe a double precision floating point sparse matrix-vector multiplier (SpMV) and its performance as implemented on a Convey HC-1 reconfigurable computer. The primary contributions of this work are a novel streaming reduction architecture for floating point accumulation, a novel on-chip cache optimized for streaming compressed sparse row (CSR) matrices, and end-to-end integration...

chapter

Multiple data set reduction on FPGAs

Yi-Gang Tai, Chia-Tien Dan Lo, K Psarris

2010 International Conference on Field-Programmable Technology > 45 - 52

2010 International Conference on Field-Programmable Technology (FPT 2010)

Many scientific or engineering applications perform reduction of sets of sequential data streams. If the core operator of the reduction is deeply pipelined, dependencies between the input data elements cause data hazards in the pipeline. To tackle this problem, we propose a multiple set variable length reduction design with low latency and high pipeline utilization in this paper. We prove the buffer...

chapter

Design and implementation of an efficient montgomery modular multiplier with a new linear systolic array

Jizhong Liu, Jinming Dong

2010 IEEE International Conference on Information Theory and Information Security > 225 - 229

2010 IEEE International Conference on Information Theory and Information Security

To resolve the latency problem of implementing Montgomery modular multiplication algorithm using the linear systolic array, this paper proposes the improved Montgomery algorithm, and improves the systolic array by combining the long carry save adder (CSA) structure. This paper also proposes a series of methods to optimize the critical path and a non-waiting modular multiplication strategy which can...

chapter

Cascading Deep Pipelines to Achieve High Throughput in Numerical Reduction Operations

Mingjie Lin, Shaoyi Cheng, J Wawrzynek

2010 International Conference on Reconfigurable Computing and FPGAs > 103 - 108

2010 International Conference on Reconfigurable Computing and FPGAs (ReConFig 2010)

This work proposes a cascaded and pipelined (CAP) reconfigurable architecture to achieve high throughput in executing numerical reduction operations commonly found in many scientific computations by (1) cascading multiple deeply-pipelined floating-point arithmetic cores to match the inherent computing structure underlying target operations, (2) interleaving multiple computing threads to eliminate...

chapter

Implementation of a Floating Point Adder and Subtracter in NoGAP, A Comparative Case Study

Per Karlström, Wenbiao Zhou, Dake Liu

2010 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing > 68 - 72

2010 IEEE/IFIP 8th International Conference on Embedded and Ubiquitous Computing (EUC 2010)

Flexible Application Specific Instruction-set Processors (ASIPs) are starting to replace monolithic Application Specific Integrated Circuits (ASICs) in a wide variety of fields. However the construction of an ASIP is today associated with a substantial design effort. Novel Generator of Accelerators And Processors (NoGap) is a tool for ASIP design utilizing hardware multiplexed data paths. One of the...

Data set:
ieee
Keywords:
FIELD PROGRAMMABLE GATE ARRAYS
ADDERS
PIPELINES

Publication date

Set your own date range

Publication type

book (37)
article (5)

Keywords

CLOCKS (16)
FPGA (16)
HARDWARE (11)
COMPUTER ARCHITECTURE (10)
PARALLEL PROCESSING (9)
FLOATING POINT ARITHMETIC (8)
REGISTERS (8)
ALGORITHM DESIGN AND ANALYSIS (7)
DELAY (7)
DIGITAL SIGNAL PROCESSING (7)
LOGIC DESIGN (7)
PIPELINE PROCESSING (7)
DATA MINING (6)
MULTIPLEXING (6)
TABLE LOOKUP (6)
THROUGHPUT (6)
MULTIPLYING CIRCUITS (4)
RANDOM ACCESS MEMORY (4)
ARRAYS (3)
BUFFER STORAGE (3)
CLOCK FREQUENCY (3)
DSP (3)
ENCODING (3)
FPGA IMPLEMENTATION (3)
PIPELINE ARITHMETIC (3)
VLSI (3)
ACCURACY (2)
ARTIFICIAL NEURAL NETWORKS (2)
BANDWIDTH (2)
COMPUTERS (2)
COPROCESSORS (2)
DECODING (2)
DELAYS (2)
EDUCATIONAL INSTITUTIONS (2)
ELLIPTIC CURVE CRYPTOGRAPHY (2)
EMBEDDED PROCESSORS (2)
EMBEDDED SYSTEMS (2)
ENGINES (2)
FAST FOURIER TRANSFORMS (2)
FFT PROCESSOR (2)
FIELD PROGRAMMABLE GATE ARRAY (2)
FLOATING POINT (2)
FLOATING-POINT ADDERS (2)
GENERATORS (2)
HARDWARE DESCRIPTION LANGUAGES (2)
HARDWARE DESIGN LANGUAGES (2)
HARDWARE-SOFTWARE CODESIGN (2)
INDEXES (2)
IP NETWORKS (2)
MATRIX ALGEBRA (2)
MATRIX MULTIPLICATION (2)
MICROPROCESSOR CHIPS (2)
MONTGOMERY MODULAR MULTIPLICATION (2)
MULTIPLY-ACCUMULATOR (2)
PROGRAM PROCESSORS (2)
PUBLIC KEY CRYPTOGRAPHY (2)
RECONFIGURABLE ARCHITECTURES (2)
RECONFIGURABLE COMPUTING (2)
REDUCED INSTRUCTION SET COMPUTING (2)
REDUCTION (2)
SIGNAL PROCESSING ALGORITHMS (2)
SOFTWARE (2)
SPARSE MATRICES (2)
SYSTOLIC ARRAYS (2)
TIMING (2)
VERY LARGE SCALE INTEGRATION (2)
WORD LENGTH 32 BIT (2)
6.34 GFLOPS (1)
ACCELERATOR (1)
ACCELERATOR ARCHITECTURES (1)
ACCUMULATOR DESIGN (1)
ACCURATE SCALAR PRODUCT (1)
ACOUSTICS (1)
ADAPTATION MODEL (1)
ADDER (1)
ADDER ACCUMULATOR OPERATOR (1)
ADL (1)
AGGREGATE STATISTICS (1)
ALGORITHMIC TRANSFORMATIONS (1)
ALTERA STRATIX III FPGA (1)
AMPLIFICATION PROBLEM (1)
APPLICATION SPECIFIC INSTRUCTION-SET PROCESSORS (1)
APPLICATION SPECIFIC INTEGRATED CIRCUITS (1)
APPROXIMATION METHODS (1)
ARCHITECTURAL LEVEL POWER REDUCTION (1)
ARCHITECTURE (1)
ARITHMETIC (1)
ARITHMETICAL DIGITAL COMPONENT (1)
ASSOCIATIVE BINARY OPERATOR (1)
AUTOMATIC TEST-BENCH GENERATION (1)
BASE-CONVERTING FLOATING-POINT ADDER (1)
BASIC LINEAR ALGEBRA SUBROUTINES (1)
BCD RECODING (1)
BCD-4221 CARRY SAVE ADDER REDUCTION TREE (1)
BEE3 (1)
BERKELEY EMULATION ENGINE (1)
BINARY ADDER (1)
more

INFONA - science communication portal

Search results

Fast RNS implementation of elliptic curve point multiplication in GF(p) with selected base pairs

Efficient FPGA Mapping of Pipeline SDF FFT Cores

A high-speed and area-efficiency DSP Block Embedded in FPGAs

HPAZ: A high-throughput pipeline architecture of ZUC in hardware

Effectiveness of matrix and pipeline FPGA-based arithmetic components of safety-related systems

Design and Implementation of an Embedded FPGA Floating Point DSP Block

Efficient montgomery multiplier for pairing and elliptic curve based cryptography

An Implementation of Montgomery Modular Multiplication on FPGAs

An area efficient multiplexer based CORDIC

Multiply-accumulator using modified booth encoders designed for application in 16-bit RISC processor

The design and implementation of reconfigurable multiplier with high flexibility

Parametrized hardware architectures for the Lucas primality test

Iterative Refinement on FPGAs

Multiprocessor FPGA implementation of a 2D digital filter

Controller design for matrix multiplication on FPGAs

A Sparse Matrix Personality for the Convey HC-1

Multiple data set reduction on FPGAs

Design and implementation of an efficient montgomery modular multiplier with a new linear systolic array

Cascading Deep Pipelines to Achieve High Throughput in Numerical Reduction Operations

Implementation of a Floating Point Adder and Subtracter in NoGAP, A Comparative Case Study

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options