The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
According to the characteristic of large space manipulator, an on-board real-time singularity detection design is proposed. On the basis of forward and inverse kinematics calculation, the forward and inverse power method is applied to obtain the singularity by iterative computation. Firstly, the 7-DOF manipulator kinematics model is described and analyzed, and the main computational process is presented;...
Recent activity in near-data processing has built or proposed systems that can exploit technologies such as 3D stacks, in-situ computing, or dataflow devices. However, little effort has been applied to exploit the natural parallelism and throughput of DRAM. This article details research from Micron Technology in the area of processing in memory as a form of memory-centric computing. In-Memory Intelligence...
Conventional processor architectures are restricted in exploiting instruction level parallelism (ILP) due to the limited number of available registers in their instruction sets. Therefore, recent processor architectures expose their datapaths so that the compiler not only schedules instructions to functional units, but also takes care of directly moving values between functional units avoiding the...
With the improvement of processor and SDRAM performance, the performance of SDRAM controller becomes the bottleneck of the system performance. In this paper, a Tow-Level Buffered SDRAM controller is proposed, and its design and verification are described. To some extent, the controller improves the throughput of the processor for the SDRAM memory, and provides a solution for the design of high performance...
This paper describes the retargeting and further enhancement of a compact multitasking kernel for the 32-bit Altera Nios II processor. The kernel, called QUERK for Queen's University Educational Real-time Kernel, was originally written in assembly language and then the C language for the Motorola (and then Freescale) 68HC11 processor. Consisting of less than 200 lines of assembly-language instructions,...
Because of their high throughput and power efficiency, massively parallel architectures like graphics processing units (GPUs) become a popular platform for generous purpose computing. However, there are few studies and analyses on GPU instruction set architectures (ISAs) although it is wellknown that the ISA is a fundamental design issue of all modern processors including GPUs.
Many algorithms have been design in order to accomplish an improved the performance of the filters by using the convolution design. The architecture of the proposed RISC CPU is a uniform 32-bit instruction format, single cycle non-pipelined processor. It has load/store architecture, where the operations will only be performed on registers, and not on memory locations. It follows the classical von-Neumann...
Our proposed architecture of dynamically reconfigurable hardware for protocol processing (DRHPP) provides flexibility with high area efficiency. It can be used for a communications system-on-a-chip (SoC) in access networks. The DRHPP enables the modification and addition of various functions for protocol processing. Our architecture consists of three types of cells. The optimized number of these types...
FPGA-based platforms allow implementing reconfigurable systems that can change functionality of portions of hardware at runtime. For this purpose, non-volatile, off-chip storage is required to hold the partial-configuration bitstreams that will be used for reconfiguration. Accessing such devices requires a high CPU usage or a dedicated hardware such as a Direct Memory Access (DMA) module, especially...
We describe the design and performance of the GRAPE-MPs, a series of SIMD accelerator boards for quadruple/hexuple/octuple-precision arithmetic operations. Basic design of GRAPE-MPs is that it consists of a number of processing elements (PE) and memory components which handle data with quadruple/hexuple/octuple-precision. A GRAPE-MPs processor is implemented on a structured ASIC chip and an FPGA chip...
The delay of instructions broadcast has a significant impact on the performance of Single Instruction Multiple Data (SIMD) architecture. This is especially true for massively parallel processing Systems-on-Chip (mppSoC), where the processing stage and that of setting up the communication mechanism need several clock periods. Subnetting is the strategy used to partition a single physical network into...
The data flow technique is a multiprocessor technique which enables parallelism to be found without being explicitly declared. One of the most important steps based on the dynamic data flow model is direct operand matching. The concept of direct operand matching represents the elimination of the costly process (in terms of computing time) related to associative searching of the operands. This paper...
Computer architecture is beset by two opposing trends. Technology scaling and deep pipelining have led to high memory access latencies; meanwhile, power and energy considerations have revived interest in traditional in-order processors. In-order processors, unlike their superscalar counterparts, do not allow execution to continue around data cache misses. In-order processors, therefore, suffer a greater...
One dimensional SIMD array which is PIM-based data parallel computer architectural has been proposed for multimedia processing application. This paper describes the implementation of a controller for one dimensional SIMD array. The main components of the controller and the instruction format are presented. PE array control is introduced in detail. Finally, the results of simulation are given to show...
Real-time execution of applications is one of key requirements for Cyber-Physical Systems (CPS) that integrate computational and physical elements for our social infrastructure, such as robotics, transportation, and consumer appliances. In such real-time systems, a task must be executed so as not to violate given time constraints. Moreover, it is desirable that the execution time of the task is predictable...
We present an obfuscation strategy to protect a program against injection attacks. The strategy represents the program as a set of code fragments in-between two consecutive system calls (the system blocks) and a graph that represents the execution order of the fragment (the system block graph). The system blocks and the system block graph are partitioned between two virtual machines (VMs). The Blocks-VM...
The hardware design of Godson-3A processor adopts the scalable distributed multi-core structure which is based on a 2D mesh. It can make use of multi-chip interconnection to construct a unified topology structure for board level or system level. This kind of interconnected system can't achieve entirely by hardware design, and it also needs the reasonable design of the BIOS and upper software. As the...
Reconfigurable Field Programmable Gate Arrays (FPGAs) are growing the attention of developers of mission- and safety-critical applications (e.g., aerospace ones), as they allow unprecedented levels of performance, which are making these devices particularly attractive as ASICs replacement, and as they offer the unique feature of in-the-field reconfiguration. However, the sensitivity of reconfigurable...
This paper presents a novel, high performance and low cost execution architecture for the system level GALS programming language SystemJ, which extends Java with synchronous reactive features present in Esterel and asynchronous constructs of CSP (Communicating Sequential Processes). The new architecture is based on JOP (Java Optimized Processor), which is a hardware implementation of the Java Virtual...
In this paper, we propose a programmable string matching architecture to process multiple characters at a single cycle. To simplify the architecture of the previous works, we employ a method of realigning the input data stream by offsets. We show that some registers can be eliminated by using the method. Additionally, we present two different approaches to implement a programmable hardware for string...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.