The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We present an ASIC architecture with coarse-grain reconfigurability that uses accelerators to improve performance over fine-grain reconfigurable architectures. A reconfigurable FFT ASIC was built as a proof of concept, and it successfully demonstrated valid switch operation for reconfiguration.
Devices connected to the internet are increasingly the targets of deliberate and sophisticated attacks [1]. Embedded system engineers tend to focus on well-defined functional capabilities rather than “obscure” security and resilience. However, “after-the-fact” system hardening could be prohibitively expensive or even impossible. The co-design of security and resilience with functionality has to overcome...
The Air Force Research Laboratory Information Directorate Advanced Computing and Communications Division is developing a new computing architecture, designed to provide high performance embedded computing (HPEC) pod solution to meet operational and tactical real-time processing intelligence surveillance and reconnaissance (ISR) missions. This newly designed system, Agile Condor, is a scalable and...
DDR3 memory is at the heart of almost all cloud computing servers today. A recently publicized failure mechanism in DDR3 memory, coined Row Hammer, has been shown to not only be a reliability issue but also a security risk. No industry standards group, government agency or trade association has signed up to address this issue. Data Centers and end users are on their own. This paper will discuss briefly...
Data processing systems impose multiple views on data as it is processed by the system. These views include spreadsheets, databases, matrices, and graphs. There are a wide variety of technologies that can be used to store and process data through these different steps. The Lustre parallel file system, the Hadoop distributed file system, and the Accumulo database are all designed to address the largest...
The gap between data production and user ability to access, compute and produce meaningful results calls for tools that address the challenges associated with big data volume, velocity and variety. One of the key hurdles is the inability to methodically remove expected or uninteresting elements from large data sets. This difficulty often wastes valuable researcher and computational time by expending...
High-performance computing applications that will run on future exascale-class supercomputing systems are projected to encounter accelerated rates of faults and errors. For these large-scale systems, maintaining fault resilient operation is a key challenge. The most widely used resiliency approach today, which is based on checkpoint and rollback (C/R) recovery, is not expected to remain viable in...
High fidelity prediction of the link budget between a pair of transmitting and receiving antennas in dense and complex environments is computationally very intensive at high frequencies. Iterative physical optics (IPO) is a scalable solution for electromagnetic (EM) simulations with complex geometry. In this paper, an efficient and robust solution is presented to predict the link budget between antennas...
The sheer size of data sets from application domains such as biomedical and social networks will lead to the need to develop algorithms that have strict time bounds and can tolerate temporary unavailability of data if they are to produce acceptable results in feasible time. In this paper we describe a simple, yet powerful, object-based concurrent programming model that features atomicity, timed execution...
K-Means clustering is a popular unsupervised machine learning method which has been used in diverse applications including image processing, information retrieval, social sciences and weather forecasting. However, clustering is computationally expensive especially when applied to large datasets. In this paper, we explore accelerating the performance of K-means clustering using three approaches: 1)...
Fast and efficient vector reduction circuit is very important to the real time application. In this paper, a new tag based fully pipelined vector reduction circuit is proposed, which can concurrently handle multiple vectors input with arbitrary sequence. Meanwhile, the proposed circuit provides simple and efficient interface and access timing similar to a RAM. So it can be used in a broad range.
Array-type reductions represent a frequently occurring algorithmic pattern in many scientific applications. A special case occurs if array elements are accessed in an irregular, often random manner, making their concurrent and scalable execution difficult. In this work we present a new approach that consists of language and runtime support and targets popular parallel programming models such as OpenMP...
With the introduction of low power System on a Chip (SoC) processor architectures in enterprise server configurations, there is a growing need to develop the software that will support scale-out, data intensive cloud applications that are deployed in data centers today. In this paper, we describe the design and implementation of a low latency user space fully compliant TCP/IP socket stack on a low...
Gene regulatory network reconstruction is a fundamental problem in computational biology. We recently developed an algorithm, called PANDA (Passing Attributes Between Networks for Data Assimilation), that integrates multiple sources of omics data and estimates regulatory network models. This approach was initially implemented in the C++ programming language and has since been applied to a number of...
Graphics processing units (GPUs) have inadvertently become supercomputers in and of themselves, to the benefit of applications outside of graphics. Acceleration of multiple orders of magnitude has been achieved in scientific computing, co-processing and the like. However, the Single Instruction Multiple Data (SIMD) design of GPUs is extremely sensitive to thread divergence. So much so that performance...
The ability to simultaneously leverage multiple modes of sensor information is critical for perception of an automated vehicle's physical surroundings. Spatio-temporal alignment of registration of the incoming information is often a prerequisite to analyzing the fused data. The persistence and reliability of multi-modal registration is therefore the key to the stability of decision support systems...
Scientific applications are typically compute intensive, often due to the requirement of solving large sparse linear systems of equations. The geometric multigrid method (GMG) is one of the most efficient algorithms for solving these systems and is well suited for parallelization. Herein we focus on an in-depth analysis of a GPU-based GMG implementation and compare the results against an optimized...
Path planning problems greatly arise in many applications where the objective is to find the shortest path from a given source to destination. In this paper, we explore the comparison of programming languages in the context of parallel workload analysis. We characterize parallel versions of path planning algorithms, such as the Dijkstra's Algorithm, across C/C++ and Python languages. Programming language...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.