The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
As the memory and storage hierarchy get deeper and more complex, it is important to have new benchmarks and evaluation tools that allow us to explore the emerging middleware solutions to use this hierarchy. Skel is a tool aimed at automating and refining this process of studying HPC I/O performance. It works by generating application I/O kernel/benchmarks as determined by a domain-specific model....
Increasing data set sizes motivate for a shift of focus from computation-centric systems to data-centric systems, where data movement is treated as a first-class optimization metric. An example of this emerging paradigm is in-situ computing in largescale computing systems. Observing that data movement costs are increasing at an exponential rate even at a node level (as a node itself is fast-becoming...
Recently, wireless technology experiences a fast growth to meet user demand and push toward the boundary limit of system performance. The simulation and verification framework play important role for accelerating investigation of technology proof of concept, field-trial, and large-scale commercial prototyping. In this paper, we present system-level simulation of heterogeneous model and unified HW/SW...
In this paper, we present a distributed data visualization framework for HPC environments based on the PBVR (Particle Based Volume Rendering) method. The PBVR method is a kind of point-based rendering approach where the volumetric data to be visualized is represented as a set of small and opaque particles. This method has the object-space and image-space variants, defined by the place (object or image-...
Independent applications co-scheduled on the same hardware will interfere with one another, affecting performance in complicated ways. Predicting this interference is key to efficiently scheduling applications on shared hardware, but forming accurate predictions is difficult because there are many shared hardware features that could lead to the interference. In this paper we investigate machine learning...
As the most active project in the Hadoop ecosystem these days (Zaharia, 2014), Spark is a fast and general purpose engine for large-scale data processing. Thanks to its advanced Directed Acyclic Graph (DAG) execution engine and in-memory computing mechanism, Spark runs programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk (Apache, 2016). However, Spark performance is impacted...
Availability of affordable hardware that in effect enables desktop supercomputing has enabled more ambitious neural simulations driven by more complex software. However, this opportunity comes with costs, in terms of long learning curves to take advantage of the performance possibilities of idiosyncratic, architecturally heterogenous hardware and decreasing ability to be confident in the quality of...
Today, cloud computing has become a promising paradigm that aims at delivering computing resources and services on demand. The adoption of these services has been rapidly increasing. One of the main issues in this context is how to evaluate the ability of cloud systems to provide the desired services while respecting the QoS constraints. Experimentation in a real environment is a hard problem. In...
Accurate IPC estimates are critical for generating performance projections of key workloads on future designs. However, the need to respond to projections requests in a timely manner in the face of rapidly evolving applications and software stacks and tight schedule constraints, often preclude design teams from executing detailed workload analysis, sampling and simulation flows for such purposes....
DNNs (Deep Neural Networks) have demonstrated great success in numerous applications such as image classification, speech recognition, video analysis, etc. However, DNNs are much more computation-intensive and memory-intensive than previous shallow models. Thus, it is challenging to deploy DNNs in both large-scale data centers and real-time embedded systems. Considering performance, flexibility, and...
In this paper we describe a flexible infrastructure that can directly interface unmodified application executables with FPGA hardware acceleration IP in order to 1), facilitate faster computer architecture simulation, and 2), to prototype microarchitecture or accelerator IP. Dynamic binary modification tool plugins are directly interfaced to the application under evaluation via flexible software interfaces...
The growing demands in IT services for improving efficiency and quality at low cost to handle complex compute requirements has led to the integration of High performance computing (HPC) systems and cloud infrastructure in data centers. Earlier, HPC systems were limited to academic and research institutions and engineering laboratories. However, the emergence of cloud infrastructures and their successful...
Online education interaction is an important part in online education research. The emergence and development of cloud computing and large data technology provide new opportunities for online education interaction research, and have great influence on its service mode and data processing. Based on the characteristics of cloud computing and large data, this paper discusses the problems faced by online...
Deep Neural Networks (DNNs) have emerged as a powerful and versatile set of techniques showing successes on challenging artificial intelligence (AI) problems. Applications in domains such as image/video processing, autonomous cars, natural language processing, speech synthesis and recognition, genomics and many others have embraced deep learning as the foundation. DNNs achieve superior accuracy for...
We present a technique to automatically generate System Verilog-Assertions from designs using dynamic dependency graphs. We extract relations between signals of the design using only a few simulation runs, which drastically reduces the required number of use cases compared to other approaches. Additionally, unlike previous approaches, we do not use expression templates to establish those relations...
The employment of five distinct benchmarks on the Distributed Environment for Academic Computing (DEAC) Cluster at Wake Forest University provides meaningful metrics of cluster processor and memory performance. Given the heterogeneous nature of the DEAC Cluster, the benchmarks taken consider the specific processor architectures comprising the cluster. The data obtained will be assessed via two modeling...
Predicting performance of an application running on parallel computing platforms is increasingly becoming important due to the long development time of an application and the high resource management cost of parallel computing platforms. However, predicting overall performance is complex and must take into account both parallel calculation time and communication time. Difficulty in accurate performance...
Hardware in-the-loop simulation test has the advantage of live test, digital simulation test, which can build the lifelike test environment. This kind of test can carry through repeated test of multi-sample. The key technique of the hardware in-the-loop simulation test is real-time algorithmic and communication technology. In this paper, based on reflective memory network, the design for hardware...
Performance Prediction Toolkit (PPT) is a simulator mainly developed at Los Alamos National Laboratory to facilitate rapid and accurate performance prediction of large-scale scientific applications on existing and future HPC architectures. In this paper, we present three interconnect models for performance prediction of large-scale HPC applications. They are based on interconnect topologies widely...
Inferring activities on smartphones is a challenging task. Prior works have elaborated on using sensory data from built-in hardware sensors in smartphones or taking advantage of location information to understand human activities. In this paper, we explore two types of data on smartphones to conduct activity inference: 1) Spatial-Temporal: reflecting daily routines from the combination of spatial...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.