The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Deep learning (DL) training-as-a-service (TaaS) is an important emerging industrial workload. TaaS must satisfy a wide range of customers who have no experience and/or resources to tune DL hyper-parameters (e.g., mini-batch size and learning rate), and meticulous tuning for each user's dataset is prohibitively expensive. Therefore, TaaS hyper-parameters must be fixed with values that are applicable...
Deep learning is nowadays one of the most popular research topics in computer science. In recent years, the extensive application of convolutional neural network has made it become a new direction for the computer architecture research that is developing rapidly. Currently, there is a growing demand on off-line deploying deep learning network on top of embedded mobile systems. However, how to balance...
The basic features of some of the most versatile and popular open source frameworks for machine learning (TensorFlow, Deep Learning4j, and H2O) are considered and compared. Their comparative analysis was performed and conclusions were made as to the advantages and disadvantages of these platforms. The performance tests for the de facto standard MNIST data set were carried out on H2O framework for...
Demand is mounting in the industry for scalable GPU-based deep learning systems. Unfortunately, existing training applications built atop popular deep learning frameworks, including Caffe, Theano, and Torch, etc, are incapable of conducting distributed GPU training over large-scale clusters.To remedy such a situation, this paper presents Nexus, a platform that allows existing deep learning frameworks...
Deep Learning over Big Data (DLoBD) is becoming one of the most important research paradigms to mine value from the massive amount of gathered data. Many emerging deep learning frameworks start running over Big Data stacks, such as Hadoop and Spark. With the convergence of HPC, Big Data, and Deep Learning, these DLoBD stacks are taking advantage of RDMA and multi-/many-core based CPUs/GPUs. Even though...
Deep learning methods have resulted in effective strategies for improving performance in a large number of applications, becoming one of the most used strategies by developers and researchers. In order to facilitate the implementation of those approaches, a set of software frameworks have been developed and are currently available. Selection of a specific framework is an important task, especially...
On-road obstacle detection and classification is one of the key tasks in the perception system of self-driving vehicles. Since vehicle tracking involves localizationand association of vehicles between frames, detection and classification of vehicles is necessary. Vision-based approaches are popular for this task due to cost-effectiveness and usefulness of appearance information associated with the...
According to some estimates of World Health Organization (WHO), in 2014, more than 1.9 billion adults aged 18 years and older were overweight. Overall, about 13% of the world's adult population (11% of men and 15% of women) were obese. 39% of adults aged 18 years and over (38% of men and 40% of women) were overweight. The worldwide prevalence of obesity more than doubled between 1980 and 2014. The...
Recent advances in deep learning have enabled researchers across many disciplines to uncover new insights about large datasets. Deep neural networks have shown applicability to image, time-series, textual, and other data, all of which are available in a plethora of research fields. However, their computational complexity and large memory overhead requires advanced software and hardware technologies...
Deep neural networks have gained popularity inrecent years, obtaining outstanding results in a wide range ofapplications such as computer vision in both academia andmultiple industry areas. The progress made in recent years cannotbe understood without taking into account the technologicaladvancements seen in key domains such as High PerformanceComputing, more specifically in the Graphic Processing...
Deep Learning (DL) algorithms have become ubiquitous in data analytics. As a result, major computing vendors — including NVIDIA, Intel, AMD and IBM — have architectural road-maps influenced by DL workloads. Furthermore, several vendors have recently advertised new computing products as accelerating DL workloads. Unfortunately, it is difficult for data scientists to quantify the potential of these...
Estimating human age from brain MR images is useful for early detection of Alzheimer's disease. In this paper we propose a fast and accurate method based on deep learning to predict subject's age. Compared with previous methods, our algorithm achieves comparable accuracy using fewer input images. With our GPU version program, the time needed to make a prediction is 20 ms. We evaluate our methods using...
Deep learning is a model of machine learning loosely based on our brain. Artificial neural network has been around since the 1950s, but recent advances in hardware like graphical processing units (GPU), software like cuDNN, TensorFlow, Torch, Caffe, Theano, Deeplearning4j, etc. and new training methods have made training artificial neural networks fast and easy. In this paper, we are comparing some...
Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second...
Many studies have shown that Deep Convolutional Neural Networks (DCNNs) exhibit great accuracies given large training datasets in image recognition tasks. Optimization technique known as asynchronous mini-batch Stochastic Gradient Descent (SGD) is widely used for deep learning because it gives fast training speed and good recognition accuracies, while it may increases generalization error if training...
With recent advances in deep convolutional neural networks (CNN), deep learning has brought significant quality improvement and flexibility on single image super resolution (SR). In this paper, we describe how CNN based SR can be accelerated on integrated GPUs. To this end, we employ a CNN model from an existing single image SR approach, and develop the model within a well-known deep learning framework...
Deep learning has been shown as a successful machine learning method for a variety of tasks, and its popularity results in numerous open-source deep learning software tools coming to public. Training a deep network is usually a very time-consuming process. To address the huge computational challenge in deep learning, many tools exploit hardware features such as multi-core CPUs and many-core GPUs to...
In recent years convolutional neural networks (CNNs) have been successfully applied to various applications that are appropriate for deep learning, from image and video processing to speech recognition. The advancements in both hardware (e.g. more powerful GPUs) and software (e.g. deep learning models, open-source frameworks and supporting libraries) have significantly improved the accuracy and training...
The stacked autoencoder is a deep learning model that consists of multiple autoencoders. This model has been widely applied in numerous machine learning applications. A significant amount of effort has been made to increase the size of the deep learning model with respect to the size of the training dataset and the parameter of the model to improve performance. However, training a large deep learning...
Recently, convolutional networks have achieved great successes in the field of computer vision. In order to improve the efficiency of convolutional networks, large amount of solutions focusing on training algorithms and parallelism strategies have been proposed. In this paper, a novel algorithm based on look-up table is proposed to speed up convolutional networks with small filters by applying GPU...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.