The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
For large-scale visual search, highly compressed yet meaningful representations of images are essential. Structured vector quantizers based on product quantization and its variants are usually employed to achieve such compression while minimizing the loss of accuracy. Yet, unlike binary hashing schemes, these unsupervised methods have not yet benefited from the supervision, end-to-end learning and...
We present an approach to accelerating a wide variety of image processing operators. Our approach uses a fully-convolutional network that is trained on input-output pairs that demonstrate the operator’s action. After training, the original operator need not be run at all. The trained network operates at full resolution and runs in constant time. We investigate the effect of network architecture on...
This paper proposed an innovative education platform-VREX (Virtual Reality based Education eXpansion), with combination of online and offline, to improve the curriculum building and teaching experience. VREX is based on Virtual Reality (VR) and we believe VR can revolutionize the education ecosystem. With some trials, we found VR can be used to promote curriculum effectiveness in an immersive environment...
Training of Artificial Neural Networks (ANN) is an important step to make the network able to accomplish the desired task. This capacity of learning in such networks makes them applied in many applications as modeling and control. However, many of training algorithms have some drawbacks like: too many parameters to be estimated, important calculus time. In this paper, we propose a very simple method...
This paper deals with the problem of audio source separation. To handle the complex and ill-posed nature of the problems of audio source separation, the current state-of-the-art approaches employ deep neural networks to obtain instrumental spectra from a mixture. In this study, we propose a novel network architecture that extends the recently developed densely connected convolutional network (DenseNet),...
In this paper, we use Diagonal Recurrent Neural Networks on a sequence prediction task. The modification from standard RNN is simple: Diagonal recurrent matrices are used instead of full. This results in better test likelihood and faster convergence compared to regular full RNNs in most of our experiments. We show the benefits of using diagonal recurrent matrices with popularly used LSTM and GRU architectures...
Deep learning (deep structured learning, hierarchical learning or deep machine learning) is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using multiple processing layers with complex structures or otherwise composed of multiple non-linear transformations. In this paper, we present the results of testing neural networks architectures...
In the realm of surface electromyography (sEMG) gesture recognition, deep learning algorithms are seldom employed. This is due in part to the large quantity of data required for them to train on. Consequently, it would be prohibitively time consuming for a single user to generate a sufficient amount of data for training such algorithms. In this paper, two datasets of 18 and 17 able-bodied participants...
Pedestrian detection is an important topic in object detection. Compared with other object detectors, YOLOv2 achieves high accuracy and fast speed for general object detection, however it degrades accuracy when detecting crowed pedestrians. In this paper, combining with the skip structure of FCN, we tailor the YOLOv2 network to improve the accuracy in detecting small pedestrians which appear in groups...
Fine-grained vehicle classiflcation is a challenging task due to the subtle differences between vehicle classes. Several successful approaches to fine-grained image classification rely on part-based models, where the image is classified according to discriminative object parts. Such approaches require however that parts in the training images be manually annotated, a laborintensive process. We propose...
In this paper, we develop an end-to-end framework for text region proposal generation and text detection in the natural scene, which takes advantage of the efficient training strategies of the neural network. We make some technical changes over the SSD architectures, so that our method can handle the Chinese and English bilingual situation. Experiments show that our method exceeds the state-of-the-art...
The contribution of this paper is to bridge the gap on understanding the mathematical structure and the computational implementation of a convolutional neural network using a minimal model. The proposed minimal convolutional neural network is presented using a layering approach. This approach provides a clear understanding of the main mathematical operations in a convolutional neural network. Hence,...
Information and communication technologies (ICTs) are changing the way people develop their lives. This change is not indifferent to activities involved in education. For this reason, several learning techniques, based on ICTs, appear to improve the efficiency of the educational process. However, the problem of a maintained individualized learning arises, which does not focus on needs of the student,...
With the advent of new technologies in the field of medicine, there is rising awareness of biomechanisms, and we are better able to treat ailments than we could earlier. Deep learning has helped a lot in this endeavor. This paper deals with the application of deep learning in brain tumor segmentation. Brain tumors are difficult to segment automatically given the high variability in the shapes and...
Crowd counting on still images is very challenging due to heavy occlusions and scale variations. In this paper, we aim to develop a method that can accurately estimate the crowd count from a still image. Recently, convolutional neural networks have been shown effective in many computer vision tasks including crowd counting. To this end, we propose a fully convolutional network (FCN) architecture to...
The deployment of deep convolutional neural networks (CNNs) in many real world applications is largely hindered by their high computational cost. In this paper, we propose a novel learning scheme for CNNs to simultaneously 1) reduce the model size; 2) decrease the run-time memory footprint; and 3) lower the number of computing operations, without compromising accuracy. This is achieved by enforcing...
Future frame prediction in videos is a promising avenue for unsupervised video representation learning. Video frames are naturally generated by the inherent pixel flows from preceding frames based on the appearance and motion dynamics in the video. However, existing methods focus on directly hallucinating pixel values, resulting in blurry predictions. In this paper, we develop a dual motion Generative...
Deep Convolutional Neural Networks (CNNs) achieve substantial improvements in face detection in the wild. Classical CNN-based face detection methods simply stack successive layers of filters where an input sample should pass through all layers before reaching a face/non-face decision. Inspired by the fact that for face detection, filters in deeper layers can discriminate between difficult face/non-face...
Human actions captured in video sequences are threedimensional signals characterizing visual appearance and motion dynamics. To learn action patterns, existing methods adopt Convolutional and/or Recurrent Neural Networks (CNNs and RNNs). CNN based methods are effective in learning spatial appearances, but are limited in modeling long-term motion dynamics. RNNs, especially Long Short- Term Memory (LSTM),...
As handheld video cameras are now commonplace and available in every smartphone, images and videos can be recorded almost everywhere at anytime. However, taking a quick shot frequently yields a blurry result due to unwanted camera shake during recording or moving objects in the scene. Removing these artifacts from the blurry recordings is a highly ill-posed problem as neither the sharp image nor the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.