The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper aims to develop a framework for vehicle type classification using convolutional neural network based on vehicle rear view images. Compared with the extraction of the appearance features from vehicle side view and frontal view images, there has been relatively little research on vehicle type classification by using vehicle rear view images' information. The vehicle rear view images are detected...
For remote sensing image understanding, target detection is one of the most important tasks. In this paper, we propose one object detection method based on region proposal detection via active contour model and detection based on one-class classification method. The large scale remote sensing image is split into several connected components. And then, the proposed algorithm detects the object from...
Recently, deep learning has been proposed and verified to possess the strong ability to learn and express complex features, which has brought significant research achievements in signal processing. As a challenging task in speech signal processing, monaural speech separation has always been the research focus of researchers. From the usage of traditional signal processing methods and shallow models...
The deep learning is a popular research direction in machine learning field now. In this paper, the deep learning algorithms are used to recognize the underwater target radiated noises. The deep belief network (DBN) model and the stacked denoising autoencoder (SDAE) model are built respectively. Then the underwater acoustic simulated data of different types of targets as well as different states of...
Manual wafer-level die inking is a common procedure for excluding die locations that are likely to be defective. Although this is a more cost-effective process, as compared to the expensive burn-in tests, it remains a labor-intensive step during IC testing. For each manufactured wafer, test engineers have to visually inspect every failure map in order to identify any regions where additional die need...
Human action recognition is one of the most active research areas of computer vision. With the rapid development of deep learning, using neural networks to realize action recognition becomes a popular thesis. This paper proposes a self-learned action recognition method based on neural networks. The proposed method trains dictionaries with sparse autoencoder (SAE) and extracts the key frames with sparse...
There are many species of tomato diseases and pests, and the pathology of which is complex. It is difficult and error-prone to simply rely on manual identification. For the ten most common tomato diseases and pests in China, This paper explores the detection algorithms on leaf images and constructs the convolution neural network model to detect tomato pests and diseases based on VGG16[8] and transfer...
Recognizing fine-grained categories (e.g., bird species) highly relies on discriminative part localization and part-based fine-grained feature learning. Existing approaches predominantly solve these challenges independently, while neglecting the fact that part localization (e.g., head of a bird) and fine-grained feature learning (e.g., head shape) are mutually correlated. In this paper, we propose...
This paper introduces a novel approach for modeling visual relations between pairs of objects. We call relation a triplet of the form (subject; predicate; object) where the predicate is typically a preposition (eg. ’under’, ’in front of’) or a verb (’hold’, ’ride’) that links a pair of objects (subject; object). Learning such relations is challenging as the objects have different spatial configurations...
Despite the recent success of deep-learning based semantic segmentation, deploying a pre-trained road scene segmenter to a city whose images are not presented in the training set would not achieve satisfactory performance due to dataset biases. Instead of collecting a large number of annotated images of each city of interest to train or refine the segmenter, we propose an unsupervised learning approach...
We propose a novel video object segmentation algorithm based on pixel-level matching using Convolutional Neural Networks (CNN). Our network aims to distinguish the target area from the background on the basis of the pixel-level similarity between two object units. The proposed network represents a target object using features from different depth layers in order to take advantage of both the spatial...
In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination changes. We make use of LSTM units on the CNN output, which play the role of a structured dimensionality reduction on the feature vector, leading to drastic improvements...
We consider the question: what can be learnt by looking at and listening to a large number of unlabelled videos? There is a valuable, but so far untapped, source of information contained in the video itself – the correspondence between the visual and the audio streams, and we introduce a novel “Audio-Visual Correspondence” learning task that makes use of this. Training visual and audio networks from...
We propose the Anchored Regression Network (ARN), a nonlinear regression network which can be seamlessly integrated into various networks or can be used stand-alone when the features have already been fixed. Our ARN is a smoothed relaxation of a piecewise linear regressor through the combination of multiple linear regressors over soft assignments to anchor points. When the anchor points are fixed...
Traditional vehicle detectors always utilize singletemplate model to represent the vehicle which can not encircle vehicles with different aspect ratios. In this paper, we propose a fast and accurate approach for detecting vehicles which joints classification and aspect ratio regression. The key idea is extending the boosting decision trees method to estimate vehicle's aspect ratio during vehicle detection,...
We propose a novel framework for abnormal event detection in video that requires no training sequences. Our framework is based on unmasking, a technique previously used for authorship verification in text documents, which we adapt to our task. We iteratively train a binary classifier to distinguish between two consecutive video sequences while removing at each step the most discriminant features....
Subitizing (i.e., instant judgement on the number) and detection of salient objects are human inborn abilities. These two tasks influence each other in the human visual system. In this paper, we delve into the complementarity of these two tasks. We propose a multi-task deep neural network with weight prediction for salient object detection, where the parameters of an adaptive weight layer are dynamically...
Discriminative correlation filters (DCFs) have been shown to perform superiorly in visual tracking. They only need a small set of training samples from the initial frame to generate an appearance model. However, existing DCFs learn the filters separately from feature extraction, and update these filters using a moving average operation with an empirical weight. These DCF trackers hardly benefit from...
Sensor based algorithms need to extract features from raw sensor data. However, different devices have different sensor data distributions. This distribution differences lead to a problem that model trained on device A may be invalid when applied to device B. However, it is labor-consuming to collect data and label them on device B from scratch. To solve the problem, a solution is proposed to learn...
A novel ultra-wideband radar sensor system for simultaneous detection of falls and vital signs is presented. The suggested system is able to deal with real-life conditions, such as lack of real-fall data for training, body movements, several people present, and privacy issues. Micro-Doppler features, extracted from time-frequency spectrograms, are used to classify human actions as related to normal...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.