The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we use a advanced method called Faster R-CNN to detect traffic signs. This new method represents the highest level in object recognition, which don't need to extract image feature manually anymore and can segment image to get candidate region proposals automatically. Our experiment is based on a traffic sign detection competition in 2016 by CCF and UISEE company. The mAP(mean average...
Albeit recent progress in speaker verification engendered powerful models, malicious attacks in the form of spoofed speech, are generally not coped with. In previous attempts, deep neural networks were used to extract high dimensional features which were later classified using an independent classifier. Even though the results of this approach are promising, this architecture's disadvantage is it's...
Albeit recent progress in speaker verification generates powerful models, malicious attacks in the form of spoofed speech, are generally not coped with. Recent results in ASVSpoof2015 and BTAS2016 challenges indicate that spoof-aware features are a possible solution to this problem. Most successful methods in both challenges focus on spoof-aware features, rather than focusing on a powerful classifier...
This paper investigates a new voice conversion technique using phone-aware Long Short-Term Memory Recurrent Neural Networks (LSTM-RNNs). Most existing voice conversion methods, including Joint Density Gaussian Mixture Models (JDGMMs), Deep Neural Networks (DNNs) and Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNNs), only take acoustic information of speech as features to...
Model based VAD approaches have been widely used and achieved success in practice. These approaches usually cast VAD as a frame-level classification problem and employ statistical classifiers, such as Gaussian Mixture Model (GMM) or Deep Neural Network (DNN) to assign a speech/silence label for each frame. Due to the frame independent assumption classification, the VAD results tend to be fragile....
Band selection plays an important role in reducing the dimensionality of hyperspectral data sets. It is a combinatorial optimization problem for optimal band (feature) subset selection which generally involves high computational complexity. In this paper, we present an efficient band selection methods based on the covariance matrix. The method tries to compute the subset of bands with the largest...
In recent years, deep neural network(DNN) has achieved great success when used as acoustic model in speech recognition. An important application of DNN is to derive bottleneck feature. In this paper, firstly we investigate the robustness of bottleneck features generated by three types of DNN structures on the Aurora 4 task without any explicit noise compensation. Secondly, we propose the node-pruning...
Night video surveillance is crucial to construct an all-weather video surveillance system. However, night video surveillance faces several problems: no color information, low brightness, low contrast, and low signal to noise ratio (SNR). These problems can introduce serious false and missing object detections. In this paper, we propose a novel night video surveillance method based on the image second-order...
We consider the automated recognition of human actions in surveillance videos. Most current methods build classifiers based on complex handcrafted features computed from the raw inputs. Convolutional neural networks (CNNs) are a type of deep model that can act directly on the raw inputs. However, such models are currently limited to handling 2D inputs. In this paper, we develop a novel 3D CNN model...
Recently, combining Conditional Random Fields (CRF) with Neural Network has shown the success of learning high-level features in sequence labeling tasks. However, such models are difficult to train because of the increase of the parameters to tune which needs enormous of labeled data to avoid over fitting. In this paper, we propose a transfer learning framework for the sequence labeling task of gesture...
The traditional SPM approach based on bag-of-features (BoF) requires nonlinear classifiers to achieve good image classification performance. This paper presents a simple but effective coding scheme called Locality-constrained Linear Coding (LLC) in place of the VQ coding in traditional SPM. LLC utilizes the locality constraints to project each descriptor into its local-coordinate system, and the projected...
Recent years have witnessed significant progress in detection of basic human actions. However, most existing methods rely on assumptions such as known spatial locations and temporal segmentations or employ very computationally expensive approaches such as sliding window search through a spatio-temporal volume. It is difficult for such methods to scale up to handle the challenges in real applications...
Many techniques for recognizing and identifying disturbed signals waveforms are primarily based on visual inspection. This paper proposes a wavelet packet decomposition based technique to perform a feature extraction from the disturbed signals in order to identify the possible causes of the disturbance. On the basis of definition groups of Electromagnetic Interference, the interested information about...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.