The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Music tracking is a useful technique for many music related tasks. Some applications include evaluating musicians performance, automatic score page turning and syncing music lyrics. In this work, we describe a WiSARD-based system for real-time music tracking and evaluate the performance of the corresponding implementation. Given any audio example, the system is capable of recognizing which point of...
Cognitive radio is the technique of effective electromagnetic spectrum usage important for future wireless communication including 5G networks. Neural networks are nature-inspired computational models used to solve cognitive radio prediction problems. This paper presents the use of contextual Sigma-if neural network in prediction of channel states for cognitive radio. Our results indicate that Sigma-if...
This paper deals with utilization of deep neural networks (DNNs) for speech recognition. The main goal is to find out the best strategy for training and utilization of these models within an acoustic modeling module of a large vocabulary continuous speech recognition (LVCSR) system of Czech language. For this purpose, various DNNs are trained a) using several training strategies, b) with different...
In this paper, we tackle the task of symbolic gesture recognition using inertial MicroElectroMechanicals Systems (MEMS) present in Smartphones. We propose to build a non-linear similarity metric based on a Siamese Neural Network (SNN), trained using a new error function that models the relations between pairs of similar and dissimilar samples in order to structure the network output space. Experiments...
Truly autonomous robots require the capacity to recognise their surroundings by interpreting their sensorimotor stream. We present an online learning algorithm for training a mixture of echo state network experts that can segment a compliant robot's sensorimotor stream. Our method follows a probabilistic approach, using a hidden Markov model to model the switching dynamics between the experts. The...
This paper presents a deep recurrent regularization neural network (DRRNN) for speech recognition. Our idea is to build a regularization neural network acoustic model by conducting the hybrid Tikhonov and weight-decay regularization which compensates the variations due to the input speech as well as the model parameters in the restricted Boltzmann machine as a pre-training stage for feature learning...
Dropout and DropConnect can be viewed as regularization methods for deep neural network (DNN) training. In DNN acoustic modeling, the huge number of speech samples makes it expensive to sample the neuron mask (Dropout) or the weight mask (DropConnect) repetitively from a high dimensional distribution. In this paper we investigate the effect of Gaussian stochastic neurons on DNN acoustic modeling....
Standard Hidden Markov Models (HMM) have proved to be a very useful tool for temporal sequence pattern recognition, although they present a poor discriminative power. On the contrary Neural Networks (NNs) have been recognized as powerful tools for classification task, but they are less efficient to model temporal variation than HMM. In order to get the advantages of both HMMs and NNs, different hybrid...
Dynamic spectrum access (DSA) technologies offer solutions to the spectral crowding associated with static frequency allocation. Hierarchical DSA networks aim at allowing secondary users to efficiently utilize licensed spectrum, while still protecting primary users and ensuring them first priority to spectrum access. However, these networks are often multi-tiered and the concept of different operating...
Many machine learning methods have been applied on Named Entity Recognition (NER). Such methods generally build on a large manually-annotated training set. However, the training set is usually limited as human labeling is costly and time consuming. Compare to the training set, the unlabeled corpus is usually much bigger and contains rich information about language. In this paper, a hybrid Deep Neural...
This paper presents an efficient segmentation method which uses simple neural network architecture. The goal was to implement an automatic annotation instrument of the vocal signal, capable to make the separation between vowel and consonant signal, respectively pauses between utterances. This instrument is used in the features extraction from the vowels areas by an emotion recognition application...
Many researches have done to develop speech recognition systems in the past decades. However, their performance in speaker variabilities lags behind that of human recognition system. In order to solve this problem, speaker adaptation methods have proposed. These methods adapt either the acoustic model parameters or the input features of the speech recognition systems to improve their performance....
The purpose of analyzing gene network structure is to identify and understand some unknown related functions and the regulatory mechanisms at molecular level in organisms. Traditional model of the gene regulatory networks often lack an effective method of solving with gene expression profiling data because of high time and space complexity. In this study, a new model of gene regulatory network based...
Recently, automatic speech recognition has advanced significantly by the introduction of deep neural networks for acoustic modeling. However, there is no clear evidence yet that this does not come at the price of less generalization to conditions that were not present during training. On the other hand, acoustic modeling with Reservoir Computing (RC) did not offer improved clean speech recognition...
As neurobiological evidence points to the neocortex as the brain region mainly involved in high-level cognitive functions, an innovative model of neocortical information processing has been recently proposed. Based on a simplified model of a neocortical neuron, and inspired by experimental evidence of neocortical organisation, the Hierarchical Temporal Memory (HTM) model attempts at understanding...
Speech affective recognition is an important branch of speech recognition, whose main purpose is the emotional characteristics included in the analysis of speech signals. Because the use of a single model to identify which identify significant limitations. This paper presents a recognition model based on HMM and PNN, which using PNN for classification and using HMM for generating feature matching...
The Extreme Learning Machine (ELM) is a simplified neural network. It non-linearly embeds input data in a higher dimensional space using randomly generated sigmoidal basis functions. The training target vector is then approximated by a linear weighted sum of these basis functions. In this paper the ELM algorithm has been applied to classify static hand gestures that represent different letters of...
In this paper we combine three simple refinements proposed recently to improve HMM/ANN hybrid models. The first refinement is to apply a hierarchy of two nets, where the second net models the contextual relations of the state posteriors produced by the first network. The second idea is to train the network on context-dependent units (HMM states) instead of context-independent phones or phone states...
This paper introduces the sparse multilayer perceptron (SMLP) which learns the transformation from the inputs to the targets as in multilayer perceptron (MLP) while the outputs of one of the internal hidden layers is forced to be sparse. This is achieved by adding a sparse regularization term to the cross-entropy cost and learning the parameters of the network to minimize the joint cost. On the TIMIT...
In this paper, a novel collective network of binary classifiers (CNBC) framework is presented for content-based audio classification. The topic has been studied in several publications before, but in many cases the number of different classification categories is quite limited and needed to be fixed a priori. We focus our efforts to increase both the classification accuracy and the number of classes,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.