The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
With the extensive application of machine learning algorithms in bioinformatics, more and more computer researchers are beginning to focus on this field. Polyadenylation of messenger RNA (mRNA) is one of the key steps of gene expression in eukaryotes, polyadenylation site marks the end of transcription, it is of great significance to explore prediction of the site of gene sequences encoding gene....
We present a novel approach for large speech databases quantization. It uses an unsupervised iterative process to regulate a similarity measure to set the number of clusters and their boundaries, thus overcoming the shortcomings of conventional clustering algorithms such as k-Means and Fuzzy C-Means, which require a priori knowledge of the number of clusters and a similarity measure that follows the...
Speech recognition systems are ubiquitous and find its application in automated voice control, voice dialling and automated directory assistance. This paper aims at implementing a neural network based isolated spoken word recognition system on an embedded board — Raspberry Pi using open source software called octave. Mel-Frequency Cepstral Coefficient (MFCC) features are extracted from speech signal...
Automatic drum transcription methods aim at extracting a symbolic representation of notes played by a drum kit in audio recordings. For automatic music analysis, this task is of particular interest as such a transcript can be used to extract high level information about the piece, e.g., tempo, downbeat positions, meter, and genre cues. In this work, an approach to transcribe drums from polyphonic...
In this paper, we tackle the continuous gesture recognition problem with a two streams Recurrent Neural Networks (2S-RNN) for the RGB-D data input. In our framework, the spotting-recognition strategy is used, that means the continuous gestures are first segmented into separated gestures, and then each isolated gesture is recognized by using the 2S-RNN. Concretely, the gesture segmentation is based...
Inspired by Gustave Lebon's idea of crowds as single-minded entities, we present a novel approach to describe the behavior of a crowd as a single entity, based on the global movement of the entire aggregate of people conforming the crowd. The present work significantly differs from existing literature where the behavior of single individuals within the crowd are the building blocks to describe crowd...
Building applications that are cognizant of temporal and spatial changes in human behaviour under a one-class learning restriction represents a requirement for many user centric systems. We are particularly motivated to demonstrate the utility of algorithms for the self identification of smart phones. A framework is designed to quantify: (i) the dissimilarity in behaviours among any two users, (ii)...
Deep neural network (DNN) acoustic models can be adapted to under-resourced languages by transferring the hidden layers. An analogous transfer problem is popular as few-shot learning to recognise scantily seen objects based on their meaningful attributes. In similar way, this paper proposes a principled way to represent the hidden layers of DNN in terms of attributes shared across languages. The diverse...
Spoken language understanding (SLU) is one of the important problem in natural language processing, and especially in dialog system. Fifth Dialog State Tracking Challenge (DSTC5) introduced a SLU challenge task, which is automatic tagging to speech utterances by two speaker roles with speech acts tag and semantic slots tag. In this paper, we focus on speech acts tagging. We propose local coactivate...
In this paper, we elaborate the advantages of combining two neural network methodologies, convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent neural networks, with the framework of hybrid hidden Markov models (HMM) for recognizing offline handwriting text. CNNs employ shift-invariant filters to generate discriminative features within neural networks. We show that CNNs are...
Monitoring the presence of occupants in a room in a timely manner is a fundamental step for effective building management. Environmental sensor networks have the advantages of high cost-efficiency and non-intrusiveness on privacy and are very suitable for room occupancy detection. Nonlinear discriminative models, e.g., support vector machine and neural networks, have shown good detection performance...
Non-intrusive appliance load monitoring is a technique to help power companies monitor and analyze residential energy usage. Aggregated power load measurements are disaggregated into individual appliance loads by examining the appliance-specific power consumption characteristics. This data can then be used to modify consumer behaviors via detailed billing data and/or demand-pricing tariffs. A number...
Efficient spectrum sensing can be realized by predicting the future idle times of primary users' activity in a cognitive radio network. In dynamic spectrum access, based on a reliable prediction scheme, a secondary user chooses a channel with the longest idle time for data transmission. In this paper, four supervised machine learning techniques, two from ANN, i.e. Multilayer Perceptron & Recurrent...
Recent breakthroughs in deep neural networks have led to the proliferation of its use in image and speech applications. Conventional deep neural networks (DNNs) are fully-connected multi-layer networks with hundreds or thousands of neurons in each layer. Such a network requires a very large weight memory to store the connectivity between neurons. In this paper, we propose a hardware-centric methodology...
The performance of speech classification tasks can be improved by accurate acoustic modeling. This modelling is responsible for establishing the relationship between the speech signal and the phonetic units that were produced by the speaker. In this paper Acoustic Modeling(AM) is done using Reservoir Computing(RC) technique for which the input speech signal frames can be identified and classified...
This paper presents an investigation of speech recognition accuracy of unknown test patterns using five models when classes for classification of unknown test patterns increase from three to five. GMM, SVM, MLP, RBPNN and LVQ are used and their speech recognition accuracy are studied on five isolated digits. In the first experiment, three isolated words from zero to two are used for training and testing...
In this paper, we introduce a simple but novel model to detect abnormal event in surveillance video using sparse autoencoder and recurrent neuron network. In this model, we first train a sparse autoencoder to extract features and use a sequence of temporal continuous features to train a recurrent neuron network to predict the subsequent features. We classify the frame as normal and abnormal based...
This paper explores a novel hybrid approach for classifying sequential data such as isolated spoken words. The approach combines a hidden Markov model (HMM) with a spiking neural network (SNN). The HMM, consisting of states and transitions, forms a fixed backbone with nonadaptive transition probabilities. The SNN, however, implements a Bayesian computation by using an appropriately selected spike...
In recent years there has been increased interest in studies that explore integrative learning of language and other modalities by using neural network models. However, for practical application to human-robot interaction, the acquired semantic structure between language and meaning has to be available immediately and repeatably whenever necessary, just as in everyday communication. As a solution...
This paper presents an unsupervised approach for learning and classifying patterns that have spatio-temporal structure, using a spike-timing neural network with axonal conductance delays, from a very small set of training samples. Spatio-temporal patterns are converted into spike trains, which can be used to train the network with spike-timing dependent plasticity learning. A pattern is encoded as...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.