The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We present a contextual spoken language understanding (contextual SLU) method using Recurrent Neural Networks (RNNs). Previous work has shown that context information, specifically the previously estimated domain assignment, is helpful for domain identification. We further show that other context information such as the previously estimated intent and slot labels are useful for both intent classification...
Convolutional Neural Networks (CNNs) have demonstrated powerful acoustic modelling capabilities due to their ability to account for structural locality in the feature space; and in recent works CNNs have been shown to often outperform fully connected Deep Neural Networks (DNNs) on TIMIT and LVCSR. In this paper, we perform a detailed empirical study of CNNs under the low resource condition, wherein...
This paper compares the classification performance and training times of feed-forward neural networks with one hidden layer trained with the two network weight optimisation methods. The first weight optimisation method used the extreme learning machine (ELM) algorithm. The second weight optimisation method used the back-propagation (BP) algorithm. Using identical network topologies the two weight...
This paper introduces a method to produce high-quality transcriptions of speech data from only two crowd-sourced transcriptions. These transcriptions, produced cheaply by people on the Internet, for example through Amazon Mechanical Turk, are often of low quality. Often, multiple crowd-sourced transcriptions are combined to form one transcription of higher quality. However, the state of the art is...
In many spoken language understanding systems (SLUS), domain classification is the most crucial component, as system responses based on wrong domains often yield very unpleasant user experiences. In multi-lingual domain classification, the training data for some poor-resource languages often comes from machine translation. Some of the higher order n-gram features are distorted during machine translation...
In this paper, we present methods in deep multimodal learning for fusing speech and visual modalities for Audio-Visual Automatic Speech Recognition (AV-ASR). First, we study an approach where uni-modal deep networks are trained separately and their final hidden layers fused to obtain a joint feature space in which another deep network is built. While the audio network alone achieves a phone error...
Recently, a discrepancy in results has appeared in the literature concerning score fusion methods, classified in “combination methods” and “classification methods” [1]. Some works suggest that a simple Arithmetic Mean Rule (AMR) can outperform some training-based methods on multimodal data [2], while others favour, among other trained classifiers, a Support Vector Machine [3]. This paper makes a comparative...
Face is the most powerful biometric as far as human recognition system is concerned which is not the case for machine vision. Face recognition by machine is yet incomplete due to adverse, unconstrained environment. Out of several attempts made in past few decades, subspace based methods appeared to be more accurate and robust. In the present proposal, a new subspace based method is developed. It preserves...
Grid systems have emerged as a means of sharing computational resources and information. Providing services for accessing, sharing and modifying large databases is a crucial task for grid management systems. This paper proposes an artificial neural network (ANN) prediction mechanism that provides an enhancement to data replication solutions within grid systems. Current replication services often exhibit...
The purpose of this paper is to investigate heterogeneous multi-column ConvNets (MCCNN) and fusion methods for them. We first construct heterogeneous MCCNN by combining ConvNets with different structures. We then use different fusion methods to check their performances to find out the effect of fusion methods for MCCNN. We also propose a novel sliding window based fusion framework which defines a...
Classification is the category that consists of identification of class labels of records that are typically described by set of features in dataset. The paper describes a system that uses a set of data pre-processing activities which includes Feature Selection and Discretization. Feature selection and dimension reduction are common data mining approaches in large datasets. Here the high data dimensionality...
This paper introduces a novel tree induction algorithm called sequential Random Forest (sRF) to improve the detection accuracy of a standard Random Forest classifier. Observations have shown that the overall performance of a forest is strongly influenced by the number of training samples. The main idea is to sequentially adapt the number of training samples per class so that each tree better complements...
In the field of image recognition, a high-dimensional feature vector is often used to construct a classifier. This presents a problem, however, since using a large number of features can slow down training and degrade model readability. To alleviate this problem, sequential backward selection (SBS) has come to be used as a method for selecting an effective number of features for classification. However,...
Vote count (VC) is a fast search algorithm originally designed for similarity search on large scale data set. VC can be efficiently implemented using simple modification to the Random Access Memory (RAM) or other memory structures such as NOR or NAND Flash memory, such that the search complexity reduces to O(1) regardless of the dimensionality of data or the size of the data set. This paper proposes...
The amount of data in our society has been exploding in the era of big data today. In this paper, we address several open challenges of big data stream classification, including high volume, high velocity, high dimensionality, and high sparsity. Many existing studies in data mining literature solve data stream classification tasks in a batch learning setting, which suffers from poor efficiency and...
As a derivative of Restricted Boltzmann Machine (RBM), classification RBM (Class RBM) is proved to be an effective classifier with a probabilistic interpretation. Several elegant learning methods/models related to Class RBM have been proposed. This paper proposes and analyzes a Rényi divergence based generalization for discriminative learning objective of Class RBM. Specifically, we extend the Conditional...
Standard Symbolic Aggregation Approximation (SAX) is at the core of many effective time series data mining algorithms. Its combination with Bag-of-Patterns (BoP) has become the standard approach with state-of-the-art performance on standard datasets. However, standard SAX with the BoP representation might neglect internal temporal correlation embedded in the raw data. In this paper, we proposed time...
The traditional k-NN classification rule predicts a label based on the most common label of the k nearest neighbors (the plurality rule). It is known that the plurality rule is optimal when the number of examples tends to infinity. In this paper we show that the plurality rule is sub-optimal when the number of labels is large and the number of examples is small. We propose a simple k-NN rule that...
Bolstered error estimation has been shown to perform better than cross-validation and competitively with bootstrap in small-sample settings. However, its performance can deteriorate in the high-dimensional settings prevalent in Genomic Signal Processing. We propose here a modification of Bolstered error estimation that is based on the principle of Naive Bayes. Rather than attempting to estimate a...
Statistical word alignment models need large amounts of training data while they are weak in small-sized corpora. This paper proposes a new approach of an unsupervised hybrid word alignment technique using an ensemble learning method. This algorithm uses three base alignment models in several rounds to generate alignments. The ensemble algorithm uses a weighed scheme for resampling training data and...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.