The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Noise robustness has long garnered much interest from researchers and practitioners of the automatic speech recognition (ASR) community due to its paramount importance to the success of ASR systems. This paper presents a novel approach to improving the noise robustness of speech features, building on top of the dictionary learning paradigm. To this end, we employ the K-SVD method and its variants...
Extractive speech summarization is intended to produce a condensed version of the original spoken document by selecting a few salient sentences from the document and concatenate them together to form a summary. In this paper, we study a novel use of manifold learning techniques for extractive speech summarization. Manifold learning has experienced a surge of research interest in various domains concerned...
Because unprecedented volumes of multimedia data associated with spoken documents have been made available to the public, spoken document retrieval (SDR) has become an important research area in the past decades. Recently, representation learning has emerged as an active research topic in many machine learning applications owing largely to its excellent performance. In the context of natural language...
Representation learning has emerged as a newly active research subject in many machine learning applications because of its excellent performance. In the context of natural language processing, paragraph (or sentence and document) embedding learning is more suitable/reasonable for some tasks, such as information retrieval and document summarization. However, as far as we are aware, there is only a...
Extractive summarization systems attempt to automatically pick out representative sentences from a source text or spoken document and concatenate them into a concise summary so as to help people grasp salient information effectively and efficiently. Recent advances in applying nonnegative matrix factorization (NMF) on various tasks including summarization motivate us to extend this line of research...
Question answering (QA) is an important research issue in natural language processing, and most state-of the-art question answering systems are based on statistical models. After witnessing recent achievements in Artificial Intelligent (AI), many businesses wish to apply those techniques to an automatic QA system that is capable of providing 24-hour customer services for their clients. However, one...
Extractive summarization aims at selecting a set of indicative sentences from a source document as a summary that can express the major theme of the document. A general consensus on extractive summarization is that both relevance and coverage are critical issues to address. The existing methods designed to model coverage can be characterized by either reducing redundancy or increasing diversity in...
Software vulnerabilities can be attributed to inherent bugs in the system. Several types of bugs introduce faults for not conforming to system specifications and failures, including crash, hang, and panic. In our work, we exploit security faults due to crash-type failures. It is difficult to reconstruct system failures after a program has crashed. Much research work has been focused on detecting program...
Extractive speech summarization refers to automatic selection of an indicative set of sentences from a spoken document so as to offer a concise digest covering the most salient aspects of the original document. The language modeling (LM) framework alongside the pseudo-relevance feedback (PRF) technique has emerged as a promising line of research for conducting extractive speech summarization in an...
Representation learning has emerged as a newly active research subject in many machine learning applications because of its excellent performance. As an instantiation, word embedding has been widely used in the natural language processing area. However, as far as we are aware, there are relatively few studies investigating paragraph embedding methods in extractive text or speech summarization. Extractive...
In the age of information explosion, efficiently categorizing the topic of a document can assist our organization and comprehension of the vast amount of text. In this paper, we propose a novel approach, named DKV, for document categorization using distributed real-valued vector representation of keywords learned from neural networks. Such a representation can project rich context information (or...
In this paper, we propose a novel approach for reader-emotion categorization using word embedding learned from neural networks and an SVM classifier. The primary objective of such word embedding methods involves learning continuous distributed vector representations of words through neural networks. It can capture semantic context and syntactic cues, and subsequently be used to infer similarity measures...
The task of extractive speech summarization is to select a set of salient sentences from an original spoken document and concatenate them to form a summary, facilitating users to better browse through and understand the content of the document. In this paper we present an empirical study of leveraging various supervised discriminative methods for effectively ranking important sentences of a spoken...
Extractive speech summarization, with the purpose of automatically selecting a set of representative sentences from a spoken document so as to concisely express the most important theme of the document, has been an active area of research and development. A recent school of thought is to employ the language modeling (LM) approach for important sentence selection, which has proven to be effective for...
Extractive speech summarization, aiming to automatically select an indicative set of sentences from a spoken document so as to concisely represent the most important aspects of the document, has become an active area for research and experimentation. An emerging stream of work is to employ the language modeling (LM) framework along with the Kullback-Leibler divergence measure for extractive speech...
This paper considers training data selection for discriminative training of acoustic models for broadcast news speech recognition. Three novel data selection approaches were proposed. First, the average phone accuracy over all hypothesized word sequences in the word lattice of a training utterance was utilized for utterance-level data selection. Second, phone-level data selection based on the difference...
This paper considers minimum phone error (MPE) based discriminative training of acoustic models for Mandarin broadcast news recognition. A novel data selection approach based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice of the training utterance was explored. It has the merit of making the training algorithm focus much more on the training...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.