The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Audio Event Detection (AED) aims to recognize sounds within audio and video recordings. AED employs machine learning algorithms commonly trained and tested on annotated datasets. However, available datasets are limited in number of samples and hence it is difficult to model acoustic diversity. Therefore, we propose combining labeled audio from a dataset and unlabeled audio from the web to improve...
This paper addresses deep face recognition (FR) problem under open-set protocol, where ideal face features are expected to have smaller maximal intra-class distance than minimal inter-class distance under a suitably chosen metric space. However, few existing algorithms can effectively achieve this criterion. To this end, we propose the angular softmax (A-Softmax) loss that enables convolutional neural...
Polarity detection is a research topic of major interest, with many applications including detecting the polarity of product reviews. However, in some cases, the polarity of the product reviews might not be available while the polarity of the product itself might be, prohibiting the use of any form of fully supervised learning technique. This scenario, while different, is close to that of multiple...
We propose an autism spectrum disorder (ASD) prediction system based on machine learning techniques. Our work features the novel development and application of machine learning methods over traditional ASD evaluation protocols. Specifically, we are interested in discovering the latent patterns that possibly indicate the symptom of ASD underneath the observations of eye movement. A group of subjects...
This paper investigates a new privacy-preserving paradigm for the task of Query-by-Example Speech Search using Secure Binary Embeddings, a hashing method that converts vector data to bit strings through a combination of random projections followed by banded quantization. The proposed method allows performing spoken query search in an encrypted domain, by analyzing ciphered information computed from...
This paper describes an approach to the optimization of the nonlinear component of a physiologically motivated feature extraction system for automatic speech recognition. Most computational models of the peripheral auditory system include a sigmoidal nonlinear function that relates the log of signal intensity to output level, which we represent by a set of frequency dependent logistic functions. The...
Effective retrieval of multimodal data involves performing accurate segmentation and analysis of such data. With easy access to a number of audio and video sharing platforms online, user-generated content with considerably less than ideal recording conditions has increased rapidly. One major issue with such content is the presence of semantically irrelevant segments in such recordings. This leads...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.