The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we propose to achieve the classification of pathologic voices and essentially the classification between organic pathologies: it's about polyp, edema and nodule pathologies using new features. The principle contribution in this work is to provide new parameter more efficient than the classic MFCC. It's about calculating MFCC not from the speech signal but from the speech multiscale...
Speech uttered by the human beings contains the information about speakers, languages and contents. Language of uttered speech can easily be identified by extracting the language specific information from it. Identification of language of speech is known as Language Identification (LID). Identification of language from speech is helpful in its translation, speech recognition and speech activated automatic...
Recently, there has been an explosion of cloud-based services that enable developers to include a spectrum of recognition services, such as emotion recognition, in their applications. The recognition of emotions is a challenging problem, and research has been done on building classifiers to recognize emotion in the open world. Often, learned emotion models are trained on data sets that may not sufficiently...
The performance of speech emotion classifiers greatly degrade when the training conditions do not match the testing conditions. This problem is observed in cross-corpora evaluations, even when the corpora are similar. The lack of generalization is particularly problematic when the emotion classifiers are used in real applications. This study addresses this problem by combining active learning (AL)...
Accurate segmentation of retinal vessel plays an important role in the computer-aided diagnosis of eye diseases. Existing supervised methods extract features only from green channel due to its much higher contrast between vessel and background than in red and blue channels. However, red and blue channels also contain useful information for distinguishing vessel from background. This work investigates...
The performance of a speaker verification system is severely degraded by spoofing attacks generated from artificial speech synthesizers. Recently, several approaches have been proposed for classifying natural and synthetic speech (spoof detection) which can be used in conjunction with a speaker verification system. In this paper, we attempt to develop a joint modelling approach which can detect the...
When emotion recognition systems are used in new domains, the classification performance usually drops due to mismatches between training and testing conditions. Annotations of new data in the new domain is expensive and time demanding. Therefore, it is important to design strategies that efficiently use limited amount of new data to improve the robustness of the classification system. The use of...
In current studies, an extended subjective self-report method is generally used for measuring emotions. Even though it is commonly accepted that speech emotion perceived by the listener is close to the intended emotion conveyed by the speaker, research has indicated that there still remains a mismatch between them. In addition, the individuals with different personalities generally have different...
Continuous prediction of dimensional emotions (e.g. arousal and valence) has attracted increasing research interest recently. When processing emotional speech signals, phonetic features have been rarely used due to the assumption that phonetic variability is a confounding factor that degrades emotion recognition/prediction performance. In this paper, instead of eliminating phonetic variability, we...
This paper addresses the problem of speech emotion recognition from movie audio tracks. The recently collected Acted Facial Expression in the Wild 5.0 database is used. The aim is to discriminate among angry, happy, and neutral. We extract a relatively small number of features, a subset of which is not commonly used for the emotion recognition task. Those features are fed as input to an ensemble classifier...
Discriminative least squares regression (DLSR) is a simple yet effective method for multi-class classification. One problem of DLSR is that it is lack of robustness to outliers. In order to tackle this difficulty, in this paper, we propose a novel Robust DLSR (RoDLSR) model. The core idea behind RoDLSR is to find and further ignore the outliers among the support vector set. Specifically, we modify...
Natural and affective handshakes of two participants define the course of dyadic interaction. Affective states of the participants are expected to be correlated with the nature of the dyadic interaction. In this paper, we extract two classes of the dyadic interaction based on temporal clustering of affective states. We use the k-means temporal clustering to define the interaction classes, and utilize...
Biometric is a pattern recognition system that automatically identifies people according to their physiologic and behavioral properties. Among the physiologic properties, hand has a special place so that all features of hand like palm lines, inner knuckles, external knuckles and geometry could be used. More recently, the usage of blood vessels pattern in the palm, in addition to the high acceptability,...
We present a system for acoustic scene classification, which is the task to classify an environment based on audio recordings. First, we describe a strong low-complexity baseline system using a compact feature set. Second, this system is improved with a novel class of audio features, which exploit the knowledge of sound behaviour within the scene - reverberation. This information is complementary...
We propose an image aesthetic quality assessment algorithm, which considers personal taste in addition to generally perceived preference. This problem is formulated by a combination of two different learning frameworks based on support vector machines—Support Vector Regression (SVR) and Ranking SVM (R-SVM), where SVR learns a general model based on public datasets and R-SVM adjusts the model to accommodate...
We address the problem of automatically recognizing artistic movement in digitized paintings. We make the following contributions: Firstly, we introduce a large digitized painting database that contains refined annotations of artistic movement. Secondly, we propose a new system for the automatic categorization that resorts to image descriptions by color structure and novel topographical features as...
In this paper we address the problem of human action recognition from video sequences. Inspired by the exemplary results obtained via automatic feature learning and deep learning approaches in computer vision, we focus our attention towards learning salient spatial features via a convolutional neural network (CNN) and then map their temporal relationship with the aid of Long-Short-Term-Memory (LSTM)...
With the rise in Air Traffic flow across the world due to advancement in technology and developments in the field of aeronautical engineering, the cases of emergency and panic situations on flights have also emerged at an exponential rate. Every single day, we hear of emergency situations in flights like fires, birdstrikes, diversions, engine failures and emergency landings etc. Across the globe,...
Diagnosing liver disease is the challenging task for many public health physicians. In this study, we propose the framework to diagnose the hepatitis disease. For this study the adaptive rule based induction were formulated and the adaptive rule implemented in combined Robust BoxCox Transformation (RBCT) and Neural Network (NN) methods. The performance of proposed model is compared and results are...
Person identification from their signatures or verifying the genuineness of official documents like bank cheques, certificates, contract forms, bonds etc. still remains a challenging task when accuracy and computation time are concerned. In this paper, a novel set of features based on the distribution of the quasi-straight line segments has been presented for off-line signature verification. For the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.