The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Automatic personal identification from their physical and behavioral traits, called biometrics technologies, is now needed in many fields such as: surveillance systems, access control systems, physical buildings and many more applications. In this paper, we propose an efficient online personal identification system based on Multi-Spectral Palmprint images (MSP) using Hidden Markov Model (HMM) and...
Application specific intrusion detection methods are used to detect network intrusions targeted at applications. Normally such detection methods require payload or packet content analysis. One of the prominent method of payload modeling and analysis is sequence or ngram modeling. Normally ngrams generated from a packet are compared with a database of ngrams seen during training phase. Depending on...
This paper presents a robust and anticipative realtime gesture recognition and its motion quality analysis module. By utilizing a motion capture device, the system recognizes gestures performed by a human, where the recognition process is based on skeleton analysis and motion features computation. Gestures are collected from a single person. Skeleton joints are used to compute features which are stored...
Biometric authentication on devices such as smartphones and tablets has increased significantly in the last years. One of the most acceptable and increasing traits is the handwriting signature as it has been used in financial and legal agreements scenarios for over a century. Nowadays, it is frequent to sign in banking and commercial areas on digitizing tablets. For these reasons, it is necessary...
We have applied Latent Topic Models to facial expression recognition. We showed that the latent topic learned from a topic model is very similar to the Action Units defined by psychologists in the Facial Action Coding Systems (FACS). Furthermore, we noted that the topics thus obtained may be correlated with each other, and we tried to model this by the correlated topic model (CTM). Preliminary results...
This paper presents the design and development of Assamese Text to speech (TTS) synthesis system. In particular, work focused on designing language specific rules, developing quality database, data segmentation, and to handle bilingual sound units. In Assamese language, till now no study is done to construct the grapheme to phoneme conversion rules. In this work, grapheme to phoneme conversion rules...
Dealing with real-life databases often implies handling sets of heterogeneous variables. We are proposing in this paper a methodology for exploring and analyzing such databases, with an application in the specific domain of healthcare data analytics. We are thus proposing a two-step heterogeneous finite mixture model, with a first step involving a joint mixture of Gaussian and multinomial distribution...
In this paper, the effect of keyword choice including and excluding plosive sounds on isolated speaker recognition system is investigated. In order to perform this study, a Turkish word database has been created consisting of 48 words including plosives and 7 words without plosives. Records are acquired at a sampling frequency of 16 kHz in a professional recording studio, with sound insulation. The...
The main purpose of this paper is to determine how well can be differentiated the anxiety /fear emotion. In the analysis it is using EmoDB which contains a total number of seven emotions: happiness, fury, sadness, neutral tones, anxiety, boredom and disgust. We do not used the Romanian Database SRoL because the anxiety state is not recorded at this moment. The results are encouraging, the recognition...
The paper describes an experimental study on emotion recognition using a collection of emotional recordings from SRoL corpus. Its goal is to study and to obtain a simple tool that can be used in recordings validation in the process of building large voice corpora. The tools can help or even replace the human validation. In this study we used two classifiers, k-NN (k — Nearest Neighborhood) and SVM...
Automatic analysis of semantic roles can be seen not only as one of the natural language processing steps in the human-machine interfaces, but also as a tool to support linguistics analysis of written or spoken texts, which has many applications in education or in telecommunication services. There does not exist an automatic system for semantic roles labeling for Slovak texts, mainly because of the...
Support Vector Machines (SVM) is a statistical classification approach which has been successfully applied to various types of problems. However, it has remained largely unexplored for Arabic recognition. SVMs are originally designed for binary classification problems. For multi-class problems, several methods used a combination of binary SVMs while some others solved the problem in one step. This...
We propose in this paper a simple, yet efficient multi-channel fusion framework for joint acoustic event detection and classification. The joint problem on individual channels is posed as a regression problem to estimate event onset and offset positions. As an intermediate result, we also obtain the posterior probabilities which measure the confidence that event onsets and offsets are present at a...
With the proliferation of digital cameras and mobile devices, people are taking many more photos than ever before. The explosive growth of personal photos leads to problems of photo organization and management. There is a growing need for tools to automatically manage photo collections. Recognizing events in photo collections is one efficient way to organize photos. The use of textual event labels...
Movie captions comprised of pure text or some restricted selection of symbols, as produced today for movies and television are, by themselves are not capable of conveying to the hearing impaired valuable information encoded into the sound-track, namely, dialogue tone, music aliveness and background sounds atmosphere. This drastically reduces the capacity of the hearing impaired to have a better understanding...
In this paper we compare two state-of-the-art speech synthesis techniques (corpus- and HMM-based) in terms of expressive speech synthesis. Two corpora were composed with different speaking styles (broadcast news and literature reading) from the same female speaker. Our aim was to determine to what extent the different technologies reproduce these styles. The corpora and the synthetic expressive speech...
Lock folder is one of method that used to ensure nobody intentionally gets access to your private and confidential information. Presently used password based systems have a number of associated inconveniences and problems such as user needs to remember passwords, passwords can be guessed or broken down via brute force and also there is problem of non-repudiation. Besides, password authentication method...
Video-to-Text (V2T) fusion is an example of coordinating low-level information fusion (LLIF) with high-level information fusion (HLIF) through semantic descriptions of physical information. Using hard (e.g., video) and soft (i.e., text) data fusion affords Level 5 User Refinement of object characterization, target tracking, and situation assessment. Building on our previous video-to-text (V2T) Fusion2014...
Classification of speech signal is one of the most vital problems in speech perception and spoken word recognition. Although, there have been many studies on the classification of speech signals but the results are still limited. In this paper, we propose an image based approach for speech signal classification based on the combination of Local Naïve Bayes Nearest Neighbor (LNBNN) and Scale-invariant...
In this paper we propose synchronization rules between acoustic and visual laughter synthesis systems. Previous works have addressed separately the acoustic and visual laughter synthesis following an HMM-based approach. The need of synchronization rules comes from the constraint that in laughter, HMM-based synthesis cannot be performed using a unified system where common transcriptions may be used...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.