Search results

article

Dynamic Signature Verification System Based on One Real Signature

Moises Diaz, Andreas Fischer, Miguel A. Ferrer, Rejean Plamondon

IEEE Transactions on Cybernetics > 2018 > 48 > 1 > 228 - 239

The dynamic signature is a biometric trait widely used and accepted for verifying a person’s identity. Current automatic signature-based biometric systems typically require five, ten, or even more specimens of a person’s signature to learn intrapersonal variability sufficient to provide an accurate verification of the individual’s identity. To mitigate this drawback, this paper proposes a procedure...

chapter

Ivec-PLDA-AHC priors for VB-HMM speaker diarization system

Liang He, Xianhong Chen, Can Xu, Tianyu Liang, more

2017 IEEE International Workshop on Signal Processing Systems (SiPS) > 1 - 6

2017 IEEE International Workshop on Signal Processing Systems (SiPS)

This paper proposes a hybrid speaker diarization system. The main body is a variational Bayes — hidden Markov model (VB-HMM) speaker diarization system. The VB-HMM speaker diarization system avoids making premature hard decision and takes advantages of soft speaker information in an iterative way. Thus, it outperforms most of mainstream speaker diarization systems. Unfortunately, this system is sensitive...

chapter

A detail enhancement strategy for face sketch synthesis based on NSST

Weiguo Wan, Hyo Jong Lee

2017 International Conference on Information and Communication Technology Convergence (ICTC) > 784 - 788

2017 International Conference on Information and Communication Technology Convergence (ICTC)

Face sketch synthesis plays an important role in both law enforcement and digital entertainment. The existing methods for sketch synthesis always suffer from noising and blurring effect. To resolve these problems, a nonsubsampled Shearlet transform (NSST) based detail enhancement strategy is proposed. The exemplar-based method is firstly adopted to synthesize the primary sketch, then the final sketch...

chapter

Recognition and simulation of parachute action based on continuous hidden Markov model

Xuan Gong, Liang Han, Jiangyun Wang, Maopeng Ran

2017 Chinese Automation Congress (CAC) > 4108 - 4113

2017 Chinese Automation Congress (CAC)

Building a human-computer interactive parachute simulator is an efficient way to avoid the high risk and high cost of field parachute training. In this paper, a novel dynamic recognition and simulation approach of parachute training is developed. Firstly we process the skeletal data acquired by Kinect and enforce the indication of the trainees' parachute posture, where principle component analysis...

chapter

Novel alignment method for DNN TTS training using HMM synthesis models

Sinisa Suzic, Tijana Delic, Darko Pekar, Vladimir Ostojic

2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY) > 271 - 276

2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY)

In order to train neural networks (NN) for text-to-speech synthesis (TTS), phonetic segmentation must be performed. The most accurate segmentation is performed manually, but the process of creating manual alignments is costly and time-consuming, so automatic procedures are preferable. In this paper, a simple alignment method based on models trained during hidden Markov Model (HMM) based TTS system...

chapter

Voice transformation using pitch and spectral mapping

Anisha Yathigiri, Meenalatha Bathula, Susmitha Kothapalli, Susmitha Vekkot, more

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1540 - 1544

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

This paper provides a voice transformation model that uses pitch data and Feed-forward Neural Networks on Line Spectral Frequency. The aim of this work is to achieve the transformation of a speech signal produced by a source speaker by modifying voice individuality parameters such that it appears to be spoken by a chosen target speaker, without modifying the message contents. Most of the previous...

chapter

Raga identification using Locality Sensitive Hashing

G Padmasundari, Hema A Murthy

2017 Twenty-third National Conference on Communications (NCC) > 1 - 6

2017 Twenty-third National Conference on Communications (NCC)

Rāga is a quintessential component of Indian classical music. Rāgas are primarily characterised by melodic time-frequency (T-F) motifs. There have been several efforts made to determine the identity of a rāga, yet the techniques work only on subset(s) of rāgas, or perform poorly in terms of scalability. In this paper, we propose a rāga identification method for Carnatic music using Locality Sensitive...

chapter

Implicit language identification system based on random forest and support vector machine for speech

Manish Gupta, Shambhu Shankar Bharti, Suneeta Agarwal

2017 4th International Conference on Power, Control & Embedded Systems (ICPCES) > 1 - 6

2017 4th International Conference on Power, Control & Embedded Systems (ICPCES)

Speech uttered by the human beings contains the information about speakers, languages and contents. Language of uttered speech can easily be identified by extracting the language specific information from it. Identification of language of speech is known as Language Identification (LID). Identification of language from speech is helpful in its translation, speech recognition and speech activated automatic...

chapter

A novel face recognition method based on one state of discrete Hidden Markov Model

Hameed R. Farhan, Mahmuod H. Al-Muifraje, Thamir R. Saeed

2017 Annual Conference on New Trends in Information & Communications Technology Applications (NTICT) > 252 - 257

2017 Annual Conference on New Trends in Information & Communications Technology Applications (NTICT)

The trend for about twenty years, the research regarding the number of states in Hidden Markov Model (HMM) was mainly aimed at increasing it in order to ensure the robustness of the face recognition system. In this paper, a novel face recognition method is presented based on one state of discrete HMM, where it seemed impossible in the past. Contrary to other approaches that use the three parameters...

chapter

Lyric recognition in monophonic singing using pitch-dependent DNN

Dairoku Kawai, Kazumasa Yamamoto, Seiichi Nakagawa

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 326 - 330

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

One of the difficulties in sung speech recognition is the small distance in an acoustic space between phonemes in sung speech. Therefore we considered clustering the speech based on a pitch (fundamental frequency F0) and creating a larger distance between the phonemes. In addition, we considered a two-stage training method of DNN-HMM: the first stage is trained by using conventional acoustic features...

chapter

Exploiting sequence information for text-dependent Speaker Verification

Subhadeep Dey, Petr Motlicek, Srikanth Madikeri, Marc Ferras

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5370 - 5374

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Model-based approaches to Speaker Verification (SV), such as Joint Factor Analysis (JFA), i-vector and relevance Maximum-a-Posteriori (MAP), have shown to provide state-of-the-art performance for text-dependent systems with fixed phrases. The performance of i-vector and JFA models has been further enhanced by estimating posteriors from Deep Neural Network (DNN) instead of Gaussian Mixture Model (GMM)...

chapter

Melody extraction and detection through LSTM-RNN with harmonic sum loss

Hyunsin Park, Chang D. Yoo

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2766 - 2770

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper proposes a long short-term memory recurrent neural network (LSTM-RNN) for extracting melody and simultaneously detecting regions of melody from polyphonic audio using the proposed harmonic sum loss. The previous state-of-the-art algorithms have not been based on machine learning techniques and certainly not on deep architectures. The harmonics structure in melody is incorporated in the...

chapter

On the impact of non-modal phonation on phonological features

Milos Cernak, Elmar Noth, Frank Rudzicz, Heidi Christensen, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5090 - 5094

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Different modes of vibration of the vocal folds contribute significantly to the voice quality. The neutral mode phonation, often used in a modal voice, is one against which the other modes can be contrastively described, also called non-modal phonations. This paper investigates the impact of non-modal phonation on phonological posteriors, the probabilities of phonological features inferred from the...

chapter

Learning and inferring human actions with temporal pyramid features based on conditional random fields

Shih-Yao Lin, Yen-Yu Lin, Chu-Song Chen, Yi-Ping Hung

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2617 - 2621

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Finding an effective way to represent human actions is yet an open problem because it usually requires taking evidences extracted from various temporal resolutions into account. A conventional way of representing an action employs temporally ordered fine-grained movements, e.g., key poses or subtle motions. Many existing approaches model actions by directly learning the transitional relationships...

chapter

Cascading BLSTM networks for handwritten word recognition

Bruno Stuner, Clement Chatelain, Thierry Paquet

2016 23rd International Conference on Pattern Recognition (ICPR) > 3416 - 3421

2016 23rd International Conference on Pattern Recognition (ICPR)

Handwritten word recognition is a tough task, mixing image and natural language processing. Recently new recurrent neural networks with LSTM cells allowed significant improvements in this field. These networks are generally coupled with lexical and linguistic knowledge in order to correct character misrecognitions, namely using a lexicon driven decoding. Yet the high performances of LSTM networks...

chapter

Combining Analytical and Holistic Strategies for Handwriting Recognition

Hesham M. Eraqi, Sherif Abdelazeem, Mohsen A. A. Rashwan

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA) > 993 - 997

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)

In this paper, a study is conducted on combining analytical and holistic strategies for handwriting recognition. Even though the big majority of the recent high recognition rate systems adopts analytical strategies, physiological scientists suggest that the holistic strategy is the key for realizing near-human performance. In what we believe is a fresh perspective on handwriting recognition, combining...

chapter

Influence of corpus size and content on the perceptual quality of a unit selection MaryTTS voice

Florian Hinterleitner, Benjamin Weiss, Sebastian Moller

2016 IEEE Spoken Language Technology Workshop (SLT) > 680 - 685

2016 IEEE Spoken Language Technology Workshop (SLT)

State-of-the-art approaches on text-to-speech (TTS) synthesis like unit selection and HMM synthesis are data-driven. Therefore, they use a prerecorded speech corpus of natural speech to build a voice. This paper investigates the influence of the size of the speech corpus on five different perceptual quality dimensions. Six German unit selection voices were created based on subsets of different sizes...

chapter

Entropy-based pruning of hidden units to reduce DNN parameters

Gautam Mantena, Khe Chai Sim

2016 IEEE Spoken Language Technology Workshop (SLT) > 672 - 679

2016 IEEE Spoken Language Technology Workshop (SLT)

For acoustic modeling, the use of DNN has become popular due to its superior performance improvements observed in many automatic speech recognition (ASR) tasks. Typically, DNNs with deep (many layers) and wide (many hidden units per layer) architectures are chosen in order to achieve good gains. An issue with such approaches is that there is an explosion in the number of learnable parameters. Thus,...

chapter

Boosting performance on low-resource languages by standard corpora: An analysis

Frantisek Grezl, Martin Karafiat

2016 IEEE Spoken Language Technology Workshop (SLT) > 629 - 636

2016 IEEE Spoken Language Technology Workshop (SLT)

In this paper, we analyze the feasibility of using single well-resourced language - English - as a source language for multilingual techniques in context of Stacked Bottle-Neck tandem system. The effect of amount of data and number of tied-states in the source language on performance of ported system is evaluated together with different porting strategies. Generally, increasing data amount and level-of-detail...

chapter

Code-switching detection using multilingual DNNS

Emre Yilmaz, Henk van den Heuvel, David van Leeuwen

2016 IEEE Spoken Language Technology Workshop (SLT) > 610 - 616

2016 IEEE Spoken Language Technology Workshop (SLT)

Automatic speech recognition (ASR) of code-switching speech requires careful handling of unexpected language switches that may occur in a single utterance. In this paper, we investigate the feasibility of using multilingually trained deep neural networks (DNN) for the ASR of Frisian speech containing code-switches to Dutch with the aim of building a robust recognizer that can handle this phenomenon...

INFONA - science communication portal

Search results

Dynamic Signature Verification System Based on One Real Signature

Ivec-PLDA-AHC priors for VB-HMM speaker diarization system

A detail enhancement strategy for face sketch synthesis based on NSST

Recognition and simulation of parachute action based on continuous hidden Markov model

Novel alignment method for DNN TTS training using HMM synthesis models

Voice transformation using pitch and spectral mapping

Raga identification using Locality Sensitive Hashing

Implicit language identification system based on random forest and support vector machine for speech

A novel face recognition method based on one state of discrete Hidden Markov Model

Lyric recognition in monophonic singing using pitch-dependent DNN

Exploiting sequence information for text-dependent Speaker Verification

Melody extraction and detection through LSTM-RNN with harmonic sum loss

On the impact of non-modal phonation on phonological features

Learning and inferring human actions with temporal pyramid features based on conditional random fields

Cascading BLSTM networks for handwritten word recognition

Combining Analytical and Holistic Strategies for Handwriting Recognition

Influence of corpus size and content on the perceptual quality of a unit selection MaryTTS voice

Entropy-based pruning of hidden units to reduce DNN parameters

Boosting performance on low-resource languages by standard corpora: An analysis

Code-switching detection using multilingual DNNS

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options