Search results

Items from 1 to 20 out of 135 results

chapter

Rap music video generator: Write a script to make your rap music video with synthesized voice and CG animation

Masaki Hayashi, Steven Bachelder, Masayuki Nakajima, Yoshiaki Shishikui

2017 IEEE 6th Global Conference on Consumer Electronics (GCCE) > 1 - 2

2017 IEEE 6th Global Conference on Consumer Electronics (GCCE)

We have made an application to make rap music video with CG animation by writing out a simple script. Aquestalk and TVML (TV program Making Language) are used for synthesized voice and real-time CG generation, respectively. A user can enjoy making rap music video easily by writing speech texts and character movements along with the music beat in the script.

chapter

Towards a Breakthrough Speaker Identification Approach for Law Enforcement Agencies: SIIP

Khaled Khelif, Yann Mombrun, Gerhard Backfried, Farhan Sahito, more

2017 European Intelligence and Security Informatics Conference (EISIC) > 32 - 39

2017 European Intelligence and Security Informatics Conference (EISIC)

This paper describes SIIP (Speaker Identification Integrated Project) a high performance innovative and sustainable Speaker Identification (SID) solution, running over large voice samples database. The solution is based on development, integration and fusion of a series of speech analytic algorithms which includes speaker model recognition, gender identification, age identification, language and accent...

chapter

Voice control for smart home automation: Evaluation of approaches and possible architectures

Tatjana Eric, Sandra Ivanovic, Suncica Milivojsa, Milica Matic, more

2017 IEEE 7th International Conference on Consumer Electronics - Berlin (ICCE-Berlin) > 140 - 142

2017 IEEE 7th International Conference on Consumer Electronics - Berlin (ICCE-Berlin)

In this paper, we explore the possibility of using existing voice recognition tools, in order to add the voice control interface to the existing smart home automation system. The choice of the voice recognition engine influences the architecture of the voice command interface, and determines its performance. We discuss the possible architectures of the voice enabled smart home automation systems....

chapter

Digital corpus of santali language

Amir Khusru Akhtar, Gadadhar Sahoo, Mohit Kumar

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 934 - 938

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

In corpus preparation we do part-of-speech (POS) tagging where we add POS information into the corpus in the form of tags. The POS information contains a number of tags such as noun, pronoun, verb, adjective, adverbs, preposition, conjunction etc. Literature shows the lack of corpora for Santali language. In this paper we have created and described a Santali language corpus using Sketch Engine corpus...

chapter

Automatic language identification for seven Indian languages using higher level features

Chithra Madhu, Anu George, Leena Mary

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES) > 1 - 6

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES)

This paper proposes an approach for automatic language identification (LID) for seven Indian languages. The proposed system uses language dependent phonotactic features and prosodic information. Phonetic Engine (PE)which serves as the front end of the phonotactic based LID system converts input speech utterance to a sequence of phonetic symbols. Syllable boundaries are detected and phones within a...

chapter

Semantics driven intelligent front-end

Tamas Gergely, Edit Halmay, Miklos Szots, George Suciu, more

2017 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) > 1 - 8

2017 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)

This paper presents the work done in the context of the Speech2Process project for Speech Dialogue System applied in call-centers, specifically in the banking domain. In our proposed solution, the client communicates with the system by natural language sentences, which will be automatically recognized and semantically analysed. The paper describes innovative features of the selected approach, which...

chapter

An algorithm for the better assessment of machine translation

Pooja Malik, Anurag Singh Baghel

2017 International Conference on Computing, Communication and Automation (ICCCA) > 395 - 399

2017 International Conference on Computing, Communication and Automation (ICCCA)

Machine Translation, sometimes referred by the acronym MT, is one of the important fields of study of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, MT performs simple substitution of atomic words in one natural language for words in another language. Around the world, numerous systems are...

chapter

The phoneme set influence for lithuanian speech commands recognition accuracy

Mindaugas Greibus, Zivile Ringeliene, Laimutis Telksnys

2017 Open Conference of Electrical, Electronic and Information Sciences (eStream) > 1 - 4

2017 Open Conference of Electrical, Electronic and Information Sciences (eStream)

The phoneme set influence for Lithuanian speech commands recognition accuracy is investigated. Four phoneme sets are discussed. LIEPA speech corpus for training of Acoustic Model is used. The phonetic representation of corpus transcriptions is generated by grapheme-to-phoneme transformation rules. Rule based transformations for Lithuanian language is proposed. Recognition engine with CMU Pocketsphinx...

chapter

Improving speaker verification performance under spoofing attacks by fusion of different operational modes

Saeid Safavi, Hock Gan, Iosif Mporas

2017 IEEE 13th International Colloquium on Signal Processing & its Applications (CSPA) > 219 - 223

2017 IEEE 13th International Colloquium on Signal Processing & its Applications (CSPA)

In this paper, we propose a methodology for the fusion of different modes of speaker verification (SV) operation (fixed-passphrase, text-dependent and text-independent mode), using regression fusion models. The experimental results with and without spoofing attack conditions and using different single mode speaker verification engines, GMM-UBM, HMM-UBM and i-vector, indicated improvement in all the...

chapter

Image text to speech conversion in the desired language by translating with Raspberry Pi

H Rithika, B. Nithya Santhoshi

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) > 1 - 4

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

The main problem in communication is language bias between the communicators. This device basically can be used by people who do not know English and want it to be translated to their native language. The novelty component of this research work is the speech output which is available in 53 different languages translated from English. This paper is based on a prototype which helps user to hear the...

chapter

Fraud Detection in Voice-Based Identity Authentication Applications and Services

Saeid Safavi, Hock Gan, Iosif Mporas, Reza Sotudeh

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 1074 - 1081

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

Keeping track of the multiple passwords, PINs, memorable dates and other authentication details needed to gainremote access to accounts is one of modern life's less appealingchallenges. The employment of a voice-based verification as abiometric technology for both children and adults could be agood replacement to the old fashioned memory dependentprocedure. Using voice for authentication could be...

article

Microphone Array Processing Strategies for Distant-Based Automatic Speech Recognition

Soudeh A. Khoubrouy, John H. L. Hansen

IEEE Signal Processing Letters > 2016 > 23 > 10 > 1344 - 1348

Robust distant speech recognition (DSR) is necessary in many speech technology applications using multiple microphones but has received only limited treatment in the literature. In this paper, we work on communicating with vehicle voice-controlled system which is one of the applications of DSR. Two approaches for DSR are i) signal-level combination using beamforming followed by automatic speech recognition...

chapter

New birth of the Arabic phonetic dictionary

Mohamed Labidi, Mohsen Maraoui, Mounir Zrigui

2016 International Conference on Engineering & MIS (ICEMIS) > 1 - 9

2016 International Conference on Engineering & MIS (ICEMIS)

The creation of a robust system for speech recognition requires a great effort and resources. One of the crucial elements in the creation of such systems is the phonetic dictionary. The creation of such a dictionary requires phonologists experts and linguists to create the best possible dictionary. But the Arabic phonologists have two theory about the Arabic phonemes. The first one says that vowels...

chapter

Development of multilingual phonetic engine for four Indian languages

Lincy Babykutty, Anu George, Leena Mary

2016 International Conference on Next Generation Intelligent Systems (ICNGIS) > 1 - 3

2016 International Conference on Next Generation Intelligent Systems (ICNGIS)

Phonetic Engine (PE) is a system that is used to determine the sequence of phones in a spoken utterance. In order to transcribe the speech database, International Phonetic Alphabet (IPA) is used. This work focuses on developing multilingual PE for four Indian languages namely, Bengali, Hindi, Urdu and Telugu. The number of languages can be increased to any number. For developing the PE, read speech...

chapter

Deep neural networks for kannada phoneme recognition

R Pradeep, K. Sreenivasa Rao

2016 Ninth International Conference on Contemporary Computing (IC3) > 1 - 6

2016 Ninth International Conference on Contemporary Computing (IC3)

Deep neural network (DNN) based speech recognizers have recently replaced Gaussian Mixture Model (GMM) based systems as the state-of-the-art. Developing a phonetic engine and enhancing its performance can lead to significant improvement in Automatic Speech Recognition (ASR). However only a less work has been reported in developing Phonetic engine on large vocabulary Kannada speech corpus. In this...

chapter

A cloud-based framework for Thai Large Vocabulary Speech Recognition

Sila Chunwijitra, Chanchai Junlouchai, Kamthorn Krairaksa, Vataya Chunwijitra, more

2016 13th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON) > 1 - 6

2016 13th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

This paper presents an improvement of a distributed Thai speech recognizer (SR). Two main objectives of the improvement are investigated; 1) the response time in terms of a real-time factor (RTF), 2) the cloud computing deployment. The proposed framework adapts and migrates the baseline collaborative DSR system to the Docker platform. Multiple containers are shared system resources such as CPU, memory,...

chapter

A comprehensive text analysis for Bengali TTS using unicode

Sheikh Abujar, Mahmudul Hasan

2016 5th International Conference on Informatics, Electronics and Vision (ICIEV) > 547 - 551

2016 International Conference on Informatics, Electronics and Vision (ICIEV)

Communication is a very natural characteristic of every creature. Sometimes we use different symbols, or many formed languages to communicate each other. Every Languages we use are able for both oral and text communications. Writing symbols is a way to express our intentions through using any physical material. As we have oral communication capability too which we could use exactly as we want to speak...

chapter

A 98.6μW acoustic signal processor for fully-implantable cochlear implants

Hao-Min Liu, Yung-Jen Lin, Yu-Chi Lee, Cheng-Yen Lee, more

2016 International Symposium on VLSI Design, Automation and Test (VLSI-DAT) > 1 - 4

2016 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)

This paper presents a low-power acoustic signal processor for fully-implantable cochlear implants. The developed processor supports adaptive beamforming, frequency-domain analysis, envelope detection, channel combination, and magnitude compression. Power and area are minimized by leveraging dedicated real-valued FFT, register count minimization, data allocation optimization, hardware complexity reduction,...

chapter

Analysis of long-term and large-scale experiments on robot dialogues using a cloud robotics platform

Komei Sugiura, Koji Zettsu

2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI) > 525 - 526

2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI)

To build conversational robots, roboticists are required to have deep knowledge of both robotics and spoken dialogue systems. Unlike using stand-alone speech recognition/ synthesis toolkits, a cloud robotics platform for human-robot communication enables high-quality speech recognition and synthesis that is optimized to human-robot interactions. This is challenging because we need to build a wide...

chapter

Detection and Reduction of High Frequency Non-stationary Vehicular Engine Noise Using Single Microphone

Phani Kumar Nyshadham, Niranjan Avadhanam, Sreekanth Nakkala, D.R. Shivakumar

2016 IEEE 6th International Conference on Advanced Computing (IACC) > 376 - 381

2016 IEEE 6th International Conference on Advanced Computing (IACC)

In mobile speech communications, because of the noise interfering with speech at one end, the intelligibility of speech degrades at the other end. In this paper, we focus on suppression of noise produced by vehicular and automobile mechanical engines, in whose presence the intelligibility of the speech deteriorates, when transmitted to the other end (or) while recording. We propose a method using...

Data set:
ieee
Keywords:
ENGINES
SPEECH

Publication date

Set your own date range

Content availability

Available (134)
None (1)

Publication type

book (128)
article (7)

Keywords

SPEECH RECOGNITION (75)
HIDDEN MARKOV MODELS (38)
ACOUSTICS (24)
DATABASES (19)
SERVERS (16)
DATA MINING (15)
SPEECH PROCESSING (15)
CONTEXT (14)
SPEECH SYNTHESIS (14)
FEATURE EXTRACTION (13)
COMPUTERS (11)
TRAINING (11)
ACCURACY (9)
HUMANS (9)
NATURAL LANGUAGE PROCESSING (9)
GRAMMAR (8)
SEMANTICS (8)
ALGORITHM DESIGN AND ANALYSIS (7)
GAMES (7)
NATURAL LANGUAGES (7)
SOFTWARE (7)
SPEECH-BASED USER INTERFACES (7)
TEXT ANALYSIS (7)
ARTIFICIAL NEURAL NETWORKS (6)
COMPUTER ARCHITECTURE (6)
DICTIONARIES (6)
INTERNET (6)
ROBOTS (6)
SPEAKER RECOGNITION (6)
VOCABULARY (6)
ERROR ANALYSIS (5)
MOBILE COMMUNICATION (5)
PRAGMATICS (5)
REAL TIME SYSTEMS (5)
TESTING (5)
VISUALIZATION (5)
ADAPTATION MODEL (4)
ADAPTATION MODELS (4)
ARTIFICIAL INTELLIGENCE (4)
AUTOMATIC SPEECH RECOGNITION (4)
CLOUD COMPUTING (4)
COMPUTATIONAL MODELING (4)
CONFERENCES (4)
EDUCATIONAL INSTITUTIONS (4)
EMOTION RECOGNITION (4)
INTERACTIVE SYSTEMS (4)
KNOWLEDGE REPRESENTATION (4)
LANGUAGE TRANSLATION (4)
LOGIC GATES (4)
MONITORING (4)
SEARCH ENGINES (4)
SIGNAL PROCESSING ALGORITHMS (4)
TEXT TO SPEECH (4)
TEXT-TO-SPEECH (4)
THREE DIMENSIONAL DISPLAYS (4)
TIME FACTORS (4)
WRITING (4)
AUDITORY SYSTEM (3)
CAMERAS (3)
CLIENT-SERVER SYSTEMS (3)
COMPUTER GAMES (3)
CORRELATION (3)
DISTANCE MEASUREMENT (3)
DISTRIBUTED SPEECH RECOGNITION (3)
EDUCATION (3)
ENCODING (3)
ENTROPY (3)
GAUSSIAN DISTRIBUTION (3)
GOOGLE (3)
GRAPHICAL USER INTERFACES (3)
HUMAN COMPUTER INTERACTION (3)
INDEXING (3)
INFORMATION RETRIEVAL (3)
MACHINE LEARNING (3)
MEL FREQUENCY CEPSTRAL COEFFICIENT (3)
MOBILE HANDSETS (3)
NATURAL LANGUAGE (3)
NAVIGATION (3)
NIST (3)
OPTICAL CHARACTER RECOGNITION SOFTWARE (3)
PATTERN RECOGNITION (3)
PERFORMANCE EVALUATION (3)
PROTOCOLS (3)
RANDOM ACCESS MEMORY (3)
RESOURCE MANAGEMENT (3)
SIGNAL PROCESSING (3)
SMART PHONES (3)
SOFTWARE ARCHITECTURE (3)
SURVEILLANCE (3)
SYNTHESIZERS (3)
USABILITY (3)
USER INTERFACES (3)
VOICEXML (3)
WAVELET TRANSFORMS (3)
XML (3)
ACOUSTIC MEASUREMENTS (2)
AGRICULTURE (2)
ANDROID (2)
more

INFONA - science communication portal

Search results

Rap music video generator: Write a script to make your rap music video with synthesized voice and CG animation

Towards a Breakthrough Speaker Identification Approach for Law Enforcement Agencies: SIIP

Voice control for smart home automation: Evaluation of approaches and possible architectures

Digital corpus of santali language

Automatic language identification for seven Indian languages using higher level features

Semantics driven intelligent front-end

An algorithm for the better assessment of machine translation

The phoneme set influence for lithuanian speech commands recognition accuracy

Improving speaker verification performance under spoofing attacks by fusion of different operational modes

Image text to speech conversion in the desired language by translating with Raspberry Pi

Fraud Detection in Voice-Based Identity Authentication Applications and Services

Microphone Array Processing Strategies for Distant-Based Automatic Speech Recognition

New birth of the Arabic phonetic dictionary

Development of multilingual phonetic engine for four Indian languages

Deep neural networks for kannada phoneme recognition

A cloud-based framework for Thai Large Vocabulary Speech Recognition

A comprehensive text analysis for Bengali TTS using unicode

A 98.6μW acoustic signal processor for fully-implantable cochlear implants

Analysis of long-term and large-scale experiments on robot dialogues using a cloud robotics platform

Detection and Reduction of High Frequency Non-stationary Vehicular Engine Noise Using Single Microphone

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options