Search results for: Lijuan Wang

Items from 1 to 7 out of 7 results

chapter

Photo-real talking head with deep bidirectional LSTM

Bo Fan, Lijuan Wang, Frank K. Soong, Lei Xie

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4884 - 4888

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Long short-term memory (LSTM) is a specific recurrent neural network (RNN) architecture that is designed to model temporal sequences and their long-range dependencies more accurately than conventional RNNs. In this paper, we propose to use deep bidirectional LSTM (BLSTM) for audio/visual modeling in our photo-real talking head system. An audio/visual database of a subject's talking is firstly recorded...

chapter

Improved minimum converted trajectory error training for real-time speech-to-lips conversion

Wei Han, Lijuan Wang, Frank Soong, Bo Yuan

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4513 - 4516

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Gaussian mixture model (GMM) based speech-to-lips conversion often operates in two alternative ways: batch conversion and sliding window-based conversion for real-time processing. Previously, Minimum Converted Trajectory Error (MCTE) training has been proposed to improve the performance of batch conversion. In this paper, we extend previous work and propose a new training criteria, MCTE for Real-time...

chapter

High quality lip-sync animation for 3D photo-realistic talking head

Lijuan Wang, Wei Han, Frank K. Soong

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4529 - 4532

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

We propose a new 3D photo-realistic talking head with high quality, lip-sync animation. It extends our prior high-quality 2D photo-realistic talking head to 3D. An a/v recording of a person speaking a set of prompted sentences with good phonetic coverage for ∼20-minutes is first made. We then use a 2D-to-3D reconstruction algorithm to automatically adapt a general 3D head mesh model to the person...

article

Computer-Assisted Audiovisual Language Learning

Lijuan Wang, Yao Qian, Matthew Scott, Gang Chen, more

Computer > 2012 > 45 > 6 > 38 - 47

Advances in speech-processing technology have enabled novel ways to learn a foreign language online. With Engkoo, researchers in China are working to turn any computer into a language learning assistant and make searching a language easier. The featured Web extra at http://youtu.be/_VHDMAKKLKo is a video discussion titled “Computer-Assisted Audiovisual Language Learning” that demonstrates Engkoo,...

chapter

Synthesizing visual speech trajectory with minimum generation error

Lijuan Wang, Yi-Jian Wu, Xiaodan Zhuang, Frank K. Soong

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4580 - 4583

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we propose a minimum generation error (MGE) training method to refine the audio-visual HMM to improve visual speech trajectory synthesis. Compared with the traditional maximum likelihood (ML) estimation, the proposed MGE training explicitly optimizes the quality of generated visual speech trajectory, where the audio-visual HMM modeling is jointly refined by using a heuristic method...

chapter

Rendering a personalized photo-real talking head from short video footage

Lijuan Wang, Wei Han, Xiaojun Qian, Frank K Soong

2010 7th International Symposium on Chinese Spoken Language Processing > 129 - 134

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

In this paper, we propose an HMM trajectory-guided, real image sample concatenation approach to photo-real talking head synthesis. An audio-visual database of a person is recorded first for training a statistical Hidden Markov Model (HMM) of Lips movement. The HMM is then used to generate the dynamic trajectory of lips movement for given speech signals in the maximum probability sense. The generated...

chapter

Reinterpretations of visual cognition from view of decomposition and synthesis

Songshan Xiao, Lijuan Wang

2010 3rd International Congress on Image and Signal Processing > 6 > 2828 - 2829

3rd International Congress on Image and Signal Processing (CISP 2010)

A special visual phenomenon is found. Based on Spatial frequency characteristics, the visual phenomenon is analyzed by decompositing and synthesizing spatial frequency of origin image to testify the existing of visual spatial frequency channel.

Filter options

Keywords:
VISUALIZATION

Publication date

Set your own date range

Publication type

book (6)
article (1)

INFONA - science communication portal

Search results for: Lijuan Wang

Photo-real talking head with deep bidirectional LSTM

Improved minimum converted trajectory error training for real-time speech-to-lips conversion

High quality lip-sync animation for 3D photo-realistic talking head

Computer-Assisted Audiovisual Language Learning

Synthesizing visual speech trajectory with minimum generation error

Rendering a personalized photo-real talking head from short video footage

Reinterpretations of visual cognition from view of decomposition and synthesis

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options