2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

book

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Asia Pacific Signal and Information Processing Association

chapter

The alpha-trimming mean filter for Video stabilization

Jinju Lim, Min-Cheol Hong

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper proposed video stabilization techniques using undesired motion detection and alpha-trimming mean filter. The proposed method consists of detecting undesired motions step and filtering the undesired motions step. The limitation on undesired motions is defined, using the local motion information. The alpha-trimming mean filter's alpha is controlled based on this limitation, so that regenerated...

chapter

Sparse spatial filtering in frequency domain of multi-channel EEG for frequency and phase detection

Naoki Morikawa, Toshihisa Tanaka

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 7

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

A brain-computer interface (BCI) based on steady state visual evoked potentials (SSVEPs) is one of the most practical BCI, because of high recognition accuracies and short time training. To increase the number of commands of SSVEP-based BCI, recently a frequency and phase mixed-coded SSVEP BCI has been proposed. However, in order to detect frequency and phase of SSVEPs accurately, it is required to...

chapter

Steering behavior model of drivers on driving simulator through visual information

Tomohito Suzaki, Takatomi Kubo, Toshihiro Hiraoka, Yuto Nakagawa, more

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

A driver is regarded as a system that receives visual information and that controls the steering wheel. To identify the system, we conducted experiments to get input-output data using a driving simulator and confirmed that the focus of expansion of optical flow has sufficient information to predict steering behaviors.

chapter

Thinning deep neural networks for sketch recognition

Pyunghwan Ahn, Dong Hoon Shin, Junmo Kim

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we propose that deep neural networks that are thinner than typical image recognition networks can perform sketch recognition effectively. As in other computer vision problems, convolutional neural networks (CNNs) outperform other feature extraction methods significantly in sketch recognition. To date, two CNN structures have been proposed for sketch recognition, as described in [4]...

chapter

Audio signal separation using supervised NMF with time-variant all-pole-model-based basis deformation

Hiroaki Nakajima, Daichi Kitamura, Norihiro Takamune, Shoichi Koyama, more

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 7

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

We address a novel nonnegative matrix factorization (NMF) with a new basis deformation method to handle various music sounds. Conventional supervised NMF has a critical problem that a mismatch between bases trained in advance and an actual target sound reduces the accuracy of separation. To solve this problem, we proposed an advanced supervised NMF that applies a single time-invariant filter to the...

chapter

Automatic heart and lung sounds classification using convolutional neural networks

Qiyu Chen, Weibin Zhang, Xiang Tian, Xiaoxue Zhang, more

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

We study the effectiveness of using convolutional neural networks (CNNs) to automatically detect abnormal heart and lung sounds and classify them into different classes in this paper. Heart and respiratory diseases have been affecting humankind for a long time. An effective and automatic diagnostic method is highly attractive since it can help discover potential threat at the early stage, even at...

chapter

Novel self-portrait enhancement via multi-photo fusing

Sifeng Xia, Shuai Yang, Jiaying Liu, Zongming Guo

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we present a novel multi-photo-based framework to solve a self-portrait enhancement problem we call “1+2 problem”, in which a self-portrait photo is enhanced with the help of two multiple photos that share the same scene and similar shooting time. The key idea is to exploit the extra information of these two photos to overcome the limited field of view and poor illumination of the target...

chapter

General expansion-shifting model for reversible data hiding

Xiaolong Li, Zongming Guo

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Reversible data hiding (RDH) is a specific information hiding technique in which both the embedded data and the original cover medium can be exactly extracted from the marked data. In this paper, we present a general expansion-shifting model for RDH by introducing the so-called reversible embedding function (REF) which maps each point of Zⁿ to a nonempty subset of Zⁿ. Moreover, to guarantee the reversibility,...

chapter

Improving BLSTM RNN based Mandarin speech recognition using accent dependent bottleneck features

Jiangyan Yi, Hao Ni, Zhengqi Wen, Jianhua Tao

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 5

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper proposes an approach to perform accent adaptation by using accent dependent bottleneck (BN) features to improve the performance of multi-accent Mandarin speech recognition system. The architecture of the adaptation uses two neural networks. First, deep neural network (DNN) acoustic model acts as a feature extractor which is used to extract accent dependent BN (BN-DNN) features. The input...

chapter

Multi-feature based score fusion method for fingerprint recognition accuracy boosting

Qiongxiu Li, Changlong Jin, Weonjin Kim, Jungmin Kim, more

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In fingerprint recognition system, minutiae-based matching algorithms are most intensively researched. However, in most minutia-based methods, the similarity score is given based on the main score of matched minutiae. And the boosted information is not effectively used in the final similarity score computation. Based on the observation, we extract several features as the supplementary scores. And...

chapter

Efficient deep neural networks for speech synthesis using bottleneck features

Young-Sun Joo, Won-Suk Jun, Hong-Goo Kang

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper proposes a cascading deep neural network (DNN) structure for speech synthesis system that consists of text-to-bottleneck (TTB) and bottleneck-to-speech (BTS) models. Unlike conventional single structure that requires a large database to find complicated mapping rules between linguistic and acoustic features, the proposed structure is very effective even if the available training database...

chapter

Beamforming networks using spatial covariance features for far-field speech recognition

Xiong Xiao, Shinji Watanabe, Eng Siong Chng, Haizhou Li

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 6

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Recently, a deep beamforming (BF) network was proposed to predict BF weights from phase-carrying features, such as generalized cross correlation (GCC). The BF network is trained jointly with the acoustic model to minimize automatic speech recognition (ASR) cost function. In this paper, we propose to replace GCC with features derived from input signals' spatial covariance matrices (SCM), which contain...

chapter

A fast CU partitioning algorithm in HEVC inter prediction for HD/UHD video

Ai Qing, Wei Zhou, Henglu Wei, Xin Zhou, more

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 5

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

The emerging high efficient video coding (HEVC) standard adopts quad-tree structure to partition the coding unit (CU) which is flexible and efficient. However, it causes enormous computational complexity. In this paper, a fast CU partitioning algorithm in the inter prediction of HEVC is proposed. Firstly, based on the visual saliency map detection, a fast CU partitioning depth prediction algorithm...

chapter

Mining user interests from social media by fusing textual and visual features

Fang-Yu Chao, Jia Xu, Chia-Wen Lin

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 8

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In this paper, we propose a framework that fuses textual and visual features of user generated social media data to mine the distribution of user interests. The proposed framework consists of three steps: feature extraction, model training, and user interest mining. We choose boards from popular users on Pinterest to collect training and test data. For each pin we extract the term-document matrices...

chapter

Deep neural network based voice conversion with a large synthesized parallel corpus

Zhengqi Wen, Kehuang Li, Jianhua Tao, Chin-Hui Lee

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 5

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

We propose a voice conversion framework to map the speech features of a source speaker to a target speaker based on deep neural networks (DNNs). Due to a limited availability of the parallel data needed for a pair of source and target speakers, speech synthesis and dynamic time warping are utilized to construct a large parallel corpus for DNN training. With a small corpus to train DNNs, a lower log...

chapter

Robust scalable video multicast using triangular network coding in LTE/LTE-Advanced

Phuc Chau, Yongwoo Lee, Toan Duc Bui, Jitae Shin, more

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 6

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

The recent research studies showed that inter-layered network coding is a promising approach to provide the unequal error protection for scalable video multicast under the channel heterogeneity. The selection of the optimal transmission distribution performed at eNB increases the system performance with the cost of time and computational complexities. In this paper, we propose an optimal transmission...

chapter

Sudden-noise suppression with strike-portion detection based on phase linearity for speech recognition

Terumi Umematsu, Shuji Komeiji, Masanori Tsujikawa, Ryosuke Isotani

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

We propose a sudden-noise suppression method for speech recognition using a phase linearity feature for noise detection. Our investigation of sound data recorded in actual retail stores shows that short, sudden noises are dominant in such environments. We also confirm the negative effect of such noises on speech recognition performance. Our method addresses this problem by focusing on sudden noises...

chapter

Content complexity based just noticeable difference estimation in DCT domain

Jinjian Wu, Wenfei Wan, Guangming Shi

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 5

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Just noticeable difference (JND), which reveals the visibility of our human visual system (HVS), is useful for image/video coding. Due to the content complexity, it is hard to accurately estimate the JND thresholds for different image blocks (e.g., edge and texture). Research on cognitive science indicates that the HVS is adaptive to extract the visual regularities for scene perception and understanding...

chapter

Enhancement of noisy low-light images via structure-texture-noise decomposition

Jaemoon Lim, Minhyeok Heo, Chul Lee, Chang-Su Kim

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 5

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

We propose a novel noisy low-light image enhancement algorithm via structure-texture-noise (STN) decomposition. We split an input image into structure, texture, and noise components, and enhance the structure and texture components separately. Specifically, we first enhance the contrast of the structure image, by extending a 2D histogram-based image enhancement scheme based on the characteristics...

INFONA - science communication portal

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)