2016 IEEE International Conference on Multimedia and Expo (ICME)

book

2016 IEEE International Conference on Multimedia and Expo (ICME)

IEEE

chapter

Large-scale vehicle re-identification in urban surveillance videos

Xinchen Liu, Wu Liu, Huadong Ma, Huiyuan Fu

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Vehicle, as a significant object class in urban surveillance, attracts massive focuses in computer vision field, such as detection, tracking, and classification. Among them, vehicle re-identification (Re-Id) is an important yet frontier topic, which not only faces the challenges of enormous intra-class and subtle inter-class differences of vehicles in multicameras, but also suffers from the complicated...

chapter

Efficient MRF-based disocclusion inpainting in multiview video

Beerend Ceulemans, Shao-Ping Lu, Gauthier Lafruit, Peter Schelkens, more

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

View synthesis using depth image-based rendering generates virtual viewpoints of a 3D scene based on texture and depth information from a set of available cameras. One of the core components in view synthesis is image inpainting which performs the reconstruction of areas that were occluded in the available cameras but are visible from the virtual viewpoint. Inpainting methods based on Markov random...

chapter

A novel trignometric energy functional for image segmentation in the presence of intensity in-homogeneity

Sajid Hussain, Qi Chun, Muhammad Rizwan Asif, Muhammad Sohrab Khan, more

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

A novel region based Active Contour Model (ACM) for image segmentation is presented using image local information for intensity inhomogeneity images. A transcendental (trigonometric) energy functional based on Local Fitted Image (LFI) energy is suggested to extract the image local information. The difference between the original and fitting image is introduced as an angular constraint of the trigonometric...

chapter

New results in free-viewpoint television systems for horizontal virtual navigation

Marek Domanski, Maciej Bartkowiak, Adrian Dziembowski, Tomasz Grajek, more

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

The paper presents the concept of a practical free-viewpoint television system with purely optical depth estimation. The system consists of camera modules that contain pairs or triples of cameras together with the respective microphones. The camera modules can be sparsely located in arbitrary positions around a scene. Each camera module is equivalent to a video camera with a depth sensor and microphones...

chapter

DBLSTM-based multi-scale fusion for dynamic emotion prediction in music

Xinxing Li, Jiashen Tian, Mingxing Xu, Yishuang Ning, more

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Dynamic Music Emotion Prediction is crucial to the emerging applications of music retrieval and recommendation. Considering the influence of temporal context and hierarchical structure on emotion in music, we propose a Deep Bidirectional Long Short-Term Memory (DBLSTM) based multi-scale regression method. In this method, a post-processing component is utilised for individual DBSLTM output to further...

chapter

Robust online visual tracking via a temporal ensemble framework

Hao Guan, Xiangyang Xue

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

In this paper, we propose a robust visual tracking method based on a temporal ensemble framework. Different from conventional ensemble-based trackers, which combine weak classifiers into a strong one using AdBoost in spatial fusion manners, our method adopts a powerful and efficient tracker integrated with its snapshots in different temporal windows of online tracking process to construct a temporal...

chapter

Quality assessment of image patches distorted by image compression using crowdsourcing

Sebastian Bosse, Mischa Siekmann, Jennifer Rasch, Thomas Wiegand, more

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Three experiments addressing the assessment of perceived image quality in a patch-based manner are compared for HEVC compression artifacts. It is shown that image patches of a size small as 128×128 pixel are large enough to evaluate the perceived image quality in a Degradation Category Rating (DCR) setting. Ratings obtained with 128×128 pixel sized images patches and 512×512 pixel sized images of...

chapter

Phonetic posteriorgrams for many-to-one voice conversion without parallel data training

Lifa Sun, Kun Li, Hao Wang, Shiyin Kang, more

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

This paper proposes a novel approach to voice conversion with non-parallel training data. The idea is to bridge between speakers by means of Phonetic PosteriorGrams (PPGs) obtained from a speaker-independent automatic speech recognition (SI-ASR) system. It is assumed that these PPGs can represent articulation of speech sounds in a speaker-normalized space and correspond to spoken content speaker-independently...

chapter

Robust latent poisson deconvolution from multiple imperfect features for web topic detection

Fei Tao, Junbiao Pang, Chunjie Zhang, Liang Li, more

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

In web topic detection, detecting “hot” topics from enormous User-Generated Content (UGC) on web data poses two main difficulties that conventional approaches can barely handle: 1) poor feature representations from noisy images and short texts; and 2) uncertain roles of modalities where visual content is either highly or weakly relevant to textual cues due to less-constrained data. In this paper,...

chapter

BCA: Bi-symmetric component analysis for temporal symmetry in human actions

Chenyang Zhang, Yingli Tian

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

In the past, many research efforts are invested into discriminative action recognition task but the general temporal structure of human actions is overlooked. In this paper, we focus on a specific yet common structure of human actions: temporal symmetry. The key contribution is that we model the temporal symmetry property of human action and separate this signal out of original action sequences without...

chapter

Sparse two-dimensional singular value decomposition

Junhui Hou, Jie Chen, Lap-Pui Chau, Ying He

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

In this paper, we propose a new data-driven transform, called sparse two-dimensional singular value decomposition (S2DSVD). By leveraging the advantages of discrete cosine transform and the conventional 2D SVD, we decompose a set of matrices into transform coefficient matrices with sparse and orthogonal basis functions. Such sparsity characteristic can significantly reduce their overhead, hence being...

chapter

On-premise signs detection and recognition using fully convolutional networks

Yong-Xiang Wang, Chih-Hsin Hsueh, Hung-Yi Loo, Min-Chun Hu

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Convolutional neural network has been recently studied and used in many object recognition tasks. In this work, we employ fully convolutional networks (FCNs) to recognize On-Premise Signs (OPS) in real scene. This technology is capable of being utilized in many camera-enabled devices like smart phones to develop practical commercial applications. The fully convolutional network technique is used to...

chapter

A pair hidden Markov support vector machine for alignment of human actions

Zhen Wang, Massimo Piccardi

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Alignment of human actions in videos is an important task for applications such as action comparison and classification. While well-established algorithms such as dynamic time warping are available for this task, they still heavily rely on basic linear cost models and heuristic parameter tuning. In this paper we propose a novel framework that combines the flexibility of the pair hidden Markov model...

chapter

Bayesian relevance feedback based Chinese calligraphy character synthesis

Xueying Du, Jiangqin Wu, Yang Xia

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Sometimes calligraphy lovers want to generate a calligraphic plaque in style of some famous calligraphers, but some characters hadn't been written or were damaged in the long history of Chinese calligraphy. It will be a significant thing to use computer-aided synthesis technology to create calligraphic characters in the particular style. Though such kinds of research work have been done, the synthesized...

chapter

Example-based video color transfer

Chun-Han Yao, Chia-Yang Chang, Shao-Yi Chien

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Color transfer is an image processing technique commonly used to fix images with wrong colors, enhance the lighting conditions, or produce special styles to express specific emotions. With the aid of a reference image, the intended color characteristics can be properly transferred to the source images or videos. Since the applications start growing popular, many image color transfer methods have emerged;...

chapter

Crowd video retrieval via deep attribute-embedding graph ranking

Yanhao Zhang, Lei Qin, Sicheng Zhao, Rongrong Ji, more

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Since the number of surveillance cameras in public areas increases very fast, massive crowd videos are captured and shared, which brings an urgent need to retrieve these videos efficiently and effectively. However, most recent research on crowd video mainly focused on crowd behavior understanding and abnormal detection. In this study, as the very first attempt, we propose a crowd video retrieval method...

chapter

Graph-based web video search reranking through consistency analysis using spectral clustering

Soh Yoshida, Takahiro Ogawa, Miki Haseyama

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

This paper proposes a graph-based Web video search reranking method through consistency analysis using spectral clustering. Graph-based reranking is effective for refining text-based video search results. Generally, this approach constructs a graph where the vertices are videos and the edges reflect their pairwise similarities. A lot of reranking methods are built based on a scheme which regularizes...

chapter

With one look: 3D face shape estimation from a single snapshot

Chia-Po Wei, Yu-Chiang Frank Wang

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Estimating the 3D shape information of a face from a single image is a challenging task, especially when the input image is captured under unconstrained scenarios (e.g., variations of pose, illumination, expression, or even disguise). Previous approaches to this problem typically require careful initialization, registration, or segmentation of the face image regions. With the objective to match the...

chapter

Video saliency prediction with optimized optical flow and gravity center bias

Zhe Wu, Li Su, Qingming Huang, Bo Wu, more

2016 IEEE International Conference on Multimedia and Expo (ICME) > 1 - 6

2016 IEEE International Conference on Multimedia and Expo (ICME)

Dynamic videos are viewed fundamentally different from static images. Besides spatial features, motion feature also plays an important role as a temporal factor. Most existing video saliency models usually employ optical flow to represent the motion feature. However, optical flow often suffers from the discontinuity problem. And we also notice that human fixations in one single video frame are much...

INFONA - science communication portal

2016 IEEE International Conference on Multimedia and Expo (ICME)