Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on

chapter

Multiple kernel active learning for image classification

Jingjing Yang, Yuanning Li, Yonghong Tian, Lingyu Duan, more

2009 IEEE International Conference on Multimedia and Expo > 550 - 553

2009 IEEE International Conference on Multimedia and Expo (ICME)

Recently, multiple kernel learning (MKL) methods have shown promising performance in image classification. As a sort of supervised learning, training MKL-based classifiers relies on selecting and annotating extensive dataset. In general, we have to manually label large amount of samples to achieve desirable MKL-based classifiers. Moreover, MKL also suffers a great computational cost on kernel computation...

chapter

Robust copy detection by mining temporal self-similarities

Zhipeng Wu, Qingming Huang, Shuqiang Jiang

2009 IEEE International Conference on Multimedia and Expo > 554 - 557

2009 IEEE International Conference on Multimedia and Expo (ICME)

This paper introduces a self-similarity matrix (SSM) based video copy detection scheme and a visual character-string (VCS) descriptor for SSM matching. SSM, which exploits the spatial and temporal information in a video clip, is extracted from exhaustive calculation of distances between the frames. The SSM based method treats the video clip as a whole and transforms the temporal self-similarity into...

chapter

Learning based thumbnail cropping

Xin Li, Haibin Ling

2009 IEEE International Conference on Multimedia and Expo > 558 - 561

2009 IEEE International Conference on Multimedia and Expo (ICME)

Thumbnail cropping helps improve thumbnail readability by cropping images before shrinking them. In this paper we propose a learning based method for automatic thumbnail cropping. To this end, we use a support vector machine to learn a discriminative model that simultaneously captures the saliency distribution and spatial priors. The model is then used to determine the best cropping rectangle. The...

chapter

Rotation invariant curvelet features for texture image retrieval

M.M. Islam, Dengsheng Zhang, Guojun Lu

2009 IEEE International Conference on Multimedia and Expo > 562 - 565

2009 IEEE International Conference on Multimedia and Expo (ICME)

Effective texture feature is an essential component in any content based image retrieval system. In the past, spectral features, like Gabor and wavelet, have shown superior retrieval performance than many other statistical and structural based features. Recent researches on multi-resolution analysis have found that curvelet captures texture properties, like curves, lines, and edges, more accurately...

chapter

An improved valence-arousal emotion space for video affective content representation and recognition

Kai Sun, Junqing Yu, Yue Huang, Xiaoqiang Hu

2009 IEEE International Conference on Multimedia and Expo > 566 - 569

2009 IEEE International Conference on Multimedia and Expo (ICME)

To understand video affective content automatically, the primary task is to transform the abstract concept of emotion into the form which can be handled by the computer easily. An improved V-A emotion space is proposed to address this problem. It unifies the discrete and dimensional emotion model by introducing the typical fuzzy emotion subspace. Fuzzy C-mean clustering (FCM) algorithm is adopted...

chapter

Discriminant sparse nonnegative matrix factorization

Ruicong Zhi, Qiuqi Ruan

2009 IEEE International Conference on Multimedia and Expo > 570 - 573

2009 IEEE International Conference on Multimedia and Expo (ICME)

In this paper, a novel discriminant sparse non-negative matrix factorization (DSNMF) algorithm is proposed. We derive DSNMF method from original NMF algorithm by considering both sparseness constraint and discriminant information constraint. Furthermore, projected gradient method is used to solve the optimization problem. DSNMF makes use of prior class information which is important in classification,...

chapter

Noise robust features for speech/music discrimination in real-time telecommunication

Zhong-Hua Fu, Jhing-Fa Wang, Lei Xie

2009 IEEE International Conference on Multimedia and Expo > 574 - 577

2009 IEEE International Conference on Multimedia and Expo (ICME)

While many efforts have been made in the audio signal classification field, the noise interruption problem is seldom concerned so far, especially in many telecommunication applications, where a real-time and noise robust approach is needed. This paper addresses this problem by proposing two novel robust features: average pitch density (APD) and relative tonal power density (RTPD). APD refers to the...

chapter

Sort-Merge feature selection and fusion methods for classification of unstructured video

M.J. Morris, J.R. Kender

2009 IEEE International Conference on Multimedia and Expo > 578 - 581

2009 IEEE International Conference on Multimedia and Expo (ICME)

We explore the problem of rapid automatic semantic tagging of video frames of unstructured (unedited) videos. We apply the sort-merge algorithm for feature selection on a large (>1000) heterogeneous feature set for videos showing lectures, to quickly locate low-level image features most predictive for concepts such as "key frame with text" or "key frame with computer source code"...

chapter

A Lie group based spatiogram similarity measure

Liyu Gong, Tianjiang Wang, Fang Liu, Gang Chen

2009 IEEE International Conference on Multimedia and Expo > 582 - 585

2009 IEEE International Conference on Multimedia and Expo (ICME)

Spatiograms were generalization of histograms, which can harvest spatial information of images. The similarity measure is important when applying spatiograms to various computer vision problems such as tracking and image retrieval. The original proposed measures use Mahalanobis distance of coordinate mean to measure spatial information in spatiograms. However, spatial information which is described...

chapter

Sub-band feature statistics compensation techniques based on discrete wavelet transform for robust speech recognition

Hao-Teng Fan, Jeih-weih Hung

2009 IEEE International Conference on Multimedia and Expo > 586 - 589

2009 IEEE International Conference on Multimedia and Expo (ICME)

This paper proposes a novel scheme in performing feature statistics normalization techniques for robust speech recognition. In the proposed approach, the processed temporal-domain feature sequence is first decomposed into non-uniform sub-bands using discrete wavelet transform (DWT), and then each sub-band stream is individually processed by the well-known normalization methods, like mean and variance...

chapter

Learning super resolution with global and local constraints

Kai Guo, Xiaokang Yang, Rui Zhang, Songyu Yu

2009 IEEE International Conference on Multimedia and Expo > 590 - 593

2009 IEEE International Conference on Multimedia and Expo (ICME)

In learning based single image super-resolution (SR) approach, the super-resolved image are usually found or combined from training database through patch matching. But because the representation ability of small patch is limited, it is difficult to guarantee that the super-resolved image is best under global view. To tackle this problem, we propose a statistical learning method for SR with both global...

chapter

Pseudo relevance feedback with incremental learning for high level feature detection

Shaoxi Xu, Sheng Tang, Jintao Li, Yongdong Zhang

2009 IEEE International Conference on Multimedia and Expo > 594 - 597

2009 IEEE International Conference on Multimedia and Expo (ICME)

Pseudo relevance feedback (PRF) has shown effective performance in information retrieval, but it has seldom been applied in the area of high level feature detection (HLF). In this paper, we explicitly propose to introduce PRF into HLF. Our contributions mainly lie in two-fold: (1) proposing three novel PRF approaches to extract pseudo positive samples, i.e., nearest-neighbor (NN) based PRF, score-evaluation...

chapter

An automatic language identification method based on subspace analysis

Yan Song, Lirong Dai, Renhua Wang

2009 IEEE International Conference on Multimedia and Expo > 598 - 601

2009 IEEE International Conference on Multimedia and Expo (ICME)

Gaussian mixture models (GMM) have become one of the standard acoustic approaches for language identification. Furthermore, the GMM-SVM is proven to work well by introducing the discriminative method into the GMM-based acoustic systems. In these systems, the intersession variability within language has become an important adverse factor that degrades the system performance. To tackle this problem,...

chapter

Finding rows of people in group images

A.C. Gallagher, Tsuhan Chen

2009 IEEE International Conference on Multimedia and Expo > 602 - 605

2009 IEEE International Conference on Multimedia and Expo (ICME)

People are among the most popular subjects in photography, and in many social settings, images of groups of people are captured. People often arrange themselves in a very structured manner in these group images. For example, taller people might stand in a row behind smaller people. This structure is often exploited in captions that sequentially label the individuals in each row. We present an algorithm...

chapter

Constructing a landmark identification system for Geo-tagged photographs based on Web data analysis

K. Hoashi, T. Uemukai, K. Matsumoto, Y. Takishima

2009 IEEE International Conference on Multimedia and Expo > 606 - 609

2009 IEEE International Conference on Multimedia and Expo (ICME)

In this research, we propose a method to automatically generate a landmark identification system for geo-tagged photographs, based on analysis of various data collected from the Web. The method first conducts Web analysis based on three major procedures: (1) Automatic extraction of points-of-interest (POIs) based on geographical clustering of geo-tagged images, (2) Retrieval of landmark candidates...

chapter

User generated video annotation using Geo-tagged image databases

G. Abdollahian, E.J. Delp

2009 IEEE International Conference on Multimedia and Expo > 610 - 613

2009 IEEE International Conference on Multimedia and Expo (ICME)

In this paper we propose a system that annotates a user generated video based on the associated location metadata, by exploiting user-tagged image databases. An example of such a database is a photo sharing Web site such as Flickr where users upload their images and annotate them with various tags. The goal is to find the tags that have high probability of being relevant to the video without any complex...

chapter

Real-time pedestrian and vehicle detection in video using 3D cues

Ping-Han Lee, Tzu-Hsuan Chiu, Yen-Liang Lin, Yi-Ping Hung

2009 IEEE International Conference on Multimedia and Expo > 614 - 617

2009 IEEE International Conference on Multimedia and Expo (ICME)

Existing pedestrian and vehicle detection algorithms use 2D cues of objects, such as pixel values, color, texture, shape information or motion. The use of 3D cues in object detection, on the other hand, is not well studied in the literature. In this paper, we propose an efficient algorithm that detects pedestrian and vehicle using their 3D cues. The proposed algorithm first detects moving objects...

chapter

Exploiting genre for music emotion classification

Yu-Ching Lin, Yi-Hsuan Yang, H.H. Chen, I-Bin Liao, more

2009 IEEE International Conference on Multimedia and Expo > 618 - 621

2009 IEEE International Conference on Multimedia and Expo (ICME)

Genre and emotion have been applied to content-based music retrieval and organization; however, the intrinsic correlation between them has not been explored. In this paper we present a statistical association analysis to examine such intrinsic correlation and propose a two-layer scheme that exploits the correlation for emotion classification. Significant improvement of classification accuracy over...

chapter

Localizing and recognizing action unit using position information of local feature

Yan Song, Shouxun Lin, Yongdong Zhang, Lin Pang, more

2009 IEEE International Conference on Multimedia and Expo > 622 - 625

2009 IEEE International Conference on Multimedia and Expo (ICME)

Action recognition has attracted much attention for human behavior analysis in recent years. Local spatial-temporal (ST) features are widely adopted in many works. However, most existing works which represent action video by histogram of ST words fail to have a deep insight into a fine structure of actions because of the local nature of these features. In this paper, we propose a novel method to simultaneously...

chapter

A two phase method for general audio segmentation

J.X. Zhang, J. Whalley, S. Brooks

2009 IEEE International Conference on Multimedia and Expo > 626 - 629

2009 IEEE International Conference on Multimedia and Expo (ICME)

This paper presents a model-free and training-free two-phase method for audio segmentation that separates monophonic heterogeneous audio files into acoustically homogeneous regions where each region contains a single sound. A rough segmentation separates audio input into audio clips based on silence detection in the time domain. Then a self-similarity matrix, based on selected audio features in the...

INFONA - science communication portal

2009 IEEE International Conference on Multimedia and Expo

Multiple kernel active learning for image classification

Robust copy detection by mining temporal self-similarities

Learning based thumbnail cropping

Rotation invariant curvelet features for texture image retrieval

An improved valence-arousal emotion space for video affective content representation and recognition

Discriminant sparse nonnegative matrix factorization

Noise robust features for speech/music discrimination in real-time telecommunication

Sort-Merge feature selection and fusion methods for classification of unstructured video

A Lie group based spatiogram similarity measure

Sub-band feature statistics compensation techniques based on discrete wavelet transform for robust speech recognition

Learning super resolution with global and local constraints

Pseudo relevance feedback with incremental learning for high level feature detection

An automatic language identification method based on subspace analysis

Finding rows of people in group images

Constructing a landmark identification system for Geo-tagged photographs based on Web data analysis

User generated video annotation using Geo-tagged image databases

Real-time pedestrian and vehicle detection in video using 3D cues

Exploiting genre for music emotion classification

Localizing and recognizing action unit using position information of local feature

A two phase method for general audio segmentation

Filter options

Publication date

Keywords

INFONA - science communication portal

2009 IEEE International Conference on Multimedia and Expo $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2009 IEEE International Conference on Multimedia and Expo