2017 IEEE International Symposium on Multimedia (ISM)

chapter

Recurrent Visual Relationship Recognition with Triplet Unit

Kento Masui, Akiyoshi Ochiai, Shintaro Yoshizawa, Hideki Nakayama

2017 IEEE International Symposium on Multimedia (ISM) > 69 - 76

The task of visual relationship recognition (VRR) is recognizing multiple objects and their relationships in an image. A fundamental difficulty of this task is class-number scalability, since the number of possible relationships we need to consider causes combinatorial explosion. Another difficulty of this task is modeling how to avoid outputting semantically redundant relationships. To overcome these...

chapter

A Pre-Saliency Map Based Blind Image Quality Assessment via Convolutional Neural Networks

Zhengxue Cheng, Masaru Takeuchi, Jiro Katto

2017 IEEE International Symposium on Multimedia (ISM) > 77 - 82

2017 IEEE International Symposium on Multimedia (ISM)

In recent years, various approaches have been investigated towards blind image quality assessment (IQA) with high accuracy and low complexity. In this paper we develop a pre-saliency map based blind IQA method, which takes advantage of saliency information in prior of quality prediction for performance enhancement by two steps. 1) We split the image into patches and design a convolution neural network...

chapter

Human Action Classification Using Temporal Slicing for Deep Convolutional Neural Networks

Nathan Henderson, Ramazan Aygun

2017 IEEE International Symposium on Multimedia (ISM) > 83 - 90

2017 IEEE International Symposium on Multimedia (ISM)

Artificial Neural Networks are a widely used computing system implemented for a wide variety of tasks and problems. A common application of such networks is classification problems. However, a significant amount of this research focuses on one and two-dimensional information, such as vectorized data and images. There is limited research performed on three-dimensional media such as video clips. This...

chapter

Rate-Accuracy Optimization of Deep Convolutional Neural Network Models

Alessandro Filini, Joao Ascenso, Riccardo Leonardi

2017 IEEE International Symposium on Multimedia (ISM) > 91 - 98

2017 IEEE International Symposium on Multimedia (ISM)

Recently, deep learning has enjoyed a great deal of success for computer vision problems due to its capability to model highly complex tasks, such as image classification, object detection, face recognition, among many others. Although these neural networks are nowadays very powerful, there is a huge amount of parameters (i.e. the model) that need to be learned and require considerable storage space...

chapter

Automatic Classification of Microstructures in Thermal Barrier Coating Images

Wei-Bang Chen, Yongjin Lu, James Li, Ben Zimmerman

2017 IEEE International Symposium on Multimedia (ISM) > 99 - 106

2017 IEEE International Symposium on Multimedia (ISM)

Thermal plasma spraying is an important manufacturing technique that creates a thermal barrier coating to protect the surface underneath from wear, erosion, oxidation and corrosion. In this paper, we develop a new microstructure classification and quantification (MCQ) module that could fully automatically classify and quantify two types of microstructures, globular and interlamellar, in the top coat...

chapter

Sustained Attention Function Evaluation During Cooking Based on Egocentric Vision

Sho Ooi, Mutsuo Sano, Hajime Tabuchi, Fumie Saito, more

2017 IEEE International Symposium on Multimedia (ISM) > 107 - 113

2017 IEEE International Symposium on Multimedia (ISM)

The attention function has been classified into (i) sustained attention, (ii) selective attention, (iii) controlled attention, (iv) distributed attention, and (v) capacity for attention. Ordinarily, in order to evaluate the function of attention, the digital cancellation test (D-CAT) or trail making test (TMT) are employed. However, these evaluations are in the form of paper tests, and cannot effectively...

chapter

Detecting Good Surface for Improvisatory Visual Projection

Hoang Le, Thong Doan, Carl S. Marshall, Selvakumar Panneer, more

2017 IEEE International Symposium on Multimedia (ISM) > 114 - 121

2017 IEEE International Symposium on Multimedia (ISM)

A projector is usually coupled with a dedicated projection surface to properly display visual information. This prevents the application of projection in places where a dedicated projection surface is not readily available. This paper presents a method for automatically detecting a good surface in a daily living and working space to support improvisatory projection without a pre-installed projection...

chapter

Coherent Visual Description of Textual Instructions

Shashank Mujumdar, Nitin Gupta, Abhinav Jain, Sameep Mehta

2017 IEEE International Symposium on Multimedia (ISM) > 122 - 129

2017 IEEE International Symposium on Multimedia (ISM)

Text is the easiest means to record information but need not always be the best means for understanding a concept. In psychological theories, it is argued that when information is presented visually, it provides a better means to understand a concept. While techniques exist for generating text from a given image, the inverse problem that is to automatically fetch coherent images to represent a given...

chapter

QoE Studies on Interactive 3D Tele-Immersion

Kevin Desai, Suraj Raghuraman, Rong Jin, Balakrishnan Prabhakaran

2017 IEEE International Symposium on Multimedia (ISM) > 130 - 137

2017 IEEE International Symposium on Multimedia (ISM)

Users' Quality of Experience (QoE) in Interactive 3D Tele-Immersion (i3DTI) systems is influenced by several factors such as the quality of the "live" 3D avatars of the users, network latency, rendering methodology (head mounted display or regular TV type of display), etc. Hence, it becomes important to answer the question: "Is Visual Quality (VQ) the only factor to be considered or...

chapter

Spatio-Temporal Compositing of Video Elements for Immersive eLearning Classrooms

Uma Gopalakrishnan, P. Venkat Rangan, Ramkumar N, Balaji Hariharan

2017 IEEE International Symposium on Multimedia (ISM) > 138 - 145

2017 IEEE International Symposium on Multimedia (ISM)

Current live eLearning systems enable remote students to view the teaching environment comprising of several information sources such as the teacher and the teaching aids. These information sources are presented as individual video and audio elements. As a result, spatial connections between these elements, such as the teacher using hand gestures to point to an area on the screen, become meaningless...

chapter

Deep Attribute Driven Image Similarity Learning Using Limited Data

Nitin Gupta, Ankush Gupta, Vikas Joshi, L. Venkata Subramaniam, more

2017 IEEE International Symposium on Multimedia (ISM) > 146 - 153

2017 IEEE International Symposium on Multimedia (ISM)

In this work, we propose to derive the attribute specific similarity score for a pair of images using an existing parent deep model. As an example, given two facial images, we derive a similarity score for attributes like gender and complexion using an existing face recognition model. It is not always feasible to train a new model for each attribute, as training of deep neural network based model...

chapter

A Real-Time Annotation of Motion Data Streams

Petr Elias, Jan Sedmidubsky, Pavel Zezula

2017 IEEE International Symposium on Multimedia (ISM) > 154 - 161

2017 IEEE International Symposium on Multimedia (ISM)

Current motion-capture technologies produce continuous streams of 3D human joint trajectories. One of the challenges is to automatically annotate such streams of complex spatio-temporal data in real time. In this paper, we propose an efficient approach to label motion stream data in real time with a limited usage of main memory. Based on a set of user-defined motion profiles, each of them specified...

chapter

Heterogeneous Features Fusion with Collaborative Representation Learning for 3D Action Recognition

Chengwu Liang, Enqing Chen, Lin Qi, Ling Guan

2017 IEEE International Symposium on Multimedia (ISM) > 162 - 168

2017 IEEE International Symposium on Multimedia (ISM)

Human action recognition of depth sensors has drawn wide attentions in computer vision and multimedia processing areas. In contrast to simple periodic actions, irrelevant actions or sharing sub-actions between different classes of two-person non-periodic interactions make this task challenging. This paper presents heterogeneous features fusion with Collaborative Representation (CR) to address this...

chapter

Towards Efficient 3D Pose Retrieval and Reconstruction from 2D Landmarks

Hashim Yasin

2017 IEEE International Symposium on Multimedia (ISM) > 169 - 176

2017 IEEE International Symposium on Multimedia (ISM)

In this paper, we deal with the most challenging task of recovering the 3D human pose from just a single monocular image, that may be a synthetic image or a real internet image. The retrieval and reconstruction of the articulated 3D pose, both are prerequisites for the analysis of the people in images/videos. We address both tasks together and propose an efficient framework for search & retrieval...

chapter

Kara1k: A Karaoke Dataset for Cover Song Identification and Singing Voice Analysis

Yann Bayle, Ladislav Marsik, Martin Rusek, Matthias Robine, more

2017 IEEE International Symposium on Multimedia (ISM) > 177 - 184

2017 IEEE International Symposium on Multimedia (ISM)

We introduce Kara1k, a new musical dataset composed of 2,000 analyzed songs thanks to a partnership with a karaoke company. The dataset is divided into 1,000 cover songs provided by Recisio Karafun application1, and the corresponding 1,000 songs by the original artists. Kara1k is mainly dedicated toward cover song identification and singing voice analysis. For both tasks, it offers novel approaches,...

chapter

Computational and Perceptual Determinants of Film Mood in Different Types of Scenes

Jussi Tarvainen, Jorma Laaksonen, Tapio Takala

2017 IEEE International Symposium on Multimedia (ISM) > 185 - 192

2017 IEEE International Symposium on Multimedia (ISM)

Films seek to elicit emotions in viewers by infusing the story they tell with an affective character or tone - in a word, a mood. In content-based multimedia analysis, considerable effort has been made to develop methods to estimate film affect computationally. However, results have been hampered by a tendency to classify film scenes either by genre or not at all, while other potentially helpful classification...

chapter

Summarization of News Videos Considering the Consistency of Auditory and Visual Contents

Ichiro Ide, Ye Zhang, Ryunosuke Tanishige, Keisuke Doman, more

2017 IEEE International Symposium on Multimedia (ISM) > 193 - 199

2017 IEEE International Symposium on Multimedia (ISM)

Since news videos are valuable sources of multimedia information on real-world events, there is a demand for viewing them efficiently. However, there is a problem that summarization methods based on auditory contents do not take into account the visual contents. In the case of news videos, due to its presentation style where audio contents and visual contents do not necessarily come from the same...

chapter

An Iterative Feature-Pair Updating Framework for Rigid Template Matching with Outliers

Yang Yang, Qian Kou, Shaoyi Du, Shuang Luo, more

2017 IEEE International Symposium on Multimedia (ISM) > 200 - 207

2017 IEEE International Symposium on Multimedia (ISM)

To deal with the rigid template matching problem in real-world scenarios, we propose a novel iterative feature-pair updating framework which is also robust to high levels of outliers, such as background changing, complex nonrigid deformation and partial occlusion. Given a pair of template image and target image, we first extract a set of corresponding feature-pairs as candidates. Then, we propose...

chapter

Mining Culture-Specific Music Listening Behavior from Social Media Data

Martin Pichl, Eva Zangerle, Gunther Specht, Markus Schedl

2017 IEEE International Symposium on Multimedia (ISM) > 208 - 215

2017 IEEE International Symposium on Multimedia (ISM)

Incorporating user characteristics and contextual information has shown to be essential when it comes to personalized music retrieval and recommendation. To this end, the current location of a user is often exploited. However, relying solely on GPS coordinates neglects the cultural background of users, which does not necessarily coincide with political borders. In this paper, we analyze culture-specific...

chapter

Static vs. Dynamic Content Descriptors for Video Retrieval in Laparoscopy

Bernd Muenzer, Manfred J. Primus, Sabrina Kletz, Stefan Petscharnig, more

2017 IEEE International Symposium on Multimedia (ISM) > 216 - 223

2017 IEEE International Symposium on Multimedia (ISM)

The domain of minimally invasive surgery has recently attracted attention from the Multimedia community due to the fact that systematic video documentation is on the rise in this medical field. The vastly growing volumes of video archives demand for effective and efficient techniques to retrieve specific information from large video collections with visually very homogeneous content. One specific...

INFONA - science communication portal

2017 IEEE International Symposium on Multimedia (ISM)

Recurrent Visual Relationship Recognition with Triplet Unit

A Pre-Saliency Map Based Blind Image Quality Assessment via Convolutional Neural Networks

Human Action Classification Using Temporal Slicing for Deep Convolutional Neural Networks

Rate-Accuracy Optimization of Deep Convolutional Neural Network Models

Automatic Classification of Microstructures in Thermal Barrier Coating Images

Sustained Attention Function Evaluation During Cooking Based on Egocentric Vision

Detecting Good Surface for Improvisatory Visual Projection

Coherent Visual Description of Textual Instructions

QoE Studies on Interactive 3D Tele-Immersion

Spatio-Temporal Compositing of Video Elements for Immersive eLearning Classrooms

Deep Attribute Driven Image Similarity Learning Using Limited Data

A Real-Time Annotation of Motion Data Streams

Heterogeneous Features Fusion with Collaborative Representation Learning for 3D Action Recognition

Towards Efficient 3D Pose Retrieval and Reconstruction from 2D Landmarks

Kara1k: A Karaoke Dataset for Cover Song Identification and Singing Voice Analysis

Computational and Perceptual Determinants of Film Mood in Different Types of Scenes

Summarization of News Videos Considering the Consistency of Auditory and Visual Contents

An Iterative Feature-Pair Updating Framework for Rigid Template Matching with Outliers

Mining Culture-Specific Music Listening Behavior from Social Media Data

Static vs. Dynamic Content Descriptors for Video Retrieval in Laparoscopy

Filter options

Publication date

Keywords

INFONA - science communication portal

2017 IEEE International Symposium on Multimedia (ISM) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE International Symposium on Multimedia (ISM)