The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Image quality assessment (IQA) is very important for many image and video processing applications, e.g. compression, archiving, restoration and enhancement. An ideal image quality metric should achieve consistency between image distortion prediction and psychological perception of human visual system (HVS). Inspired by that HVS is quite sensitive to image local orientation features, in this paper,...
Aiming at the demand of mining target from a mass of battlefield video, a mining method based on key frames is discussed. Firstly, the key frames are extracted from battlefield video through content-based video retrieval process. Then, target recognition of key frames is done to mine the target message. The practical calculation shows the mining method is feasible.
This paper presents a framework to extract subset which only contains one person's video segments from a superfluous home video set. This is a semantic level multimedia application. We proposed a co-clustering method based on facial and body feature, and defined a new type of measurement to combine the two features more reasonable. 2D-PCA detector is used to extract facial feature, and K-Means algorithm...
This paper presents a new algorithm based on boosting for interactive object retrieval in images. Recent works propose ”online boosting” algorithms where weak classifier sets are iteratively trained from data. These algorithms are proposed for visual tracking in videos, and are not well adapted to ”online boosting” for interactive retrieval. We propose in this paper to iteratively build weak classifiers...
A novel color-based two-step, coarse-to-fine video replica detection system is proposed in this paper. The first step uses an R-tree in order to perform a coarse selection of the database (original) videos that potentially match the query video. A training procedure that utilizes attacked versions of the database videos and aims at achieving robustness to attacks is being used. A frame-based voting...
Tracking is a major issue of virtual and augmented reality applications. Single object tracking on monocular video streams is fairly well understood. However, when it comes to multiple objects, existing methods lack scalability and can recognize only a limited number of objects. Thanks to recent progress in feature matching, state-of-the-art image retrieval techniques can deal with millions of images...
A fast duplicate video detection system based on camera transitional behavior and the suffix array data structure is proposed in this work. The main idea is to match video clips according to their temporal structures, and frames corresponding to unique events are marked as anchor frames. To simplify the detection process, we use the camera transitional behavior to indicate unique events. Specifically,...
We present a new method of computing invariants in videos captured from different views to achieve view-invariant action recognition. To avoid the constraints of collinearity or coplanarity of image points for constructing invariants, we consider several neighboring frames to compute cross ratios, namely cross ratios across frames (CRAF), as our invariant representation of action. For every five points...
In video based face recognition, faces typically experience challenging illumination conditions, blur, or localisation errors in several frames. To alleviate these challenges, quality measures can be used to remove the most severely degraded frames. Still, when the videos are taken in real life settings, degradations are likely to be present even in the highest quality frames, and robust recognition...
Multimedia fingerprinting, also know as robust/perceptual hashing and replica detection is an emerging technology that can be used as an alternative to watermarking for the efficient Digital Rights Management (DRM) of multimedia data. Two fingerprinting approaches are reviewed in this paper. The first is an image fingerprinting technique that makes use of color and texture descriptors,R-trees and...
Video data modeling is an important issue for content-based retrieval. In this paper, we propose a semantic-based four-layer video data model for scenery documentary and our discussions focus on the extracting and representation of semantics of video data. The support vector machine (SVM) is used to bridge the gap between low-level visual features and high-level semantic concepts. Semantic concept...
Approximately 105 video clips are posted every day on the Web. The popularity of Web-based video databases poses a number of challenges to machine vision scientists: how do we organize, index and search such large wealth of data? Content-based video search and classification have been proposed in the literature and applied successfully to analyzing movies, TV broadcasts and lab-made videos. We explore...
In this paper, we examine the problem of internet video categorization. Specifically, we explore the representation of a video as a ldquobag of wordsrdquo using various combinations of spatial and temporal descriptors. The descriptors incorporate both spatial and temporal gradients as well as optical flow information. We achieve state-of-the-art results on a standard human activity recognition database...
The lack of publicly available annotated databases is one of the major barriers to research advances on emotional information processing. In this contribution we present a recently collected database of spontaneous emotional speech in German which is being made available to the research community. The database consists of 12 hours of audio-visual recordings of the German TV talk show ldquoVera am...
This paper presents a general method for segmenting a vector valued sequence into an unknown number of subsequences where all data points from a subsequence can be represented with the same affine parametric model. The idea is to cluster the data into the minimum number of such subsequences which, as we show, can be cast as a sparse signal recovery problem by exploiting the temporal correlation between...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.