Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Image inpainting in a clutter-scene containing objects at several depths could be more challenging in contrast to conventional image inpainting problems. The basic inpainting problem is filling up the missing parts of an image caused by removing undesired objects existing in the foreground layer. In this paper we propose a method to inpaint missing portions of an image in different depth layers. The...
Computing similarities between data samples is a fundamental step in most Pattern Recognition (PR) tasks. Better similarity measures lead to more accurate prediction of labels. Computing similarities between video sequences has been a challenging problem for the PR community for long because videos have both spatial and temporal context which are hard to capture. We describe a novel approach that...
This paper introduces an algorithm for dense motion segmentation of pedestrians in crowded video sequences. This algorithm realizes dense and temporally-consistent segmentation under severe occlusion conditions. To segment the whole appearance of each articulated object, the algorithm uses temporal invariance of geodesic distance (similarity) between segments as a criterion for motion segmentation...
To deal with the drifting issue in visual tracking, we propose an Online Transfer Boosting (OTB) algorithm that transfers knowledge from three different source domains to the target domain to improve the performance of the online classifier used in tracking-by-detection. In particular, the OTB algorithm integrates three types of knowledge by: (1) transferring prior knowledge from the first frame using...
We may represent human actions as a bag of spatiotemporal visual words extracted from input video sequences. For human action categorization, labeled LDA (L-LDA) is an extension of latent Dirichlet allocation (LDA) by providing action class labels to each video. To handle parameter uncertainty in L-LDA, this paper further extends L-LDA within the type-2 fuzzy set (T2 FS) framework, referred to as...
We propose a new framework for recognizing three-dimensional (3D) objects that needs few reference images. We tackle a key issue of 3D object recognition; the trade-off between the number of reference images and recognition accuracy. We assume that a reference image is a photo of the object from an arbitrary position, typical of those used to exhibit goods on the web. The framework, first, estimates...
This paper introduces CMV100, a new research dataset for people tracking and re-identification in sparse camera networks. Baseline methods for reidentification performance analysis are also proposed. The dataset consist of over 400 indoor video sequences in total. The number of visually distinctive human objects is 100, and each person appears in three different views on average and in five at maximum...
The key frame extraction is designed for obtaining a (very) compressed set of video frames that summarizes the essential content of a video sequence. In this paper, a well-known information theoretic measure, the Jensen-Rényi divergence (JRD), is studied to estimate the frame-by-frame distance between consecutive video images, for segmenting shots/subshots and for choosing key frames. Our new key...
In this study, a visual attention region determination approach for H.264 videos using spatiotemporal features is proposed. After Gaussian filtering in Lab color space, the phase spectrum of Fourier transform (PFT) is used to generate the spatial saliency map of each video frame. On the other hand, the motion vector fields from an H.264 video bitstream are backward accumulated and the phase spectrum...
In this paper, we propose an approach for human activity categorizing based on the use of optical flow direction and magnitude features. The main contribution of this paper is the feature representation that mirrors the geometry of the human body and relationships between its moving regions when performing activities. The features are quantified using a quantization algorithm. We analyze the performance...
Accurate segmentation provides a useful contour constraint to alleviate drifting during online learning for tracking. Towards this end, we present a closed-loop method for object tracking that links Hough forests and alpha matting via an effective back-projection scheme for patches. A novel hybrid-Hough-forests-based method first estimates object location. Given the object location, the trimap of...
Design of video storyboards has emerged as a popular research area in the multimedia community. Different pattern clustering techniques are applied to extract the key frames from a video sequence to form a storyboard. In this paper, we propose an automatic method for the selection of key frames of a video sequence using Delaunay graphs. We prune certain edges from the Delaunay graph using an iterative...
For achieving efficient action recognition, some recent works propose to select a smaller number of frames in a video sequence instead of the entire sequence of frames. In this study, we propose to represent a frame by a combination of local and global descriptors instead of the silhouette used in our previous approach aiming at frame selection. Action recognition is then executed on the basis of...
Existing techniques for object tracking with Multiple Instance Learning take the approach of extracting low-level patches of fixed size and aspect ratios within each image, and employ many simplistic assumptions. In this work, we propose an approach that automatically utilizes image segments as input primitives to develop a multi-level segmentation-based system, and build a target model refinement...
This paper delves into the effectiveness of a gait recognition process depending on the length of the video sequence used. To this end, a well-known gait representation, the Gait Energy Image (GEI), is incrementally computed from gait cycles in the order they occur. The main objective is to assess the problem of the minimum number of gait cycles required to obtain discriminant GEIs. An experimental...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.