The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Person reidentification is a problem of recognizing a person across non-overlapping camera views. Pose variations, illumination conditions, low resolution images, and occlusion are the main challenges encountered in reidentification. Due to the uncontrolled environment in which the videos are captured, people could appear in different poses and due to which the appearance of a person could vary significantly...
In this paper, we propose to learn object representations with inference from temporal correlation in videos to achieve effective visual tracking. Unlike traditional methods which perform feature learning either at image level or based on intuitive temporal constraint, we employ the recurrent network with Long Short Term Memory (LSTM) units to directly learn temporally correlated representations of...
In camera-equipped teleoperated robots, it is often tedious for the operator to manage both the viewpoint and the shaky/unstable navigation, leading to disorientation. Our proposal is to create a virtual, freely rotatable camera that is decoupled from the robot's rotation. It is implemented using a complete spherical camera and removing its rotation in-image with a novel algorithm based on aligning...
In distributed sensing systems that use compressed videos for video analysis tasks, the lossy compression of videos can damage the accuracy of object detection, which is an essential step for various vision applications. This paper aims at constructing a new quality model to predict the performance of object detection. To achieve this goal, a distorted video database is constructed by applying object...
Finding highlights relevant to a text query in unedited videos has become increasingly important due to their unprecedented growth. We refer this task as semantic highlight retrieval and propose a query-dependent video representation for retrieving a variety of highlights. Our method consist of two parts: (1) “viralets”, a mid-level representation bridging between visual and semantic spaces; (2) a...
The emergence of UHD video format induces larger screens and involves a wider stimulated visual angle. Therefore, its effect on visual attention can be questioned since it can impact quality assessment, metrics but also the whole chain of video processing and creation. Moreover, changes in visual attention from different viewing conditions challenge visual attention models. In this paper, we present...
In this study, we make use of brain activation data to investigate the perceptual plausibility of a visual and an auditory model for visual and auditory saliency in video processing. These models have already been successfully employed in a number of applications. In addition, we experiment with parameters, modifications and suitable fusion schemes. As part of this work, fMRI data from complex video...
Human activities prediction is to enable early recognition of unfinished activities from videos only containing the beginning parts, which is a challenge problem. Prediction of human activities is necessarily applied in particular scenes(e.g. surveillance systems, human-computer interfaces). To solve this problem, we propose a novel framework which classifies videos into activity classes by using...
In this paper we introduce a novel decolorization strategy built on image fusion principles. Decolorization (color-to-grayscale), is an important transformation used in many monochrome image processing applications. We demonstrate that aside from color spatial distribution, local information plays an important role in maintaining the discriminability of the image conversion. Our strategy blends the...
In this paper, we propose an algorithm to remove rain streaks from single color image. Firstly, the guided filter, cooperated with rain pixels detection are used to separate a color image into low-frequency and high-frequency parts so that most rain components exist in the high-frequency part. Then, we focus on the high-frequency part to extract the non-rain details according to the characteristics...
If a low dynamic range (LDR) image or video is inverse tone mapped to a higher dynamic range, there can be banding artifacts in the output high dynamic range (HDR) image or video. We design a selective sparse filter to remove the banding artifacts and at the same time preserve edges and details. The filter is able to reduce other artifacts, such as blocky artifacts which are due to the compression...
This paper presents a framework for anomaly detection in videos which considers both motion and appearance features. For motion cues, we propose a new feature called 3D-HOF, which effectively extracts both velocity and orientation from the optical flow map. At the same time, we introduce the concept of “depth of field” problem to make the detection more accurate when the velocity of an object may...
Most traditional video summarization methods are designed to generate effective summaries for single-view videos, and thus they cannot fully exploit the complicated intra- and inter-view correlations in summarizing multi-view videos. In this paper, we introduce a novel framework for summarizing multi-view videos in a way that takes into consideration both intra- and inter-view correlations in a joint...
Real-world CCTV footage often poses increased challenges in object tracking due to Pan-Tilt-Zoom operations, low camera quality and diverse working environments. Most relevant challenges are moving background, motion blur and severe scale changes. Convolutional neural networks, which offer state-of-the-art performance in object detection, are increasingly utilized to pursue a more efficient tracking...
Today, mobile users are struggling with accessing overloading and unstructured social media feeds on the severely constrained mobile display. To overcome the challenges associated with browsing social media feeds on mobile devices, we are developing an innovative scheme to automatically create and synthesize the mixed social media digest (pictures, texts and videos) into a magazine-page-like social...
This paper investigates the discriminative capabilities of facial action units (AUs) exhibited by an individual while performing a task on a tablet computer in a semi-unconstrained environment. To that end, AUs are measured on a frame-by-frame basis from videos of 96 different subjects participating in a game-show-like quiz game that included a prize incentive. We propose a method that leverages the...
We propose a method for detecting obstacles by comparing input and reference train frontal view camera images. In the field of obstacle detection, most methods employ a machine learning approach, so they can only detect pre-trained classes, such as pedestrian, bicycle, etc. This means that obstacles of unknown classes cannot be detected. To overcome this problem, we propose a background subtraction...
This paper presents a novel algorithm that aims at minimizing the required decoding energy by exploiting a general energy model for HEVC-decoder solutions. We incorporate the energy model into the HEVC encoder such that it is capable of constructing a bit stream whose decoding process consumes less energy than the decoding process of a conventional bit stream. To achieve this, we propose to extend...
This paper presents a method for hierarchical content group detection from different social media platforms, which can reveal hierarchical structure of content groups. In this paper, content groups are defined as sets of contents with similar topics. Based on the revealed hierarchical structure, our method enables users to efficiently find the desired contents from large amount of contents placed...
User-generated videos (UGVs) have dominated contemporary social networking sites (SNSs). Forecasting their popularity is of great relevance to a broad range of online services. All existing studies forecast popularity of UGVs using their popularity statistics that are accumulated for a period of time after they are uploaded. Hence, there is always a substantial time lag (days to weeks) before popularity...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.