The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Viewpoint variation is a major challenge in video- based human action recognition. We exploit the simultaneous RGB and Depth sensing of RGB-D cameras to address this problem. Our technique capitalizes on the complementary spatio-temporal information in RGB and Depth frames of the RGB-D videos to achieve viewpoint invariant action recognition. We extract view invariant features from the dense trajectories...
We propose a neural network architecture for depth map inference from monocular stabilized videos with application to UAV videos in rigid scenes. Training is based on a novel synthetic dataset for navigation that mimics aerial footage from gimbal stabilized monocular camera in rigid scenes. Based on this network, we propose a multi-range architecture for unconstrained UAV flight, leveraging flight...
Automatic License Plate Recognition (ALPR) has been employed in many developed countries for traffic management, automatic speed control, tracking stolen cars and also in automatic toll systems for improving the traffic control. ALPR is a surveillance system that extracts the information from the vehicle license plate by capturing the images. Human intervention to recognize the license plates results...
In this paper we present a framework that is able to reliably and completely autonomously detect abnormal behavior in surveillance images. As input, we rely solely on a long-wave infrared (LWIR) image sensor. Our abnormal behavior detection pipeline consists of two consecutive stages. In a first stage, we perform efficient and fast pedestrian detection and tracking. In a second step, the detected...
Behavior or human action recognition is one hot research topic in real-time video surveillance system. Dangerous accidents consist of dangerous actions by one or more persons. Thus, action recognition is very important for dangerous accident recognition. If videos captured by public cameras especially dangerous actions related videos can be processed and analyzed immediately to provide an early and...
Deep learning for human action recognition in videos is making significant progress, but is slowed down by its dependency on expensive manual labeling of large video collections. In this work, we investigate the generation of synthetic training data for action recognition, as it has recently shown promising results for a variety of other computer vision tasks. We propose an interpretable parametric...
The successful deep convolutional neural networks for visual object recognition typically rely on a massive number of training images that are well annotated by class labels or object bounding boxes with great human efforts. Here we explore the use of the geographic metadata, which are automatically retrieved from sensors such as GPS and compass, in weakly-supervised learning techniques for landmark...
Vehicle re-identification (re-id) plays an important role in the automatic analysis of the drastically increasing urban surveillance videos. Similar to the other image retrieval problems, vehicle re-id suffers from the difficulties caused by various poses of vehicles, diversified illuminations, and complicated environments. Triplet-wise training of convolutional neural network (CNN) has been studied...
Activity recognition applications is growing in importance due to two key factors: first there is increased need for more human assistance and surveillance; and second, increased availability of datasets and improved image recognition algorithms have allowed effective recognition of more sophisticated activities. In this paper we develop an activity recognition approach to support visually impaired...
Fine-grained activities are human activities involving small objects and small movements. Automatic recognition of such activities can prove useful for many applications, including detailed diarization of meetings and training sessions, assistive human-computer interaction and robotics interfaces. Existing approaches to fine-grained activity recognition typically leverage the combined use of multiple...
Energy saving is an effort to decrease and minimize unnecessary energy consumption. The energy saving and energy efficiency is one of the famous issues in the last decade since the energy resource is rapidly depleted. Reducing unnecessary energy consumption could be performed by architectural design or automatic system approach. This paper proposes an idea to develop a model of energy usage in a room...
Real-world CCTV footage often poses increased challenges in object tracking due to Pan-Tilt-Zoom operations, low camera quality and diverse working environments. Most relevant challenges are moving background, motion blur and severe scale changes. Convolutional neural networks, which offer state-of-the-art performance in object detection, are increasingly utilized to pursue a more efficient tracking...
Motivated by the recent advances in human-robot interaction we present a new dataset, a suite of tools to handle it and state-of-the-art work on visual gestures and audio commands recognition. The dataset has been collected with an integrated annotation and acquisition web-interface that facilitates on-the-way temporal ground-truths for fast acquisition. The dataset includes gesture instances in which...
The paper describes a computer vision method for estimating the clinical gait metrics of walking patients in unconstrained environments. The method employs background subtraction to produce a silhouette of the subject and a randomized decision forest to detect their feet. Given the feet detections, the stride and step length, cadence, and walking speed are estimated. Validation of the system is presented...
A fascinating issue in a digital forensic investigation is that given a digital video, would it be conceivable to recognize the camera model which was utilized to get the video. In this paper we take a simplified form of this issue by attempting to recognize recordings caught by a predetermined number of camera models. We propose various features which could be utilized by a classifier to distinguish...
Soccer is a very popular sport but also has a high rate of injuries. In this paper, player falling events in soccer videos are classified into five major categories. These categories have been identified by soccer coaches as the major mechanisms behind player injuries. Automatic detection of these events will be useful to coaches to plan specific training modules and to impart individual training...
In this paper, a deep convolutional neural network based approach to the problem of automatically recognizing jersey numbers from soccer videos is presented. Two different jersey number vector encoding schemes are presented and compared to each other. The first treats every number as a separate class, while the second one treats each digit as a class. Additionally, the semi-automatic process for the...
In recent years, accurate pedestrian detection from in-vehicle camera images is focused to develop a safety driving assistance system. Currently, successful methods are based on statistical learning. However, in such methods, it is necessary to prepare a large amount of training images. Thus, the decrease in the number of training images degrades the detection accuracy. That is, in driving environments...
High efficiency video coding (HEVC), as an up-to-date video coding standard, is becoming widely used, due to its preeminent performance. We argue in this paper that the new HEVC encoder can be utilized to provide efficient features for video segmentation. Thus, we propose a novel method, which learns to segment videos in HEVC compressed domain. Specifically, three features are extracted from HEVC...
The Internet of Things (IoTs) has triggered rapid advances in sensors, surveillance devices, wearables and body area networks with advanced Human-Computer Interfaces (HCI). One such application area is the adoption of Body Worn Cameras (BWCs) by law enforcement officials. The need to be ‘always-on’ puts heavy constraints on battery usage in these camera front-ends, thus limiting their widespread adoption...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.