The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Films seek to elicit emotions in viewers by infusing the story they tell with an affective character or tone - in a word, a mood. In content-based multimedia analysis, considerable effort has been made to develop methods to estimate film affect computationally. However, results have been hampered by a tendency to classify film scenes either by genre or not at all, while other potentially helpful classification...
In monocular vision systems, lack of knowledge about metric distances caused by the inherent scale ambiguity can be a strong limitation for some applications. We offer a method for fusing inertial measurements with monocular odometry or tracking to estimate metric distances in inertial-monocular systems and to increase the rate of pose estimates. As we performed the fusion in a loosely-coupled manner,...
The human visual system employs an information selection mechanism, visual attention, so that higher-level cognitive processes can be restricted to a potentially important subset of the incoming information. This mechanism is amenable to efficient computational implementation and, consequently, it has been incorporated into many technological applications. Among these applications is autonomous mobile...
The intensive annotation cost and the rich but unlabeled data contained in videos motivate us to propose an unsupervised video-based person re-identification (re-ID) method. We start from two assumptions: 1) different video tracklets typically contain different persons, given that the tracklets are taken at distinct places or with long intervals; 2) within each tracklet, the frames are mostly of the...
In this paper we introduce a novel Depth-Aware Video Saliency approach to predict human focus of attention when viewing videos that contain a depth map (RGBD) on a 2D screen. Saliency estimation in this scenario is highly important since in the near future 3D video content will be easily acquired yet hard to display. Despite considerable progress in 3D display technologies, most are still expensive...
Direct method for visual odometry has gained popularity, it needs not to compute feature descriptor and uses the actual values of camera sensors directly. Hence, it is very fast. However, its accuracy and consistency are not satisfactory. Based on these considerations, we propose a tightly-coupled, optimization-based method to fuse inertial measurement unit (IMU) and visual measurement, in which uses...
This paper explores freehand physical interaction in egocentric Mixed Reality by performing a usability study on the use of hand posture estimation sensors. We report on precision, interactivity and usability metrics in a task-based user study, exploring the importance of additional visual cues when interacting. A total of 750 interactions were recorded from 30 participants performing 5 different...
In this paper, we describe the application TaRDIS, a visual analytics system for spatial and temporal data designed for the needs of archaeo-related disciplines that supports domain experts in analyzing their data. The temporal data is visualized in form of an interactive Harris Matrix that illustrates the temporal position of the layers. The 2D and 3D visualization sketches the spatial position of...
At disaster sites, the use of Micro Unmanned Aerial Vehicles (MUAVs) is expected for human safety. One application is to support first-phase emergency restoration work conducted by teleoperated construction machines. To extend the operation time of a MUAV, the authors proposed a powerfeeding tethered MUAV to provide an overhead view of the site to operators. The target application is to be used outdoors,...
This paper presents a new method for video preference estimation using functional near-infrared spectroscopy signals (fNIRS signals). The proposed method first computes fNIRS features from fNIRS signals recorded while users are watching videos and multiple visual features from these videos. Next, by applying Locality Preserving Canonical Correlation Analysis to fNIRS features and each visual feature,...
To obtain depth information from a stereo camera setup, a common way is to conduct disparity estimation between the two views; the disparity map thus generated may then also be used to synthesize arbitrary intermediate views. A straightforward approach to disparity estimation is block matching, which performs well with perspective data. When dealing with non-perspective imagery such as obtained from...
Omnidirectional images describe the color information at a given position from all directions. Affordable 360° cameras have recently been developed leading to an explosion of the 360° data shared on social networks. However, an omnidirectional image does not contain interesting content everywhere. Some part of the images are indeed more likely to be looked at by some users than others. Knowing these...
Research efforts have been devoted to extraction and visualization of vortices in an unsteady (turbulent) flow. Characterizing the behaviors of the flow, vortices are identifiable as regions using a vortex detector known as the lambda2-criterion. Isosurface visualization renders vortex regions based on a chosen isovalue. However, it is highly challenging to choose one isovalue suitable for visualizing...
An inertial-aided visual servo control approach for fully-actuated Autonomous Underwater Vehicles (AUVs) without relying on linear velocity measurements is proposed. The homography obtained from corresponding images of a locally planar scene is directly exploited as feedback information. A cascade inner-outer loop control architecture is adopted that facilitates both control implementation and gain...
In this paper, a method of estimating binding time is proposed for ophthalmological outpatients. Binding time equals the sum of transit time and waiting time at a hospital. It determines the number of outpatients virtually arriving and their arrival time intervals according to the gamma distribution and the exponential distribution. It then assigns examinations to outpatients virtually arriving, referring...
Aesthetic quality assessment plays an important role in how people organize large image collections. Many studies on aesthetic quality assessment are based on design of hand-crafted features without considering whether attributes conveyed by images can actually affect image aesthetics. This paper presents an aesthetic quality assessment method which uses new visual features. The proposed method utilizes...
UAVs (Unmanned Aerial Vehicles) have been widely used in power line inspections, but low autonomous cruise capacity of UAVs requires strict condition for operators and site while landing during UAV power line inspections. This paper presents an autonomous landing control technique for UAVs when charging at the electric towers based on vision positioning method. The proposed system consists of three...
In order to realize autonomous landing of the unmanned aerial vehicle (UAV) in power patrolling, a visual method vision based on Faster Regions with Convolutional Neural Network (Faster R-CNN) for UAVs is studied. In this paper, we design the landing sign of the combination of concentric circles and pentagon, and propose the Faster R-CNN recognition algorithm which can be used to identify the target...
This paper proposes an image dehazing model built with a convolutional neural network (CNN), called All-in-One Dehazing Network (AOD-Net). It is designed based on a re-formulated atmospheric scattering model. Instead of estimating the transmission matrix and the atmospheric light separately as most previous models did, AOD-Net directly generates the clean image through a light-weight CNN. Such a novel...
2D Gabor filters are sinusoidal waves of some frequency and orientation within a 2D Gaussian envelope and are used to model simple cells (S1) in primate visual cortex. Visual system in primates extracts information both in 2D spatial and frequency domains. In this paper, we use multiple spatial frequencies to model S1 cells in primates visual cortex. Thereafter, we use MAX and STD operators to extract...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.