The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The main purpose of transfer learning is to resolve the problem of different data distribution, generally, when the training samples of source domain are different from the training samples of the target domain. Prediction of salient areas in natural video suffers from the lack of large video benchmarks with human gaze fixations. Different databases only provide dozens up to one or two hundred of...
An efficient stereo matching algorithm, which applies adaptive smoothness constraints using texture and edge information, is proposed in this work. First, we determine non-textured regions, on which an input image yields flat pixel values. In the non-textured regions, we penalize depth discontinuity and complement the primary CNN-based matching cost with a color-based cost. Second, by combining two...
Computationally transcribing historical document images to digital text often requires an initial, labor intensive recording of ground-truths by language experts to provide the OCR system with training text. This paper presents a framework for the automatic generation of training data, provided only with labeled character images and a digital font, thus removing the need for manually generated text...
Visual question answering (VQA) comes as a result of great development in computer vision and natural language processing, which requires deep understanding of images and questions and effective integration of them. Current works on VQA simply concatenated visual and textual features or compared them via dot product, which were unable to eliminate the semantic difference between them. We argue to...
Ocular recognition on smartphone authentication applications are gaining popularity in academic research and in the commercial sector where operators are requesting reliable and robust biometric authentication. The wide acceptance of such ocular based authentication systems also depends on the verification performance on large scale testing with different data subject ethnic groups and platforms....
We present COVERAGE — a novel database containing copy-move forged images and their originals with similar but genuine objects. COVERAGE is designed to highlight and address tamper detection ambiguity of popular methods, caused by self-similarity within natural images. In COVERAGE, forged-original pairs are annotated with (i) the duplicated and forged region masks, and (ii) the tampering factor/similarity...
In this work, we present a new multiple channel feature called Deep Compact Channel Feature (DCCF), which generates a compact, discriminative feature representation by a pre-trained deep encoder-decoder. With the combination of DCCF and boosted decision trees, a new object detector is proposed which achieved outstanding performance on standard pedestrian dataset INRIA and Caltech. Furthermore, a large...
Recent advances in salient object detection have exploited the deep Convolutional Neural Network (CNN) to represent high-level semantic, however, due to the presence of convolutional and pooling layers, it is difficult for CNN to generate saliency map with sharp boundaries. In this paper, we propose multi-scale mask-based Fast R-CNN framework which generate saliency score of each region. Since the...
With the increased focus on visual attention (VA) in the last decade, a large number of computational visual saliency methods have been developed. These models are evaluated by using performance evaluation metrics that measure how well a predicted map matches eye-tracking data obtained from human observers. Though there are a number of existing performance evaluation metrics, there is no clear consensus...
Correlation filters have been extensively studied to address online visual object tracking task, while achieving favourable performance against the-state-of-the-art methods in various benchmark datasets. Nevertheless, undesired conditions, i.e. partial occlusions or abrupt deformations of the object appearance, severely degrade the performance of correlation filter based tracking methods. To this...
Correlation filters have recently made significant improvements in visual object tracking on both efficiency and accuracy. In this paper, we propose a sparse correlation filter, which combines the effectiveness of sparse representation and the computational efficiency of correlation filters. The sparse representation is achieved through solving an ℓ0 regularized least squares problem. The obtained...
Multidimensional image data, i.e., images with three or more dimensions, are used in many areas of science. Multidimensional image proçessing is supported in Python and MATLAB. VisionGL is an open source library that provides a set of image processing functions and can help the programmer by automatically generating code. The objective of this work is to augment VisionGL by adding multidimensional...
Circle detection from digital images is a necessary operation in many robotics and computer vision tasks to facilitate shape and object recognition. We propose and analyze a novel method, based on line segment detection and circle completeness verification, to detect circles in images. The key idea is to use line segments instead of raw edge pixels to get the circle candidates followed by a verification...
Appearance model is widely used for image description and demonstrates an impressive performance in object detection. However, most appearance models can not be applied to more freedom object in still image, especially when dealt with variant objects whose shapes are modified by warping, rotation, etc. In this article, a simple but effective method to build a regional rotation-invariant feature descriptor...
Convolutional Neural Network (CNN) based image representations have achieved high performance in image retrieval tasks. However, traditional CNN based global representations either provide high-dimensional features, which incurs large memory consumption and computing cost, or inadequately capture discriminative information in images, which degenerates the functionality of CNN features. To address...
Document is unavailable: This DOI was registered to an article that was not presented by the author(s) at this conference. As per section 8.2.1.B.13 of IEEE's "Publication Services and Products Board Operations Manual," IEEE has chosen to exclude this article from distribution. We regret any inconvenience.
This paper explores a pragmatic approach to multiple object tracking where the main focus is to associate objects efficiently for online and realtime applications. To this end, detection quality is identified as a key factor influencing tracking performance, where changing the detector can improve tracking by up to 18.9%. Despite only using a rudimentary combination of familiar techniques such as...
Automatic object detection is a rapidly evolving area in surveillance and autonomous vehicles. Deformable part model (DPM) is a well-known object detector for its high precision and speed bottleneck. This paper proposes a very fast object detection pipeline based on complementary techniques to accelerate DPM. A recent fast feature pyramid technique is employed with look-up table HOG features, Fast...
Studying tissue structure in 3D is beneficial in many applications. Reconstructing the structure based on histological sections has the advantages of high resolution and compatibility with conventional staining and interpretation techniques. However, obtaining an accurate 3D reconstruction based on a sequence of 2D sections is a difficult task. Evaluating the accuracy of such reconstructions is also...
Despite significant progress in pedestrian detection has been made in recent years, detecting pedestrians in crowded scenes remains a challenging problem. In this paper, we propose to use visual contexts based on scale and occlusion cues from detections at proximity to better detect pedestrians for surveillance applications. Specifically, we first apply detectors based on full body and parts to generate...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.