The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Since the beginning of early civilizations, social relationships derived from each individual fundamentally form the basis of social structure in our daily life. In the computer vision literature, much progress has been made in scene understanding, such as object detection and scene parsing. Recent research focuses on the relationship between objects based on its functionality and geometrical relations...
In recent years, automatic detection of on-shelf products has become an industrial need along with the technological improvements in the field of computer vision. In this respect, localization of on-shelf products and detection of the brands of these products have evolved into two main objectives. In this work, a hidden Markov model, which is commonly used in signal processing for post-processing...
In this paper, we propose a patched-based deep Boltzmann shape priors for visual tracking. The shape priors are generated from deep Boltzmann machine network. The network consists of three layers of hidden and visible units. The generated shapes not only maintain general shapes from a variety of poses, but also entail local modifications with high probability.
What defines a visual style? Fashion styles emerge organically from how people assemble outfits of clothing, making them difficult to pin down with a computational model. Low-level visual similarity can be too specific to detect stylistically similar images, while manually crafted style categories can be too abstract to capture subtle style differences. We propose an unsupervised approach to learn...
Deep embeddings answer one simple question: How similar are two images? Learning these embeddings is the bedrock of verification, zero-shot learning, and visual search. The most prominent approaches optimize a deep convolutional network with a suitable loss function, such as contrastive loss or triplet loss. While a rich line of work focuses solely on the loss functions, we show in this paper that...
Recent ground-breaking works have shown that deep neural networks can be trained end-to-end to regress dense disparity maps directly from image pairs. Computer generated imagery is deployed to gather the large data corpus required to train such networks, an additional fine-tuning allowing to adapt the model to work well also on real and possibly diverse environments. Yet, besides a few public datasets...
Humans take advantage of real world symmetries for various tasks, yet capturing their superb symmetry perception mechanism with a computational model remains elusive. Motivated by a new study demonstrating the extremely high inter-person accuracy of human perceived symmetries in the wild, we have constructed the first deeplearning neural network for reflection and rotation symmetry detection (Sym-NET),...
Domain adaption (DA) allows machine learning methods trained on data sampled from one distribution to be applied to data sampled from another. It is thus of great practical importance to the application of such methods. Despite the fact that tensor representations are widely used in Computer Vision to capture multi-linear relationships that affect the data, most existing DA methods are applicable...
In this paper, we first provide a new perspective to divide existing high performance object detection methods into direct and indirect regressions. Direct regression performs boundary regression by predicting the offsets from a given point, while indirect regression predicts the offsets from some bounding box proposals. In the context of multioriented scene text detection, we analyze the drawbacks...
This paper describes an industrial close-range photogrammetric system that is being developed by the authors. We describe the system's architecture based on an iterative scheme, which allows using fast heuristic point matching algorithms with the follow-up result verification. We also describe the design of our multi-functional coded targets that allow using those algorithms. Besides, this design...
Human motion recognition is a trending topic and could be applied in many areas, the motion estimation of ASD children is more challenging because of the high uncertainty of their activities, we thus introduced a novel method which is designed for estimating the upper joints and recognising their special motions, we verified the proposed method on our recorded ASD children dataset and adult dataset,...
Most state-of-the-art motion segmentation algorithms draw their potential from modeling motion differences of local entities such as point trajectories in terms of pairwise potentials in graphical models. Inference in instances of minimum cost multicut problems defined on such graphs allows to optimize the number of the resulting segments along with the segment assignment. However, pairwise potentials...
Although Deep Convolutional Neural Networks (CNNs) have liberated their power in various computer vision tasks, the most important components of CNN, convolutional layers and fully connected layers, are still limited to linear transformations. In this paper, we propose a novel Factorized Bilinear (FB) layer to model the pairwise feature interactions by considering the quadratic terms in the transformations...
The rapid and irregular motion of semen cells makes the counting process of semen difficult in the visual assessment. Therefore, computer based techniques are necessary to evaluate the tests with more accurately. In this paper, an alternative way to the visual assessment technique in spermiogram tests is presented. Analyses are performed on the recorded microscope video images by computer, automatically...
Video image dataset is playing an essential role in design and evaluation of traffic vision methods. However, there is a longstanding difficulty that manually collecting and annotating large-scale diversified dataset from real scenes is time-consuming and prone to error. In 2016, we proposed the parallel vision methodology to tackle the issues of conventional vision computing approach in data collection,...
The contribution of this paper is to bridge the gap on understanding the mathematical structure and the computational implementation of a convolutional neural network (CNN) using a minimal model (Minimal CNN). The proposed minimal CNN is presented using a layering approach. This approach provides a concise and accessible understanding of the main mathematical operations of a CNN. Hence, it benefits...
During the last years, Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance in image classification. Their architectures have largely drawn inspiration by models of the primate visual system. However, while recent research results of neuroscience prove the existence of non-linear operations in the response of complex visual cells, little effort has been devoted to extend...
Numerous computer vision problems such as stereo depth estimation, object-class segmentation and fore-ground/background segmentation can be formulated as per-pixel image labeling tasks. Given one or many images as input, the desired output of these methods is usually a spatially smooth assignment of labels. The large amount of such computer vision problems has lead to significant research efforts,...
The problem of transferring a deep convolutional network trained for object recognition to the task of scene image classification is considered. An embedded implementation of the recently proposed mixture of factor analyzers Fisher vector (MFA-FV) is proposed. This enables the design of a network architecture, the MFAFVNet, that can be trained in an end to end manner. The new architecture involves...
We address an essential problem in computer vision, that of unsupervised foreground object segmentation in video, where a main object of interest in a video sequence should be automatically separated from its background. An efficient solution to this task would enable large-scale video interpretation at a high semantic level in the absence of the costly manual labeling. We propose an efficient unsupervised...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.