The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The development of a deep (stacked) convolutional auto-encoder in the Caffe deep learning framework is presented in this paper. We describe simple principles which we used to create this model in Caffe. The proposed model of convolutional auto-encoder does not have pooling/unpooling layers yet. The results of our experimental research show comparable accuracy of dimensionality reduction in comparison...
Recently, kernelized correlation Filter-based trackers have aroused the interest of many researchers and achieved good results in the field of tracking. However, the current tracking model based on kernelized correlation filters can not deal with the changes of the target appearance and scale effectively. Therefore, in this paper, we intend to solve these two problems and improve the robustness of...
This paper proposes a method for generative learning of hierarchical random field models. The resulting model, which we call the hierarchical sparse FRAME (Filters, Random field, And Maximum Entropy) model, is a generalization of the original sparse FRAME model by decomposing it into multiple parts that are allowed to shift their locations, scales and rotations, so that the resulting model becomes...
Human motion modelling is a classical problem at the intersection of graphics and computer vision, with applications spanning human-computer interaction, motion synthesis, and motion prediction for virtual and augmented reality. Following the success of deep learning methods in several computer vision tasks, recent work has focused on using deep recurrent neural networks (RNNs) to model human motion,...
For a safe, natural and effective human-robot social interaction, it is essential to develop a system that allows a robot to demonstrate the perceivable responsive behaviors to complex human behaviors. We introduce the Multimodal Deep Attention Recurrent Q-Network using which the robot exhibits human-like social interaction skills after 14 days of interacting with people in an uncontrolled real world...
Attributes are defined as mid-level image characteristics shared among different categories. These characteristics are suitable in order to handle classification problems especially when training data are scarce. In this paper, we design discriminative real-valued attributes by learning nonlinear inductive maps. Our method is based on solving a constrained optimization problem that mixes three criteria;...
Analyze the characteristic of three-dimensional scene visualization and regulation of scene change in flight simulation system, which is the basis of implementation of terrain reconstruction. Give the related mathematics model of terrain visualization. In order to resolve the problem that three-dimensional scene could not be reconstructed fast because of large-scale terrain data, present a procedure...
In this paper, a multiple scaling factor based Semi-Blind watermarking scheme for grayscale image watermarking using Online Sequential Extreme Learning Machine (OS-ELM) is proposed. Four-level DWT is applied on three standard test images of size 512 × 512. LL4 sub-band coefficients are chosen for watermark embedding. OS-ELM is initially tuned with a fixed number of training data used in its initial...
Convolutional neural networks play an increasingly important role in computer vision tasks, especially in the field of visual object recognition. Many prominent models, such as Inception, Maxout, ResNet, and NIN, have been proposed to significantly improve recognition performance. Inspired from those models, we propose a novel module called self-adaptive module (SAM). SAM consists of four passes and...
Learning-based partial differential equations (PDEs), which combine fundamental differential invariants into a nonlinear regressor, have been successfully applied to several computer vision and image processing problems. However, it cannot apply to saliency detection directly. In this paper, we present a novel learning-based PDEs model and learn the PDEs from training samples. We simplify the current...
Procedural textures have been widely used as they can be easily generated from various mathematical models. However, the model parameters are not perceptually meaningful or uniform for non-expert users; therefore it is difficult for general users to obtain a desired texture by tuning the parameters. In order to satisfy users' requirement, we propose a novel procedural texture retrieval scheme that...
Generating descriptions for visual data (images and video) automatically has been a complicated task in the field of Computer Vision and Artificial Intelligence. This paper discusses the working of and improvements on an algorithm called Neural Image Captioner (NIC) by Oriol Vinyals and his team, which uses a deep convolutional and recurrent architecture to generate natural language sentences to describe...
We study basic-level categories for describing visual concepts, and empirically observe context-dependant basic level names across thousands of concepts. We propose methods for predicting basic-level names using a series of classification and ranking tasks, producing the first large scale catalogue of basic-level names for hundreds of thousands of images depicting thousands of visual concepts. We...
The process of synthetically producing an image illustrating merged parts of multiple source images is usually known as image morphing. In this work a system is presented which morphs more than two source images to one output image. Its focus lies on using ancient coin images belonging to a common coin type. Nowadays, these coins can be worn or damaged. The goal of the presented morphing framework...
Online technical forums are valuable sources for mining useful software engineering information. LDA (Latent Dirichlet Allocation) is an unsupervised machine learning method which can be used for extracting underlying topics out of such large forums. However, the main output of LDA forum learning are usually huge matrices that contain millions of numbers, which is impossible for researchers to directly...
During the past few years, there has been a massive explosion of multimedia content such as un-annotated images on the web. Automatic image annotation is an important task for multimedia retrieval. By automatically allocating semantic concepts to un-annotated images, image retrieval can be performed over annotation concepts. In this work, we address the problem of automatic image annotation, namely...
This paper reports the preliminary development of a water-ski simulator for indoor training. Compared to existing training systems, the proposed simulator is capable of recreating a more realistic and immersive simulation experience, by providing both a proprioceptive and visual feedback to the practicing skier. In addition, it allows to practically test any desired skiing manoeuvre, since the ski...
In current biological image analysis, the temporal stage information, such as the developmental stage in the Drosophila development in situ hybridization images, is important for biological knowledge discovery. Such information is usually gained through visual inspection by experts. However, as the high-throughput imaging technology becomes increasingly popular, the demand for labor effort on annotating,...
In this paper, we present a biologically-inspired object recognition system for humanoid robots. Our approach is based on a hierarchical model of the visual cortex for feature extraction and rapid scene categorization of natural images. We enhanced the model to be entropy-aware and real-time capable, to be able to realize object recognition over time. We integrate time in our system to model uncertainty...
In this paper we present a new appearance-based localisation system that is able to deal with dynamic elements in the scene. By independently modelling the properties of local features observed in a scene over long periods of time, we show that feature appearances and geometric relationships can be learned more accurately than when representing a location by a single image. We also present a new dataset...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.