The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we propose an effective face completion algorithm using a deep generative model. Different from well-studied background completion, the face completion task is more challenging as it often requires to generate semantically new pixels for the missing key components (e.g., eyes and mouths) that contain large appearance variations. Unlike existing nonparametric algorithms that search for...
Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image super-resolution. In this paper, we propose the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images. At each pyramid level, our model takes coarse-resolution feature maps as input, predicts the high-frequency residuals,...
Reconstructing the detailed geometric structure of a face from a given image is a key to many computer vision and graphics applications, such as motion capture and reenactment. The reconstruction task is challenging as human faces vary extensively when considering expressions, poses, textures, and intrinsic geometries. While many approaches tackle this complexity by using additional data to reconstruct...
The 3D shapes of faces are well known to be discriminative. Yet despite this, they are rarely used for face recognition and always under controlled viewing conditions. We claim that this is a symptom of a serious but often overlooked problem with existing methods for single view 3D face reconstruction: when applied in the wild, their 3D estimates are either unstable and change for different photos...
We propose split-brain autoencoders, a straightforward modification of the traditional autoencoder architecture, for unsupervised representation learning. The method adds a split to the network, resulting in two disjoint sub-networks. Each sub-network is trained to perform a difficult task – predicting one subset of the data channels from another. Together, the sub-networks extract features...
In most state-of-the-art hashing-based visual search systems, local image descriptors of an image are first aggregated as a single feature vector. This feature vector is then subjected to a hashing function that produces a binary hash code. In previous work, the aggregating and the hashing processes are designed independently. In this paper, we propose a novel framework where feature aggregating and...
This paper presents a novel method for detecting pedestrians under adverse illumination conditions. Our approach relies on a novel cross-modality learning framework and it is based on two main phases. First, given a multimodal dataset, a deep convolutional network is employed to learn a non-linear mapping, modeling the relations between RGB and thermal data. Then, the learned feature representations...
Unsupervised learning is a good neural network training way. However, the unsupervised learning algorithm is rare. The generative model is an interesting algorithm which can generate the similar data as the sample data by building a probabilistic model of the input data, and it can be used for unsupervised learning. Variational autoencoder is a typical generative model which is different from common...
Learning based methods have shown very promising results for the task of depth estimation in single images. However, most existing approaches treat depth prediction as a supervised regression problem and as a result, require vast quantities of corresponding ground truth depth data for training. Just recording quality depth data in a range of environments is a challenging problem. In this paper, we...
In this paper, we propose a novel generative model named Stacked Generative Adversarial Networks (SGAN), which is trained to invert the hierarchical representations of a bottom-up discriminative network. Our model consists of a top-down stack of GANs, each learned to generate lower-level representations conditioned on higher-level representations. A representation discriminator is introduced at each...
Magnetic Resonance Imaging (MRI) offers high-resolution in vivo imaging and rich functional and anatomical multimodality tissue contrast. In practice, however, there are challenges associated with considerations of scanning costs, patient comfort, and scanning time that constrain how much data can be acquired in clinical or research studies. In this paper, we explore the possibility of generating...
In this paper, we propose an unsupervised feature learning method called deep binary descriptor with multi-quantization (DBD-MQ) for visual matching. Existing learning-based binary descriptors such as compact binary face descriptor (CBFD) and DeepBit utilize the rigid sign function for binarization despite of data distributions, thereby suffering from severe quantization loss. In order to address...
Endmember extraction is a fundamental task in spectral unmixing of remotely sensed hyperspectral images. In this work, we develop a new robust algorithm for endmember extraction which is based on a nonnegative sparse autoencoder. The proposed approach is based on two main steps. First, it uses an automatic sampler approach with local outlier factor and affinity propagation to intelligently gather...
An improved super-resolution image reconstruction algorithm based on dictionary-learning is studied for the time-consuming algorithms in the existing dictionary training process. In this paper, the reconstruction of image super resolution is realized from the compressed sensing theory. The image patches are conveyed by sparse linear representations with an over-complete dictionary. In the process...
One significant advantage of the deep convolutional neural networks (DCNN) is their representational ability for local complex structures. Inspired by this observation, a DCNN based residual learning model is proposed to learn a nonlinear mapping function between the high-resolution (HR) and low-resolution (LR) image patches. The DCNN is trained based on image patches, which are only sampled from...
Learning-based face super-resolution approaches rely on representative dictionary as self-similarity prior from training samples to estimate the relationship between the low-resolution (LR) and high-resolution (HR) image patches. The most popular approaches, learn mapping function directly from LR patches to HR ones but neglects the multi-layered nature of image degradation process (resolution down-sampling)...
Several models based on deep neural networks have applied to single image super-resolution and obtained great improvements in terms of both reconstruction accuracy and computational performance. All these methods focus either on performing the super-resolution (SR) reconstruction operation in the high resolution (HR) space after upscaling with a single filter, usually bicubic interpolation, or optimizing...
Uniform test suites consist of test cases exclusively differing in test inputs - not in test goals. Intended to gain confidence that a given invariant holds, these inputs trigger particular behavior of the system under test. Equipped with a simulation of the system under test we are able to cheaply explore this behavior virtually. When changing over to reality, testing the system within its real context,...
Face hallucination, which refers to predicting a HighResolution (HR) face image from an observed Low-Resolution (LR) one, is a challenging problem. Most state-of-the-arts employ local face structure prior to estimate the optimal representations for each patch by the training patches of the same position, and achieve good reconstruction performance. However, they do not take into account the contextual...
Visual attention is a dynamic search process of acquiring information. However, most previous studies have focused on the prediction of static attended locations. Without considering the temporal relationship of fixations, these models usually cannot explain the dynamic saccadic behavior well. In this paper, an iterative representation learning framework is proposed to predict the saccadic scanpath...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.