The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Zero-shot learning, a special case of unsupervised domain adaptation where the source and target domains have disjoint label spaces, has become increasingly popular in the computer vision community. In this paper, we propose a novel zero-shot learning method based on discriminative sparse non-negative matrix factorization. The proposed approach aims to identify a set of common high-level semantic...
Most existing binary embedding methods prefer compact binary codes (b-dimensional) to avoid high computational and memory cost of projecting high-dimensional visual features (d-dimensional, b
This paper tackles the problem of efficient and effective object instance search in videos. To effectively capture the relevance between a query and video frames and precisely localize the particular object, we leverage the object proposals to improve the quality of object instance search in videos. However, hundreds of object proposals obtained from each frame could result in unaffordable memory...
Deep convolutional neural networks (CNNs) have proven highly effective for visual recognition, where learning a universal representation from activations of convolutional layer plays a fundamental problem. In this paper, we present Fisher Vector encoding with Variational Auto-Encoder (FV-VAE), a novel deep architecture that quantizes the local activations of convolutional layer in a deep generative...
We introduce Spatio-Temporal Vector of Locally Max Pooled Features (ST-VLMPF), a super vector-based encoding method specifically designed for local deep features encoding. The proposed method addresses an important problem of video understanding: how to build a video representation that incorporates the CNN features over the entire video. Feature assignment is carried out at two levels, by using the...
Traffic scene recognition is an important and challenging issue in Intelligent Transportation Systems (ITS). Recently, Convolutional Neural Network (CNN) models have achieved great success in many applications, including scene classification. The remarkable representational learning capability of CNN remains to be further explored for solving real-world problems. Vector of Locally Aggregated Descriptors...
A weighted free tree is an undirected connected graph with no cycles whose edges and nodes have weights (positive real numbers). The purpose of our research is to support the comparative analysis of weighted free trees. The fundamental categories of visualization techniques that support comparison are juxtaposition, superposition, and explicit encoding. Juxtaposition is often used for comparison....
The latest High Efficiency Video Coding (HEVC) has been increasingly used to generate video streams over Internet. However, the decoded HEVC video streams may incur severe quality degradation, especially at low bit-rates. Thus, it is necessary to enhance visual quality of HEVC videos at the decoder side. To this end, we propose in this paper a Decoder-side Scalable Convolutional Neural Network (DS-CNN)...
The ever widening application of virtual reality requires the ultra high resolution omnidirectional videos (OVs) to be transmitted over the wired and wireless Internet at low cost (i.e. bitrate). Various solutions have been proposed to intelligently reduce the bitrate, e.g. adapting the spatial resolution of the video for different directions of the panorama with regard to current direction that the...
Several recent works interpret convolutional features produced by deep convolutional neural networks as local descriptors. Existing high-dimensional aggregation based methods, e.g., Fisher Vector (FV) obtain inferior performance to pooling based methods in most situations, and we observe that it is mainly caused by the ignorance of spatial weights. In this paper, we propose a novel method named spatial...
In video coding, quality evaluation is important for improving the coding efficiency. Usually Peak Signal-to-Noise Ratio (PSNR) is utilized to measure the performance of different coding techniques. During the video coding process in YCbCr color space, there are three PSNRs, one for each color component. Sometimes they may contradict to each other, which poses a problem for evaluating the coding performance...
Most local sparse representation models in visual tracking generally contain three components: 1) extracting local descriptors from target region, 2) encoding the extracted local descriptors as mid-level features, 3) aggregating statistics of mid-level features into a signature. Since the last step aggregates only first-order statistics of mid-level features, it is named as First-order Pooling (FP)...
Higher dropout and failure rates among computer science students in introductory programming courses tend to be a norm for many institutions. Years of evidence indicate that dropouts and failures persist in spite of advancements in pedagogy, technology, and teacher training. Most advancements have relied on summative assessments and of late formative assessments. This research explores assessments...
We propose a principled approach for the learning of causal conditions from actions and activities taking place in the physical environment through visual input. Causal conditions are the preconditions that must exist before a certain effect can ensue. We propose to consider diachronic and synchronic causal conditions separately for the learning of causal knowledge. Diachronic condition captures the...
We present VidedWhisfer, a novel approach for unsupervised video representation learning, in which video sequence is treated as a self-supervision entity based on the observation that the sequence encodes video temporal dynamics (e.g., object movement and event evolution). Specifically, for each video sequence, we use a pre-learned visual dictionary to generate a sequence of high-level semantics,...
In this paper, we propose a multiscale dictionary learning framework for hierarchical sparse representation of natural images. The proposed framework leverages an adaptive quadtree decomposition to represent structured sparsity in different scales. In dictionary learning, a tree-structured regularized optimization is formulated to distinguish and represent high-frequency details based on varying local...
Recent studies suggest that the ability to use memories flexibly emerges gradually with development; however, the mechanistic changes that underlie this shift remain unknown. Participants aged 7-30 years encoded a series of related associations during functional magnetic resonance imaging (fMRI) scanning. We hypothesized that the comparatively more rigid memory behaviors characteristic of children...
Red tide harms the ecological environment. In order to reduce the occurrence of red tide phenomenon, it is necessary to study and analyze the reasons for the formation of red tides. The establishment of a set of red tide phenomenon visualization system has important significance. The paper improved inverse distance weighted interpolation algorithm based on the research on the spatial interpolation...
This paper proposes a method based on the bag-of-words (BoW) and the softmax regression for microscopic image classification. Essentially, the locality-constrained linear coding (LLC) is adopted for local feature encoding. Compared with the traditionally adopted vector quantization (VQ) in the BoW framework, the LLC encodes local structures of microscopic images with lower quantization errors and...
Optogenetic therapy holds the promise to restore visual function in patients affected by retinal degenerative diseases. However, the light-sensitivity of the molecule mediating light responses is much less than the one of healthy retinal cells so that no photo-stimulation is expected under natural environmental conditions. In this work, we present a platform set up to stimulate optogenetically-engineered...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.