The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Rolling element bearings are among the most frequently encountered components in the majority of rotating machines. Thus, prognostic and health management (PHM) of rolling bearing plays an important role on the working status of the machine system. Remaining useful life (RUL) prediction is the core of PHM. It's well known that original auto-regression (AR) model is suitable for the prediction of linear...
Zero-shot learning, a special case of unsupervised domain adaptation where the source and target domains have disjoint label spaces, has become increasingly popular in the computer vision community. In this paper, we propose a novel zero-shot learning method based on discriminative sparse non-negative matrix factorization. The proposed approach aims to identify a set of common high-level semantic...
Current object detection approaches predict bounding boxes that provide little instance-specific information beyond location, scale and aspect ratio. In this work, we propose to regress directly to objects shapes in addition to their bounding boxes and categories. It is crucial to find an appropriate shape representation that is compact and decodable, and in which objects can be compared for higher-order...
Semantic image inpainting is a challenging task where large missing regions have to be filled based on the available visual data. Existing methods which extract information from only a single image generally produce unsatisfactory results due to the lack of high level context. In this paper, we propose a novel method for semantic image inpainting, which generates the missing content by conditioning...
We propose local binary convolution (LBC), an efficient alternative to convolutional layers in standard convolutional neural networks (CNN). The design principles of LBC are motivated by local binary patterns (LBP). The LBC layer comprises of a set of fixed sparse pre-defined binary convolutional filters that are not updated during the training process, a non-linear activation function and a set of...
Deep convolutional neural networks (CNNs) have proven highly effective for visual recognition, where learning a universal representation from activations of convolutional layer plays a fundamental problem. In this paper, we present Fisher Vector encoding with Variational Auto-Encoder (FV-VAE), a novel deep architecture that quantizes the local activations of convolutional layer in a deep generative...
We propose a new self-supervised CNN pre-training technique based on a novel auxiliary task called odd-one-out learning. In this task, the machine is asked to identify the unrelated or odd element from a set of otherwise related elements. We apply this technique to self-supervised video representation learning where we sample subsequences from videos and ask the network to learn to predict the odd...
Face recognition has been an important task in pattern recognition and computer vision. Recently, sparse representation has become a popular data representation method in face recognition field. Convolutional sparse coding, which replaces the linear combination of a set of dictionary atoms with the sum of s series of mapping term convoluted with the dictionary filters, was proposed to improve the...
Sparse representation (SR) based hyperspectral image (HSI) classification is a rapidly evolving research topic. How to construct an optimized dictionary to better characterize spectral-spatial features of HSI is an important problem. In this paper, a novel spectral-spatial online dictionary learning (SSODL) method for HSI classification is proposed. The main idea is to learn a complete and discriminative...
Classification of multisensor data provides potential advantages over a single sensor in accuracy. In this paper, deep bimodal autoencoders are proposed for classification of fusing synthetic aperture radar (SAR) and multispectral images. The proposed deep network based on autoencoders is trained to discover both independencies of each modality and correlations across the modalities. Specifically,...
Discriminative dictionary learning aims to learn a dictionary from training samples in order to improve the discriminative ability of their coding vectors. Gabor wavelets have recently been successfully applied for hyperspectral image (HSI) classification due to their ability to extract joint spatial and spectrum information. Due to the high discriminative power of Gabor features, an efficient method,...
The latest High Efficiency Video Coding (HEVC) has been increasingly used to generate video streams over Internet. However, the decoded HEVC video streams may incur severe quality degradation, especially at low bit-rates. Thus, it is necessary to enhance visual quality of HEVC videos at the decoder side. To this end, we propose in this paper a Decoder-side Scalable Convolutional Neural Network (DS-CNN)...
In many sports, it is useful to analyse video of an athlete in competition for training purposes. In swimming, stroke rate is a common metric used by coaches; requiring a laborious labelling of each individual stroke. We show that using a Convolutional Neural Network (CNN) we can automatically detect discrete events in continuous video (in this case, swimming strokes). We create a CNN that learns...
Image captioning has recently received much attention. Existing approaches, however, are limited to describing images with simple contextual information, which typically generate one sentence to describe each image with only a single contextual emphasis. In this paper, we address this limitation from a user perspective with a novel approach. Given some keywords as additional inputs, the proposed method...
We present VidedWhisfer, a novel approach for unsupervised video representation learning, in which video sequence is treated as a self-supervision entity based on the observation that the sequence encodes video temporal dynamics (e.g., object movement and event evolution). Specifically, for each video sequence, we use a pre-learned visual dictionary to generate a sequence of high-level semantics,...
In many real-world scenarios, rewards extrinsic to the agent are extremely sparse, or absent altogether. In such cases, curiosity can serve as an intrinsic reward signal to enable the agent to explore its environment and learn skills that might be useful later in its life. We formulate curiosity as the error in an agent's ability to predict the consequence of its own actions in a visual feature space...
This paper proposes a novel deep convolutional neural network (CNN), called sparse coding convolutional neural network (SC-CNN), to address the problem of sound event recognition and retrieval task. Unlike the general framework of a CNN, in which feature learning process is performed hierarchically, the proposed framework models the whole memorizing procedures in the human brain, including encoding,...
Facial analysis plays very important role in many vision applications, such as authentication and entertainments. The very early works in the 1990s mostly focus on estimating geometric deformations of facial landmarks to address this task. While in the past several years, more and more efforts have been made to directly learn an appearance regression for facial analysis. Though training regressions...
This work investigates a statistical technique for high performance remote-sensing imagery compression. By exploiting existing remote-sensing data sets, useful structural and texture prior information can be learned. The main methodologies are Bayesian dictionary learning and stochastic approximation. A Bayesian network simulating the generation mechanism of remote- sensing images is modelled. The...
This study firstly points out the essence of knowledge acquirement can be accounted for learners' process of decoding and encoding for knowledge from the perspective of knowledge learning. This process includes the encoding and decoding of knowledge transfer process from the perspective of communication, and knowledge information in different storage forms from the perspective of knowledge management...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.