The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Human action recognition is one of the most active research areas of computer vision. With the rapid development of deep learning, using neural networks to realize action recognition becomes a popular thesis. This paper proposes a self-learned action recognition method based on neural networks. The proposed method trains dictionaries with sparse autoencoder (SAE) and extracts the key frames with sparse...
This paper introduces the use of representations based on nonnegative matrix factorization (NMF) to train deep neural networks with applications to environmental sound classification. Deep learning systems for sound classification usually rely on the network to learn meaningful representations from spectrograms or hand-crafted features. Instead, we introduce a NMF-based feature learning stage before...
Porting state of the art deep learning algorithms to resource constrained compute platforms (e.g. VR, AR, wearables) is extremely challenging. We propose a fast, compact, and accurate model for convolutional neural networks that enables efficient learning and inference. We introduce LCNN, a lookup-based convolutional neural network that encodes convolutions by few lookups to a dictionary that is trained...
Broad Learning System [1] proposed recently demonstrates efficient and effective learning capability. This model is also proved to be suitable for incremental learning algorithms by taking the advantages of random vector flat neural networks. In this paper, a modified BLS structure based on the K-means feature extraction is developed. Compared with the original broad learning system, acceptable performance...
Several models based on deep neural networks have applied to single image super-resolution and obtained great improvements in terms of both reconstruction accuracy and computational performance. All these methods focus either on performing the super-resolution (SR) reconstruction operation in the high resolution (HR) space after upscaling with a single filter, usually bicubic interpolation, or optimizing...
In this paper, balanced two-stage residual networks (BTSRN) are proposed for single image super-resolution. The deep residual design with constrained depth achieves the optimal balance between the accuracy and the speed for super-resolving images. The experiments show that the balanced two-stage structure, together with our lightweight two-layer PConv residual block design, achieves very promising...
Convolutional neural network (CNN) has achieved the state-of-the-art performance in many different visual tasks. Learned from a large-scale training data set, CNN features are much more discriminative and accurate than the handcrafted features. Moreover, CNN features are also transferable among different domains. On the other hand, traditional dictionary-based features (such as BoW and spatial pyramid...
In this work we propose a new framework for combined feature extraction and classification. The base idea stems from the sparse representation based classification; where in the training samples from each class are assumed to form a basis for representing the same. Later studies learned a basis for each class using dictionary learning; these were shallow techniques where only one level of dictionary...
To reduce the potential radiation risk, low-dose CT has attracted much attention. However, simply lowering the radiation dose will lead to significant deterioration of the image quality. In this paper, we propose a noise reduction method for low-dose CT via deep neural network without accessing original projection data. A deep convolutional neural network is trained to transform low-dose CT images...
In this paper, we propose a new supervised monaural source separation based on autoencoders. We employ the autoencoder for the dictionary training such that the nonlinear network can encode the target source with high expressiveness. The dictionary is trained by each target source without the mixture signal, which makes the system independent from the context where the dictionaries will be used. In...
The last decade of John Cozzens's tenure at the NSF witnessed the advent of theory and methods at the heart of modern data science. These advances include (but are not limited to) compressed sensing, sparse coding, inference methods robust to outliers and missing data, and convex optimization tools that facilitate a host of novel inference methods. This paper describes how these methods evolved from...
Epithelium-stroma classification is always considered as an important preprocessing step for morphological quantitative analysis in image-based histological researches of oncologic diseases. However, large-scale accurate ground-truth labeling is expensive in histopathological image analysis, thus the classification performances will still be limited with the insufficient labeled training samples....
We present Deep Sparse-coded Network (DSN), a deep architecture based on multilayer sparse coding. It has been considered difficult to learn a useful feature hierarchy by stacking sparse coding layers in a straightforward manner. The primary reason is the modeling assumption for sparse coding that takes in a dense input and yields a sparse output vector. Applying a sparse coding layer on the output...
When applied for phoneme recognition, the Connectionist Temporal Classification (CTC) objective function allows a neural network to be trained with the phoneme level transcriptions of training utterances. A limitation of the CTC is that it can not be applied directly for network training with large speech corpora, since those corpora usually only have word level transcriptions. This work extends the...
In recent years, growing attention has been paid to recognizing text in natural scenes images. Scene Character recognition (SCR) is an important step in automatizing the process of reading text in natural scenes.
To reduce patient's dose, few-view CT reconstruction promises to be a good attempt. The key to better reconstruction is the sparse view artifacts. In recent years, DL(deep learing) has attracted a lot of attention because its outstanding performance in image processing. We propose a deep learning method for few-view CT reconstuction. Our method directly learns an end-to-end mapping between the full-view/few-view...
In order to improve the performance of Primi speech recognition system, a novel method based on deep neural network has been proposed. The deep neural network has two distinct characteristics, one is a high-capacity, and the other is a highly complex network structure. On the Kaldi platform, the neural network, containing four hidden layers, which used to deal with the Primi speech recognition. The...
This research proposes an approach for text classification that uses a simple neural network called Dynamic Text Classifier Neural Network (DTCNN). The neural network uses as input vectors of words with variable dimension without information loss called Dynamic Token Vectors (DTV). The proposed neural network is designed for the classification of large and short text into categories. The learning...
Image super-resolution aims to recover a fine-resolution image from one or more low-resolution image(s). In this paper, we propose a novel image super-resolution approach based on the recent development of coupled deep auto-encoder. In the training step, the vector of the local low resolution (LR) and high resolution image (HR) patches and the corresponding edge information are extracted to be the...
A novel fMRI classification method designed for rapid event related fMRI experiments is described and applied to the classification of loud reading of isolated words in Hebrew. Three comparisons of different grammatical complexity were performed: (i) words versus asterisks (ii) “with diacritics versus without diacritics” and (iii) “with root versus no root”. We discuss the most difficult task and,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.