Search results

chapter

Latent syntactic structure-based sentiment analysis

Viktor Hangya, Zsolt Szanto, Richard Farkas

2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA) > 248 - 254

2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA)

People share their opinions about things like products, movies and services using social media channels. The analysis of these textual contents for sentiments is a gold mine for marketing experts, thus automatic sentiment analysis is a popular area of applied artificial intelligence. We propose a latent syntactic structure-based approach for sentiment analysis which requires only sentence-level polarity...

chapter

Speech recognition features based on deep latent Gaussian models

Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) > 1 - 6

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

This paper constructs speech features based on a generative model using a deep latent Gaussian model (DLGM), which is trained using stochastic gradient variational Bayes (SGVB) algorithm and performs efficient approximate inference and learning with a directed probabilistic graphical model. The trained DLGM then generate latent variables based on Gaussian distribution, which is used as new features...

chapter

CNN-based transform index prediction in multiple transforms framework to assist entropy coding

Saurabh Puri, Sebastien Lasserre, Patrick Le Callet

2017 25th European Signal Processing Conference (EUSIPCO) > 798 - 802

2017 25th European Signal Processing Conference (EUSIPCO)

Recent work in video compression has shown that using multiple 2D transforms instead of a single transform in order to de-correlate residuals provides better compression efficiency. These transforms are tested competitively inside a video encoder and the optimal transform is selected based on the Rate Distortion Optimization (RDO) cost. However, one needs to encode a syntax to indicate the chosen...

chapter

Addressing unknown word problem for neural machine translation using distributee representations of words as input features

Tomoki Nishimura, Tomoyosi Akiba

2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA) > 1 - 6

2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA)

In recent years, the machine translation system based on neural network, called Neural Machie Translation, have attracted much attention, in which the entire translation steps are implemented in a single large neural network. In this framework, dealing with a large vocabulary size on its input (source) and output (target) often make the training computationally intractable. Therefore, the most frequent...

chapter

A deep convolutional encoder-decoder model for robust speech dereverberation

D. S. Wang, Y. X. Zou, W. Shi

2017 22nd International Conference on Digital Signal Processing (DSP) > 1 - 5

2017 22nd International Conference on Digital Signal Processing (DSP)

Research shows that speech dereverberation (SD) with Deep Neural Network (DNN) achieves the state-of-the-art results by learning spectral mapping, which, simultaneously, lacks the characterization of the local temporal spectral structures (LTSS) of speech signal and calls for a large storage space that is impractical in real applications. Contrarily, the Convolutional Neural Network (CNN) offers a...

chapter

A new algorithm for training sparse autoencoders

Ali Shahin Shamsabadi, Massoud Babaie-Zadeh, Seyyede Zohreh Seyyedsalehi, Hamid R. Rabiee, more

2017 25th European Signal Processing Conference (EUSIPCO) > 2141 - 2145

2017 25th European Signal Processing Conference (EUSIPCO)

Data representation plays an important role in performance of machine learning algorithms. Since data usually lacks the desired quality, many efforts have been made to provide a more desirable representation of data. Among many different approaches, sparse data representation has gained popularity in recent years. In this paper, we propose a new sparse autoencoder by imposing the power two of smoothed...

chapter

Soft decoding of JPEG 2000 compressed images using bit-rate-driven deep convolutional neural networks

Xiaohai He, Honggang Chen, Jingxu Chen, Linbo Qing

2017 IEEE International Conference on Information and Automation (ICIA) > 843 - 847

2017 IEEE International Conference on Information and Automation (ICIA)

Lossy image compression methods always introduce various unpleasant artifacts into the compressed results, especially at low bit-rates. In recent years, many effective soft decoding methods for JPEG compressed images have been proposed. However, to the best of our knowledge, very few works have been done on soft decoding of JPEG 2000 compressed images. Inspired by the outstanding performance of Convolution...

chapter

Unsupervised Video Summarization with Adversarial LSTM Networks

Behrooz Mahasseni, Michael Lam, Sinisa Todorovic

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2982 - 2991

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

This paper addresses the problem of unsupervised video summarization, formulated as selecting a sparse subset of video frames that optimally represent the input video. Our key idea is to learn a deep summarizer network to minimize distance between training videos and a distribution of their summarizations, in an unsupervised way. Such a summarizer can then be applied on a new video for estimating...

chapter

Deep Image Harmonization

Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2799 - 2807

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Compositing is one of the most common operations in photo editing. To generate realistic composites, the appearances of foreground and background need to be adjusted to make them compatible. Previous approaches to harmonize composites have focused on learning statistical relationships between hand-crafted appearance features of the foreground and background, which is unreliable especially when the...

chapter

StyleBank: An Explicit Representation for Neural Image Style Transfer

Dongdong Chen, Lu Yuan, Jing Liao, Nenghai Yu, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2770 - 2779

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We propose StyleBank, which is composed of multiple convolution filter banks and each filter bank explicitly represents one style, for neural image style transfer. To transfer an image to a specific style, the corresponding filter bank is operated on top of the intermediate feature embedding produced by a single auto-encoder. The StyleBank and the auto-encoder are jointly learnt, where the learning...

chapter

Flexible Spatio-Temporal Networks for Video Prediction

Chaochao Lu, Michael Hirsch, Bernhard Scholkopf

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2137 - 2145

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We describe a modular framework for video frame prediction. We refer to it as a Flexible Spatio-Temporal Network (FSTN) as it allows the extrapolation of a video sequence as well as the estimation of synthetic frames lying in between observed frames and thus the generation of slow-motion videos. By devising a customized objective function comprising decoding, encoding, and adversarial losses, we are...

chapter

Semantic Autoencoder for Zero-Shot Learning

Elyor Kodirov, Tao Xiang, Shaogang Gong

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4447 - 4456

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Existing zero-shot learning (ZSL) models typically learn a projection function from a feature space to a semantic embedding space (e.g. attribute space). However, such a projection function is only concerned with predicting the training seen class semantic representation (e.g. attribute prediction) or classification. When applied to test data, which in the context of ZSL contains different (unseen)...

chapter

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

Xiao Yang, Ersin Yumer, Paul Asente, Mike Kraley, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 4342 - 4351

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present an end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images. We consider document semantic structure extraction as a pixel-wise segmentation task, and propose a unified model that classifies pixels based not only on their visual appearance, as in the traditional page segmentation task, but also on the content of underlying text. Moreover,...

chapter

Synthesizing Normalized Faces from Facial Identity Features

Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3386 - 3395

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We present a method for synthesizing a frontal, neutral-expression image of a persons face, given an input face photograph. This is achieved by learning to generate facial landmarks and textures from features extracted from a facial-recognition network. Unlike previous generative approaches, our encoding feature vector is largely invariant to lighting, pose, and facial expression. Exploiting this...

chapter

Hallucinating Very Low-Resolution Unaligned and Noisy Face Images by Transformative Discriminative Autoencoders

Xin Yu, Fatih Porikli

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5367 - 5375

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Most of the conventional face hallucination methods assume the input image is sufficiently large and aligned, and all require the input image to be noise-free. Their performance degrades drastically if the input image is tiny, unaligned, and contaminated by noise. In this paper, we introduce a novel transformative discriminative autoencoder to 8X super-resolve unaligned noisy and tiny (16X16) low-resolution...

chapter

Hard Mixtures of Experts for Large Scale Weakly Supervised Vision

Sam Gross, Marc'Aurelio Ranzato, Arthur Szlam

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5085 - 5093

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Training convolutional networks (CNNs) that fit on a single GPU with minibatch stochastic gradient descent has become effective in practice. However, there is still no effective method for training large networks that do not fit in the memory of a few GPU cards, or for parallelizing CNN training. In this work we show that a simple hard mixture of experts model can be efficiently trained to good effect...

chapter

Deep Image Matting

Ning Xu, Brian Price, Scott Cohen, Thomas Huang

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 311 - 320

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Image matting is a fundamental computer vision problem and has many applications. Previous algorithms have poor performance when an image has similar foreground and background colors or complicated textures. The main reasons are prior methods 1) only use low-level features and 2) lack high-level context. In this paper, we propose a novel deep learning based algorithm that can tackle both these problems...

chapter

Lip Reading Sentences in the Wild

Joon Son Chung, Andrew Senior, Oriol Vinyals, Andrew Zisserman

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 3444 - 3453

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio. Unlike previous works that have focussed on recognising a limited number of words or phrases, we tackle lip reading as an open-world problem – unconstrained natural language sentences, and in the wild videos. Our key contributions are: (1) a Watch, Listen, Attend and Spell...

chapter

Learning Diverse Image Colorization

Aditya Deshpande, Jiajun Lu, Mao-Chuang Yeh, Min Jin Chong, more

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 2877 - 2885

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Colorization is an ambiguous problem, with multiple viable colorizations for a single grey-level image. However, previous methods only produce the single most probable colorization. Our goal is to model the diversity intrinsic to the problem of colorization and produce multiple colorizations that display long-scale spatial co-ordination. We learn a low dimensional embedding of color fields using a...

chapter

Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories

Jian Shi, Yue Dong, Hao Su, Stella X. Yu

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) > 5844 - 5853

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

We focus on the non-Lambertian object-level intrinsic problem of recovering diffuse albedo, shading, and specular highlights from a single image of an object. Based on existing 3D models in the ShapeNet database, a large-scale object intrinsics database is rendered with HDR environment maps. Millions of synthetic images of objects and their corresponding albedo, shading, and specular ground-truth...

INFONA - science communication portal

Search results

Latent syntactic structure-based sentiment analysis

Speech recognition features based on deep latent Gaussian models

CNN-based transform index prediction in multiple transforms framework to assist entropy coding

Addressing unknown word problem for neural machine translation using distributee representations of words as input features

A deep convolutional encoder-decoder model for robust speech dereverberation

A new algorithm for training sparse autoencoders

Soft decoding of JPEG 2000 compressed images using bit-rate-driven deep convolutional neural networks

Unsupervised Video Summarization with Adversarial LSTM Networks

Deep Image Harmonization

StyleBank: An Explicit Representation for Neural Image Style Transfer

Flexible Spatio-Temporal Networks for Video Prediction

Semantic Autoencoder for Zero-Shot Learning

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

Synthesizing Normalized Faces from Facial Identity Features

Hallucinating Very Low-Resolution Unaligned and Noisy Face Images by Transformative Discriminative Autoencoders

Hard Mixtures of Experts for Large Scale Weakly Supervised Vision

Deep Image Matting

Lip Reading Sentences in the Wild

Learning Diverse Image Colorization

Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options