Wyniki wyszukiwania

rozdział

Sound event detection in synthetic audio: Analysis of the dcase 2016 task results

Gregoire Lafay, Emmanouil Benetos, Mathieu Lagrange

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 11 - 15

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

As part of the 2016 public evaluation challenge on Detection and Classification of Acoustic Scenes and Events (DCASE 2016), the second task focused on evaluating sound event detection systems using synthetic mixtures of office sounds. This task, which follows the ‘Event Detection-Office Synthetic’ task of DCASE 2013, studies the behaviour of tested algorithms when facing controlled levels of audio...

rozdział

Metric learning based data augmentation for environmental sound classification

Rui Lu, Zhiyao Duan, Changshui Zhang

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 1 - 5

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Deep neural networks have been widely applied in the field of environmental sound classification. However, due to the scarcity of carefully labeled data, their training process suffers from over-fitting. Data augmentation is a technique that alleviates this issue. It augments the training set with synthetic data that are created by modifying some parameters of the real data. However, not all kinds...

rozdział

IMINET: Convolutional semi-siamese networks for sound search by vocal imitation

Yichi Zhang, Zhiyao Duan

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 304 - 308

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

Searching sounds by text labels is often difficult, as text labels cannot always provide sufficient information for the sound content. Previously we proposed an unsupervised system called IMISOUND for sound search by vocal imitation. In this paper, we further propose a Convolutional Semi-Siamese Network (CSN) called IMINET. IMINET uses two towers of Convolutional Neural Networks (CNN) to extract features...

rozdział

Comparative performance analysis of neural networks architectures on H2O platform for various activation functions

Yuriy Kochura, Sergii Stirenko, Yuri Gordienko

2017 IEEE International Young Scientists Forum on Applied Physics and Engineering (YSF) > 70 - 73

2017 IEEE International Young Scientists Forum on Applied Physics and Engineering (YSF)

Deep learning (deep structured learning, hierarchical learning or deep machine learning) is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using multiple processing layers with complex structures or otherwise composed of multiple non-linear transformations. In this paper, we present the results of testing neural networks architectures...

rozdział

Quality assessment for synthesized view based on variable-length context tree

Suiyi Ling, Patrick Le Pallet, Gene Cheung

2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP) > 1 - 6

2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP)

In free viewpoint television (FTV) application scenario, views that synthesized with depth image-based rendering (DIBR) techniques mainly contain special artifacts like geometric distortions. These artifacts may affect the structure of images/videos by changing the global contour characteristics and thus are annoying for human observers. Context tree based contour coding scheme can be a good tool...

rozdział

Reliable Machine Learning for Networking: Key Issues and Approaches

Christian A. Hammerschmidt, Sebastian Garcia, Sicco Verwer, Radu State

2017 IEEE 42nd Conference on Local Computer Networks (LCN) > 167 - 170

2017 IEEE 42nd Conference on Local Computer Networks (LCN)

Machine learning has become one of the go-to methods for solving problems in the field of networking. This development is driven by data availability in large-scale networks and the commodification of machine learning frameworks. While this makes it easier for researchers to implement and deploy machine learning solutions on networks quickly, there are a number of vital factors to account for when...

rozdział

Automated audio captioning with recurrent neural networks

Konstantinos Drossos, Sharath Adavanne, Tuomas Virtanen

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) > 374 - 378

2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

We present the first approach to automated audio captioning. We employ an encoder-decoder scheme with an alignment model in between. The input to the encoder is a sequence of log mel-band energies calculated from an audio file, while the output is a sequence of words, i.e. a caption. The encoder is a multi-layered, bi-directional gated recurrent unit (GRU) and the decoder a multi-layered GRU with...

rozdział

Training Deep Networks to be Spatially Sensitive

Nicholas Kolkin, Gregory Shakhnarovich, Eli Shechtman

2017 IEEE International Conference on Computer Vision (ICCV) > 5669 - 5678

2017 IEEE International Conference on Computer Vision (ICCV)

In many computer vision tasks, for example saliency prediction or semantic segmentation, the desired output is a foreground map that predicts pixels where some criteria is satisfied. Despite the inherently spatial nature of this task commonly used learning objectives do not incorporate the spatial relationships between misclassified pixels and the underlying ground truth. The Weighted F-measure, a...

rozdział

Weakly Supervised Learning of Deep Metrics for Stereo Reconstruction

Stepan Tulyakov, Anton Ivanov, Francois Fleuret

2017 IEEE International Conference on Computer Vision (ICCV) > 1348 - 1357

2017 IEEE International Conference on Computer Vision (ICCV)

Deep-learning metrics have recently demonstrated extremely good performance to match image patches for stereo reconstruction. However, training such metrics requires large amount of labeled stereo images, which can be difficult or costly to collect for certain applications (consider, for example, satellite stereo imaging). The main contribution of our work is a new weakly supervised method for learning...

rozdział

Pose Guided RGBD Feature Learning for 3D Object Pose Estimation

Vassileios Balntas, Andreas Doumanoglou, Caner Sahin, Juil Sock, więcej

2017 IEEE International Conference on Computer Vision (ICCV) > 3876 - 3884

2017 IEEE International Conference on Computer Vision (ICCV)

In this paper we examine the effects of using object poses as guidance to learning robust features for 3D object pose estimation. Previous works have focused on learning feature embeddings based on metric learning with triplet comparisons and rely only on the qualitative distinction of similar and dissimilar pose labels. In contrast, we consider the exact pose differences between the training samples,...

rozdział

DeepCD: Learning Deep Complementary Descriptors for Patch Representations

Tsun-Yi Yang, Jo-Han Hsu, Yen-Yu Lin, Yung-Yu Chuang

2017 IEEE International Conference on Computer Vision (ICCV) > 3334 - 3342

2017 IEEE International Conference on Computer Vision (ICCV)

This paper presents the DeepCD framework which learns a pair of complementary descriptors jointly for image patch representation by employing deep learning techniques. It can be achieved by taking any descriptor learning architecture for learning a leading descriptor and augmenting the architecture with an additional network stream for learning a complementary descriptor. To enforce the complementary...

rozdział

Deep Metric Learning with Angular Loss

Jian Wang, Feng Zhou, Shilei Wen, Xiao Liu, więcej

2017 IEEE International Conference on Computer Vision (ICCV) > 2612 - 2620

2017 IEEE International Conference on Computer Vision (ICCV)

The modern image search system requires semantic understanding of image, and a key yet under-addressed problem is to learn a good metric for measuring the similarity between images. While deep metric learning has yielded impressive performance gains by extracting high level abstractions from image data, a proper objective loss function becomes the central issue to boost the performance. In this paper,...

rozdział

Two-dimensional Linear discriminant analysis for low-resolution face recognition

Di Zhao, Zhenxue Chen, Chengyun Liu, Yanan Peng

2017 Chinese Automation Congress (CAC) > 703 - 707

2017 Chinese Automation Congress (CAC)

Low-resolution (LR) is a challenging problem in the real world. In order to obtain better performance for low-resolution face recognition (LRFR), this paper employs a novel approach for matching low-resolution images with high resolution (HR) images based on two-dimensional linear discriminant analysis (2D-LDA) and metric learning method. The LR and HR images are transformed into a common space via...

rozdział

Towards Diverse and Natural Image Descriptions via a Conditional GAN

Bo Dai, Sanja Fidler, Raquel Urtasun, Dahua Lin

2017 IEEE International Conference on Computer Vision (ICCV) > 2989 - 2998

2017 IEEE International Conference on Computer Vision (ICCV)

Despite the substantial progress in recent years, the image captioning techniques are still far from being perfect. Sentences produced by existing methods, e.g. those based on RNNs, are often overly rigid and lacking in variability. This issue is related to a learning principle widely used in practice, that is, to maximize the likelihood of training samples. This principle encourages high resemblance...

rozdział

Smart Mining for Deep Metric Learning

Ben Harwood, Vijay Kumar B. G, Gustavo Carneiro, Ian Reid, więcej

2017 IEEE International Conference on Computer Vision (ICCV) > 2840 - 2848

2017 IEEE International Conference on Computer Vision (ICCV)

To solve deep metric learning problems and producing feature embeddings, current methodologies will commonly use a triplet model to minimise the relative distance between samples from the same class and maximise the relative distance between samples from different classes. Though successful, the training convergence of this triplet model can be compromised by the fact that the vast majority of the...

rozdział

BIER — Boosting Independent Embeddings Robustly

Michael Opitz, Georg Waltner, Horst Possegger, Horst Bischof

2017 IEEE International Conference on Computer Vision (ICCV) > 5199 - 5208

2017 IEEE International Conference on Computer Vision (ICCV)

Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of large embeddings. In this work, we show how to improve the robustness of embeddings by exploiting independence in ensembles. We divide the last embedding layer of a deep network into an embedding ensemble and formulate training this ensemble as an online gradient boosting problem. Each...

rozdział

Efficient Online Local Metric Adaptation via Negative Samples for Person Re-identification

Jiahuan Zhou, Pei Yu, Wei Tang, Ying Wu

2017 IEEE International Conference on Computer Vision (ICCV) > 2439 - 2447

2017 IEEE International Conference on Computer Vision (ICCV)

Many existing person re-identification (PRID) methods typically attempt to train a faithful global metric offline to cover the enormous visual appearance variations, so as to directly use it online on various probes for identity match- ing. However, their need for a huge set of positive training pairs is very demanding in practice. In contrast to these methods, this paper advocates a different paradigm:...

rozdział

Improved Image Captioning via Policy Gradient optimization of SPIDEr

Siqi Liu, Zhenhai Zhu, Ning Ye, Sergio Guadarrama, więcej

2017 IEEE International Conference on Computer Vision (ICCV) > 873 - 881

2017 IEEE International Conference on Computer Vision (ICCV)

Current image captioning methods are usually trained via maximum likelihood estimation. However, the log-likelihood score of a caption does not correlate well with human assessments of quality. Standard syntactic evaluation metrics, such as BLEU, METEOR and ROUGE, are also not well correlated. The newer SPICE and CIDEr metrics are better correlated, but have traditionally been hard to optimize for...

rozdział

No Fuss Distance Metric Learning Using Proxies

Yair Movshovitz-Attias, Alexander Toshev, Thomas K. Leung, Sergey Ioffe, więcej

2017 IEEE International Conference on Computer Vision (ICCV) > 360 - 368

2017 IEEE International Conference on Computer Vision (ICCV)

We address the problem of distance metric learning (DML), defined as learning a distance consistent with a notion of semantic similarity. Traditionally, for this problem supervision is expressed in the form of sets of points that follow an ordinal relationship – an anchor point x is similar to a set of positive points Y , and dissimilar to a set of negative points Z, and a loss defined over these...

rozdział

Show, Adapt and Tell: Adversarial Training of Cross-Domain Image Captioner

Tseng-Hung Chen, Yuan-Hong Liao, Ching-Yao Chuang, Wan-Ting Hsu, więcej

2017 IEEE International Conference on Computer Vision (ICCV) > 521 - 530

2017 IEEE International Conference on Computer Vision (ICCV)

Impressive image captioning results are achieved in domains with plenty of training image and sentence pairs (e.g., MSCOCO). However, transferring to a target domain with significant domain shifts but no paired training data (referred to as cross-domain image captioning) remains largely unexplored. We propose a novel adversarial training procedure to leverage unpaired data in the target domain. Two...

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Sound event detection in synthetic audio: Analysis of the dcase 2016 task results

Metric learning based data augmentation for environmental sound classification

IMINET: Convolutional semi-siamese networks for sound search by vocal imitation

Comparative performance analysis of neural networks architectures on H2O platform for various activation functions

Quality assessment for synthesized view based on variable-length context tree

Reliable Machine Learning for Networking: Key Issues and Approaches

Automated audio captioning with recurrent neural networks

Training Deep Networks to be Spatially Sensitive

Weakly Supervised Learning of Deep Metrics for Stereo Reconstruction

Pose Guided RGBD Feature Learning for 3D Object Pose Estimation

DeepCD: Learning Deep Complementary Descriptors for Patch Representations

Deep Metric Learning with Angular Loss

Two-dimensional Linear discriminant analysis for low-resolution face recognition

Towards Diverse and Natural Image Descriptions via a Conditional GAN

Smart Mining for Deep Metric Learning

BIER — Boosting Independent Embeddings Robustly

Efficient Online Local Metric Adaptation via Negative Samples for Person Re-identification

Improved Image Captioning via Policy Gradient optimization of SPIDEr

No Fuss Distance Metric Learning Using Proxies

Show, Adapt and Tell: Adversarial Training of Cross-Domain Image Captioner

Opcje filtrowania

Data publikacji

Dostępność treści

Słowa kluczowe

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Dostępność treści

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu