Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
In this paper, we present a novel perceptually-based optimization for the improvement of stereoscopic video coding efficiency. The main idea of this proposed scheme is to adaptively adjust the quantization parameter by taking into account the Human Visual System perceptual characteristics. For this, a saliency map is generated from both views and then segmented into salient and non-salient regions...
Text detection is typically the first step for any text processing such as hand-written text recognition, layout analysis, line detection, or writer identification. This paper describes a new method to detect text in images, particularly in historical document images. For a robust detection, we propose the use of the vesselness filter as a new preprocessing step for text detection. We show, that this...
Visual question answering (VQA) comes as a result of great development in computer vision and natural language processing, which requires deep understanding of images and questions and effective integration of them. Current works on VQA simply concatenated visual and textual features or compared them via dot product, which were unable to eliminate the semantic difference between them. We argue to...
In this paper we introduce a novel method for general semantic segmentation that can benefit from general semantics of Convolutional Neural Network (CNN). Our segmentation proposes visually and semantically coherent image segments. We use binary encoding of CNN features to overcome the difficulty of the clustering on the high-dimensional CNN feature space. These binary codes are very robust against...
Action recognition has been one of the challenging problems in the computer vision community. Most of the recent research work in this area exploits the motion features captured by dense trajectory descriptors. On the other hand, static image classification has seen the rise of deep learning architectures, with evidence that the output of intermediate layers could be successfully employed as a low...
Intra-frame prediction in the High Efficiency Video Coding (HEVC) standard can be empirically improved by applying sets of recursive two-dimensional filters to the predicted values. However, this approach does not allow (or complicates significantly) the parallel computation of pixel predictions. In this work we analyze why the recursive filters are effective, and use the results to derive sets of...
Cultivar identification is an important aspect in agriculture and also a typical task of fine-grained visual categorization (FGVC). In comparison with other common topics in FGVC, studies on this problem are somewhat lagged and limited. In this paper, targeting four Chinese maize cultivars of Jundan No.20, Wuyue No.3, Nongda No.108, and Zhengdan No.958, we first consider the problem of identifying...
We present a novel video representation for human action recognition by considering temporal sequences of visual words. Based on state-of-the-art dense trajectories, we introduce temporal bundles of dominant, that is most frequent, visual words. These are employed to construct a complementary action representation of ordered dominant visual word sequences, that additionally incorporates fine grained...
In contrast to still image analysis, motion information offers a powerful means to analyze video. In particular, motion trajectories determined from keypoints have become very popular in recent years for a variety of video analysis tasks, including search, retrieval and classification. Additionally, cloud-based analysis of media content has been gaining momentum, so efficient communication of salient...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.