Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
The past decade has witnessed the popularity of video conferencing, such as FaceTime and Skype. In video conferencing, almost every frame has a human face. Hence, it is necessary to predict attention on face videos by saliency detection, as saliency can be used as a guidance of regionof- interest (ROI) for the content-based applications. To this end, this paper proposes a novel approach for saliency...
Learning the dynamics of shape is at the heart of many computer vision problems: object tracking, change detection, longitudinal shape analysis, trajectory classification, etc. In this work we address the problem of statistical inference of diffusion processes of shapes. We formulate a general Itô diffusion on the manifold of deformable landmarks and propose several drift models for the evolution...
We propose a method for transferring an arbitrary style to only a specific object in an image. Style transfer is the process of combining the content of an image and the style of another image into a new image. Our results show that the proposed method can realize style transfer to specific object.
This paper discusses a possible implementation of the integration of knowledge from a probabilistic ontology in the automatic description of images. This combination not only provides the relations existing between the different segments, but also improve the classification accuracy, as the context often gives cues suggesting the correct class of the segment.
In this paper, we proposed a seam carving based refinement method to refine and produce superpixels. The proposed method can refine existing superpixels by repeating the splitting process. There are two major steps. The first is choosing a superpixel candidate by analyzing color variances; the second is splitting a superpixel into 4 by dynamic programming. The experimental results show that the proposed...
Image matting is one of the most common image processing techniques, because it is often necessary to extract the desired foreground object from the original image and then to composite the extracted foreground with the another background. Over the years, there have been lots of commercial image processing tools or softwares which can support the human beings this function, such as photoshop, photoimpact...
The convolutional neural network (CNN) is more and more popular in computer vision and widely used in acoustic signal processing, image classification, and image segmentation. In this work, an architecture which is a combination of the 3-D convolutional neural network and the long short term memory (LSTM) was proposed for action recognition. It stacks the consecutive video frames, extracts spatial...
This paper proposes a new spatio-temporal appearance feature named Phasic Maximal and Local Maximal Occurrence (PM-LOMO) representation for video-based person re-identification. To perform temporal alignment of the sequence, we selected the optimal period of walking cycle and divide frames into several phases based on the extreme points of the sequence's Flow Energy Profile (FEP). To describe the...
Automatic and accurate human upper-body detection and orientation estimation have great practical value in several computer vision applications. Most previous works on human upper-body orientation estimation assume that the human upper-body region is already detected and aligned. However, this is not the case in many real-world scenarios. Additional human detector is essential which is usually much...
For low density crowd, the statistical information of pixels and feature points can reflect the change of crowd density. Therefore, pixels and corners are fused in this paper, then, SVR is used to learn the corresponding relationship between feature and the number of people. While PSO is used to optimize the choice of parameters C and gamma in SVR. The experimental results show that the SVR optimized...
In recent years, Convolutional Neural Networks (CNNs) have shown great performance not only in image classification and image recognition tasks but also several tasks of computer vision. A lot of models which have different number of layers and depths, have been proposed. In this work, locations of leopards are tried to be identified by deep neural networks. To accomplish this task, two different...
This paper addresses the problem of maritime vessel identification by exploiting the state-of-the-art techniques of distance metric learning and deep convolutional neural networks since vessels are the key constituents of marine surveillance. In order to increase the performance of visual vessel identification, we propose a joint learning framework which considers a classification and a distance metric...
Hyperspectral imaging is based on the acquisition of a large number of narrowly spaced spectral band images in the electromagnetic spectrum. Hyperspectral images surpass other imaging techniques in the detection of objects, classification and the detection of the changes that occur in the scene. A recent approach for hyperspectral segmentation is the superpixel segmentation approach. In this work,...
Scene detection via processing of multimedia data is a significant research area for the advancement of the video technologies and applications. Currently, the scene detection is mostly performed manually. Thus, it is time consuming and costly. Therefore, it is important to develop algorithms that can automatically segment scenes to support the advancement of these technologies and applications. With...
Nowadays, with the increasing use of biometric data, it is expected that systems can give successful results against difficult situations and work robustly. Especially, in face recognition systems, variables such as direction of light, facial expression and reflection are making difficult to identify. Thus, in recent years, Convolutional Neural Network (CNN) models, which are deep learning models...
This study aims to provide an overview on the intersection and interaction between architecture, urban modeling, planning fields and computer vision field. The reflection of the methods and approaches of fields such as visual recognition, natural language processing, data mining and data visualization onto architecture and urban studies are investigated and potentials of inter/transdisciplinary encounters...
With the increased use of smart devices, digital cameras and abundance of memory in the devices, the pictures of the same scenes have been taken several times, resulting in a number of images consisting of the same or very similar content in memory. Manually selecting the good ones is time-consuming as well as error prone. In this paper, the features of the images in the data sets were extracted and...
Production of high quality wheat has a great importance especially in the solution of nutrition problems. It is necessary to make decomposition for specifying the quality. Here, high quality and unclassified wheat recognition are realized. The most distinctive feature between high quality and poor quality wheat is the shape difference. In this study, Bag of Contour Fragments (BCF) was used as a shape...
The use of depth sensors in activity recognition is a technology that emerges in human computer interaction and motion recognition. In this study, an approach to identify single-person activities using deep learning on depth image sequences is presented. First, a 3D volumetric template is generated using skeletal information obtained from a depth video. The generated 3D volume is used for extracting...
This paper reviews the historic of ChaLearn Looking at People (LAP) events. We started in 2011 (with the release of the first Kinect device) to run challenges related to human action/activity and gesture recognition. Since then we have regularly organized events in a series of competitions covering all aspects of visual analysis of humans. So far we have organized more than 10 international challenges...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.