The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We investigate the problem of representing an entire video using CNN features for human action recognition. End-to-end learning of CNN/RNNs is currently not possible for whole videos due to GPU memory limitations and so a common practice is to use sampled frames as inputs along with the video labels as supervision. However, the global video labels might not be suitable for all of the temporally local...
A ubiquitous problem in pattern recognition is that of matching an observed time-evolving pattern (or signal) to a gold standard in order to recognize or characterize the meaning of a dynamic phenomenon. Examples include matching sequences of images in two videos, matching audio signals in speech recognition, or matching framed trajectories in robot action recognition. This paper shows that all of...
The course control of an unmanned surface vehicle(USV) with water-jet-propelled is addressed using a novel fast terminal sliding mode control approach based on system immersion and manifold invariant (FTSMC-I&I). The control scheme can ensure all error signals globally exponentially converge to origin in finite-time by the novel fast terminal sliding mode controller. In addition, I&I method...
Semantic instance segmentation remains a challenge. We propose to tackle the problem with a discriminative loss function, operating at pixel level, that encourages a convolutional network to produce a representation of the image that can easily be clustered into instances with a simple post-processing step. Our approach of combining an offthe- shelf network with a principled loss function inspired...
Despite the rapid progress of the techniques for image classification, video annotation has remained a challenging task. Automated video annotation would be a breakthrough technology, enabling users to search within the videos. Recently, Google introduced the Cloud Video Intelligence API for video analysis. As per the website, the system can be used to "separate signal from noise, by retrieving...
The past decade has witnessed the popularity of video conferencing, such as FaceTime and Skype. In video conferencing, almost every frame has a human face. Hence, it is necessary to predict attention on face videos by saliency detection, as saliency can be used as a guidance of regionof- interest (ROI) for the content-based applications. To this end, this paper proposes a novel approach for saliency...
We propose a method for transferring an arbitrary style to only a specific object in an image. Style transfer is the process of combining the content of an image and the style of another image into a new image. Our results show that the proposed method can realize style transfer to specific object.
This paper discusses a possible implementation of the integration of knowledge from a probabilistic ontology in the automatic description of images. This combination not only provides the relations existing between the different segments, but also improve the classification accuracy, as the context often gives cues suggesting the correct class of the segment.
Based on the scale-invariant feature transform, this paper presents an approach to keyboard recognition. Not only the skewed keyboard can be corrected, but also the keys in the keyboard can be located. Experimental results confirm the feasibility of the proposed method.
This paper proposes a new spatio-temporal appearance feature named Phasic Maximal and Local Maximal Occurrence (PM-LOMO) representation for video-based person re-identification. To perform temporal alignment of the sequence, we selected the optimal period of walking cycle and divide frames into several phases based on the extreme points of the sequence's Flow Energy Profile (FEP). To describe the...
Synthetic aperture radar (SAR) is a powerful tool for remote sensing of the Earth surface. In the paper, several applications of pattern detection and recognition algorithms for extraction of information from SAR images are discussed. In particular, an idea of usage of optical flow techniques for automatic estimation of the moving target displacements from a sequence of single-look SAR images is proposed...
Describes the universal approach to the intellectual automated system development of digital signal processing for acoustic testing devices with free vibrations method and the usage of artificial neural networks. The system solves the problem of defects recognition and classification, and enhances performance testing in comparison with traditional instruments.
In this work, an image processing based lane-detection approach is proposed. In the proposed approach, candidate pixels that can be used for lane markings are detected by making use of 1-bit transform as a pre-processing step. Next, feature points are extracted via Sobel filter and candidate lane markings are decided employing a correlation and Hough transform based approach. Finally, Kalman filter...
In recent years, Convolutional Neural Networks (CNNs) have shown great performance not only in image classification and image recognition tasks but also several tasks of computer vision. A lot of models which have different number of layers and depths, have been proposed. In this work, locations of leopards are tried to be identified by deep neural networks. To accomplish this task, two different...
This paper addresses the problem of maritime vessel identification by exploiting the state-of-the-art techniques of distance metric learning and deep convolutional neural networks since vessels are the key constituents of marine surveillance. In order to increase the performance of visual vessel identification, we propose a joint learning framework which considers a classification and a distance metric...
Scene detection via processing of multimedia data is a significant research area for the advancement of the video technologies and applications. Currently, the scene detection is mostly performed manually. Thus, it is time consuming and costly. Therefore, it is important to develop algorithms that can automatically segment scenes to support the advancement of these technologies and applications. With...
This study aims to provide an overview on the intersection and interaction between architecture, urban modeling, planning fields and computer vision field. The reflection of the methods and approaches of fields such as visual recognition, natural language processing, data mining and data visualization onto architecture and urban studies are investigated and potentials of inter/transdisciplinary encounters...
Production of high quality wheat has a great importance especially in the solution of nutrition problems. It is necessary to make decomposition for specifying the quality. Here, high quality and unclassified wheat recognition are realized. The most distinctive feature between high quality and poor quality wheat is the shape difference. In this study, Bag of Contour Fragments (BCF) was used as a shape...
The use of depth sensors in activity recognition is a technology that emerges in human computer interaction and motion recognition. In this study, an approach to identify single-person activities using deep learning on depth image sequences is presented. First, a 3D volumetric template is generated using skeletal information obtained from a depth video. The generated 3D volume is used for extracting...
A novel extension to Hızlı B-ESA object detection algorithm is proposed in order to learn convolutional context features for determining boundaries of objects better. For input images, the hypothesis windows and their context around those windows are learned through convolutional layers as two parallel networks. The resulting object and context feature maps are combined in such a way that they preserve...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.