The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Multimodal recognition has recently become more attractive and common method in multimedia information retrieval. In many cases it shows better recognition results than using only unimodal methods. Most of current multimodal recognition methods still depend on unimodal recognition results. Therefore, in order to get better recognition performance, it is important to choose suitable features and classification...
Recurring visual elements in videos commonly represent central content entities, such as main characters and dominant objects. The automated detection of such elements is crucial for various application fields ranging from compact video content summarization to the retrieval of videos sharing common visual entities. Recent approaches for content-based video analysis commonly require for prior knowledge...
In this paper, we present a subclass-representation approach that predicts the probability of a social image belonging to one particular class. We explore the co-occurrence of user-contributed tags to find subclasses with a strong connection to the top level class. We then project each image onto the resulting subclass space, generating a subclass representation for the image. The advantage of our...
In this paper we propose an automatic marine life monitoring system. First task in the monitoring process is to detect underwater moving objects as fishes. Second Task is to identify the species of the detected fish. Third task is to track the detected fish to avoid multiple counting and record their activities. Detection is performed using GMM based background subtraction method, classification is...
Currency duplication also known as counterfeit currency is a vulnerable threat on economy. It is now a common phenomenon due to advanced printing and scanning technology. Bangladesh has been facing serious problem by the increasing rate of fake notes in the market. To get rid of this problem various fake note detection methods are available around the world and most of these are hardware based and...
In medical information retrieval research, automatically classifying X-ray images based on body-parts is a challenging problem. In ImageCLEF's 2015 campaign there was a contest where the participants were challenged to cluster X-ray images into different groups based on presence of particular body-part in that X-ray image. In brief the challenge was to classify given X-ray images primarily into five...
Functional magnetic resonance imaging (fMRI) is one of the most popular and reliable modality to measure brain activities. The quality of fMRI data is best among other modalities such as Electroencephalography (EEG) and Magnetoencephalography (MEG). In fMRI, normally number of features are more than the number of instances so it is necessary to select the features and do dimension reduction to remove...
In this work, we present a method of human action recognition based on detection of interest points by spatial and temporal constraints. Firstly, the improved Harris-Laplace algorithm is proposed to solve the problem of multi-scale. Then, the bag-of-visual features (BoV) model is used for feature extraction, and is built the visual dictionary with K-means clustering. We train the Support Vector Machine...
We present a method for voice activity detection of multiple concurrent speakers using a camera-assisted microphone array. The proposed method uses face detection to identify locations of potential speech sources, and uses this information in an adaptive beamforming procedure to form a spatially directed detection algorithm to identify voice activity for individual speakers. Voice activity is classified...
In this study, we combine a voxel selection method with temporal mesh model to decode the discriminative information distributed in functional Magnetic Resonance Imaging (fMRI) data. We first employ one way Analysis of Variance (ANOVA) feature selection to select the most informative voxels. Then, we form meshes around selected voxels with their spatial and functional neighbors by employing the Mesh...
In this paper, we present an approach that can detect scene nudity level with high precision using different deep net configurations. For this purpose, a recent approach [1] which has intense and very deep convolution layers is used. During net modelling, we strive to obtain most successful net configuration by comparing different Dropout models and image sizes -64 × 64, 128 × 128-. Additionally,...
The deep learning of neural network works on vision recognition and classification tasks briskly, and it can extract great features of an image for classification. Recently, many approaches have studied the visual tracking in two-ways with these characteristics. First, they can regard tracking problem as classifying each video and frame by learning all dataset. Second, use the deep neural network...
Deep learning architectures are showing great promise in various computer vision domains including image classification, object detection, event detection and action recognition. In this study, we investigate various aspects of convolutional neural networks (CNNs) from the big data perspective. We analyze recent studies and different network architectures both in terms of running time and accuracy...
Recently, deep Convolutional Neural Networks (CNNs) have been used to achieve state-of-the-art performance on a wide range of visual learning tasks. However, when facing some imbalanced learning tasks where the training samples are unevenly distributed among different classes, CNNs tend to produce performance bias toward the majority class, making them not suitable for applications in which the recognition...
Despite the outperformance of Support Vector Machine (SVM) on many practical classification problems, the algorithm is not directly applicable to multi-dimensional trajectories having different lengths. In this paper, a new class of SVM that is applicable to trajectory classification, such as action recognition, is developed by incorporating two efficient time-series distances measures into the kernel...
Codebook has been shown to be an effective image representation method. In this method, discriminative local features, e.g., SIFT, are extracted from images and then pooled together. All these local features are then clustered and the centers of all the clusters form a codebook. By counting the distribution of local features on these codes, we obtain a histogram of local features as the global feature...
We propose and evaluate a method for learning deep-sea substrate types using video recorded with a remotely operated vehicle (ROV). The goal of this work is to create a labelled spatial map of substrate types from ROV video in order to support biological and geological domain research. The output of our method describes the mixtures of geological features such as sediment and types of lava flow in...
Attributes are semantic visual properties shared by objects. They have been shown to improve object recognition and to enhance content-based image search. While attributes are expected to cover multiple categories, e.g. a dalmatian and a whale can both have "smooth skin", we find that the appearance of a single attribute varies quite a bit across categories. Thus, an attribute model learned...
Fatigue during long-time driving threatens the safety of drivers and transportation. In this paper, we provide an effective method based on multi-sensor signals collected from Kinect2.0 camera and PPG pulse sensor to build a driver fatigue detection system. Unlike most traditional works, we define the transitional process of fatigue and elaborate its effect on training classifiers. The simulation...
Image search techniques were not generally basedon visual features but on the textual annotation of images. Images were firstly annotated with text and then searched usinga text-based approach from traditional database managementsystems which is time consuming and difficult to manage. Toovercome this problem, CBIR (Content Based Image Retrieval) is introduced which is becoming the hottest research...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.