The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In order to fill gap of growing demand for high efficient image and video processing, open source computer vision library (OpenCv) is way to deals with this task. Hence, this paper is about basic algorithm for image processing and their CPU time consumption in Matlab comparing with OpenCv. Algorithms are tested on images with resolution 3264×2448, 1920×1080, 1024×768 and 220×260. Multi-processors...
Foreground-background segmentation is an important problem in computer vision, and it has many applications. We propose a technique of Automatic foreground-background segmentation based on depth from coded aperture. This method first calculates a coarse depth map using technique of coded aperture depth extraction, then estimate the general area of foreground. At last, in order to get the foreground,...
Edge-preserving smoothing is widely used in image processing and bilateral filtering is one way to achieve it. Bilateral filter is a nonlinear combination of domain and range filters. Implementing the classical bilateral filter is computationally intensive, owing to the nonlinearity of the range filter. In the standard form, the domain and range filters are Gaussian functions and the performance depends...
Although many studies of facial expression analysis have been conducted, most previous works indeed focused on expression recognition. Different from previous works, this paper proposes a novel approach to learn the expression kernel for facial expression intensity estimation. The solution involves first aligning the optical flow to a neutral face to reduce inter-person variations in facial geometry,...
Classifier combination can be used to combine multiple classification decisions to improve object classification performance, and weighted average is a popular method for this purpose. In this paper we propose to use a graph-theoretic clustering method to define the weights for SVM classifier decisions. Specifically, we use the dominant set clustering to evaluate the difficulty of a kernel matrix...
Insect species recognition is more difficult than generic object recognition because of the similarity between different species. In this paper, we propose a hybrid approach called discriminative local soft coding (DLSoft) which combines local and discriminative coding strategies together. Our method takes use of neighbor codewords to get a local soft coding and class specific codebooks (sets of codewords)...
In this paper, we present a new method for a locally adaptive region detector called Bilateral kernel-based Region Detector (BIRD). This work is to detect stable regions from images by consecutively computing a multiscale decomposition based on the bilateral kernel. The BIRD regards a region as covariant if it exhibits predictability in its photometric distance over spatial distance. Distinctiveness...
A method for detecting a vanishing point in structured images is presented. The method relies on the detection of line segments from an edge map by representing clusters of edge points by the long axes of highly eccentric ellipses. The extracted lines provide a set of candidate vanishing points computed by their intersections, which are assigned weights proportional to the lengths of the line segments...
Studies on human faculties of scene recognition have lead to two broad classifications of the perceived information: local and global. It has been shown that both are processed separately and combined towards final category assignment. Recently, it was suggested that accuracy of computational models for local information closely match human performance, while it is not so for current global representations...
In this paper, we address an interesting application of computer vision technique, namely classification of Indian Classical Dance (ICD). With the best of our knowledge, the problem has not been addressed so far in computer vision domain. To deal with this problem, we use a sparse representation based dictionary learning technique. First, we represent each frame of a dance video by a pose descriptor...
Recently, a new representation for recognizing instances and categories of scenes called spatial Principal component analysis of Census Transform histograms (PACT) has shown its excellent performance in the scene image classification task. PACT captures local structures of an image through the Census Transform (CT), meanwhile, large scale structures are captured by the strong correlation between neighboring...
The automated detection of cell nuclei, which is an important step in the pipeline of quantitative histopathological analysis, has received considerable attentions in recent years. However, biological variations, uneven staining and illumination, non-rigid deformations and touching or overlapping of the cell nuclei have made the detection procedure a major hurdle. In this paper, we consider the problem...
A large number of training samples is requiredin developing visual object recognition systems. However, the size of samples is limited sometimes. This paper investigates bagging of one class support vector machines (OCSVM), which just use one class of objects for training. Experiments are performed on Caltech101 database. Our findings show that the performance with bagging method is better than single...
This paper describes a method of tracking multiple persons with occlusions using stereo. Many previous stereo-based systems track each person separately and do not explicitly handle such occlusions. We previously developed an accurate, stable tracking method using overlapping silhouette templates which considers how persons overlap in the image. However, because the method uses a particle filter,...
Local space-time features and bag-of-feature (BOF) representation are often used for action recognition in previous approaches. For complicated human activities, however, the limitation of these approaches blows up because of the local properties of features and the lack of context. This paper addresses the problem by exploiting the spatio-temporal context information between features. We first define...
We present a new Coprime Blurred Pair (CBP) theory that may benefit a number of computer vision applications. A CBP is constructed by blurring the same latent image with two unknown kernels, where the two kernels are co-prime when mapped to bivariate polynomials under the z-transform. We first show that the blurred contents in a CBP are difficult to restore using conventional blind deconvolution methods...
The recently proposed two-phase test sample sparse representation (TPTSR) method makes a great contribution to the field of face recognition. Though TPTSR uses a computationally very efficient algorithm, it can obtain a better performance than the well-known sparse representation method. In the first phase of TPTSR, the determined M nearest neighbors for the test sample seem not to be optimal in terms...
This paper presents a method for recognizing scene categories based on multiple channels of Pyramid Histogram Of Words (PHOW). The main difference among different channels lies in what kind of feature detector/descriptor pair is employed in the framework of Bag-of-Words (BoW) models. This technique works by obtaining the confidence scores of a test image belonging to each possible category based on...
This paper explores the combining of powerful local texture descriptors and the advantages over single descriptors for texture classification. The proposed system is composed of three components: (i) highly discriminative and robust sorted random projections (SRP) features; (ii) a global Bag-of-Words (BoW) model; and (iii) the use of multiple kernel Support Vector Machines (SVMs) combining multiple...
This paper presents an active learning approach for recognizing human actions in videos based on multiple kernel combined method. We design the classifier based on Multiple Kernel Learning (MKL) through Gaussian Processes (GP) regression. This classifier is then trained in an active learning approach. In each iteration, one optimal sample is selected to be interactively annotated and incorporated...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.