The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Seeded segmentation methods attempt to solve the segmentation problem in the presence of prior knowledge in the form of a partial segmentation, where a small subset of the image elements (seed-points) have been assigned correct segmentation labels. Common for most of the leading methods in this area is that they seek to find a segmentation where the boundaries of the segmented regions coincide with...
We propose a fast and efficient method for localization and rectification of a dominant rectangular region within an image, particularly suitable for mobile Augmented Reality applications. This approach can deal with perspective distortion and high-frequency structures such as text. The resulting image may be used for planar tracking or as input for subsequent image processing tasks. We demonstrate...
We study the problem of cross-media retrieval, where the query and the returned results are of different modalities. A novel method is proposed to measure the similarity between heterogeneous media objects for cross-media retrieval. While existing methods only focus on the original low level feature spaces or the third common space, our proposed tri-space explores both of the two kinds of spaces....
Unsupervised image segmentation is an important and difficult technique in pattern recognition. In this paper, we propose an interesting region merging algorithm for segmentation of natural images. It consists of two steps: first forming initial over-segmentation by the Connected Coherence Tree Algorithm (CCTA), and then merging the primitive regions in terms of their similarity and feature in the...
This paper explores the challenge of optimally categorizing regions for man-made environment. We propose using the histogram of oriented gradients (HOG) features for characterizing image regions, and propose an algorithm based on the entropy of HOG to select relevant regions. We also propose a regionsensitive feature selection algorithm for image registration. The algorithms are applied to several...
The well-known bilateral filter is used to smooth noisy images while keeping their edges. This filter is commonly used with Gaussian kernel functions without real justification. The choice of the kernel functions has a major effect on the filter behavior. We propose to use exponential kernels with L1 distances instead of Gaussian ones. We derive Stein's Unbiased Risk Estimate to find the optimal parameters...
In this paper we present an algorithm to detect text on video frames consisting of lecture slides. We begin by performing a multi-channel wavelet transform and then merge the channel components for the high frequency sub bands to obtain a composite energy map. Thresholding the energy map results in an edge map consisting of candidate text pixels — some of these correspond to actual text and others...
In this paper, we present a new method for text extraction in real scene images. We propose first a skeleton based descriptor to describe the strokes of the text candidates that compose a spatial relation graph. We then apply the graph cuts algorithm to label the nodes of the graph as text or non-text. We finally refine the resulted text lines candidates by classifying them using a kernel SVM. To...
In this article, we present a novel set of features for detection of text in images of natural scenes using a multi-layer perceptron (MLP) classifier. An estimate of the uniformity in stroke thickness is one of our features and we obtain the same using only a subset of the distance transform values of the concerned region. Estimation of the uniformity in stroke thickness on the basis of sparse sampling...
The idea of cost or similarity volume may date back to Marr and Poggio's cooperative algorithm. One of the challenging difficulties in applying their algorithm to recovering sub-pixel disparity is 1) how to take advantages of the edge location in the images for computing reliable cost between the corresponding sub-pixel coordinates. In addition, it should be noted that 2) accidental correspondences...
We propose a novel method on stereo matching based on the Global Edge Constraint (GEC) and Graph Cuts. Firstly, the GEC composed of particular image edges is employed to generate the initial disparity maps. And then the reliable disparity maps consistent with the observed data are extracted to construct the data term of the energy function. Finally, we incorporate the GEC as a soft constraint into...
In this paper, we propose a new framework for edge-preserving texture-smoothing filtering to improve the visibility of images in presence of haze or fog. The proposed framework can effectively achieve strong texture smoothing while keeping edges sharp, and any low-pass filter can be directly integrated into the framework. Our experiments with three popular low-pass filters (Gaussian filter, median...
In this paper, we propose a video representation that is motivated by the problem of categorizing large web video collections. The representation focuses on capturing the properties of the temporal structure of a video and deploys low-level image features derived from the self-similarity matrix of the video. The bias of the representation towards the temporal structure is based on our hypothesis that...
Accurate vessel segmentation is the first step in retinal image analysis for medical diagnosis. In this paper we propose a novel method to segment vessel network in fundus image. Vessel centerlines are first extracted by using a set of directional line detectors. Next an Iterative Geodesic Time Transform (ItGTT) is designed to segment the entire vessel network. The idea of the ItGTT is to use centerline...
Reading text from scene images is a challenging problem that is receiving much attention, especially since the appearance of imaging devices in low-cost consumer products like mobile phones. This paper presents an easy and fast method to recognize individual characters in images of natural scenes that is applied after an algorithm that robustly locates text on such images. The recognition is based...
A document image matching approach making use of probabilistic graphical models is proposed. The document image is first represented by a tree with the nodes in the tree corresponding to the regions in the image and the edges indicating the parent-child relationships between them, transforming the problem to tree matching. A graphical model, i.e. pairwise Markov Random Field is defined on the tree,...
We propose a new Iterative-Midpoint-Method (IMM) for video character gap filling based on end pixels and neighbor pixels in the extracted contour of a character. The method obtains the Enhanced Gradient Image (EGI) for the given gray character image to sharpen text pixels. Max-Min clustering and K-means clustering algorithm with K=2 are applied on the EGI to obtain text candidates. To clean up the...
In this paper, we propose a novel text detection approach based on stroke width. Firstly, a unique contrast-enhanced Maximally Stable Extremal Region(MSER) algorithm is designed to extract character candidates. Secondly, simple geometric constrains are applied to remove non-text regions. Then by integrating stroke width generated from skeletons of those candidates, we reject remained false positives...
In this paper, we present a new method for a locally adaptive region detector called Bilateral kernel-based Region Detector (BIRD). This work is to detect stable regions from images by consecutively computing a multiscale decomposition based on the bilateral kernel. The BIRD regards a region as covariant if it exhibits predictability in its photometric distance over spatial distance. Distinctiveness...
In this paper we propose a novel method of estimating vanishing point by spherical gradient. In contrast with the conventional methods in which vanishing point is estimated from lines, the proposed method does not necessarily extract lines, but employs the spherical gradient cues of edge points. Based on the observation that spherical gradient is aligned with the normal vector of the projection plane...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.