The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Page segmentation is still a challenging problem due to the large variety of document layouts. Methods examining both foreground and background regions are among the most effective to solve this problem. However, their performance is influenced by the implementation of two key steps: the extraction and selection of background regions, and the grouping of background regions into separators. This paper...
In this paper, we propose an efficient scene text localization method using gradient local correlation, which can characterize the density of pair wise edges and stroke width consistency to get a text confidence map. Gradient local correlation is insensitive to the gradient direction and robust to noise, small character size and shadow. Based on the text confidence map, the regions with high confidence...
Text detection and localization in natural scene images is important for content-based image analysis. This problem is challenging due to the complex background, the non-uniform illumination, the variations of text font, size and line orientation. In this paper, we present a hybrid approach to robustly detect and localize texts in natural scene images. A text region detector is designed to estimate...
This paper presents a text query-based method for keyword spotting from online Chinese handwritten documents. The similarity between a text word and handwriting is obtained by combining the character similiarity scores given by a character classifier. To overcome the ambiguity of character segmentation, multiple candidates of character patterns are generated by over-segmentation, and sequences of...
The alignment of text line images with text transcript is a crucial step of handwritten document annotation. Handwritten text alignment is prone to errors due to the difficulty of character segmentation and the variability of character shape, size and position. In this paper, we propose to incorporate the geometric context of character strings to improve the alignment accuracy for offline handwritten...
The splitting of touching characters remains a challenge in over-segmentation, which is crucial to the performance of integrated segmentation-recognition of handwritten character strings. In this paper, we propose a new method based on contour analysis for touching character splitting in Chinese handwriting. To reliably locate splitting points on the contour of touching pattern, we pair upper and...
Text line segmentation in unconstrained handwritten documents remains a challenge because handwritten text lines are multi-skewed and not obviously separated. This paper presents a new approach based on the variational Bayes (VB) framework for text line segmentation. Viewing the document image as a mixture density model, with each text line approximated by a Gaussian component, the VB method can automatically...
This paper proposes a novel hybrid method to robustly and accurately localize texts in natural scene images. A text region detector is designed to generate a text confidence map, based on which text components can be segmented by local binarization approach. A conditional random field (CRF) model, considering the unary component property as well as binary neighboring component relationship, is then...
Annotating the regions, text lines and characters of document images is an important, but tedious and expensive task. A ground-truthing tool may largely alleviate the human burden in this process. This paper describes an automated recognition-based tool GTLC for finding the best alignment between the text transcript and the connected components of unconstrained handwritten document image. The alignment...
This paper describes a system for handwritten Chinese text recognition integrating language model. On a text line image, the system generates character segmentation and word segmentation candidates, and the candidate paths are evaluated by character recognition scores and language model. The optimal path, giving segmentation and recognition result, is found using a pruned dynamic programming search...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.