The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we describe our approach for extracting salient information from US census form images. These forms present several challenges including variations in individual form templates, skew, writing device, writing style, etc. We describe an innovative registration algorithm that is robust to scale variations for segmenting the input image into cells. Following registration, the borders of...
We generalize recursive baseline extraction algorithms for symbol layout analysis in math expressions so that handwritten strokes may be provided as input. Specifically, baseline extraction is used for lexical analysis in a modified LL(1) parser, returning a set of candidate symbols when the leftmost or next symbol along the current baseline (from left-to-right) is requested by the parser. Candidate...
This paper reports a trial of handwritten text recognition by a part-based method. The part-based method recognizes individual characters by their parts without considering their whole shape. This realizes great robustness to severe deformations. This robustness is also effective for text recognition. Especially, for handwritten texts whose segmentation into individual characters is very difficult...
In keyword spotting from handwritten documents, the word similarity is usually computed by combining character similarities. Converting similarity to probabilistic confidence is beneficial for context fusion and threshold selection. In this paper, we propose to directly estimate the posterior probability of candidate characters based on the N-best paths from the segmentation-recognitioin candidate...
Automatic extraction of date patterns from handwritten document involves difficult challenges due to writing styles of different individuals, touching characters and confusion among identification of alphabets and digits. In this paper, we propose a framework for retrieval of date patterns from handwritten documents. The method first classifies word components of each text line into month and non-month...
This paper presents a handwritten digit recognition method based on cascaded heterogeneous convolutional neural networks (CNNs). The reliability and complementation of heterogeneous CNNs are investigated in our method. Each CNN recognizes a proportion of input samples with high-confidence, and feeds the rejected samples into the next CNN. The samples rejected by the last CNN are recognized by a voting...
Handwritten identification is a technique of automatic person identification play important role in economic dispute case. In this paper, we propose a new dataset system of Economic Dispute handwritten (DSEDH) based on stroke shape and structure Features. The system consists of CIMS, CPIS, HMS, HIS Subsystem, segmentation of character, recognition of handwritten, this system with high efficiency and...
HMM is one of the most popular methods to model sequential signals and plays a significant role in the field of off-line handwritten Arabic word recognition research. However, the structure of an HMM including the number of states has to be determined initially and can hardly be updated during the training process. A novel analytic algorithm based on the information entropy of states in an HMM to...
Unconstrained handwritten text recognition systems maximize the combination of two separate probability scores. The first one is the observation probability that indicates how well the returned word sequence matches the input image. The second score is the probability that reflects how likely a word sequence is according to a language model. Current state-of-the-art recognition systems use statistical...
In this paper, we propose a novel method for extracting a set of baseline-independent features, which are based on the combination of global and local information. A HMM-based recognition system is developed with 161 models that include a space model and a blank model. All of the models are trained using the standard Baum-Welch Algorithm with the state-tying technique, and are then decoded using the...
Handwritten text recognition systems commonly combine character classification confidence scores and context models for evaluating candidate segmentation-recognition paths, and the classification confidence is usually optimized at character level. On comparing the performance of class-dependent and class-independent confidence transformation (CT), this paper proposes two regularized class-dependent...
The purpose of this research is to improve the recognition rate of online Arabic handwriting recognition using HMM (Hidden Markov Model). Delayed strokes are removed from the online Arabic word to avoid the difficulty and the confusion caused by the delayed strokes in the recognition process. A new technique for extracting offline features by dividing the image into non-uniform horizontal segments...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.