The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Automatic recognition of Arabic handwritten text presents a problem worth solving; it has increasingly more interest, especially in recent years. In this paper, we address the most frequently encountered problems when dealing with Arabic handwriting recognition, and we briefly present some lessons learned from several serious attempts. We show why morphological analysis of Arabic handwriting could...
This paper presents the history of the Persian (Farsi) script, as well as the development of different writing styles for the current Persian script. It also addresses the Arabic alphabet adopted and evolved for writing the Persian language as well as different writing styles. This evolution includes further extensions to the Arabic alphabet and altered shapes of some Arabic letters. The differences...
This paper summarizes techniques proposed for off-line Arabic word recognition. This point of view concerns the human reading favoring an interactive mechanism between global memorization and local verification sim- plifying the recognition of complex scripts such as Arabic. According to this consideration, specific papers are analyzed with comments on strategies.
Searching handwritten documents is a relatively unexplored frontier for documents in any language. Traditional approaches use either image-based or text-based techniques. This paper describes a framework for versatile search where the query can be either text or image, and the retrieval method fuses text and image retrieval methods. A UNICODE and an image query are maintained throughout the search,...
In this paper we present a novel approach for the recognition of offline Arabic handwritten text motivated by the Arabic letters’ conditional joining rules. A lexicon of Arabic words can be expressed in terms of a new alphabet of PAWs (Part of Arabic Word). PAWs can be expressed in terms of letters. The recognition problem is decomposed into two problems to solve simultaneously. To find the best matching...
The great success and high recognition rates of both OCR systems and recognition systems for handwritten words are unconceivable without the availability of huge datasets of real world data. This chapter gives a short survey of datasets used for recognition with special focus on their application. The main part of this chapter deals with Arabic handwriting, datasets for recognition systems, and their...
The technology of handwritten Chinese character recognition (HCCR) has seen significant advances in the last two decades owing to the effectiveness of many techniques, especially those for character shape normalization and feature extraction. This chapter reviews the major methods of normalization and feature extraction and evaluates their performance experimentally. The normalization methods include...
Uncertainty and variability are two of the most important concepts at the center of pattern recognition. It is especially true when patterns to be recognized are complex in nature and not controlled by any artificial constraints. Handwritten postal address recognition is one such case. This paper presents five principles of dealing with uncertainty and variability, and discusses how to decompose the...
In this paper, we introduce an efficient clustering based coarse-classifier for a Chinese handwriting recognition system to accelerate the recognition procedure. We define a candidate-cluster-number for each character. The defined number indicates the within-class diversity of a character in the feature space. Based on the candidate-cluster-number of each character, we use a candidate-refining module...
Given the large number of categories, or class types, in the Chinese language, the challenge offered by character recognition involves dealing with such a large-scale problem in both training and testing phases. This paper addresses three techniques, the combination of which has been found to be effective in solving the problem. The techniques are: 1) a prototype learning/matching method that determines...
This paper discusses online handwriting recognition of Japanese characters, a mixture of ideographic characters (Kanji) of Chinese origin, and the phonetic characters made from them. Most Kanji character patterns are composed of multiple subpatterns, called radicals, which are shared among many (sometimes hundreds of) Kanji character patterns. This is common in Oriental languages of Chinese origin,...
The market of handwriting recognition applications is increasing rapidly due to continuous advancement in OCR technology. This paper summarizes our recent efforts on offline handwritten Chinese script recognition using a segmentation-driven approach. We address two essential problems, namely isolated character recognition and establishment of the probabilistic segmentation model. To improve the isolated...
Two methods, Symbolic Indirect Correlation (SIC) and Style Constrained Classification (SCC), are proposed for recognizing handwritten Arabic and Chinese words and phrases. SIC reassembles variable-length segments of an unknown query that match similar segments of labeled reference words. Recognition is based on the correspondence between the order of the feature vectors and of the lexical transcript...
This paper introduces a script-independent methodology for multi-lingual offline handwriting recognition (OHR) based on the use of Hidden Markov Models (HMM). The OHR methodology extends our script-independent approach for OCR of machine-printed text images. The feature extraction, training, and recognition components of the system are all designed to be script independent. The HMM training and recognition...
India is a multi-lingual, multi-script country. Considerably less work has been done towards handwritten character recognition of Indian languages than for other languages. In this paper we propose a quadratic classifier based scheme for the recognition of off-line handwritten characters of three popular south Indian scripts: Kannada, Telugu, and Tamil. The features used here are mainly obtained from...
This paper describes recent work on ensemble methods for offline handwritten text line recognition. We discuss techniques to build ensembles of recognizers by systematically altering the training data or the system architecture. To combine the results of the ensemble members, we propose to apply ROVER, a voting based framework commonly used in continuous speech recognition. Additionally, we extend...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.