The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A critical issue in recognition of mathematical expressions is the identification of the spatial relations of the symbols or/and sub-expressions that comprise the entire mathematical formula. This paper addresses the problem of structural analysis of mathematical expressions by constructing appropriate feature vectors to represent the spatial affinity of the objects (mathematical symbols or sub-expressions)...
Arabic writer identification is a very active research field. However, no standard benchmark is available for researchers in this field. The aim of this competition is to gather researchers and compare recent advances in Arabic writer identification. This competition was hosted by Kaggle, it has attracted thirty participants from both academia and industry. This paper gives details on this competition,...
This paper proposes an enhancement of our previously presented word segmentation method (ILSPLWseg) [1] by exploiting local spatial features. ILSP-LWseg is based on a gap metric that exploits the objective function of a soft-margin linear SVM that separates successive connected components (CCs). Then a global threshold for the gap metrics is estimated and used to classify the candidate gaps in "within"...
This paper deals with the problem of recognizing accented and non-accented characters in French handwriting. Accented characters increase the number of classes to be recognized. The performances of powerful classifier such as SVM are declined by the presence of accents. In this paper, an accented character is segmented into two parts: the root character or letter and the accent. These two parts are...
Accurate computer recognition of handwritten mathematics offers to provide a natural interface for mathematical computing, document creation and collaboration. Mathematical handwriting, however, provides a number of challenges beyond what is required for the recognition of handwritten natural languages. For example, it is usual to use symbols from a range of different alphabets and there are many...
Since the Urdu language has more isolated letters than Arabic and Farsi, a research on Urdu handwritten word is desired. This is a novel approach to use the compound features and a Support Vector Machine (SVM) in offline Urdu word recognition. Due to the cursive style in Urdu, a classification using a holistic approach is adapted efficiently. Compound feature sets, which involves in structural and...
Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned document cannot be always ensured. Also, compromising in terms of systems reliability under such situation is not desirable. We here propose a system to encounter such adverse situation...
In this paper we investigate the use of linguistic information given by language models to deal with word recognition errors on handwritten sentences. We focus especially on errors due to out-of-vocabulary (OOV) words. First, word posterior probabilities are computed and used to detect error hypotheses on output sentences. An SVM classifier allows these errors to be categorized according to defined...
In this paper, we present a novel approach for incorporating structural information into the hidden Markov modeling (HMM) framework for offline handwriting recognition. Traditionally, structural features have been used in recognition approaches that rely on accurate segmentation of words into smaller units (sub-words or characters). However, such segmentation based approaches do not perform well on...
The traditional weighting schemes used in text categorization for the vector space model (VSM) cannot exploit information intrinsic to texts obtained through online handwriting recognition or any OCR process. Especially, top n (n > 1) recognition candidates could not be used without flooding the resulting text with false occurrences of spurious terms. In this paper, an improved weighting scheme...
Transforming handwriting into digital text and recognition of handwritten patterns opens a vast scope of application opportunities from searching for handwritten notes and document management to causing actions by writing symbols. Despite receiving a great attention, a massive number of applications, and a huge research effort, recognition of handwritten text has not still reached a desired efficiency...
This paper presents a new approach to estimating the readability of handwritten text. The estimation task is posed as a regression problem. A novel support vector regression (SVR) system is used to estimate the recognition rate of a text recognizer on a given text. The estimated recognition rates are used to classify text as either readable or unreadable. Unreadable text can then be filtered out prior...
The off line identification of the handwriting as of signature comes under the field of biometrics. The context of use is in particular in the banking and legal fields. Within this framework problems particularly of imitation and falsification are often met. This paper presents an approach of personal identification based on the fusion of two off line modalities: handwritten signature and handwriting...
This article describes a sketch-based framework for semi-automatic annotation of historical document collections. It is motivated by the fact that fully automatic methods, while helpful for extracting metadata from large collections, have two main drawbacks in a real-world application: (i) they are error-prone and (ii) they only capture a subset of all the knowledge in the document base, both meaning...
In this paper we describe a model for classifying binary data using classifiers based on Bernoulli mixture models. We show how Bernoulli mixtures can be used for feature extraction and dimensionality reduction of raw input data. The extracted features are then used for training a classifier for supervised labeling of individual sample points. We have applied this method to two different types of datasets,...
In this paper, we present an approach for separating text and non-text ink strokes in online handwritten Japanese documents based on Markov random fields (MRFs), which effectively utilize the spatial relationship between strokes. Support vector machine (SVM) classifiers are trained for individual stroke and stroke pair classification, and on converting the SVM outputs to probabilities, the likelihood...
Writer recognition is considered as a difficult problem to solve due to variations found in the writing, even from the same writer. In this paper, steered Hermite features are used to identify writer from a written document. We will show that steered Hermite features are highly useful for text images because they extract lot of information, notably for data characterized by oriented features, curves...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.