The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a syntactic approach based on Adjacency Grammars (AG) for sketch diagram modeling and understanding. Diagrams are a combination of graphical symbols arranged according to a set of spatial rules defined by a visual language. AG describe visual shapes by productions defined in terms of terminal and non-terminal symbols (graphical primitives and subshapes), and a set functions describing...
Writer identification consists in determining the writer of a piece of handwriting from a set of writers. In this paper we present a system for writer identification in old handwritten music scores which uses only music notation to determine the author. The steps of the proposed system are the following. First of all, the music sheet is preprocessed for obtaining a music score without the staff lines...
In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region at the background portion. Using Convex Hull information, we use these background information to find some initial points to segment a touching string into possible primitive segments (a primitive segment...
Reliable indexing of documents having seal instances can be achieved by recognizing seal information. This paper presents a novel approach for detecting and classifying such multi-oriented seals in these documents. First, Hough Transform based methods are applied to extract the seal regions in documents. Next, isolated text characters within these regions are detected. Rotation and size invariant...
In this paper we present a method for document categorization which processes incoming document images such as invoices or receipts. The categorization of these document images is done in terms of the presence of a certain graphical logo detected without segmentation. The graphical logos are described by a set of local features and the categorization of the documents is performed by the use of a bag-of-words...
The use of graphology in recruitment processes has become a popular tool in many human resources companies. This paper presents a model that links features from handwritten images to a number of personality characteristics used to measure applicant aptitudes for the job in a particular hiring scenario. In particular we propose a model of measuring active personality and leadership of the writer. Graphological...
This article proposes a novel similarity measure between vector sequences. Recently, a model-based approach was introduced to address this issue. It consists in modeling each sequence with a continuous Hidden Markov Model (CHMM) and computing a probabilistic measure of similarity between C-HMMs. In this paper we propose to model sequences with semi-continuous HMMs (SC-HMMs): the Gaussians of the SC-HMMs...
Touching characters are major problem of achieving higher recognition rate in optical character recognition (OCR). Present OCR systems do not perform well when adjacent characters touch. If characters are touched in graphical documents (e.g. map) then such touching string recognition is more difficult because in such documents touching characters appear in multi-oriented direction. In this paper,...
We propose a novel approach for writer adaptation in a word spotting task. The method exploits the fact that a semi-continuous hidden Markov model separates the word model parameters into (i) a shared codebook of shapes and (ii) a set of word-specific parameters. Our main contribution is to derive writer-specific word models by statistically adapting an initial universal codebook to each document...
In this paper, we present a scheme towards recognition of English character in multi-scale and multi-oriented environments. Graphical document such as map consists of text lines which appear in different orientation. Sometimes, characters in a single word may follow a curvilinear way to annotate the graphical curve lines. For recognition of such multi-scale and multi-oriented characters a Support...
This article describes a sketch-based framework for semi-automatic annotation of historical document collections. It is motivated by the fact that fully automatic methods, while helpful for extracting metadata from large collections, have two main drawbacks in a real-world application: (i) they are error-prone and (ii) they only capture a subset of all the knowledge in the document base, both meaning...
In this paper we present a method to spot both text and graphical symbols in a collection of images of wiring diagrams. Word spotting and symbol spotting methods tend to use the most discriminative features to describe the objects to be located. This fact makes that one can not tackle with textual and symbolic information at the same time. We propose a spotting architecture able to index both words...
The aim of writer identification is determining the writer of a piece of handwriting from a set of writers. In this paper we present a system for writer identification in old handwritten music scores. Even though an important amount of compositions contains handwritten text in the music scores, the aim of our work is to use only music notation to determine the author. The steps of the system proposed...
In graphical documents (map, engineering drawing), artistic documents etc. there exist many printed materials where text lines are not parallel to each other and they are multi-oriented and curve in nature. For the OCR of such documents we need to extract individual text lines from the documents. Extraction of individual text lines from multi-oriented and/or curved text document is a difficult problem...
This paper deals with the topic of performance evaluation of the symbol recognition & spotting systems. It presents an overview as a result of the work and the discussions undertaken by a working group on this subject. The paper starts by giving a general view of symbol recognition & spotting and performance evaluation. Next, the two main issues of performance evaluation are discussed: groundtruthing...
This paper presents a syntactic recognition approach for on-line drawn graphical symbols. The proposed method consists in an incremental on-line predictive parser based on symbol descriptions by an adjacency grammar. The parser analyzes input strokes as they are drawn by the user and is able to get ahead which symbols are likely to be recognized when a partial subshape is drawn in an intermediate...
In this paper we present a method to locate and recognize graphical symbols appearing in real images. A vectorial signature is defined to describe graphical symbols. It is formulated in terms of accumulated length and angular information computed from polygonal approximation of contours. The proposed method aims to locate and recognize graphical symbols in cluttered environments at the same time,...
In this paper a word spotting approach to index archival image documents is presented. Indices are constructed from keyword images. The spotting strategy is formulated on an indexing-by-shape basis. The well known shape context descriptor is used to compute word image signatures from the skeleton points. Afterwards, codewords are extracted from thresholded shape contexts. It is a simpler and more...
This article presents a pen-based framework for manual edition of digital documents on tablet computers. In this system, the user draws certain proofreading symbols on the text parts to edit; some symbols can be accompanied by handwritten text. The input is interpreted and the corresponding editing action is executed in real time. The possibility that the input contains handwritten text is a novelty...
Text/graphics separation in document image analysis is one of the main concerns in present research work. The complexity enhances when both text and graphics overlap in the context of maps in color images. This paper discusses a number of improvements to text/graphics separation methods to make it suitable for maps. Emphasize is given to the overlapping regions of text and graphics. It also discusses...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.