Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Warped text-lines often appear whenever one performs the digitalization of bound documents using flatbed scanners or digital cameras. Compensating such distortion is an important pre-processing step in document transcription via OCR, for instance. This paper presents an efficient algorithm for text-line segmentation for document images. A typographic study and parameter tuning are done yielding into...
In this work, a system for recognition of printed mathematical expressions has been developed. Hence, a statistical framework based on two-dimensional stochastic context-free grammars has been defined. This formal framework allows to jointly tackle the segmentation, symbol recognition and structural analysis of a mathematical expression by computing its most probable parsing. In order to test this...
Optical Character Recognition (OCR) converts images of handwritten or printed text captured by camera or scanner into editable text. OCR has seen limited adoption in mobile platforms due to the performance constraints of these systems. Intel® Atom™ processors have enabled general purpose applications to be executed on handheld devices. In this paper, we analyze a reference implementation of the OCR...
Accurate segmentation of text lines from printed or handwritten documents is an important task in any document processing system. This becomes a challenging and complex problem due to several reasons. Situations arise when the text from neighboring lines overlaps the white space area, or touches text of the current line. Complications may also arise when due to varying skew, text lines curve along...
One of the major issues in document image processing is the efficient creation of ground truth in order to be used for training and evaluation purposes. Since a large number of tools have to be trained and evaluated in realistic circumstances, we need to have a quick and low cost way to create the corresponding ground truth. Moreover, the specific need for having the correct text correlated with the...
In this paper we present a novel CAPTCHA that is based on the current hard AI problem of mixed-text (handwriting and printed-text) segmentation. The proposed CAPTCHA overlays generated handwritten word images on a generated printed-text background. We first propose a modification that allows for character level perturbations on an existing synthetic handwriting generation technique. These perturbations...
Automatic Arabic handwritten text recognition is still an open research field, methods that describe satisfactory solution are still lacking. This can be attributed to cursive orthography and to the letter shape context sensitivity, which complex the problem of the character segmentation. This paper presents a heuristic rule based analytical segmentation approach for handwritten Arabic text, which...
Scene text images feature an abundance of font style variety but a dearth of data in any given query. Recognition methods must be robust to this variety or adapt to the query data's characteristics. To achieve this, we augment a semi-Markov model-integrating character segmentation and recognition-with a bigram model of character widths. Softly promoting segmentations that exhibit font metrics consistent...
Video texts are known to constitute an important source of information for semantic summaries of video archives. In this study, we propose a fully automated architecture for semantic annotation and later retrieval of Turkish news videos based on the corresponding video texts. At the core of the architecture is a named entity recognizer, the output of which on video texts is used as semantic annotations...
Separation of the text and graphics layers in maps with dense and overlapping sets of features (e.g. topographic maps) is a challenging problem. Multi Angled Parallelism (MAP) provides an efficient tool to detect miscellaneous linear features using directional morphological operations and higher order feature representation. However, in its original formulation sides of characters, short lines, and...
We propose a low complexity method for segmentation of text regions in natural images. This algorithm is designed for mobile applications (e.g. unmanned or hand-held devices) in which computational and energy resources are limited. No prior assumption is made regarding the text size, font, language, character set or the camera angle. However, the text is assumed to be located on a piecewise homogeneous...
In this paper, we define a new paradigm for eight-connection labeling, which employes a general approach to improve neighborhood exploration and minimizes the number of memory accesses. First, we exploit and extend the decision table formalism introducing or-decision tables, in which multiple alternative actions are managed. An automatic procedure to synthesize the optimal decision tree from the decision...
Amharic is the official language of Ethiopia and uses Ethiopic script for writing. In this paper, we present writer-independent HMM-based Amharic word recognition for offline handwritten text. The underlying units of the recognition system are a set of primitive strokes whose combinations form handwritten Ethiopic characters. For each character, possibly occurring sequences of primitive strokes and...
In this paper, we propose a comparative study between the affixal approach and the analytical approach for off-line Arabic decomposable word recognition. The analytical approach is based on the modeling of alphabetical letters. The affixal approach is based on the modeling of the linguistic entity namely prefix, infix, suffix and root. The experimental results obtained by these two last approaches...
Offline handwriting recognition of free-flowing Arabic text is a challenging task due to the plethora of factors that contribute to the variability in the data. In this paper, we address some of these sources of variability, and present experimental results on a large corpus of handwritten documents. Specific techniques such as the application of context-dependent Hidden Markov Models (HMMs) for the...
This paper describes a system for handwritten Chinese text recognition integrating language model. On a text line image, the system generates character segmentation and word segmentation candidates, and the candidate paths are evaluated by character recognition scores and language model. The optimal path, giving segmentation and recognition result, is found using a pruned dynamic programming search...
This paper presents a graph based scheme for color text recognition in images and videos, which is particularly robust to complex background, low resolution or video coding artifacts. This scheme is based on a novel method named the image text recognition graph (iTRG) composed of five main modules: an image text segmentation module, a graph connection builder module, a character recognition module,...
This paper describes application-oriented text localization and character segmentation in images. Target texts in our application are often damaged by lots of various noise and, as a result, there are many unclear characters. Therefore, some special treatments are necessary to recognize these texts. In this paper, new optimized text localization and character segmentation algorithms are proposed....
In this paper we introduce a framework for automated text recognition from images. We first describe a simple but efficient text detection and recognition method based on analysis of maximally stable extremal regions (MSERs) and simple template matching which allows to provide initial character recognition results. The main emphasis of the paper is on introducing a novel method for exploiting contextual...
In this paper, a system for automatic detection and recognition of Korean texts or shop names in outdoor signboard images is described. The system includes detection, binarization and extraction of text in a signboard image captured by a camera of a mobile phone for the recognition of the shop name. It can deal with different font styles and sizes as well as illumination changes. Individual characters...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.