Search results

Items from 1 to 20 out of 175 results

chapter

Curved document image rectification

Dhanya M Dhanalakshmy, Hema P Menon

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 783 - 786

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Digitization of documents has gained prominence in the recent past for data preserving. Paper documents can be converted to digital form by using various modes of acquisition techniques. In this paper processing of data captured using normal digital camera has been considered. The camera captured document images may contain warped document due to perspective and geometric distortions. Curvature of...

chapter

A Novel OCR Approach Based on Document Layout Analysis and Text Block Classification

Weiheng Zhu, Yuanfeng Liu, Liang Hao

2016 12th International Conference on Computational Intelligence and Security (CIS) > 91 - 94

2016 12th International Conference on Computational Intelligence and Security (CIS)

Document layout helps users to focus on important content of the documents while neglecting the rest whenever possible. This paper presents a novel Optical Character Recognition (OCR) algorithm whose performance is enhanced by post-processing based on information collected from document layout analysis. Initial OCR results are used for text block classification, whose results are then used to fine-tune...

chapter

Similar handwritten Chinese character recognition based on adaptive discriminative locality alignment

Xiwen Qu, Ning Xu, Weiqiang Wang, Ke Lu

2015 14th IAPR International Conference on Machine Vision Applications (MVA) > 130 - 133

2015 14th IAPR International Conference on Machine Vision Applications (MVA)

Discriminative locality alignment (DLA) has been successfully applied in similar handwritten Chinese character recognition (SHCCR). But, the performance of DLA heavily depends on the choice of parameters and the optimal parameters among different groups of similar characters are not consistent. To address this problem, we present an improved method with few parameters, called adaptive discriminative...

chapter

A skew detection and correction technique for Arabic script text-line based on subwords bounding

Atallah M. Al-Shatnawi

2014 IEEE International Conference on Computational Intelligence and Computing Research > 1 - 5

2014 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

Text-line skew detection and correction is the first step in Arabic document recognition and analysis. It is a crucial pre-processing stage of Arabic Character Recognition (ACR). It has a direct effect on the dependability and efficiency of other system stages such as baseline detection, segmentation and feature extraction stages. In this paper an efficient skew detection and correction method for...

chapter

A Database of On-Line Handwritten Mixed Objects Named "Kondate"

Tomohisa Matsushita, Masaki Nakagawa

2014 14th International Conference on Frontiers in Handwriting Recognition > 369 - 374

2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR)

This paper describes a database of on-line handwritten patterns mixed of text, figures, tables, maps, diagrams and so on. Now, pen-based and touch-based interfaces are spreading into people and their surfaces are getting large. People can write and draw mixed objects without paying attention on the difference of objects or the mode change. Moreover, they may write text in any direction in combination...

chapter

The Influence of Language Orthographic Characteristics on Digital Word Recognition

Ofer Biller, Jihad El-Sana, Klara Kedem

2014 11th IAPR International Workshop on Document Analysis Systems > 131 - 135

2014 11th IAPR International Workshop on Document Analysis Systems (DAS)

We study the effect of language orthographic characteristics on the performance of digital word recognition in degraded documents such as historical documents. We provide a rigorous scheme for quantifying the influence of the orthographic characteristics on the quality of word recognition in such documents. We study and compare several orthographic characteristics for four natural languages and measure...

chapter

Graph Model Optimization Based Historical Chinese Character Segmentation Method

Jingning Ji, Liangrui Peng, Bohan Li

2014 11th IAPR International Workshop on Document Analysis Systems > 282 - 286

2014 11th IAPR International Workshop on Document Analysis Systems (DAS)

Historical Chinese document recognition technology is important for digital library. However, historical Chinese character segmentation remains a difficult problem due to the complex structure of Chinese characters and various writing styles. This paper presents a novel method for historical Chinese character segmentation based on graph model. After a preliminary over-segmentation stage, the system...

chapter

Neural net based complete character recognition scheme for Bangla printed text books

S. K. Alamgir Hossain, Tamanna Tabassum

16th Int'l Conf. Computer and Information Technology > 71 - 75

2013 16th International Conference on Computer and Information Technology (ICCIT)

In this paper we propose a neural net based characters recognition scheme for Bangla printed text books. There are a lot of scientific literature, novels, magazines and books etc that are written in Bangla language. More than 400 million people use Bangla language. Most of the library and educational institutions want to keep copy of the books in a digital format. For storing those books in digital...

chapter

Text line identification in Tagore's manuscript

Chandranath Adak, Bidyut B. Chaudhuri

Proceedings of the 2014 IEEE Students' Technology Symposium > 210 - 213

2014 IEEE Students' Technology Symposium (TechSym)

In this paper, a text line identification method is proposed. The text lines of printed document are easy to segment due to uniform straightness of the lines and sufficient gap between the lines. But in handwritten documents, the line is nonuniform and interline gaps are variable. We take Rabindranath Tagore's manuscript as it is one of the most difficult manuscripts that contain doodles. Our method...

chapter

Rejection Schemes in Multi-class Classification -- Application to Handwritten Character Recognition

Hubert Cecotti, Szilard Vajda

2013 12th International Conference on Document Analysis and Recognition > 445 - 449

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

The recognition of handwritten characters is an almost solved problem thanks to efficient machine learning techniques. However, the evaluation and the choice of thresholds to meet a certain level of performance remains a challenge. In this paper, we compare different rejection techniques to determine if a character has been successfully detected or not. Whereas the evaluation of binary classifiers...

chapter

Using Harris Corners for the Retrieval of Graphs in Historical Manuscripts

Rainer Herzog, Arved Solth, Bernd Neumann

2013 12th International Conference on Document Analysis and Recognition > 1295 - 1299

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

In recent years, several methods have been proposed for content-based retrieval from manuscripts, mostly based on character or word similarity. In this paper, we present a new segmentation-free method, called Harris Corner Matching (HCM), which accepts an arbitrary writing pattern as a model and allows to retrieve similar patterns from a possibly large database. Retrieval is performed in two steps...

chapter

A Two-Stage Approach for Word Spotting in Graphical Documents

Arundhati Tarafdar, Umapada Pal, Partha Pratim Roy, Nicolas Ragot, more

2013 12th International Conference on Document Analysis and Recognition > 319 - 323

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

Presence of multi-oriented characters, connected characters with graphical lines, intersection of text and symbols with graphical lines/curves etc. are very common in graphical documents. As a result word spotting in graphical documents is still a challenging task that we try to solve (partially) in this paper. The proposed approach proceeds in two stages. In the first stage, recognition of isolated...

chapter

What Should We Be Comparing for Writer Identification?

Andrew J. Newell

2013 12th International Conference on Document Analysis and Recognition > 418 - 422

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

Certain approaches to writer identification encode handwriting as texture, producing a single histogram of visual features, ignoring any information about the lexical content of the passage. In contrast, other approaches first segment elements of the text, such as characters or big rams, so that they can be compared like-for-like with other instances of the same element. The difference between the...

chapter

Reading Activity Recognition Using an Off-the-Shelf EEG -- Detecting Reading Activities and Distinguishing Genres of Documents

Kai Kunze, Yuki Shiga, Shoya Ishimaru, Koichi Kise

2013 12th International Conference on Document Analysis and Recognition > 96 - 100

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

The document analysis community spends substantial resources towards computer recognition of any type of text (e.g. characters, handwriting, document structure etc.). In this paper, we introduce a new paradigm focusing on recognizing the activities and habits of users while they are reading. We describe the differences to the traditional approaches of document analysis. We present initial work towards...

chapter

Power-law transformation for enhanced recognition of born-digital word images

Deepak Kumar, A G Ramakrishnan

2012 International Conference on Signal Processing and Communications (SPCOM) > 1 - 5

2012 International Conference on Signal Processing and Communications (SPCOM)

In this paper, we discuss the issues related to word recognition in born-digital word images. We introduce a novel method of power-law transformation on the word image for binarization. We show the improvement in image binarization and the consequent increase in the recognition performance of OCR engine on the word image. The optimal value of gamma for a word image is automatically chosen by our algorithm...

chapter

Word classification in bilingual printed documents

Sofiene Haboubi Samia Maddouri, Hamid Amiri

2012 6th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT) > 502 - 506

2012 6th International Conference on Sciences of Electronic, Technologies of Information and Telecommunications (SETIT)

In this paper we propose a method of identifying Arabic words from Arabic and Latin scripts in printed documents. This method is based on a statistical and geometric analysis to separate between words of a printed document. Structural features are used to describe the words extracted in previous step. Among the features used: the jambs, the diacritical points, the connected components, the hamps…...

chapter

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning

Adam Coates, Blake Carpenter, Carl Case, Sanjeev Satheesh, more

2011 International Conference on Document Analysis and Recognition > 440 - 445

2011 International Conference on Document Analysis and Recognition (ICDAR)

Reading text from photographs is a challenging problem that has received a significant amount of attention. Two key components of most systems are (i) text detection from images and (ii) character recognition, and many recent methods have been proposed to design better feature representations and models for both. In this paper, we apply methods recently developed in machine learning -- specifically,...

chapter

Automatic Estimation of the Legibility of Binarised Historic Documents for Unsupervised Parameter Tuning

M. Stommel, G. Frieder

2011 International Conference on Document Analysis and Recognition > 104 - 108

2011 International Conference on Document Analysis and Recognition (ICDAR)

Document enhancement tools are a valuable help in the study of historic documents. Given proper filter settings, many effects that impair the legibility can be evened out (e.g. washed out ink, stained and yellowed paper). However, because of differing authors, languages, handwritings, fonts and paper conditions, no single filter parameter set fits all documents. Therefore, the parameters are usually...

chapter

Mathematical Formula Identification in PDF Documents

Xiaoyan Lin, Liangcai Gao, Zhi Tang, Xiaofan Lin, more

2011 International Conference on Document Analysis and Recognition > 1419 - 1423

2011 International Conference on Document Analysis and Recognition (ICDAR)

Recognizing mathematical expressions in PDF documents is a new and important field in document analysis. It is quite different from extracting mathematical expressions in image-based documents. In this paper, we propose a novel method by combining rule-based and learning-based methods to detect both isolated and embedded mathematical expressions in PDF documents. Moreover, various features of formulas,...

chapter

Discovering Legible Chinese Typefaces for Reading Digital Documents

Bing Zhang, Ying Li, Ching Y. Suen, Xuemin Zhang

2011 International Conference on Document Analysis and Recognition > 962 - 966

2011 International Conference on Document Analysis and Recognition (ICDAR)

More and more fonts have sprung up in recent years in digital publishing industry and reading devices. In this paper, we focus on methods of evaluating digital Chinese fonts and their typeface characteristics to seek a good way to enhance the character recognition rate. To accomplish this, we combined psychological analysis methods with statistical analysis. It involved an extensive survey of distinctive...

Keywords:
CHARACTER RECOGNITION

Publication date

Set your own date range

Content availability

Available (171)
None (4)

Publication type

book (168)
article (7)

Keywords

FEATURE EXTRACTION (75)
IMAGE SEGMENTATION (57)
PIXEL (53)
TEXT RECOGNITION (48)
OPTICAL CHARACTER RECOGNITION SOFTWARE (47)
OPTICAL CHARACTER RECOGNITION (39)
DOCUMENT IMAGE PROCESSING (38)
DATA MINING (34)
HANDWRITING RECOGNITION (33)
IMAGE RECOGNITION (30)
HANDWRITTEN CHARACTER RECOGNITION (29)
NATURAL LANGUAGE PROCESSING (27)
IMAGE EDGE DETECTION (25)
TRAINING (22)
ALGORITHM DESIGN AND ANALYSIS (18)
DATABASES (18)
IMAGE COLOR ANALYSIS (18)
SHAPE (18)
CAMERAS (16)
HIDDEN MARKOV MODELS (15)
IMAGE PROCESSING (15)
NOISE (14)
TEXT DETECTION (14)
OCR (13)
WRITING (13)
ACCURACY (12)
HISTOGRAMS (12)
NATURAL LANGUAGES (12)
SUPPORT VECTOR MACHINES (12)
IMAGE CLASSIFICATION (11)
CHARACTER SEGMENTATION (10)
PATTERN RECOGNITION (10)
STATISTICAL ANALYSIS (10)
ARTIFICIAL NEURAL NETWORKS (9)
EDGE DETECTION (9)
LAYOUT (9)
LEARNING (ARTIFICIAL INTELLIGENCE) (9)
CONFERENCES (8)
EDUCATIONAL INSTITUTIONS (8)
TEXT EXTRACTION (8)
VIDEO SIGNAL PROCESSING (8)
CONTEXT (7)
DICTIONARIES (7)
FILTERING (7)
IMAGE CODING (7)
IMAGE COLOUR ANALYSIS (7)
IMAGE RESOLUTION (7)
TEXT SEGMENTATION (7)
TRANSFORMS (7)
VISUALIZATION (7)
CLASSIFICATION ALGORITHMS (6)
COMPUTER VISION (6)
CONNECTED COMPONENT ANALYSIS (6)
DYNAMIC PROGRAMMING (6)
ENCODING (6)
ENGINES (6)
ESTIMATION (6)
HANDICAPPED AIDS (6)
HUMANS (6)
IMAGE ANALYSIS (6)
MACHINE LEARNING (6)
MARKOV PROCESSES (6)
NEURAL NETS (6)
OPTICAL IMAGING (6)
PATTERN CLUSTERING (6)
ROBUSTNESS (6)
SEGMENTATION (6)
FILTERING THEORY (5)
GRAPHICS (5)
IMAGE ENHANCEMENT (5)
IMAGE MATCHING (5)
INTERNET (5)
LIGHTING (5)
PROBABILITY (5)
ROBOTS (5)
SPEECH (5)
VISUAL DATABASES (5)
WAVELET TRANSFORMS (5)
WORD PROCESSING (5)
BINARIZATION (4)
COLORED NOISE (4)
COMPUTERS (4)
DETECTORS (4)
DOCUMENT ANALYSIS (4)
EQUATIONS (4)
FILTERING ALGORITHMS (4)
HANDWRITTEN DOCUMENT (4)
HOUGH TRANSFORM (4)
IMAGE SEQUENCES (4)
INDEXING (4)
INFORMATION TECHNOLOGY (4)
INK (4)
KERNEL (4)
LABELING (4)
NATURAL SCENES (4)
PATTERN CLASSIFICATION (4)
PATTERN MATCHING (4)
RECOGNITION (4)
more

INFONA - science communication portal

Search results

Curved document image rectification

A Novel OCR Approach Based on Document Layout Analysis and Text Block Classification

Similar handwritten Chinese character recognition based on adaptive discriminative locality alignment

A skew detection and correction technique for Arabic script text-line based on subwords bounding

A Database of On-Line Handwritten Mixed Objects Named "Kondate"

The Influence of Language Orthographic Characteristics on Digital Word Recognition

Graph Model Optimization Based Historical Chinese Character Segmentation Method

Neural net based complete character recognition scheme for Bangla printed text books

Text line identification in Tagore's manuscript

Rejection Schemes in Multi-class Classification -- Application to Handwritten Character Recognition

Using Harris Corners for the Retrieval of Graphs in Historical Manuscripts

A Two-Stage Approach for Word Spotting in Graphical Documents

What Should We Be Comparing for Writer Identification?

Reading Activity Recognition Using an Off-the-Shelf EEG -- Detecting Reading Activities and Distinguishing Genres of Documents

Power-law transformation for enhanced recognition of born-digital word images

Word classification in bilingual printed documents

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning

Automatic Estimation of the Legibility of Binarised Historic Documents for Unsupervised Parameter Tuning

Mathematical Formula Identification in PDF Documents

Discovering Legible Chinese Typefaces for Reading Digital Documents

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options