Search results

Items from 1 to 20 out of 159 results

chapter

Curved document image rectification

Dhanya M Dhanalakshmy, Hema P Menon

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 783 - 786

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Digitization of documents has gained prominence in the recent past for data preserving. Paper documents can be converted to digital form by using various modes of acquisition techniques. In this paper processing of data captured using normal digital camera has been considered. The camera captured document images may contain warped document due to perspective and geometric distortions. Curvature of...

chapter

Adaptive method for multi colored text binarization

Arindam Das, Sandipan Chowdhury

2017 International Conference on Systems, Signals and Image Processing (IWSSIP) > 1 - 5

2017 International Conference on Systems, Signals and Image Processing (IWSSIP)

This article presents our recent study on multi colored text binarization. In the output image, we represented foreground content as black and background as white regardless the polarity of foreground and background in original image. Here we applied connected component analysis based approach to group the words or characters within bounding or edge box. The main novelty of this reported work includes...

chapter

A Novel OCR Approach Based on Document Layout Analysis and Text Block Classification

Weiheng Zhu, Yuanfeng Liu, Liang Hao

2016 12th International Conference on Computational Intelligence and Security (CIS) > 91 - 94

2016 12th International Conference on Computational Intelligence and Security (CIS)

Document layout helps users to focus on important content of the documents while neglecting the rest whenever possible. This paper presents a novel Optical Character Recognition (OCR) algorithm whose performance is enhanced by post-processing based on information collected from document layout analysis. Initial OCR results are used for text block classification, whose results are then used to fine-tune...

chapter

Document image classification using SEMCON

Zenun Kastrati, Ali Shariq Imran

2015 20th Symposium on Signal Processing, Images and Computer Vision (STSIVA) > 1 - 6

2015 20th Symposium on Signal Processing, Images and Computer Vision (STSIVA)

In this paper, we are proposing a new semantic and contextual based document image classification framework. The framework is composed of two main modules. The first one is the text analysis module (TAM) which processes document images and extracts words from the image, and second one is the SEMCON, which is a semantic and contextual objective metric. From the list of extracted words by TAM, SEMCON...

chapter

A hybrid method for table detection from document image

Tran Tuan Anh, Na In-Seop, Kim Soo-Hyung

2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) > 131 - 135

2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)

In this paper, we present a hybrid method consisting of three main stages for detecting tables in document images. Based on table structure, our system separates table into two main categories, ruling line table and non-ruling line table. In the first stage, the text and non-text elements in document are classified by a heuristic filter. Then, the white space analysis is used to group the text elements...

chapter

Page-level script identification from multi-script handwritten documents

Pawan Kumar Singh, Santu Kumar Dalal, Ram Sarkar, Mita Nasipuri

Proceedings of the 2015 Third International Conference on Computer, Communication, Control and Information Technology (C3IT) > 1 - 6

2015 3rd International Conference on Computer, Communication, Control and Information Technology (C3IT)

Script identification has long been the forerunner of many Optical Character Recognition (OCR) processes in a multi-lingual document environment. Script identification has numerous applications in the field of document image analysis, such as document sorting, indexing, retrieval and translation, etc. In this paper, we have developed a page-level script identification technique for handwritten documents...

chapter

Degradation enhancement for the captured document image using retinex theory

Marian Wagdy, Ibrahima Faye, Dayang Rohaya

Proceedings of the 6th International Conference on Information Technology and Multimedia > 363 - 367

2014 International Conference on Information Technology and Multimedia (ICIMU)

The state-of-arts global thresholding techniques are fast and efficient to convert the gray scale document image into a binary image. However, they are unsuitable for complex and degraded documents. Moreover, global thresholding techniques produce border noise when the illumination of the document is not uniform. Other methods that depend on local thresholding techniques are efficient in the case...

chapter

Extraction of arbitrary text in natural scene image based on stroke width transform

Jinjuli Jameson, Siti Norul Huda Sheikh Abdullah

2014 14th International Conference on Intelligent Systems Design and Applications > 124 - 128

2014 14th International Conference on Intelligent Systems Design and Applications (ISDA)

Text extraction plays an important role in numerous applications. Research on its method still need to be improved in order to achieve better performance, to increase the reliability of text extraction system and to deal with complex cases of text extraction. The majority of the text extraction methods are focusing on horizontal and near horizontal text lines; however, text in natural scene might...

chapter

Visual Perception of Unitary Elements for Layout Analysis of Unconstrained Documents in Heterogeneous Databases

Baptiste Poirriez, Aurelie Lemaitre, Bertrand Couasnon

2014 14th International Conference on Frontiers in Handwriting Recognition > 35 - 40

2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR)

The document layout analysis is a complex task in the context of heterogeneous documents. It is still a challenging problem. In this paper, we present our contribution for the layout analysis competition of the international Maurdor Campaign. Our method is based on a grammatical description of the content of elements. It consists in iteratively finding and then removing the most structuring elements...

chapter

Utilizing digital humanities methods for quantifying Howell's State Trials

Tracy Bergstrom, Donald Brower, Natalie Meyers

IEEE/ACM Joint Conference on Digital Libraries > 441 - 442

2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL)

In this paper we describe the undertaking of a quantitative, historically oriented analysis of the law of England between 1650–1700 as represented in Howell's State Trials. Our goal was to analyze cases over time to support investigation into whether a quantitative analysis of the content of the 1650–1700 State Trials would exhibit an upward trend of religious tolerance.

chapter

LectureKhoj: Automatic tagging and semantic segmentation of online lecture videos

Esha Baidya, Sanjay Goel

2014 Seventh International Conference on Contemporary Computing (IC3) > 37 - 43

2014 Seventh International Conference on Contemporary Computing (IC3)

Online educational lecture videos are very popular nowadays. However, effective search of relevant videos remains a difficult task. Texts displayed in lecture video slides have important information about the video content. Therefore, it can be utilized as a valuable source of content analysis and tagging. In this paper, we present an automated method for semantic segmentation and tag recommendation...

chapter

Forgery Detection Based on Intrinsic Document Contents

Amr Gamal Hamed Ahmed, Faisal Shafait

2014 11th IAPR International Workshop on Document Analysis Systems > 252 - 256

2014 11th IAPR International Workshop on Document Analysis Systems (DAS)

Nowadays, Document forgery detection is becoming increasingly important as forgery techniques are becoming available even to untrained users. Hence, documents that do not contain any extrinsic security features (e.g. invoices) have become easier to forge. We previously presented a method to detect manipulated documents based on distortions introduced during the forgery creation process. In this paper,...

chapter

Tübıtak Turkish — Ottoman handwritten recognition system

M. Said Aydemir, Burak Aydin, Hamza Kaya, Ibrahim Karliaga, more

2014 22nd Signal Processing and Communications Applications Conference (SIU) > 1918 - 1921

2014 22nd Signal Processing and Communications Applications Conference (SIU)

In this study, two different Ottoman and Turkish handwritten recognition systems have been developed using Hidden Markov Model (HMM) and Recurrent Neural Network (RNN). The systems are tested in both public use datasets and Civil Registration and Nationality (CRN) dataset. As public use datasets, IFN/ENIT dataset which is created for Arabic language, is used because of the similarity between Ottoman...

chapter

Real-Time Document Image Super-Resolution by Fast Matting

Yun Zheng, Xudong Kang, Shutao Li, Yuan He, more

2014 11th IAPR International Workshop on Document Analysis Systems > 232 - 236

2014 11th IAPR International Workshop on Document Analysis Systems (DAS)

From a single low resolution image, a real-time document image super-resolution algorithm is proposed to obtain high resolution document image with sharp text boundaries. First, a highly efficient document image matting algorithm based on local linear modeling is designed to decompose the input image into text, foreground and background layers, which contain the text edge information, the color information...

chapter

Document table detection and analysis using projection scale space

L. Ilham Kalyon, Yusuf Sinan Akgul

2014 22nd Signal Processing and Communications Applications Conference (SIU) > 1319 - 1322

2014 22nd Signal Processing and Communications Applications Conference (SIU)

Detection and analysis of tables on document images has been one of the most researched topics in document image processing. In this study, we define novel methods for the detection and analysis of tables from document images, and show their performance results on realistic table examples. The main method developed is projection-scale-space (PSS), where local and global constraints of the table in...

chapter

Neural net based complete character recognition scheme for Bangla printed text books

S. K. Alamgir Hossain, Tamanna Tabassum

16th Int'l Conf. Computer and Information Technology > 71 - 75

2013 16th International Conference on Computer and Information Technology (ICCIT)

In this paper we propose a neural net based characters recognition scheme for Bangla printed text books. There are a lot of scientific literature, novels, magazines and books etc that are written in Bangla language. More than 400 million people use Bangla language. Most of the library and educational institutions want to keep copy of the books in a digital format. For storing those books in digital...

chapter

An approach for printed document labeling

Chandranath Adak

2014 First International Conference on Automation, Control, Energy and Systems (ACES) > 1 - 4

2014 First International Conference on Automation, Control, Energy and Systems (ACES)

A document image contains texts and non-texts, it may be printed, handwritten, or hybrid of both. In this paper we deal with printed document where textual region is of printed characters, and non-texts are mainly photo images. Here we propose a model which performs labeling of different components of a printed document image, i.e. identification of heading, subheading, caption, article and photo...

chapter

Document layout analysis for Indian newspapers using contour based symbiotic approach

Vijay Singh, Bhupendra Kumar

2014 International Conference on Computer Communication and Informatics > 1 - 4

2014 International Conference on Computer Communication and Informatics (ICCCI)

Document layout analysis is necessary process for automated document recognition systems. Document layout analysis identifies, categorizes and labels the semantics of text blocks for meaningful information retrieval from document images. Our primary target document includes various newspaper and magazine pages which are having complex layout without following any static rules. We propose an effective...

chapter

The Significance of Reading Order in Document Recognition and Its Evaluation

C. Clausner, S. Pletschacher, A. Antonacopoulos

2013 12th International Conference on Document Analysis and Recognition > 688 - 692

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

Reading order detection and representation is an important task in many digitisation scenarios involving the preservation of the logical structure of a document. The corresponding need for the evaluation of reading order results generated by layout analysis methods poses a particular challenge due to potential deviations between ground truth and actually detected segmentation of the page. To this...

chapter

Exploiting Stroke Orientation for CRF Based Binarization of Historical Documents

Xujun Peng, Huaigu Cao, Krishna Subramanian, Rohit Prasad, more

2013 12th International Conference on Document Analysis and Recognition > 1034 - 1038

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

We present a novel binarization method that is especially effective on historical documents with the following characteristics: (a) the documents contain free-form cursive handwritten text with significant but consistent slant, (b) scanning artifacts resulting in the text and background pixels not having uniform intensity even within the same page, and (c) pages containing significant amount of bleeds...

Keywords:
OPTICAL CHARACTER RECOGNITION SOFTWARE

Publication date

Set your own date range

Content availability

Available (156)
None (3)

Publication type

book (154)
article (5)

Keywords

OPTICAL CHARACTER RECOGNITION (70)
FEATURE EXTRACTION (54)
IMAGE SEGMENTATION (50)
CHARACTER RECOGNITION (47)
DOCUMENT IMAGE PROCESSING (47)
PIXEL (47)
DATA MINING (35)
TEXT RECOGNITION (27)
OCR (26)
LAYOUT (23)
ACCURACY (21)
IMAGE COLOR ANALYSIS (18)
IMAGE RECOGNITION (18)
NOISE (18)
IMAGE EDGE DETECTION (17)
VIDEO SIGNAL PROCESSING (16)
NATURAL LANGUAGE PROCESSING (14)
SHAPE (14)
ALGORITHM DESIGN AND ANALYSIS (13)
CLASSIFICATION ALGORITHMS (12)
DATABASES (12)
HANDWRITTEN CHARACTER RECOGNITION (11)
PATTERN RECOGNITION (11)
TRAINING (11)
CAMERAS (10)
ENGINES (10)
HANDWRITING RECOGNITION (10)
IMAGE CLASSIFICATION (10)
TRANSFORMS (10)
HISTOGRAMS (9)
IMAGE RESOLUTION (9)
INDEXING (9)
OBJECT DETECTION (9)
PERFORMANCE EVALUATION (9)
TEXT EXTRACTION (9)
DEGRADATION (8)
DOCUMENT ANALYSIS (8)
HIDDEN MARKOV MODELS (8)
IMAGE ENHANCEMENT (8)
IMAGE PROCESSING (8)
OPTICAL IMAGING (8)
ROBUSTNESS (8)
VIDEOS (8)
WORD PROCESSING (8)
CONTEXT (7)
EDGE DETECTION (7)
IMAGE RESTORATION (7)
LEARNING (ARTIFICIAL INTELLIGENCE) (7)
MATHEMATICAL MODEL (7)
WRITING (7)
CLUSTERING ALGORITHMS (6)
COMPUTERS (6)
CONFERENCES (6)
ESTIMATION (6)
FILTERING THEORY (6)
IMAGE COLOUR ANALYSIS (6)
LAYOUT ANALYSIS (6)
PATTERN CLUSTERING (6)
BOOKS (5)
DISTANCE MEASUREMENT (5)
DOCUMENT LAYOUT ANALYSIS (5)
FILTERING (5)
HUMANS (5)
IMAGE ANALYSIS (5)
IMAGE CODING (5)
IMAGE TEXTURE (5)
LANGUAGE IDENTIFICATION (5)
NATURAL LANGUAGES (5)
OPTICAL DISTORTION (5)
SEGMENTATION (5)
SEMANTICS (5)
SIGNAL PROCESSING (5)
VIDEO RETRIEVAL (5)
VISUALIZATION (5)
ARTIFICIAL NEURAL NETWORKS (4)
BINARIZATION (4)
CHARACTER SEGMENTATION (4)
CORRELATION (4)
DICTIONARIES (4)
DOCUMENT IMAGE (4)
DOCUMENT IMAGES (4)
HISTORICAL DOCUMENTS (4)
IMAGE MATCHING (4)
IMAGE REPRESENTATION (4)
IMAGE RETRIEVAL (4)
INFORMATION HIDING (4)
INFORMATION TECHNOLOGY (4)
KNOWLEDGE BASED SYSTEMS (4)
LABELING (4)
NEURAL NETS (4)
OBJECT RECOGNITION (4)
OPTICAL CHARACTER RECOGNITION SYSTEM (4)
PIPELINES (4)
SECURITY OF DATA (4)
SKEW CORRECTION (4)
SOFTWARE (4)
STEGANOGRAPHY (4)
TESTING (4)
more

INFONA - science communication portal

Search results

Curved document image rectification

Adaptive method for multi colored text binarization

A Novel OCR Approach Based on Document Layout Analysis and Text Block Classification

Document image classification using SEMCON

A hybrid method for table detection from document image

Page-level script identification from multi-script handwritten documents

Degradation enhancement for the captured document image using retinex theory

Extraction of arbitrary text in natural scene image based on stroke width transform

Visual Perception of Unitary Elements for Layout Analysis of Unconstrained Documents in Heterogeneous Databases

Utilizing digital humanities methods for quantifying Howell's State Trials

LectureKhoj: Automatic tagging and semantic segmentation of online lecture videos

Forgery Detection Based on Intrinsic Document Contents

Tübıtak Turkish — Ottoman handwritten recognition system

Real-Time Document Image Super-Resolution by Fast Matting

Document table detection and analysis using projection scale space

Neural net based complete character recognition scheme for Bangla printed text books

An approach for printed document labeling

Document layout analysis for Indian newspapers using contour based symbiotic approach

The Significance of Reading Order in Document Recognition and Its Evaluation

Exploiting Stroke Orientation for CRF Based Binarization of Historical Documents

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options