Search results

Items from 1 to 20 out of 101 results

chapter

VML-HD: The historical Arabic documents dataset for recognition systems

Majeed Kassis, Alaa Abdalhaleem, Ahmad Droby, Reem Alaasam, more

2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR) > 11 - 14

2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR)

In this paper we present a new database with handwritten Arabic script. It is based on five books written by different writers from the years 1088–1451. We took 680 pages from these five books, and fully annotated them on the sub-word level. For each page we manually applied bounding boxes on the different sub-words and annotated the sequence of characters. It consists of 121,636 sub-word appearances...

article

Age Groups Classification in Social Network Using Deep Learning

Rita Georgina Guimaraes, Renata L. Rosa, Denise De Gaetano, Demostenes Z. Rodriguez, more

IEEE Access > 2017 > 5 > 10805 - 10816

Social networks have a large amount of data available, but often, people do not provide some of their personal data, such as age, gender, and other demographics. Although the sentiment analysis uses such data to develop useful applications in people’s daily lives, there are still failures in this type of analysis, either by the restricted number of words contained in the word dictionaries or because...

chapter

Feature extraction from handwritten documents for personality analysis

Salankara Mukherjee, Ishita De

2016 International Conference on Computer, Electrical & Communication Engineering (ICCECE) > 1 - 8

2016 International Conference on Computer, Electrical & Communication Engineering (ICCECE)

Handwriting can be used to predict or analyze a person's behavioral or personality traits. Characteristics of a handwriting are studied for that. In this work various characteristics like size, spacing, slant, skew, pressure, etc are studied. Since signature reflects important characteristics of human being, we analyze them also. As our objective is to build an automated or computerized handwriting...

chapter

Segmentation of highly unstructured handwritten documents using a neural network technique

Rathin Radhakrishnan Nair, Bharagava Urala Kota, Ifeoma Nwogu, Venu Govindaraju

2016 23rd International Conference on Pattern Recognition (ICPR) > 1291 - 1296

2016 23rd International Conference on Pattern Recognition (ICPR)

In recent years there has been a growing interest in digitizing the extensive amounts of books and documents that existed preceding the widespread adoption of digital technologies. Many of these digitizing initiatives deal with huge collections of handwritten documents, for which document image analysis techniques (page segmentation, keyword-spotting, optical character recognition (OCR), etc) are...

chapter

A robust text line detection in complex handwritten documents

Jakub Leszek Pach, Piotr Bilski

2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS) > 1 > 271 - 275

2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)

In this paper, we present the modified method of detecting text lines in handwritten documents based on the Block-Based Hough Transform. The algorithm has the practical application in the manuscript author identification. The proposed technique consists of three steps: preprocessing, detecting of potential text lines and eliminating the false ones. The first step covers the following operations: image...

chapter

Framework for human identification through offline handwritten documents

Shehzad Khalid, Uzma Naqvi, Imran Siddiqi

2015 International Conference on Computer, Communications, and Control Technology (I4CT) > 54 - 58

2015 International Conference on Computer, Communications, and Control Technology (I4CT)

Identification of individuals from handwritten documents using automated recognition systems has gained significant research interest due to the wide variety of applications it offers for forensic analysis, signature verification, classification of historical writings and other document analysis tasks. In this paper, we present a framework that combines different feature space representations of handwriting...

chapter

Interactive Visual Text Analysis for Corpus-Based Language Learning

Ying Zhu, Eric Friginal

2015 IEEE First International Conference on Big Data Computing Service and Applications > 462 - 467

2015 IEEE First International Conference on Big Data Computing Service and Applications (BigDataService)

A corpus is a large collection of texts that can be automatically analyzed for linguistic patterns and structures using interactive tools. Corpus-based language learning has gained prominence in recent years thanks to the advances in computing technologies, such as text mining, searching, and natural language processing. The size and variety of corpora have also grown significantly in recent years...

chapter

A Database of On-Line Handwritten Mixed Objects Named "Kondate"

Tomohisa Matsushita, Masaki Nakagawa

2014 14th International Conference on Frontiers in Handwriting Recognition > 369 - 374

2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR)

This paper describes a database of on-line handwritten patterns mixed of text, figures, tables, maps, diagrams and so on. Now, pen-based and touch-based interfaces are spreading into people and their surfaces are getting large. People can write and draw mixed objects without paying attention on the difference of objects or the mode change. Moreover, they may write text in any direction in combination...

chapter

LAMIS-MSHD: A Multi-script Offline Handwriting Database

Chawki Djeddi, Abdeljalil Gattal, Labiba Souici-Meslati, Imran Siddiqi, more

2014 14th International Conference on Frontiers in Handwriting Recognition > 93 - 97

2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR)

This paper introduces a new offline handwriting database that was developed to be employed in performance evaluation, result comparison and development of new methods related to handwriting analysis and recognition. The database can particularly be used for signature verification, writer recognition and writer demographics classification. In addition, the database also supports isolated digit recognition,...

chapter

Data Sufficiency for Online Writer Identification: A Comparative Study of Writer-Style Space vs. Feature Space Models

Arti Shivram, Chetan Ramaiah, Venu Govindaraju

2014 22nd International Conference on Pattern Recognition > 3121 - 3125

2014 22nd International Conference on Pattern Recognition (ICPR)

A key factor in building effective writer identification/verification systems is the amount of data required to build the underlying models. In this research we systematically examine data sufficiency bounds for two broad approaches to online writer identification -- feature space models vs. writer-style space models. We report results from 40 experiments conducted on two publicly available datasets...

chapter

Javanese character image segmentation of document image of Hamong Tani

Agustinus Rudatyo Himamunanto, Anastasia Rita Widiarti

2013 Digital Heritage International Congress (DigitalHeritage) > 1 > 641 - 644

2013 Digital Heritage International Congress (DigitalHeritage)

Script image segmentation of a document image is the most decisive step to the success of the process of transliteration of the script image into another script, such as automatically transliterating a printed Javanese manuscript image into a Latin manuscript. This paper gives an example of the application of profile projection modification to the segmentation of Javanese script document image of...

chapter

A Coarse to Fine Skew Estimation Technique for Handwritten Words

A. Papandreou, B. Gatos

2013 12th International Conference on Document Analysis and Recognition > 225 - 229

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

The estimation and correction of handwritten word skew is a difficult and challenging task since it has to be independent of the variations due to handwriting style and writing conditions. In this paper, a coarse-to-fine technique that integrates core-region information is presented. At first, a rough estimation and correction of the skew is accomplished by cutting vertically the word in two overlapping...

chapter

Codebook for Writer Characterization: A Vocabulary of Patterns or a Mere Representation Space?

Chawki Djeddi, Imran Siddiqi, Labiba Souici-Meslati, Abdellatif Ennaji

2013 12th International Conference on Document Analysis and Recognition > 423 - 427

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

Codebook-based representations have been effectively employed for writer identification. Most of the codebook-based methods generate a codebook by clustering a set of patterns extracted from an independent data set. The probability of occurrence of the codebook patterns in a given writing is then used to characterize its author. This study investigates the hypothesis that the codebook is merely a...

chapter

Discriminating Features for Writer Identification

Zachary A. Daniels, Henry S. Baird

2013 12th International Conference on Document Analysis and Recognition > 1385 - 1389

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

This paper investigates highly discriminating features for writer identification for off-line handwritten text lines and passages. Five categories of features are tested: slant and slant energy, skew, pixel distribution, curvature, and entropy. Four experiments are run utilizing the IAM Handwriting Database and the ICDAR 2011 Writer Identification Contest dataset: the first, on 10 writers from the...

chapter

Using Harris Corners for the Retrieval of Graphs in Historical Manuscripts

Rainer Herzog, Arved Solth, Bernd Neumann

2013 12th International Conference on Document Analysis and Recognition > 1295 - 1299

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

In recent years, several methods have been proposed for content-based retrieval from manuscripts, mostly based on character or word similarity. In this paper, we present a new segmentation-free method, called Harris Corner Matching (HCM), which accepts an arbitrary writing pattern as a model and allows to retrieve similar patterns from a possibly large database. Retrieval is performed in two steps...

chapter

A Comprehensive Representation Model for Handwriting Dedicated to Word Spotting

Peng Wang, Veronique Eglin, Christophe Garcia, Christine Largeron, more

2013 12th International Conference on Document Analysis and Recognition > 450 - 454

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

In this paper, we propose an original representation model for handwriting document images. Most state-of-the-art handwriting representation models only use separately textural properties, selective dominant features (such as stroke orientation or gradient orientation) or structural properties. To avoid the drawbacks of using the properties from a single aspect, we design a comprehensive model that...

chapter

Generalized Eigen Cooccurrence: Application to Palaeography

Ikram Moalla, Frank Lebourgeois, Adel Alimi

2013 12th International Conference on Document Analysis and Recognition > 555 - 559

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

This paper introduces the Generalized Eigen Cooccurrence Matrix (GECM) as a new feature to describe complex structures like images of handwritings for palaeographic expertise. It measures the spatial dependency between two features in the image. It generalizes the popular grey level cooccurrence Dependencies (SGLD) which uses the luminance for the two features. 2nd order statistics generate high dimensional...

chapter

IBM_UB_1: A Dual Mode Unconstrained English Handwriting Dataset

Arti Shivram, Chetan Ramaiah, Srirangaraj Setlur, Venu Govindaraju

2013 12th International Conference on Document Analysis and Recognition > 13 - 17

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

In this paper we present a new dual mode, twin-folio structured English handwriting dataset IBM_UB_1. IBM_UB_1 is our first major release from a large multilingual handwriting corpus. Containing over 6000 pages of handwritten matter, this dataset can not only be used for unconstrained handwriting recognition, more importantly, the dataset's unique twin-folio structure presents a natural fit for research...

chapter

Text Line Detection for Heterogeneous Documents

Markus Diem, Florian Kleber, Robert Sablatnig

2013 12th International Conference on Document Analysis and Recognition > 743 - 747

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

Text line detection is a pre-processing step for automated document analysis such as word spotting or OCR. It is additionally used for document structure analysis or layout analysis. Considering mixed layouts, degraded documents and handwritten documents, text line detection is still challenging. We present a novel approach that targets torn documents having varying layouts and writing. The proposed...

chapter

A Technique for Skew Detection of Printed Arabic Documents

Irfan Ahmad

2013 10th International Conference Computer Graphics, Imaging and Visualization > 62 - 67

2013 10th International Conference Computer Graphics, Imaging and Visualization (CGIV)

Document skew correction is one of the core preprocessing steps in document analysis systems. In this paper, the author proposes a new multi-step skew detection technique for printed Arabic documents. The technique exploits the unique property of the writing line of Arabic script and is based on connected component analysis and projection profiles. The proposed technique works for different types...

Data set:
ieee
Keywords:
TEXT ANALYSIS

Publication date

Set your own date range

INFONA - science communication portal

Search results

VML-HD: The historical Arabic documents dataset for recognition systems

Age Groups Classification in Social Network Using Deep Learning

Feature extraction from handwritten documents for personality analysis

Segmentation of highly unstructured handwritten documents using a neural network technique

A robust text line detection in complex handwritten documents

Framework for human identification through offline handwritten documents

Interactive Visual Text Analysis for Corpus-Based Language Learning

A Database of On-Line Handwritten Mixed Objects Named "Kondate"

LAMIS-MSHD: A Multi-script Offline Handwriting Database

Data Sufficiency for Online Writer Identification: A Comparative Study of Writer-Style Space vs. Feature Space Models

Javanese character image segmentation of document image of Hamong Tani

A Coarse to Fine Skew Estimation Technique for Handwritten Words

Codebook for Writer Characterization: A Vocabulary of Patterns or a Mere Representation Space?

Discriminating Features for Writer Identification

Using Harris Corners for the Retrieval of Graphs in Historical Manuscripts

A Comprehensive Representation Model for Handwriting Dedicated to Word Spotting

Generalized Eigen Cooccurrence: Application to Palaeography

IBM_UB_1: A Dual Mode Unconstrained English Handwriting Dataset

Text Line Detection for Heterogeneous Documents

A Technique for Skew Detection of Printed Arabic Documents

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options