Scene text recognition with CNN classifier and WFST-based word labeling

Xinhao Liu; Takahito Kawanishi; Xiaomeng Wu; Kunio Kashino

doi:10.1109/ICPR.2016.7900259

Scene text recognition with CNN classifier and WFST-based word labeling

Liu, Xinhao, Kawanishi, Takahito, Wu, Xiaomeng, Kashino, Kunio

Source

2016 23rd International Conference on Pattern Recognition (ICPR) > 3999 - 4004

Abstract

Natural scene text recognition has proved to be challenging due to the unconstrained wild conditions. In this paper, to solve this problem we propose a method which first detects and recognizes characters by utilizing the high performance Convolutional Neural Network (CNN). Then for post-processing, inspired by its success in speech recognition, we employ the efficient and flexible Weight Finite State Transducer (WFST) based word labeling model for incorporation with a lexicon or high order language model. In the experiments we show that the proposed approach can correctly and robustly recognize the text in the scene images and the results for serveral public datasets (ICDAR 2003, SVT and IIIT5K) show comparable or superior performance to the state-of-the-art algorithms.