We present a recognition-based digitization method for building digital library of large amount of historical archives. Because the most of archives are manually transcribed in ancient Chinese characters, their digitization present unique academic and pragmatic challenges. By integrating the layout analysis and the recognition into single probabilistic framework, our system achieved 95.1% character recognition rates on test data set, despite the obsolete characters and unique variants used in the archives. Compared with intuitive verification and correction interface, the system freed the operators from repetitive typing tasks and improved the overall throughput significantly.