Generally, in multi-lingual communities, non-native speakers may produce speech sound which is either part of their own native language or established via merging characteristics of native pronunciation with non-native pronunciation. Recently, a Two-pass phone clustering based on Confusion Matrix (TCM) approach has been proposed to address the one-to-one phone mappings between Chinese syllables and English phones using standard Chinese and English data. In this paper, we extend TCM to the one-to-many phone mappings issue since there is the merging phenomenon of native and non-native pronunciation in bilingual speeches. Employing a knowledge-based phone set to TCM as supplements for phone clustering, a novel method termed as the TCM with Initialization and Updating of the Phone Set method (TCM-IUPS). As a result, the pronunciation dictionary is built via using the information learned by our proposed TCM-IUPS as well as canonical pronunciation. Experiments show that, compared with TCM, the Phrase Error Rate (PhrER) of TCM-IUPS is reduced by 5.27% in bilingual testing corpora and 26.09% in mono-English testing corpora compared with TCM, while the same performance is maintained in mono-Mandarin testing corpora.