Distance metric learning (DML) is an effective similarity learning tool to learn a distance function from examples to enhance the model performance in applications of classification, regression, and ranking, etc. Most DML algorithms need to learn a Mahalanobis matrix, a positive semidefinite matrix that scales quadratically with the number of dimensions of input data. This brings huge computational cost in the learning procedure, and makes all proposed algorithms infeasible for extremely high-dimensional data even with the low-rank approximation. Differently, in this paper, we take advantage of the power of parallel computation and propose a novel distributed distance metric learning algorithm based on a state-of-the-art DML algorithm, Information-Theoretic Metric Learning (ITML).More specifically, we utilize the property that each positive semidefinite matrix can be decomposed into a combination of rank-one and trace-one matrices and convert the original sequential training procedure into a parallel one. In most cases, the communication demands of the proposed method are also reduced from O(d2) to O(cd), where d is the number of dimensions of the data and c is the number of constraints in DML and can be smaller than d by appropriate selection. Moreover importantly, we present a rigorous theoretical analysis to upper bound the Bregman divergence between the sequential algorithm and the parallel algorithm, which guarantees the correctness and performance of the proposed algorithm. Our experiments on datasets with O(105) features demonstrate the competitive scalability and the performance compared with the original ITML algorithm.