Alternating direction method of multipliers (ADMM) has been recognized as an efficient approach for solving many large-scale learning problems over a computer cluster. However, traditional synchronized computation does not scale well with the problem size, as the speed of the algorithm is limited by the slowest workers. In this paper, we propose an asynchronous distributed ADMM (AD- ADMM) which can effectively improve the time efficiency of distributed optimization. Our main interest lies in characterizing the convergence conditions of the AD-ADMM, under the popular partially asynchronous model which is defined based on a maximum tolerable delay in the network. Specifically, by considering general and possibly non-convex cost functions, we show that the AD-ADMM converges to the set of Karush-Kuhn-Tucker (KKT) points as long as the algorithm parameters are chosen appropriately according to the network delay. We also show that the asynchrony of ADMM has to be handled with care, as a slightly different implementation can significantly jeopardize the algorithm convergence.