Policy iteration, as an adaptive/approximate dynamic programming-based approach for optimal control is investigated. The context is optimal control of discrete-time nonlinear dynamics with undiscounted cost functions. Convergence of the learning iterations and uniqueness of the solution to the corresponding Bellman equation are established, leading to the optimality of the limit function, i.e., the learning results. Moreover, given the faster convergence of the learning under policy iteration compared with value iteration-based learning algorithms, some theoretical results are developed which prove that starting with a similar initial guess, policy iteration will not converge slower than value iteration. Finally, some numerical analyses are presented to demonstrate the results in practice.