Phase Change Memory (PCM) has emerged as a promising candidate for future memories. PCM has high cell density, zero cell leakage, and high stability in deep sub-micron technologies. Although PCM has limited endurance, recent endeavors have shown that its lifetime can be improved by orders of magnitude. However, a major hurdle for PCM is the long write latency and high write power. For this reason, PCM cannot deliver satisfactory memory bandwidth for high-end computing environment such as multi-processing and server systems. In this paper, we develop a non-blocking PCM bank design such that subsequent reads or writes can be carried in parallel with an on-going write. This is effective in removing long blocking time due to serial operations. Moreover, we propose novel memory request scheduling algorithms to exploit intra-bank parallelism brought by our non-blocking hardware. Our non-blocking hardware with scheduling enhancement improves PCM memory throughput by 51% on average. Finally, we propose a fine-grained power budgeting scheme to achieve more throughput improvement under power budgets. Experiments show that our scheduler enhanced with power budgeting scheme can achieve a throughput improvement of 118% on average.