Content providers of P2P-Video-on-Demand (P2P-VoD) services aim to provide a high quality, scalable service to users, and at the same time, operate the system with a manageable operating cost. Given the volume-based charging model by ISPs, it is to the best interest of the P2P-VoD content providers to reduce peers' access to the content server so as to reduce the operating cost. In this paper, we address an important open problem: what is the “optimal replication ratio” in a P2P-VoD system such that peers will receive service from each other and at the same time, reduce the traffic to the content server. We address two fundamental problems: (1) what is the optimal replication ratio of a movie given its popularity, and (2) how to achieve the optimal ratios in a distributed and dynamic fashion. We formally show how movie popularities can impact server's workload, and formulate the video replication as an optimization problem. We show that the conventional wisdom of using the proportional replication strategy is non-optimal, and expand the design space to both passive replacement policy and active push policy to achieve the optimal replication ratios. We consider practical implementation issues, evaluate the performance of P2P-VoD systems and show that our algorithms can greatly reduce server's workload and improve streaming quality.