This paper presents a motion compensation architecture for Quad-HD H.264/AVC video decoder. For meeting the high throughput requirement, reducing power consumption and solving the memory latency problems, three optimization schemes are applied in this work. Firstly, a quarter-pel interpolator based on Horizontal-Vertical Expansion and Luma-Chroma Parallelism (HVE-LCP) is proposed to efficiently increase the throughput by at least 4 times from the previous designs. Secondly, a novel cache memory organization (4S×4) is adopted to improve the on-chip memory utilization, contributing to memory area and power saving. Finally, a Split Task Queue (STQ) architecture enhances the memory system latency tolerance, which reduces overall processing time. This design costs a logic gate count and on-chip memory of 108.8k and 3.1kB, respectively. The proposed architecture supports real-time processing of 3840×2160@60fps at 166MHz.