High external memory bandwidth requirement is one major challenge for efficient hardware motion estimation (ME) implementation. Large double-buffered on-chip search window (SW) buffer is usually employed to increase the throughput. In this paper, we focus on SW buffer structure optimization in a systematic viewpoint. An efficient buffer share mechanism is proposed to minimize the memory consumption, simultaneously alleviate the external memory access bandwidth. Moreover, variable block size ME (VBSME) and large SW in high definition (HD) video encoder are both supported with good trade off among throughput, data regularity, and rate distortion performance. The simplified algorithm can achieve nearly 50% SW buffer saving with less than 0.15 dB PSNR degradation at the worst case compared with full search VBSME.