Reconfigurable computing arrays facilitate the flexibility with high performance for regular and computation-intensive algorithms in multimedia processing. However, the efficiency of the irregular and control-intensive algorithms becomes the performance bottleneck of reconfigurable multimedia systems. In this paper, we propose the design and VLSI implementation of a novel memory efficient macroblock prediction and boundary strength (Bs) calculation engine. The control-intensive algorithms, including intra mode prediction, motion vector prediction, and Bs calculation, are implemented with 4x4 block level pipeline to achieve real-time decoding for H.264/AVC high profile and Chinese AVS Jizhun profile. Compared with existing designs, our design achieves 60% registers reduction for neighboring block load and update. Implementation results indicate that the proposed architecture can support 1920×1088@30fps of H.264 and AVS decoding at 86 MHz.