This paper presents a low-power implementation of the A8051 processor. It employs an adaptive pipeline structure that allows to skip a redundant stage operation and to combine with the neighboring empty stage. The processor has three features to reduce the power dissipation as well as to improve performance: multilooping control for multicycle instructions, branch predictor for unconditional branches, and a single threading in the EXE stage. The experimental results show that A8051 runs about 1.8 times faster than the synchronous counterpart, CIP51 [reported in the HC8051F0xx Family Datasheet (2002)]. In terms of Et2 , our implementation shows 15 times higher efficiency than that of asynchronous counterpart developed by the Nanyang University [Chang and Gwee (2006)].