This paper presents new rotator architecture for FFT computation. The proposed architecture consists of cascaded multiplier-less cells, and each cell stage performs partial twiddle factor multiplications with low-complexity adders and multiplexers. Besides, for further area reduction, each cell is optimized with the technique of common subexpression sharing. Since those twiddle factors involved in computation are realized with multipliers generated on-the-fly by a scheme of coefficient selection, the proposed architecture doesn't require memory space to store any twiddle factors. Variable FFT lengths ranging from 64 ∼ 32768 points can be supported by flexibly adding or removing some cell stages, depends on FFT length. Compared to CORDIC-based architectures, the proposed architecture has lower latency. The implementation results show that the proposed architecture is area-efficient and is suitable for either pipelined or memory based FFT architectures.