The MSR (Multi-Scale Retinex) image enhancement algorithm can produce best performance in most cases, but the computation load is very huge especially for large image. In this paper, an efficient approach is proposed to accelerate MSR image enhancement speed on GPU via CUDA (Compute Unified Device Architecture). Time consuming modules such as multi-scale Gaussian filter, logarithmic domain differencing and dynamic range compressing are analyzed and implemented in GPU. Experiment results show that the proposed method can reduce the execution time significantly, and get a maximum speedup ratio over 100x.