In this paper, we propose a novel formulation for building accurate pixel-wise alignments between remote sensing images under non-rigid distortions. Our formulation involves two variables: the first is a discrete displacement flow field similar to optical flow which controls the pixel-wise correspondence and allows piecewise smoothness, while the second is a continuous spatial transformation which fits for a few confidential sparse feature correspondences. An additional term is introduced to ensure the coherence between the two variables, and the continuous spatial transformation plays a role of anchor for optimizing the discrete displacement flow field. Experiments on real remote sensing images demonstrate that our approach greatly outperforms state-of-the-art methods.