The distributed video coding (DVC) paradigm is based on two well-known information theory results: the Slepian-Wolf and Wyner-Ziv theorems. In a DVC codec, the video signal correlation is mostly exploited at the decoder, providing a flexible distribution of the computational complexity between the encoder and the decoder and error robustness to channel errors. To exploit the temporal correlation, an estimate of the original frame to code, well-known as side information, is typically created at the decoder. One popular approach to side information creation is to perform frame interpolation using a translational motion model derived from already decoded frames. However, this translational model fails to estimate complex camera motions, such as zooms and rotations, and is not accurate enough to estimate the true trajectories of scene objects. In this paper, a new side information creation framework integrating perspective transform motion modeling is proposed. This solution is able to better locally track the trajectories and deformations of each object and increase the accuracy of the overall side information estimation process. Experimental results show peak signal-to-noise ratio gains of up to 1 dB in side information quality and up to 0.5 dB in rate-distortion performance for some video sequences regarding state-of-the-art alternative solutions.