This paper presents a method for robustly stabilising omnidirectional video given the presence of significant rotations and translations by creating a virtual camera and using a combination of sensor fusion and scene tracking. Real time rotational movements of the camera are measured by an Inertial Measurement Unit (IMU), which provides an initial estimate of the ego-motion of the camera platform. Image registration is then used to refine these estimates. The calculated ego-motion is then used to adjust an extract of the omnidirectional video, forming a virtual camera that is focused on the scene. Experiments show the technique is effective under challenging ego-motions and overcomes deficiencies that are associated with unimodal approaches making it robust and suitable to be used in many surveillance applications.