Transmission of compressed video signals over error-prone networks exposes the information to losses and errors. To reduce the effects of these losses and errors, this paper presents a joint spatial-temporal estimation method which takes advantages of data correlation in these two domains for better recovery of the lost information. The method is designed for the hybrid multiple description coding which splits video signals along spatial and temporal dimensions. In particular, the proposed method includes fixed and content-adaptive approaches for estimation method selection. The fixed approach selects the estimation method based on description loss cases, while the adaptive approach selects the method according to pixel gradients. The experimental results demonstrate that improved error resilience can be accomplished by the proposed estimation method.