Recently, visual saliency detection has received great interest. As most video saliency detection models are based on spatiotemporal mechanism, we firstly give a simple introduction of it in this paper. After discussing issues to be addressed, we present a novel framework for video saliency detection based on 3D discrete shearlet transform. Instead of measuring saliency by fusing spatial and temporal saliency maps, the proposed model regards video as three-dimensional data. By decomposing the video with 3D discrete shearlet transform and reconstructing it on multi-scales, this multi-scale saliency detection model obtains a number of feature blocks to describe the video. Based on each feature block, every a number of successive feature maps are taken as a whole, and the global contrast is calculated to obtain the saliency maps. By fusing all the saliency maps of different levels, the saliency map is generated for each video frame. This novel framework is very simple, and experimental results on ten videos show that the proposed model outperforms lots existing models.