Spatio-temporal data is characteristic of large volume and high redundancy, which will require large amounts of space for storage and computing power for analysis. In this paper, inspired by the sparse, multi-scale and hierarchical characteristics of visual perception in the Human Vision System (HVS), we advance a new Hierarchical Representation Learning (HRL) based spatio-temporal data redundancy reduction approach. In our method, the most informative and representative data can be identified in a cascade manner via a hierarchical and sparse self-representation model. The parallelized realization of the proposed scheme is discussed. The proposed method is investigated on some large volume spatio-temporal data, and the experimental results prove its efficiency and superiority to some state-of-the-art results.