This paper presents a new method of human activity recognition, which is based on transform and non-linear SVM Decision Tree (NSVMDT). For a key binary human silhouette, transform is employed to represent low-level features. The advantage of the transform lies in its low computational complexity and geometric invariance. We utilize NSVMDT to train and classify video sequences, and demonstrate the usability with many sequences. Compared with other methods, ours is superior because the descriptor is robust to frame loss in superior because the descriptor is robust to frame loss in activities recognition, simple representation, computational complexity and template generalization. Sufficient experiments have proved the efficiency.