This paper presents a new method to detect human actions in video by combining sparse appearance features and dense motion features in the unified random forest framework. We compute sparse appearance features to capture the main appearance changes and dense motion features to capture the tiny motion changes in the video. We take advantage of the randomization of channel selection in random trees to combine these two complementary types of features. In addition, linear classification is applied to grow each tree with high efficiency. Each leaf in these trees stores the class distribution and location information of the training samples and action detection for the test video is accomplished by Hough voting of the leaves in each tree. Experimental results demonstrate that our method achieves the state-of-the-art performance on two datasets.