This work presents a novel method for human action recognition based on feature level fusion and random projection. The proposed method exploits both spatial-temporal gradient features and Gabor features of the action in video, which helps representing the action more accurately after feature level fusion. Meanwhile, the random projection is employed to reduce the dimensionality of features effectively. In addition, the Bayesian parameter estimation is applied to the Latent Dirichlet Allocation (LDA) topic model. It reflects the action distribution of different videos as well as reduces the complexity of parameter estimation. Experimental results on publically available datasets KTH dataset indicate that the proposed method not only outperforms the single local descriptor approach but also improves the recognition performance compared with the baseline classifier in the same experimental settings.