This paper proposes a novel feature extraction method for speech recognition based on gradient features on a 2D time-frequency matrix. Widely used MFCC features lack temporal dynamics. In addition, ??MFCC is an indirect expression of temporal frequency changes. To extract the temporal dynamics more directly, we propose local gradient features in an area around a reference position. The gradient-based features were originally proposed as HOG (histograms of oriented gradients) and applied to human body detection in image recognition. In this paper, we expand the application to include gradient-based acoustic features in speech recognition. The novel acoustic features were evaluated on a word-speech recognition task, and the results showed a significant improvement for clean speech and even for noisy speech when coupled with MFCC.