We propose to learn a non-intrusive quality assessment metric for enhanced speech signals. High-dimension spectro-temporal features are extracted by the Gabor filter bank for speech signals. To reduce the high-dimension features, we use PCA (Principal Component Analysis) to process these features. After obtaining the feature vector from audio signals, Support Vector Regression (SVR) is used to learn the metric for quality evaluation of enhanced speech signals. Experimental results on NOIZEUS dataset demonstrate that proposed non-intrusive quality assessment metric by using spectro-temporal features can obtain better performance for enhanced speech signals.