An optimal nonlinear feature extractor for extracting energy features under two different kinds of patterns is proposed. It carries out the simultaneous diagonalization of two signal covariance matrices in a high-dimensional kernel transformed space, and thus promises to find features which are more discriminant, especially when the original data have nonlinear structures. Two operations, whitening transform and projection transform, are involved in kernel spaces. The mechanism of the feature extractor and its effectivity are shown with simulation data and the classification task of real electroencephalographic (EEG) signals.