In this paper, a hardware/software co-processing speech recognizer for embedded applications is proposed. The system mainly consists of a softcore processor and a hardware accelerator. The accelerator is responsible for GMM emission probability calculation, which is the major computational bottleneck. To alleviate the memory bandwidth issue, the hardware accelerator uses double-buffering, which allows parallel operation of data retrieval and GMM computation. The proposed accelerator is synthesized on an Altera Stratix II FPGA device together with a Nios II softcore processor running at 100 MHz. The proposed system is compared with a pure software-based system using test utterances from the Resource Management (RMI) corpus. For a speech utterance length of 2.49 s, the decoding time reduces from 6.64 s to 2.48 s. The real-time factor improves from 2.67 to 1.00. The word accuracy rate of the proposed system on the RM corpus is 93:42%.