This paper presents a robust keyword detection system for criminal scene analysis. The system follows the classical keyword spotting framework. A universal background model is designed and served as the filler model and anti-word model in keyword recognition and verification, respectively. Specifically, we analyze the different pitch varying styles of the keywords in criminal scenarios and their homophones in normal conditions. The pitch variation characteristics are employed in the system to effectively reduce the false alarm error rate. Results on simulated experiments of audio-based criminal scene analysis show the effectiveness of the proposed system in the real-world implementations.