Real-time low-latency online inference and decoding in sequential probabilistic models are important in many interactive systems, including automatic speech recognition (ASR) and streaming environments. We study total inference latency (TL) in such systems, the additively combined latency of the inherent look-ahead of a deep neural network's (DNN) contextual window (CWL) in a DNN-HMM hybrid system and the latency incurred during Kalman-style smoothing in a dynamic probabilistic model (MSL) (hence, TL = CWL + MSL). For a fixed TL, the best accuracy can occur with a strictly positive MSL, often by quite a bit, a surprising result given the DNN's power. Furthermore, we find that accuracy is often improved with smaller TL and larger MSL. These results suggest that for optimal low-latency real-time decoding, the size of a DNN context window along with model smoothing should be jointly considered.