Video surveillance systems have enabled the monitoring of complex events in several places, such as airports, banks, streets, schools, industries, among others. Due to the massive amount of multimedia data acquired by video cameras, traditional visual inspection by human operators is a very tedious and time consuming task, whose performance is affected by fatigue and stress. A challenge is to develop intelligent video systems capable of automatically analyzing long sequences of videos from a large number of cameras. This work describes and evaluates the use of CENTRIST-based features to identify violence context from video scenes. Experimental results demonstrate the effectiveness of our method when applied to two public benchmarks, Violent Flows and Hockey Fights datasets.