Vision-based interfaces pose a tempting alternative to physical interfaces. Intuitive and multi-purpose, these interfaces could allow people to interact with computer naturally and effortlessly. The existing various vision-based interfaces are hard to apply in reality since it has many environmental constraints. In this paper, we introduce a vision-based game interface which is robust in varying environments. This interface consists of three main modules: body-parts localization, pose classification and gesture recognition. Firstly, body-part localization module determines the locations of body parts such as face and hands automatically. For this, we extract body parts using SCI-color model, human physical character and heuristic information. Subsequently, pose classification module classifies the positions of detected body parts in a frame into a pose according to Euclidean distance between the input positions and predefined poses. Finally, gesture recognition module extracts a sequence of poses corresponding to the gestures from the successive frames, and translates that sequence into the game commands using a HMM. To assess the effectiveness of the proposed interface, it has been tested with a popular computer game, Quake II, and the results confirm that the vision-based interface facilitates more natural and friendly communication while controlling the game.