In this paper, we propose a hybrid model-based tracker for simultaneous tracking of 3D head pose and facial actions in sequences of texture and depth frames. Our tracker utilizes a generic wireframe model, the Candide-3, to represent facial deformations. This wireframe model is initially fit into the first frame by an Iterative Closest Point algorithm. Given the result after the first frame, our tracking algorithm combines both Iterative Closest Point technique and Appearance Model for head pose and facial actions tracking. The tracker is capable of adapting on-line to the changes in appearance of the target and thus the prior training process is avoided. Furthermore, the tracking system works automatically without any intervention from human operators.