Temporal segmentation of facial expressions in video sequences is an important and relatively unexplored problem in facial image analysis. The difficulties of temporal segmentation include irregular facial behavior, large variability in facial gestures and moderate to large head motion. To solve those problems, we propose a two-step method to segment facial expression temporally, which consists of a rough segmentation stage and a fine segmentation stage. The rough segmentation combines kernel k-means and spectral clustering to segment facial sequences into distinct facial behaviors. The fine segmentation subdivides the rough segmentation to get the final segmentation result by calculating the similarity between segments. We conduct the experiments by using MMI Facial Expression Database. Promising results demonstrate the potential of the proposed approach.