Intelligent environments equipped with audio-visual sensors provide suitable means for automatically monitoring and tracking the behavior, strategies and engagement of the participants in multiperson meetings. In this paper, high-level features are calculated from active speaker segmentations, automatically annotated by our smart room system, to infer the interaction dynamics between the participants. These features include the number and the average duration of each turn, statistics of turn-taking such as time as active speaker, and turn-taking transition patterns between participants. The results show that it is possible to accurately estimate in real-time not only the flow of the interaction, but also how dominant and engaged each participant was during the discussion. These high-level features, which cannot be inferred from any of the individual modalities by themselves, can be useful for summarization, classification, retrieval and (after action) analysis of meetings.