Hidden Markov models (HMMs) provide joint segmentation and classification of sequential data by efficient inference algorithms and have therefore been employed in fields as diverse as speech recognition, document processing, and genomics. However, conventional HMMs do not suit action segmentation in video due to the nature of the measurements which are often irregular in space and time, high dimensional and affected by outliers. For this reason, in this paper we present a joint action segmentation and classification approach based on an extended model: the hidden Markov model for multiple, irregular observations (HMM-MIO). Experiments performed over a concatenated version of the popular KTH action dataset and the challenging CMU multi-modal activity dataset (CMU-MMAC) report accuracies comparable to or higher than those of a bag-of-features approach, showing the usefulness of improved sequential models for joint action segmentation and classification tasks.