We propose a method for classifying actions involving people interacting with objects. Our method combines motion and appearance information into a unified framework. Here, we explore the video's sparse component as provided by robust principal-component analysis for the extraction of motion information in the form of trajectories. While we use motion as the main clue for classification, we also incorporate implicit object information into the classification process. Here, object information is represented by the probability of the object with which the person is interacting. These probabilities are learned using probabilistic Latent Semantic Analysis (pLSA). We test our classification method on a publicly available dataset, and provide a comparison with some related work. Classification results obtained by our method are promising.