Task recognition and future human activity prediction are of importance for a safe and profitable human-robot cooperation. In real scenarios, the robot has to extract this information merging the knowledge of the task with contextual information from the sensors, minimizing possible misunderstandings. In this paper, we focus on tasks that can be represented as a sequence of manipulated objects and performed actions. The task is modelled with a Dynamic Bayesian Network (DBN), which takes as input manipulated objects and performed actions. Objects and actions are separately classified starting from RGB-D raw data. The DBN is responsible for estimating the current task, predicting the most probable future pairs of action-object and correcting possible misclassification. The effectiveness of the proposed approach is validated on a case of study, consisting of three typical tasks of a kitchen scenario.