Top down image semantics play a major role in predicting where people more attend in images. In the state of computational models of human visual attention incorporate high level object detections signifying top down image semantics in a separate channel along with other bottom up saliency channels. The different occurrences of objects in a scene also to attract our attention and this interaction is ignored in recent computational models. This paper deals with the attention model which uses low, high, scene features to understand how their joint presence affects visual attention. The context based features of the scene are extracted using cause effect mechanism. The MIT bench mark data base is used in this paper. The saliency map is compared with some existing models using the performance metric of ROC and area under ROC. It seems that the scene context based saliency map gives promising results compare to the state of art models.