Semantic segmentation is an image labeling process for each pixels according to defined objects class and its presence in an image. Labeling process consists of recognizing, detecting location and labeling pixels that defines the object in the image. Annotation result of semantic segmentation needs ground truth to verify accuracy of score prediction. Therefore, this research propose a model to predict score of annotation accuracy. By casting the problem into constraining object boundary recognition, we described the annotation using foreground mask. To extract the feature, we used convolution neural network. We only used CNN trained on a image level annotation. In order to be able to infer the pixel instance, we adapt CNN architecture into weakly supervised learning. Experiments were conducted by finetuning Convolution Neural Network for object recognition using weakly supervised architecture for multilabel classification. In this paper we proposed to score semantic segmentation based on bag level information without the availability of pixel level annotation.