Urban areas contain many manmade objects such as buildings and roads that create a huge amount of salient features in Very High Resolution (VHR) images. These features are often used as Control Points (CPs) in state-of-the-art co-registration approaches. However, a large number of CPs especially clustered together may result in a poor matching between multitemporal images and thus a poor co-registration performance. In order to effectively reduce the number of CPs and achieve good co-registration performance, we propose a context-based CPs selection approach. To this end, context-based CPs are extracted by applying a segmentation method. Their correspondences are established by considering local misalignment, also called Registration Noise (RN). Thus the approach achieves fine co-registration performance even in complex scenarios like urban areas. The experiments on both a simulated and a real dataset confirmed the effectiveness of the proposed approach.