The study of external pages (pages that links a site) is important since search engines use them to rank their results. Therefore, it is useful to have a “snapshop” of the content of the external pages to understand the surroundings of a particular site in the extense topic dimensionality of Internet. To achieve this, a graph, in which the external pages are nodes and a measure of similitude between them are edges, is built. This is a novel approach of building the edges since usually just the hyperlinks are used instead. The analysis shows that the properties related to the keywords are useful to explain the structure of the graph whereas the isolated components of the graphs represent undesirable (spamming) links. The results gives promising ideas in three directions: an approach to understand the neighbourhood of a site, a methodology to detect spamming links and an automatic procedure to legitimate the importance of external pages.