Snapshot of the position of a site in the Internet: Vector space model applied to external pages

Roberto Ulloa

doi:10.1109/CLEI.2012.6427223

Snapshot of the position of a site in the Internet: Vector space model applied to external pages

Source

2012 XXXVIII Conferencia Latinoamericana En Informatica (CLEI) > 1 - 7

Abstract

The study of external pages (pages that links a site) is important since search engines use them to rank their results. Therefore, it is useful to have a “snapshop” of the content of the external pages to understand the surroundings of a particular site in the extense topic dimensionality of Internet. To achieve this, a graph, in which the external pages are nodes and a measure of similitude between them are edges, is built. This is a novel approach of building the edges since usually just the hyperlinks are used instead. The analysis shows that the properties related to the keywords are useful to explain the structure of the graph whereas the isolated components of the graphs represent undesirable (spamming) links. The results gives promising ideas in three directions: an approach to understand the neighbourhood of a site, a methodology to detect spamming links and an automatic procedure to legitimate the importance of external pages.

Identifiers

book ISBN :	978-1-4673-0794-9
book e-ISBN :	978-1-4673-0793-2 , 978-1-4673-0792-5
DOI	10.1109/CLEI.2012.6427223

Authors

Ulloa, Roberto

The CulturePlex Lab, The University of Western Ontario, London ON N6A 3K7, CA

Keywords

Vectors Search engines Internet Correlation Buildings Layout Unsolicited electronic mail data mining external pages analysis content based graphs search engine optimization graph analysis

Additional information

Data set: ieee

Publisher

IEEE

chapter

Read online
Download
Add to read later
Add to collection
Add to followed
Share

Export to bibliography


Assign to other user
	×
Wrong email address

INFONA - science communication portal

Snapshot of the position of a site in the Internet: Vector space model applied to external pages $("#expandableTitles").expandable();

Source

Abstract

Identifiers

Authors

User assignment

Assignment remove confirmation

You're going to remove this assignment. Are you sure?

Ulloa, Roberto

Keywords

Additional information

Publisher

Share

Export to bibliography

Reporting an error / abuse

Sending the report failed

Accessibility options

Snapshot of the position of a site in the Internet: Vector space model applied to external pages