Online advertising is a rapidly growing industry currently dominated by the search engine 'giant' Google. In an attempt to tap into this huge market, Internet Service Providers (ISPs) started deploying deep packet inspection techniques to track and collect user browsing behavior. However, such techniques violate wiretap laws that explicitly prevent intercepting the contents of communication without gaining consent from consumers. In this paper, we show that it is possible for ISPs to extract user browsing patterns without inspecting contents of communication. Our contributions are threefold. First, we develop a methodology and implement a system that is capable of extracting web browsing features from stored non-content based records of online communication, which could be legally shared. When such browsing features are correlated with information collected by independently crawling the Web, it becomes possible to recover the actual web pages accessed by clients. Second, we systematically evaluate our system on the Internet and demonstrate that it can successfully recover user browsing patterns with high accuracy. Finally, our findings call for a comprehensive legislative reform that would not only enable fair competition in the online advertising business, but more importantly, protect the consumer rights in a more effective way.
Financed by the National Centre for Research and Development under grant No. SP/I/1/77065/10 by the strategic scientific research and experimental development program:
SYNAT - “Interdisciplinary System for Interactive Scientific and Scientific-Technical Information”.