Thunderbeam-Lightbeam for Chrome


Chrome Extension Introduction

    Our extension could be installed and used in Chrome Web Store: https://tiny.cc/lightbeam-chrome-plugin .
  • Thunderbeam starts recording connections as soon as it's installed.
  • To start visualizing your online interactions, open a new tab, navigate to a site, and then check back to the Thunderbeam tab.
  • Thunderbeam is used for our research study. If you consent, and only if you consent, your browser data will be used for research. Please note that even for those who consent, no personal data belonging to the users will be used (and data is hashed to prevent re-identification); any data collected will be used only for the purposes of non-commercial research, and all data will be deleted after the research project completes.
  • The code behind our plugin is completely open source on github for your inspection. We are happy to receive your feedback (pull requests/issues etc): https://github.com/socsys/Lightbeam_Chrome .






Our papers:


Characterising Third Party Cookie Usage in the EU after GDPR

The recently introduced General Data Protection Regulation (GDPR) requires that when obtaining information online that could be used to identify individuals, their consents must be obtained. Among other things, this affects many common forms of cookies, and users in the EU have been presented with notices asking their approvals for data collection. This paper examines the prevalence of third party cookies before and after GDPR by using two datasets: accesses to top 500 websites according to Alexa.com, and weekly data of cookies placed in users' browsers by websites accessed by 16 UK and China users across one year. We find that on average the number of third parties dropped by more than 10% after GDPR, but when we examine real users' browsing histories over a year, we find that there is no material reduction in long-term numbers of third party cookies, suggesting that users are not making use of the choices offered by GDPR for increased privacy. Also, among websites which offer users a choice in whether and how they are tracked, accepting the default choices typically ends up storing more cookies on average than on websites which provide a notice of cookies stored but without giving users a choice of which cookies, or those that do not provide a cookie notice at all. We also find that top non-EU websites have fewer cookie notices, suggesting higher levels of tracking when visiting international sites. Our findings have deep implications both for understanding compliance with GDPR as well as understanding the evolution of tracking on the web.



What a Tangled Web We Weave: Understanding the Interconnectedness of the Third Party Cookie Ecosystem

In this paper, we develop a metric called “tangle factor” that measures how a set of first party websites may be interconnected or tangled with each other based on the common third parties used. Our insight is that the interconnectedness can be calculated as the chromatic number of a graph where the first party sites are the nodes, and edges are induced based on shared third parties. We use this technique to measure the interconnectedness of the browsing patterns of over 100 users in 25 different countries, through a Chrome browser plugin which we have deployed. The users of our plugin consist of a small carefully selected set of 15 test users in UK and China, and 1000+ in-the-wild users, of whom 124 have shared data with us. We show that different countries have different levels of interconnectedness, for example China has a lower tangle factor than the UK. We also show that when visiting the same sets of websites from China, the tangle factor is smaller, due to blocking of major operators like Google and Facebook...



Multi-country Study of Third Party Trackers from Real Browser Histories

This paper aims to understand how third-partyecosystems have developed in four different countries: UK,China, AU, US. We are interested in how wide a view agiven third-party player may have, of an individual user’sbrowsing history over a period of time, and of the collectivebrowsing histories of a cohort of users in each of thesecountries. We study this by utilizing two complementary ap-proaches: the first uses lists of the most popular websites percountry, as determined by Alexa.com. The second approachis based on the real browsing histories of a cohort of users inthese countries. Our larger continuous user data collectionspans over a year. Some universal patterns are seen, suchas more third parties on more popular websites, and aspecialization among trackers, with some trackers presentin some categories of websites but not others. However, ourstudy reveals several unexpected country-specific patterns:China has a home-grown ecosystem of third-party operatorsin contrast with the UK, whose trackers are dominated byplayers hosted in the US. UK trackers are more locationsensitive than Chinese trackers. One important consequenceof these is that users in China are tracked lesser than usersin the UK. Our unique access to the browsing patternsof a panel of users provides a realistic insight into thirdparty exposure, and suggests that studies which rely solelyon Alexa top ranked websites may be over estimating thepower of third parties, since real users also access severalniche interest sites with lesser numbers of many kinds ofthird parties, especially advertisers.


Third-party Dataset

All collected data have been obtained with agreement from participants and under Research Ethics Minimal Risk Registrationprocess at our university to ensure the permissions of approvals relevant to this research (Ethics approval no. MRS-1718-6539). If you are interested in using this data, please send us an email to Request Data and indicate which of following parts you need in the email. Example screencast videos for non-complaince websites in Top500: here


Contact Us


If you are interested in using this data, please e-mail us at xuehui.hu[at]kcl.ac.uk

We are sharing the video dataset under the terms and conditions specified here and following GDPR's Terms of Usage. In the email, please indicate which part of the dataset you need and the usage of the dataset. If you do not get any email notification for your logged request within 24 hours, please e-mail us at xuehui.hu[at]kcl.ac.uk.


Dataset Terms and Conditions

  1. You will use the data solely for the purpose of non-profit research or non-profit education.

  2. You will respect the privacy of end users and organizations that may be identified in the data. You will not attempt to reverse engineer, decrypt, de-anonymize, derive or otherwise re-identify anonymized information.

  3. You will not distribute the data beyond your immediate research group.

  4. If you create a publication using our datasets, please cite our papers as follows.

Xuehui Hu and Nishanth Sastry. 2019. Characterising Third Party Cookie Usage in the EU after GDPR. In Proceedings of the 11th ACM Conference on Web Science (WebSci ’19). Association for Computing Machinery, New York, NY, USA, 137–141. DOI:https://doi.org/10.1145/3292522.3326039
Xuehui Hu and Nishanth Sastry. 2020. What a Tangled Web We Weave: Understanding the Interconnectedness of the Third Party Cookie Ecosystem. 12th ACM Web Science Conference 2020. Southampton, UK.
Xuehui Hu, Guillermo Suarez-Tangil, and Nishanth Sastry. 2020. Multi-country Study of Third Party Trackers from Real Browser Histories. IEEE European Symposium on Security and Privacy (Euro S&P). Genova, Italy.