MEDIA & BIG TECH

Written By Bryan Jung 

Researchers from KU Leuven, Radboud University, and the University of Lausanne conducted a study that found tens of thousands of websites captured—without permission—every word typed into an online form, even if users left a site without submitting their info, according to a May 12 article in Fortune.

“Considering its scale, intrusiveness, and unintended side effects, the privacy problem we investigate deserves more attention from browser vendors, privacy tool developers, and data protection agencies,” warned the authors of the study.

Many users were unwittingly led into thinking that their personal data was safe while filling out their email address on a website, registering an account, buying a ticket, or subscribing to a newsletter.

The joint university study analyzed more than 100,000 websites, according to Fortune.

The research groups created fake software profiles imitating a live user that visited thousands of websites and then filled in login or registration information without clicking the submit button.

They found that 1,844 websites in the European Union had gathered individual email addresses without user consent, while 2,950 U.S.-based sites did exactly the same.

The top U.S. websites by user volume where personal data like emails were collected by tracker software included USAToday, Time, Fox News, and Trello, according to the study, while Newsweek, Shopify, and Marriott hit the top of the EU list.

“It certainly exceeded our expectations by a lot,” says Güneş Acar, a professor and researcher at Radboud University, who explained that his team initially thought they would find just a couple hundred sites taking user data.

“Based on our findings, users should assume that the personal information they enter into web forms may be collected by trackers—even if the form is never submitted,” said the study’s authors.

The results found that in some cases, websites collected the data themselves in-house before submission, but most of the data gathered was solely collected by third-party advertising and marketing services like Taboola, Bizible, and Glassbox digital, which were built into websites to monetize content.

The algorithm used by the third parties to collect data was very similar to that of “keylogging,” a technique malware programs utilize to record a user’s keystrokes, often to steal passwords and other confidential information, but rarely the collection of email addresses.

In addition, the researchers “found incidental password collection on 52 websites by third-party session replay scripts,” which were also collecting password data before submission.

Since then, the study group informed the various sites’ operators that the issues in collecting the passwords had been resolved.

In a follow-up investigation, they found that Meta and TikTok had used in-house invisible marketing trackers to collect personal information from web forms without consent.

Websites that used Meta’s Pixel or TikTok’s Pixel software, which allow a webpage’s domains to track visitor activity, would trigger an “automatic advanced matching” feature to allow the two social media giants to grab data from an advertiser’s site.

Every email or piece of data partially entered into a website using Pixel software, even after a click to another page, would result in personal information being taken by Meta or TikTok.

“Documentation we looked together with Asuman claims that [Meta] only collect this data when users click Submit, but we’ve looked into their code and they were collecting all clicks to any button, any link on the page,” said Acar.

The professor found that 8,438 U.S. sites may have been leaking data to Meta through Pixel, while 7,379 sites were compromised for EU users.