Data exposure spotlights ongoing questions about web scrapers and face biometrics
Sections of the web-scraping industry closely resemble the sketchier corners of the search engine optimization community with a couple big differences.
Practitioners of bad SEO hide within densely nested corporate hierarchies like overcrowded spiders waiting for unprepared search bots to boost their clients’ traffic.
In contrast, the more unsavory web scrapers brazenly sell their harvesting services, especially from social media sites and without the owners’ permission. Or, apparently, the sites’ permission either.
The biggest difference, however, is that unscrupulous SEO service firms leave behind defrauded advertisers and other companies.
Scrapers not only profit from ordinary people’s personal information, including biometric data, but their vast data stores can be insufficiently protected.
That is what has happened with 235 million public social media profiles collected by the defunct Deep Social, an indiscriminate data miner that closed in 2018 after Instagram (and its owner Facebook) forbid it from using its marketing APIs. Instagram representatives have said scraping violates the pair’s legally binding terms of service that Deep Social agreed to.
Comparitech, a consumer technology research publisher, reportedly found the profiles exposed and unprotected in three identical databases last week. It included names, images, contact information and other data tied to accounts not only on Instagram, but YouTube and TikTok as well. The publication notes that the scraped data could be used for biometric face recognition purposes, such as training data or spoof attacks.
After a little detective work, Comparitech researchers traced the databases to Deep Social, and from there, to another scraper, Social Data. According to Comparitech, Social Data’s chief technology officer “acknowledged the exposure,” and the relevant servers were taken off line about three hours later.
Social Data told the publication that it is not connected to Deep Social.
An email from the company to Comparitech includes a full-throated defense for scraping public profiles on social media without referencing the need to abide by terms of services.
About all that is known about either company’s structure is that Social Data has a CTO and that defunct Deep Social founder and CEO Pavel Maurus moved on after the closure.
Maurus, a Russian native, has since been involved in AppQuantum, a game maker, and AdQuantum. The latter busies itself “acquiring premium quality traffic.”