Breach exposes privacy risk from de-anonymization of location data

Gravy Analytics, a prominent location data broker, has disclosed that a significant data breach potentially exposed through de-anonymization the precise location information of millions of individuals. The breach was executed using a “misappropriated key” to access Gravy Analytics’ Amazon Web Services (AWS) cloud storage environment.
Gravy Analytics’ parent company, Norwegian-based Unacast, disclosed the breach last week to the Norwegian Data Protection Authority (NDPA) as required by law. Unacast said it “identified unauthorized access to its AWS cloud storage environment. The unauthorized person obtained some files, but the contents of those files and whether they contain personal data remains under investigation. Gravy Analytics is informing Datatilsynet at this time for your awareness, as speculation about this incident has started to appear on social media and in news media.”
“The investigation is still in progress, but the unauthorized person appears to have gained access to the Gravy Analytics AWS environment through a misappropriated access key. Gravy Analytics became aware of this incident through communication from the unauthorized person,” the company told NDPA.
According to 404 Media, “The hackers said they have stolen a massive amount of data, including customer lists, information on the broader industry, and even location data harvested from smartphones which show peoples’ precise movements, and they are threatening to publish the data publicly.”
The breach underscores significant privacy concerns, as the exposed data could potentially lead to the de-anonymization of individuals, enabling malicious actors to track personal movements and behaviors.
“It can be embarrassing and a violation of privacy. For some, it can be used to influence and manipulate them for fraud or blackmail,” said NDPA’s Tobias Judin.
The exposure of individuals’ location data by the Gravy Analytics hack highlights the vulnerabilities that are inherent in the data brokerage ecosystem where vast amounts of personal information are collected, stored, and monetized without adequate oversight or user consent.
The data breach exposed the risks that are associated with the real-time bidding (RTB) process within the advertising ecosystem which allows data brokers to harvest location information during ad placements. In the Gravy Analytics case, sensitive location data collected through RTB processes was exposed, demonstrating just how such systems can compromise user privacy.
The breach highlights the critical need for robust cybersecurity measures, particularly when it comes to the protection of sensitive personal data. The incident underscores the importance of transparency in data collection and the necessity for obtaining explicit user consent, especially when handling sensitive information such as location data. Users should be informed about how their data is collected, stored, and utilized, empowering them to make informed decisions regarding their privacy.
The compromised data includes sensitive location records harvested from various smartphone applications, revealing individuals’ movements to places such as the White House, military bases, and other sensitive locations. A small data sample leaked on a Russian forum included over 30 million location points, indicating the extensive nature of the breach.
The data breach involving Gravy Analytics revealed that popular applications such as Candy Crush, Tinder, and MyFitnessPal, were exploited to collect users’ location data without their explicit consent, which was achieved through the RTB process. This data collection occurred without the direct involvement or awareness of the app developers.
Real-Time Bidding is a programmatic advertising technology that facilitates the rapid auctioning of digital ad space to advertisers. When a user visits a website or opens an app, their data, including location, browsing behavior, and demographic details, is transmitted to an ad exchange. Advertisers then bid in real time to display their ads to that specific user, with the highest bidder’s ad appearing almost instantly.
While RTB enables highly targeted advertising, it also raises significant privacy concerns. The process involves sharing vast amounts of user data with advertisers and third parties – data that often includes sensitive information such as precise locations and device identifiers. Many users are unaware that their data is being auctioned, as consent is often buried in lengthy terms and conditions that are rarely read. Furthermore, even anonymized data can be de-anonymized when combined with other datasets, exposing individuals to potential tracking and surveillance.
RTB also facilitates continuous tracking of users across devices and platforms, creating detailed behavioral profiles. This form of surveillance advertising raises concerns about the erosion of online privacy. Moreover, the extensive data sharing inherent in RTB increases the risk of data breaches. If one entity in the RTB ecosystem is compromised, millions of users’ data could be exposed. These practices often conflict with privacy laws like the General Data Protection Regulation (GDPR) in Europe and the California Consumer Privacy Act in the United States, which require transparency and explicit consent for data collection.
De-anonymization occurs when data that has been stripped of direct identifiers, such as names or email addresses, is combined with other datasets to re-identify individuals. This process relies on quasi-identifiers like location, gender, birth date, or device IDs, which, when cross-referenced with other information, can uniquely identify people. For example, a combination of birth date, gender, and ZIP code can be enough to pinpoint an individual, even in anonymized datasets.
Location-based re-identification is a common method, as precise GPS data often reveals unique movement patterns. A person’s home address (their location at night) and workplace (their daytime location) can easily lead to their identification. Behavioral data, such as shopping habits or browsing histories, also pose a risk. Patterns in this data can correlate with identifiable attributes in other datasets, linking anonymous records back to specific individuals.
There have been notable cases of de-anonymization. In 2008, Netflix released an anonymized dataset of movie ratings for a competition to improve its recommendation algorithm. Researchers were able to re-identify users by cross-referencing the Netflix dataset with publicly available IMDb reviews, exposing private preferences. Another example occurred in 1997 when Latanya Sweeney re-identified the governor of Massachusetts in an anonymized medical dataset by comparing it with voter registration records using attributes like ZIP code, birth date, and gender.
In 2013, MIT researchers demonstrated how anonymized mobile phone location data could be re-identified. They found that just four spatiotemporal points -specific times and locations – were enough to uniquely identify 95% of individuals in the dataset.
Similarly, during the Cambridge Analytica scandal in 2018, data from Facebook was de-anonymized to build detailed psychographic profiles of voters, showing how behavioral patterns could be used for invasive targeting. The COVID-19 pandemic also revealed vulnerabilities in anonymized data, as mobility data shared for public health research could still identify individuals’ movements through pattern analysis.
Similarly, regulatory bodies like the Belgian Data Protection Authority have criticized RTB frameworks for violating privacy regulations, as seen in the IAB Europe’s 2022 GDPR violation ruling.
In 2022, DPA ruled that IAB Europe was in violation of the GDPR due to its “Transparency and Consent Framework” (TCF) system, finding that the TCF string containing user preferences constituted personal data and that IAB Europe acted as a data controller, meaning they were responsible for managing this data, leading to concerns about excessive user tracking and non-compliant consent mechanisms. The decision was later upheld by the European Court of Justice, which further solidified IAB Europe’s responsibility under the GDPR for the TCF system.
De-anonymization has profound implications for privacy, as re-identified data can expose sensitive information and personal behaviors. This exposure may lead to discrimination, exploitation, or other forms of harm. Organizations also face legal and reputational risks when anonymized data is compromised, especially if re-identification violates privacy laws like the GDPR. Additionally, the use of de-anonymized data for profiling, targeted advertising, or surveillance further erodes public trust.
To mitigate these risks, organizations often employ methods such as differential privacy, which adds statistical noise to data to prevent individual records from being isolated while preserving overall patterns.
Aggregating data instead of sharing raw records is another approach, along with implementing strict access controls to limit who can view sensitive information. Transparency is also critical, ensuring users are aware of how their data is collected, anonymized, and potentially used. However, as datasets grow and analytical tools become more sophisticated, the risk of de-anonymization remains an ongoing challenge.
To address these concerns, platforms should prioritize transparency and provide users with clear information about RTB practices. Users should have accessible options to opt out of targeted advertising. Governments must enforce privacy regulations more rigorously to ensure compliance.
Additionally, privacy-preserving technologies, such as on-device ad targeting, can minimize data exposure while still supporting advertising needs. Educating users on managing app permissions and using ad-blocking tools can also help mitigate the risks associated with RTB.
While RTB offers efficiency and precision in advertising, its implications for user privacy demand careful scrutiny and systemic changes to balance technological capabilities with ethical practices.
Several app developers and companies have denied any knowledge of or involvement with Gravy Analytics. Tinder stated that it has no relationship with Gravy Analytics and no evidence that data was obtained from its app. Similarly, Muslim Pro, a popular prayer app, expressed unawareness of Gravy’s activities.
The pervasive nature of RTB makes it challenging for app developers to fully control or even be aware of how user data is exploited within the advertising ecosystem. Users are encouraged to be vigilant by limiting app permissions and blocking advertisements to reduce exposure to such data collection practices.
The FTC has been actively scrutinizing data brokers for their handling of sensitive information. In December, the Federal Trade Commission took action against Gravy Analytics and its subsidiary Venntel for unlawfully selling sensitive location data tracking consumers to sensitive sites, including health-related locations and places of worship.
To mitigate the risks, it is imperative for organizations to implement stringent access controls, regularly audit data collection practices, and ensure compliance with data protection regulations. Additionally, there is a pressing need for greater transparency in data collection methods and for obtaining explicit user consent, particularly when handling sensitive information such as location data.
Article Topics
anonymization | data privacy | digital identity | Gravy Analytics | location data | mobile app | reidentification | smartphones | surveillance
Comments