Senators move to block unauthorized use of PII in AI models

Senators Josh Hawley and Richard Blumenthal have reintroduced the AI Accountability and Personal Data Protection Act, a bipartisan measure designed to curb the growing practice of using personal information to train AI models without explicit permission.
At its core, the bill treats the training of AI on personally identifiable information (PII) as a regulated exploitation of “covered data,” placing it under the same legal scrutiny as the sale or commercial use of private records and copyrighted material.
It comes at a moment of heightened public scrutiny over AI’s hunger for data, fueled by lawsuits from news organizations, creative guilds, and individual rights holders challenging the unauthorized use of personal or copyrighted material in model training.
While those cases often center on intellectual property, Hawley and Blumenthal’s bill extends the principle of consent into the realm of private, uniquely identifiable information.
By framing AI training as a legally actionable act when conducted without permission, the measure aims to close what lawmakers describe as a significant privacy gap.
The legislation defines “covered data” in unusually expansive terms, capturing far more than names, phone numbers, or email addresses. It encompasses government-issued identifiers such as Social Security numbers, driver’s license and passport numbers, as well as any unique personal identifier, including device IDs, advertising IDs, and persistent browser cookies that can be linked to a specific individual.
The bill also extends to biometric identifiers, including fingerprints, iris scans, facial recognition templates, and voiceprints; precise geolocation data that could reveal a person’s home address, place of work, or habitual routes; and behavioral or activity patterns, such as purchase histories, browsing activity, or app usage logs.
Even inferred profiles – data points or characteristics that an AI system or data broker deduces about an individual from other information, known as “de-anonymization,” all fall under its scope. This means that everything from a recorded voice sample to a GPS breadcrumb trail could be considered off-limits for AI training without the subject’s advance, explicit approval.
De-anonymization occurs when disparate datasets that have been stripped of direct identifiers such as names or email addresses, is combined with other datasets to re-identify individuals.
In privacy and data protection contexts, this falls under the broader concept of data linkage or record linkage, where different data points such as location, gender, birth date, device IDs, purchase history, location traces, demographic details, or online behavior are correlated to uniquely identify people.
Even behavioral data, such as shopping habits or browsing histories, pose a risk. Patterns in this data can correlate with identifiable attributes in other datasets, linking anonymous records back to specific individuals.
Under Hawley and Blumenthal’s legislation, consent must be an unambiguous, informed, and freely given agreement before any data is used. The bill stipulates that the request for consent must clearly identify all intended uses, including AI model training, and disclose at the time of collection the specific third parties who will receive or are privy to the data.
These disclosures must be presented separately from dense privacy policies or terms of service and cannot be relegated to fine print or hidden behind hyperlinks. Any consent obtained through coercion, deceptive design, or as a condition for accessing unrelated services would be considered invalid.
By doing so, the measure targets long-standing industry tactics where broad data permissions are buried in legalese or bundled with unrelated user agreements.
This legislation did not emerge in a vacuum. In September 2023, Hawley and Blumenthal unveiled a bipartisan AI oversight framework that emphasized personal agency, corporate liability for misuse, and transparency in algorithmic systems. That earlier framework laid the groundwork for the current bill, which has been referred to the Senate Committee on the Judiciary.
The bill’s enforcement provisions are aggressive. The act creates a federal cause of action allowing individuals to sue in federal or state court when their covered data is exploited without valid consent.
Remedies include compensatory damages -calculated as the greater of actual losses, triple the profits earned from the misuse, or a statutory minimum of $1,000 – along with punitive damages, injunctive relief, and attorney’s fees.
The legislation invalidates predispute arbitration agreements and class-action waivers for these claims, ensuring that groups of individuals can pursue collective action against violators. In court, companies could invoke consent as an affirmative defense but would bear the burden of proving it met the law’s stringent requirements.
Supporters see the measure as a long-overdue check on AI developers who have treated personal data as free, unregulated raw material. The Authors Guild and other advocacy groups have praised its clear, enforceable consent rules, noting that the bill would give individuals and creators the leverage they have lacked in negotiations with technology firms.
Privacy advocates point to its broad definition of covered data as a recognition of how modern AI systems thrive on combining seemingly innocuous fragments of information into highly detailed personal profiles.
Industry, however, warns that the bill’s reach could disrupt established AI development practices. Because so many datasets used in training large models contain personal identifiers, even if only in fragmentary form, compliance could require extensive data cleansing or costly licensing.
Some AI companies argue that such restrictions could slow innovation and reduce U.S. competitiveness in a rapidly moving global AI race. Hawley and Blumenthal have countered that argument, saying the bill’s intent is not to halt AI progress, but to ensure it develops without trampling on the rights of individuals to control their own information.
The legislation also importantly would allow states to enact even stronger privacy protections, which mirrors other areas of privacy and consumer protection law, ensuring that the baseline standard set in Washington could be built upon rather than undermined. This provision for states is in line with other bipartisan efforts in Congress to allow states to regulate AI, something the Trump administration is opposed to.
Last month, the Senate voted to strike down a controversial provision in President Trump’s sweeping tax and spending package that would have blocked states and localities from regulating AI for the next ten years.
A national reckoning on data brokers and digital rights in the U.S. was further spurred by the targeted attacks in June of Minnesota lawmakers using people-search websites that left two dead.
The alleged assassin was found by investigators to have used commercial data-brokers and people-search websites to compile detailed dossiers on them and more than 70 other public figures.
Should the AI Accountability and Personal Data Protection Act become law, it would reframe the conversation about AI training data. It would transform consent from passive acceptance to an active, enforceable shield against the unauthorized use of personally identifiable information.
Article Topics
anonymization | biometric dataset | data protection | legislation | responsible AI | U.S. Government







Comments