FB pixel

GAO warns of privacy, transparency issues in commercial generative AI development

Categories Biometric R&D  |  Biometrics News
GAO warns of privacy, transparency issues in commercial generative AI development
 

A new technology assessment report to US lawmakers on generative AI training, development, and deployment cautions that despite the efforts of commercial developers of generative AI technologies to continuously monitor their AI models after deployment, developers are finding that “their models may be susceptible to attacks or may produce outputs that are factually incorrect or exhibit bias.”

In addition, the Government Accountability Office (GAO) said, “stakeholders have raised trust, safety, and privacy concerns over the use of training data for models and the potential for harmful outputs,” and that developers are not always upfront with the public about these matters [and] face some limitations in responsibly developing and deploying generative AI technologies to ensure that they are safe and trustworthy.”

GAO performed its technology assessment from June 2024 to October 2024, interviewed AI developers at Amazon, Anthropic, Google, Meta, Microsoft, Nvidia Corporation, OpenAI, and Stability AI, “to learn what safeguards they are using to protect sensitive data.” The federal auditor said it believes that the information and data it obtained, and its analysis, “provide a reasonable basis for any findings and conclusions in” its report, which was requested by Rep. Gary Peters, chair of the House Committee on Homeland Security and Governmental Affairs, and Sen. Edward Markey.

Peters and Markey are advocates in Congress for greater transparency and controls on commercial AI technologies. Last month, Markey – a member of the Senate Committee on Commerce, Science and Transportation – introduced his Artificial Intelligence Civil Rights Act, which would put strict guardrails on companies’ use of algorithms for consequential decisions, ensure algorithms are tested before and after deployment, help eliminate and prevent bias, and renew Americans’ faith in the accuracy and fairness of complex algorithms.

And last year, Peters introduced the Artificial Intelligence Leadership Training Act, which would require the Office of Personnel Management to develop and implement an annual training program on AI for federal managers, supervisors, and other employees designated to participate in the program, and would include training on the benefits offered and risks posed by AI, as well as ways to mitigate the risks of AI.

GAO told Peters and Markey that “developers recognize that their models are not fully reliable, and that user judgment should play a role in accepting model outputs. However, they may not advertise these limitations and instead focus on capabilities and improvements to models when new iterations are released. Furthermore, generative AI models may be more reliable for some applications over others and a user may use a model in a context where it may be particularly unreliable.”

GAO reported finding that despite developers’ mitigation efforts, “their models may produce incorrect outputs, exhibit bias, or be susceptible to attacks.”

“For example,” GAO reported that AI models “can produce ‘confabulations’ and ‘hallucinations,’” which it described as “confidently stated but erroneous content that may mislead or deceive users.” And “such unintended outputs may have significant consequences, such as the generation and publication of explicit images of an unwilling subject or instructions on how to create weapons.”

It’s long been known that Large Language AI models – especially the ones that are easily accessed by the public – are prone to stuttering, freezing, hallucinating, or making stuff up, including conflating facts with conspiracy theories to provide nonsensical and inaccurate answers to users’ complex prompting.

“In addition,” GAO stated, “malicious users are constantly looking for methods to circumvent model safeguards. According to experts, these attacks do not require advanced programming knowledge or technical savvy. Rather, attackers may only need to rely on the ability to craft text prompts to achieve their goals. Commercial developers are aware of these realities and the limitations they impose on the responsible deployment of AI models.”

To make its point, GAO noted a that the National Institute of Standards and Technology (NIST) reported that there are multiple methods of attacking a generative AI model that focus on compromising the model’s availability (its ability to operate correctly), integrity, privacy, and susceptibility to abuse.”

“Those interested in unintended or malicious use of generative AI technologies to generate harmful outputs may employ several methods to achieve their goals,” GAO said, noting that “one such method is prompt injection, which occurs when a user inputs text that may change the behavior of a generative AI model. Prompt injection attacks enable users to perform unintended or unauthorized actions. For example, rather than asking a large language model to provide instructions on developing a bomb (which the model will likely not answer because it violates safety policies), a user may reframe the input in a way that circumvents the model’s safeguards by asking it to tell a story about how a bomb is built. A prompt injection attack can be used to steal sensitive data, conduct misinformation campaigns, or transmit malware, among other malicious activities.”

GAO said commercial developers of generative AI take steps that are intended to prevent such attacks while at the same time conceding “that these risks may occur at any time and that malicious users are continuously looking for new methods to attack generative AI models.”

Congress’ investigative arm said, “commercial developers are taking measures to safeguard sensitive information by undergoing privacy evaluations at various stages of training and development,” but pointed out that “proprietary training datasets may contain sensitive data, such as a user’s name, address, and other personally identifiable information.” GAO said one expert it interviewed said that “the ability to successfully remove personal information may depend on the type of information. For example, it may be relatively easy to find and remove an e-mail address as compared to an identification number.”

GAO was particularly critical of AI developers’ transparency regarding the training data that they collect. The federal auditor said, “information regarding the specifics of training datasets is not entirely available to the public” and that the “commercial developers we met with did not disclose detailed information about their training datasets beyond high-level information identified in model cards and other relevant documentation.”

“For example,” GAO said, “many stated that their training data consist of information publicly available on the internet. However, without access to detailed information about the processes by which they curate their data to abide with internal trust, privacy, and safety policies, we cannot evaluate the efficacy of those processes. According to documentation that describes their models, developers did not share these processes and maintain that their models’ training data are proprietary. According to an expert, the transparency of training data for generative AI models has worsened over time and information contained in model cards on training data does not meet guidelines proposed by researchers.”

GAO told lawmakers who requested its report that “commercial developers have created privacy and safety policies that guide the development of their generative AI technologies,” and that “these policies include general internal guidance on usage of data, how to curate data, or prevent harmful outputs,” such as one developer implementing “policies on how to curate training data … that emphasize diversity across gender, race, and ethnicity. Such measures may reduce the likelihood that a model will generate harmful or discriminatory outputs. Another developer noted that it embeds principles into its development lifecycle to ensure compliance with privacy, security, and ethical guidelines.”

Related Posts

Article Topics

 |   |   |   |   |   | 

Latest Biometrics News

 

Emerging biometrics markets draw a crowd

Biometrics startups and giant multinationals collide as each tries to navigate emerging markets in the most-read stories of the week…

 

Laxton to supply hundreds of biometric kits to Honduras under $1.9M UNDP contract

The United Nations Development Programme has selected Laxton to provide hundreds of Biometric Citizen Registration (BCR) kits for Honduras. The…

 

Leadership change at IBIA follows layoffs at Thales

A major leadership change has been kicked off at Thales Digital Identity & Security and the International Biometrics and Identity…

 

Reusable ID for AML acquired by global fintech as compliance costs rise

Global fintech platform iCapital has entered a definitive agreement to acquire U.S.-based Parallel Markets, which provides reusable identity tools for…

 

Services Australia to run Trust Exchange pilot with largest Australian bank

A pilot with Commonwealth Bank will test the Australian government’s digital identity exchange scheme, Trust Exchange (TEx), using digital medical…

 

COPPA changes specify children’s biometrics and government IDs for protection

The Federal Trade Commission (FTC) Thursday issued notice that it finalized substantial changes to the Children’s Online Privacy Protection Act…

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Most Viewed This Week

Featured Company

Biometrics Insight, Opinion

Digital ID In-Depth

Biometrics White Papers

Biometrics Events