OpenAI setting the pace for EU regulators, as GPAIs throw a wrench into AI Act
The boom in Artificial Intelligence has regulators trying to catch up with technology that advances its capabilities by the day. As OpenAI’s ChatGPT and its cousins in General Purpose Artificial Intelligence (GPAI) create waves across industries and governments, the technology perhaps provides its own best analogy with the self-driving car: it is impressive — until it runs over a pedestrian.
Barely four months into its public life, OpenAI’s ChatGPT has upended European efforts to formalize the Artificial Intelligence Act, the flagship legislation to create safeguards and standards to apply to AI. According to Politico, the current draft of the AI Act aims to curtail potential harm; it addresses what it calls “high-risk” AI and restricts certain applications, such as social scoring, manipulation and some uses of facial recognition. But large language models such as ChatGPT, which have a broad range of applications, are categorized as General Purpose AI (GPAI) — and are therefore harder to pin down on a regulatory level.
On March 14, EU lawmakers put forth an amendment that would require GPAI providers to comply with standards initially designed for high-risk applications deemed likely to cause harm. According to Euractiv, the new draft requires large language models to undergo “external audits testing their performance, predictability, interpretability, corrigibility, safety and cybersecurity in line with the AI Act’s strictest requirements.”
Response to the proposal has been mixed, with opponents expressing concern about overreach, and activists arguing that the law does not go deep enough and needs to apply not just to text-making systems, but to other kinds of GPAIs, as well.
GPT-4: image in, text out
OpenAI continues to push development, with the release of GPT-4, the newest generation of its foundational AI model. As reported in Techmonitor, unlike its predecessor (3.5, which powers the current version of ChatGPT), the system can intake both text and images, and output text based on pictures.
“GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks,” reads a post on OpenAI’s blog.
GPT-4’s image capabilities will only be available to one client, for now: Be My Eyes, an app that uses AI and video to assist people with visual impairments. However, its language component will be integrated into ChatGPT, and companies such as Duolingo have plans to incorporate it into language learning apps.
“As we continue to focus on reliable scaling,” says the company’s blog, “we aim to hone our methodology to help us predict and prepare for future capabilities increasingly far in advance — something we view as critical for safety.”
If OpenAI’s current pace is any indication, the question of oversight will quickly become urgent. With a fresh $10 billion investment from Microsoft behind it, the San Francisco-based research lab — which was founded in 2015 by, among others, Elon Musk and Peter Thiel, and has both non-profit and for-profit arms — is speeding ahead with new technological jumps; there was speculation that GPT-4 might come with video capability, and while that was not the case, it is sure to arrive eventually.
Human oversight is a key piece in the European AI Act’s regulatory framework. But, writing in The Parliament magazine, Johannes Walter, a German researcher and academic, argues that evidence shows human oversight can too often fail. He points to studies showing that people cannot always discern poor algorithmic recommendations, and to evidence from the world of policing and healthcare, where human oversight has not prevented technological bias.
“The problem with the AI Act is that it presupposes that AI systems have always been designed in such a way that human oversight will be effective,” says Walter. “As evidenced by self-driving cars that allow their operators to become distracted for extended periods, this is clearly not the case. It is therefore critical for each high-risk application to test whether humans can successfully exploit algorithmic advice when it is of high quality and ignore or correct it when the advice is poor.”
Walter says changes to the AI Act could address the issue, and suggests randomized controlled trials to determine the effectiveness of human oversight.
“Ensuring that human oversight works as intended is key to prevent future discriminatory decisions or further casualties.”