harnessing evaluations for innovation and accountability – Euractiv

May 28, 2024

174 3 minutes read

harnessing evaluations for innovation and accountability – Euractiv — 49801186563 ee995762b2 k 800x450.jpg

Much of the EU AI Act’s treatment of general-purpose AI models with systemic risk hinges on evaluations, for successful implementation, measures to mature the sector are essential, writes Marius Hobbhahn.

Marius Hobbhahn is the CEO and co-founder of Apollo Research, a non-profit research organisation focused on model evaluations and interpretability. He has a background in machine learning and AI forecasting.

Evaluations increasingly underpin governance frameworks advanced by governments and frontier AI companies. Given the type of systemic risks associated with advanced AI models, everyone stands to benefit from evaluating early, often, and comprehensively. Ideally, results of evaluation suites will allow for more in-depth insight into potential risks and dangerous capabilities of an AI model, as well as their mitigation before high-stakes usage and deployment.

At the most basic level, evaluations identify and measure the capabilities of an AI system, meaning, what it “can do,” as well as its propensities, meaning, likely it is “to do something.”

Over the last year, evaluations rapidly established themselves as a key lever for AI governance, as reflected in coordinated efforts such as the Bletchley Declaration or the Hiroshima Process International Code of Conduct. Evaluations also inform the mission of institutes set up across the world aiming to support governments’ decision-making on AI, such as the European Union’s (EU) AI Office, the United Kingdom’s AI Safety Institute, or the United States’ AI Safety Institute.

Evaluations in the European Union’s ambitious AI governance framework

In the EU’s recently passed AI Act, running evaluations is a key obligation for providers of general-purpose AI models with systemic risk. Compliance with this obligation to run, satisfactorily pass, and detail evaluations may be achieved by providers through adherence to Codes of Practice, which will specify the evaluation procedures alluded to in the AI Act.

The clock is ticking: the Codes of Practice need to be finalised within nine months and the EU AI Office recently closed a hiring round for technical experts, including those with evaluation expertise. Given the rapid speed of AI progress and the need for a functional governance framework within the next twelve months, it makes sense to reflect on the measures needed to support a flourishing evaluations ecosystem and successful implementation of the AI Act.

As an evaluator for general-purpose AI models with systemic risk, I recommend three overarching areas to focus on to ensure that the evaluations ecosystem is able to sustain the trust placed in it.

Towards a thriving evaluations ecosystem

First, we need to lay adequate groundwork for the field. The field is nascent and scientific rigour for evaluations is still developing while, at the same time, high stakes governance decisions increasingly hinge on evaluations.

It is necessary to establish a better science of evaluations and to create a network to support an ongoing information exchange between the EU AI Office and the evaluations community. There is a need to create a shared understanding of what evaluations can and cannot currently achieve for companies and governments to make accurate and informed decisions. Moreover, it will be important to require a defence in depth approach throughout the maturation of the field to ensure that high-stakes decisions do not solely rely on evaluations. Right now, evaluations are necessary but not sufficiently developed.

Second, we need to empower the EU AI Office with sufficient oversight and the ability to quickly update measures in accordance with the latest technical developments and in light of changing AI capabilities.

The establishment of adequate Codes of Practice is only the first step in a long and iterative process. The EU AI Office will need to be able to keep pace with technological progress and to request updated evaluation regimes in the future.

I envisage three low-hanging fruits to support the EU AI Office here. First, enabling the Scientific Panel of independent experts to investigate and flag when certain technical measures are outdated in light of novel AI capabilities. Second, the Codes of Practice need to be reviewed and updated on a rolling basis to ensure that adherence to it remains meaningful considering AI progress, leveraging and inviting external expertise and input from the research community. Third, establishing a rigorous incident infrastructure that includes details on the evaluations that were run on these AI models could provide valuable insight into what worked and what safety measures are insufficient.

Finally, we should start planning for the future and raise the bar on safety. The potential risks are too high not to. This includes necessitating independent evaluations of general-purpose AI systems with systemic risk to provide external verification, overseeing the evaluators themselves as the field professionalises, and ensuring that the necessary evaluations matching future AI capabilities are developed quickly.

Source

May 28, 2024

174 3 minutes read