OpenAI Whistleblowers Describe Reckless and Secretive Culture
A group of OpenAI insiders is blowing the whistle on what they say is a culture of recklessness and secrecy at the San Francisco artificial intelligence company, which is racing to build the most powerful A.I. systems ever created.
The group, which includes nine current and former OpenAI employees, has rallied in recent days around shared concerns that the company has not done enough to prevent its A.I. systems from becoming dangerous.
The members say OpenAI, which started as a nonprofit research lab and burst into public view with the 2022 release of ChatGPT, is putting a priority on profits and growth as it tries to build artificial general intelligence, or A.G.I., the industry term for a computer program capable of doing anything a human can.
They also claim that OpenAI has used hardball tactics to prevent workers from voicing their concerns about the technology, including restrictive nondisparagement agreements that departing employees were asked to sign.
“OpenAI is really excited about building A.G.I., and they are recklessly racing to be the first there,” said Daniel Kokotajlo, a former researcher in OpenAI’s governance division and one of the group’s organizers.
The group published an open letter on Tuesday calling for leading A.I. companies, including OpenAI, to establish greater transparency and more protections for whistle-blowers.
Other members include William Saunders, a research engineer who left OpenAI in February, and three other former OpenAI employees: Carroll Wainwright, Jacob Hilton and Daniel Ziegler. Several current OpenAI employees endorsed the letter anonymously because they feared retaliation from the company, Mr. Kokotajlo said. One current and one former employee of Google DeepMind, Google’s central A.I. lab, also signed.
A spokeswoman for OpenAI, Lindsey Held, said in a statement: “We’re proud of our track record providing the most capable and safest A.I. systems and believe in our scientific approach to addressing risk. We agree that rigorous debate is crucial given the significance of this technology, and we’ll continue to engage with governments, civil society and other communities around the world.”
A Google spokesman declined to comment.
The campaign comes at a rough moment for OpenAI. It is still recovering from an attempted coup last year, when members of the company’s board voted to fire Sam Altman, the chief executive, over concerns about his candor. Mr. Altman was brought back days later, and the board was remade with new members.
The company also faces legal battles with content creators who have accused it of stealing copyrighted works to train its models. (The New York Times sued OpenAI and its partner, Microsoft, for copyright infringement last year.) And its recent unveiling of a hyper-realistic voice assistant was marred by a public spat with the Hollywood actress Scarlett Johansson, who claimed that OpenAI had imitated her voice without permission.
But nothing has stuck like the charge that OpenAI has been too cavalier about safety.
Last month, two senior A.I. researchers — Ilya Sutskever and Jan Leike — left OpenAI under a cloud. Dr. Sutskever, who had been on OpenAI’s board and voted to fire Mr. Altman, had raised alarms about the potential risks of powerful A.I. systems. His departure was seen by some safety-minded employees as a setback.
So was the departure of Dr. Leike, who along with Dr. Sutskever had led OpenAI’s “superalignment” team, which focused on managing the risks of powerful A.I. models. In a series of public posts announcing his departure, Dr. Leike said he believed that “safety culture and processes have taken a back seat to shiny products.”
Neither Dr. Sutskever nor Dr. Leike signed the open letter written by former employees. But their exits galvanized other former OpenAI employees to speak out.
“When I signed up for OpenAI, I did not sign up for this attitude of ‘Let’s put things out into the world and see what happens and fix them afterward,’” Mr. Saunders said.
Some of the former employees have ties to effective altruism, a utilitarian-inspired movement that has become concerned in recent years with preventing existential threats from A.I. Critics have accused the movement of promoting doomsday scenarios about the technology, such as the notion that an out-of-control A.I. system could take over and wipe out humanity.
Mr. Kokotajlo, 31, joined OpenAI in 2022 as a governance researcher and was asked to forecast A.I. progress. He was not, to put it mildly, optimistic.
In his previous job at an A.I. safety organization, he predicted that A.G.I. might arrive in 2050. But after seeing how quickly A.I. was improving, he shortened his timelines. Now he believes there is a 50 percent chance that A.G.I. will arrive by 2027 — in just three years.
He also believes that the probability that advanced A.I. will destroy or catastrophically harm humanity — a grim statistic often shortened to “p(doom)” in A.I. circles — is 70 percent.
At OpenAI, Mr. Kokotajlo saw that even though the company had safety protocols in place — including a joint effort with Microsoft known as the “deployment safety board,” which was supposed to review new models for major risks before they were publicly released — they rarely seemed to slow anything down.
For example, he said, in 2022 Microsoft began quietly testing in India a new version of its Bing search engine that some OpenAI employees believed contained a then-unreleased version of GPT-4, OpenAI’s state-of-the-art large language model. Mr. Kokotajlo said he was told that Microsoft had not gotten the safety board’s approval before testing the new model, and after the board learned of the tests — via a series of reports that Bing was acting strangely toward users — it did nothing to stop Microsoft from rolling it out more broadly.
A Microsoft spokesman, Frank Shaw, disputed those claims. He said the India tests hadn’t used GPT-4 or any OpenAI models. The first time Microsoft released technology based on GPT-4 was in early 2023, he said, and it was reviewed and approved by a predecessor to the safety board.
Eventually, Mr. Kokotajlo said, he became so worried that, last year, he told Mr. Altman that the company should “pivot to safety” and spend more time and resources guarding against A.I.’s risks rather than charging ahead to improve its models. He said that Mr. Altman had claimed to agree with him, but that nothing much changed.
In April, he quit. In an email to his team, he said he was leaving because he had “lost confidence that OpenAI will behave responsibly” as its systems approach human-level intelligence.
“The world isn’t ready, and we aren’t ready,” Mr. Kokotajlo wrote. “And I’m concerned we are rushing forward regardless and rationalizing our actions.”
OpenAI said last week that it had begun training a new flagship A.I. model, and that it was forming a new safety and security committee to explore the risks associated with the new model and other future technologies.
On his way out, Mr. Kokotajlo refused to sign OpenAI’s standard paperwork for departing employees, which included a strict nondisparagement clause barring them from saying negative things about the company, or else risk having their vested equity taken away.
Many employees could lose out on millions of dollars if they refused to sign. Mr. Kokotajlo’s vested equity was worth roughly $1.7 million, he said, which amounted to the vast majority of his net worth, and he was prepared to forfeit all of it.
(A minor firestorm ensued last month after Vox reported news of these agreements. In response, OpenAI claimed that it had never clawed back vested equity from former employees, and would not do so. Mr. Altman said he was “genuinely embarrassed” not to have known about the agreements, and the company said it would remove nondisparagement clauses from its standard paperwork and release former employees from their agreements.)
In their open letter, Mr. Kokotajlo and the other former OpenAI employees call for an end to using nondisparagement and nondisclosure agreements at OpenAI and other A.I. companies.
“Broad confidentiality agreements block us from voicing our concerns, except to the very companies that may be failing to address these issues,” they write.
They also call for A.I. companies to “support a culture of open criticism” and establish a reporting process for employees to anonymously raise safety-related concerns.
They have retained a pro bono lawyer, Lawrence Lessig, the prominent legal scholar and activist. Mr. Lessig also advised Frances Haugen, a former Facebook employee who became a whistle-blower and accused that company of putting profits ahead of safety.
In an interview, Mr. Lessig said that while traditional whistle-blower protections typically applied to reports of illegal activity, it was important for employees of A.I. companies to be able to discuss risks and potential harms freely, given the technology’s importance.
“Employees are an important line of safety defense, and if they can’t speak freely without retribution, that channel’s going to be shut down,” he said.
Ms. Held, the OpenAI spokeswoman, said the company had “avenues for employees to express their concerns,” including an anonymous integrity hotline.
Mr. Kokotajlo and his group are skeptical that self-regulation alone will be enough to prepare for a world with more powerful A.I. systems. So they are calling for lawmakers to regulate the industry, too.
“There needs to be some sort of democratically accountable, transparent governance structure in charge of this process,” Mr. Kokotajlo said. “Instead of just a couple of different private companies racing with each other, and keeping it all secret.”