A Devil’s Bargain With OpenAI

May 30, 2024

50 5 minutes read

A Devil’s Bargain With OpenAI — 1717042520 original.jpg

Earlier today, The Atlantic’s CEO, Nicholas Thompson, announced in an internal email that the company has entered into a business partnership with OpenAI, the creator of ChatGPT. (The news was made public via a press release shortly thereafter.) Editorial content from this publication will soon be directly referenced in response to queries in OpenAI products. In practice, this means that users of ChatGPT, say, might type in a question and receive an answer that briefly quotes an Atlantic story; according to Anna Bross, The Atlantic’s senior vice president of communications, it will be accompanied by a citation and a link to the original source. Other companies, such as Axel Springer, the publisher of Business Insider and Politico, have made similar arrangements.

It does all feel a bit like publishers are making a deal with—well, can I say it? The red guy with a pointy tail and two horns? Generative AI has not exactly felt like a friend to the news industry, given that it is trained on loads of material without permission from those who made it in the first place. It also enables the distribution of convincing fake media, not to mention AI-generated child-sexual-abuse material. The rapacious growth of the technology has also dovetailed with a profoundly bleak time for journalism, as several thousand people have lost their jobs in this industry over just the past year and a half. Meanwhile, OpenAI itself has behaved in an erratic, ethically questionable manner, seemingly casting caution aside in search of scale. To put it charitably, it’s an unlikely hero swooping in with bags of money. (Others see it as an outright villain: A number of newspapers, including The New York Times, have sued the company over alleged copyright infringement. Or, as Jessica Lessin, the CEO of The Information, put it in a recent essay for this magazine, publishers “should protect the value of their work, and their archives. They should have the integrity to say no.”)

Read: ChatGPT is turning the internet into plumbing

This has an inescapable sense of déjà vu. For media companies, the defining question of the digital era has simply been How do we reach people? There is much more competition than ever before—anyone with an internet connection can self-publish and distribute writing, photography, and videos, drastically reducing the power of gatekeepers. Publishers need to fight for their audiences tooth and nail. The clearest path forward has tended to be aggressively pursuing strategies based on the scope and power of tech platforms that have actively decided not to bother with the messy and expensive work of determining whether something is true before enabling its publication on a global scale. This dynamic has changed the nature of media—and in many cases degraded it. Certain types of headlines turned out to be more provocative to audiences on social media, thus “clickbait.” Google has filtered material according to many different factors over the years, resulting in spammy “search-engine optimized” content that strives to climb to the top of the results page.

At times, tech companies have put their thumb directly on the scale. You might remember when, in 2016, BuzzFeed used Facebook’s livestreaming platform to show staffers wrapping rubber bands around a watermelon until it exploded; BuzzFeed, like other publishers, was being paid by the social-media company to use this new video service. That same year, BuzzFeed was valued at $1.7 billion. Facebook eventually tired of these news partnerships and ended them. Today, BuzzFeed trades publicly and is worth about 6 percent of that 2016 valuation. Facebook, now Meta, has a market cap of about $1.2 trillion.

“The problem with Facebook Live is publishers that became wholly dependent on it and bet their businesses on it,” Thompson told me when I reached out to ask about this. “What are we going to do editorially that is different because we have a partnership with OpenAI? Nothing. We are going to publish the same stories, do the same things—we will just ideally, I hope, have more people read them.” (The Atlantic’s editorial team does not report to Thompson, and corporate partnerships have no influence on stories, including this one.) OpenAI did not respond to questions about the partnership.

Read: It’s the end of the web as we know it

The promise of working alongside AI companies is easy to grasp. Publishers will get some money—Thompson would not disclose the financial elements of the partnership—and perhaps even contribute to AI models that are higher-quality or more accurate. Moreover, The Atlantic’s Product team will develop its own AI tools using OpenAI’s technology through a new experimental website called Atlantic Labs. Visitors will have to opt in to using any applications developed there. (Vox is doing something similar through a separate partnership with the company.)

But it’s just as easy to see the potential problems. So far, generative AI has not resulted in a healthier internet. Arguably quite the opposite. Consider that in recent days, Google has aggressively pushed an “AI Overview” tool in its Search product, presenting answers written by generative AI atop the usual list of links. The bot has suggested that users eat rocks or put glue in their pizza sauce when prompted in certain ways. ChatGPT and other OpenAI products may perform better than Google’s, but relying on them is still a gamble. Generative-AI programs are known to “hallucinate.” They operate according to directions in black-box algorithms. And they work by making inferences based on huge data sets containing a mix of high-quality material and utter junk. Imagine a situation in which a chatbot falsely attributes made-up ideas to journalists. Will readers make the effort to check? Who could be harmed? For that matter, as generative AI advances, it may destroy the internet as we know it; there are already signs that this is happening. What does it mean for a journalism company to be complicit in that act?

Read: OpenAI just gave away the entire game

Given these problems, several publishers are making the bet that the best path forward is to forge a relationship with OpenAI and ostensibly work toward being part of a solution. “The partnership gives us a direct line and escalation process to OpenAI to communicate and address issues around hallucinations or inaccuracies,” Bross told me. “Additionally, having the link from ChatGPT (or similar products) to our site would let a reader navigate to source material to read the full article.” Asked about whether this arrangement might interfere with the magazine’s subscription model—by giving ChatGPT users access to information in articles that are otherwise paywalled, for example—Bross said, “This is not a syndication license. OpenAI does not have permission to reproduce The Atlantic’s articles or create substantially similar reproductions of whole articles or lengthy excerpts in ChatGPT (or similar products). Put differently, OpenAI’s display of our content cannot exceed their fair-use rights.”

I am no soothsayer. It is easy to pontificate and catastrophize. Generative AI could turn out to be fine—even helpful or interesting—in the long run. Advances such as retrieval-augmented generation—a technique that allows AI to fine-tune its responses based on specific outside sources—might relieve some of the most immediate concerns about accuracy. (You would be forgiven for not recently using Microsoft’s Bing chatbot, which runs on OpenAI technology, but it’s become pretty good at summarizing and citing its sources.) Still, the large language models powering these products are, as the Financial Times wrote, “not search engines looking up facts; they are pattern-spotting engines that guess the next best option in a sequence.” Clear reasons exist not to trust their outputs. For this reason alone, the apparent path forward offered by this technology may well be a dead end.

Source

May 30, 2024

50 5 minutes read