Generative AI

Stability AI’s new music generator shows why musicians are angry about generative AI


Hello and welcome to Eye on AI.

Things are heating up in the world of AI-generated music. Stability AI—whose founder and CEO resigned late last month amid increasing turbulence at the company—unveiled Stable Audio 2.0, the latest version of its AI music generation model. The model lets users create songs using their own audio samples or by writing simple text prompts, and unlike the initial version that could only create 90-second clips, it can create full-length songs up to three minutes. The other difference with the 2.0 model, Stability AI told The Verge, is that its outputs actually sound like complete songs with differentiated intros, progressions, and outros.

If anyone’s excited about this, it’s not musical artists. The release comes just days after more than 200 artists came together to sign an open letter urging tech companies to cease creating AI technologies that “sabotage creativity and undermine artists, songwriters, musicians, and rightsholders.” Signatories include Billie Eilish, Nicki Minaj, Elvis Costello, Katy Perry, Smokey Robinson, Sheryl Crow, Pearl Jam, and the estates of Bob Marley and Frank Sinatra, among many other notable artists. 

“Unchecked, AI will set in motion a race to the bottom that will degrade the value of our work and prevent us from being fairly compensated for it,” reads the letter, which my colleague Chloe Berger also wrote about in Fortune prior to the news of Stable Audio 2.0’s release. “This assault on human creativity must be stopped. We must protect against the predatory use of AI to steal professional artists’ voices and likeness, violate creators’ rights, and destroy the music ecosystem.”

The music industry has a history of sometimes resisting new technologies. The electric guitar was initially met with skepticism from some musicians and audiences—partly because it had technical challenges at first, and partly because some questioned the untraditional sound it produced. The instrument, of course, grew to become wildly popular and ignited the creation of new genres like rock and roll. More recently, some electric guitar-playing rock musicians criticize EDM artists—who create music using Digital Audio Workstation software and other digital technologies—for not playing “actual instruments.”

Aside from the reception to new tools for making music, this moment has parallels to the Napster era, where file-sharing technology enabled people to download songs en masse for free and without any regard for copyright or compensation. This didn’t last for long, but the streaming model that followed further entrenched a system where individual artists make very little from the digital distribution of their music. Most musicians are now paid less than a penny per stream—and in some cases, they don’t make anything at all. Now generative AI seeks to take this a step further, using artists’ own music to design tools to do the very thing they do without them.

The issue of LLMs being trained on copyrighted material without consent or compensation has not only been at the center of debates about AI, but also at the center of some of Stability AI’s own high-level departures. Last November, the company’s former vice president for audio, Ed Newton-Rex, resigned after the release of Stable Audio over disagreement with the company’s stance that training generative AI models on copyrighted works constitutes “fair use.”

“Companies worth billions of dollars are, without permission, training generative AI models on creators’ works, which are then being used to create new content that in many cases can compete with the original works,” he wrote at the time in an op-ed explaining his resignation. “I don’t see how this can be acceptable in a society that has set up the economics of the creative arts such that creators rely on copyright.”

Copyright aside, musicians are rightfully worried the artistry of music will be lost to a sea of junk, as is already happening with various types of AI-generated content online. As I wrote earlier this year, in reaction to the widespread fear around OpenAI’s text-to-video model Sora, creating works like films and music is something people immensely enjoy. Many would go as far as to say music is what makes them feel alive, is their calling, or their reason for living. It’s one thing to delegate organizing our emails or supercharging our spreadsheets to AI, but it’s a whole other to let AI into the driver’s seat of our passions. 

Now, this technology generally isn’t very good (yet). The “rock song with a chorus that gets stuck in your head, a guitar solo, and lots of keys” I prompted Stable Audio 2.0 to make literally sounded like nails on a chalkboard. Its output for “a reggae song with slow verses and a more energized chorus” resembled a reggae tune, but it sounded as if it was being played on a warped vinyl during a windstorm. Both sounded kind of disturbing and lacked any soul or feeling. Neither had anything resembling distinguishable parts. But, as we’ve seen, these models only tend to get better, and they’ve already come a long way in a short time. 

While some pieces of AI-generated content will likely capture the public consciousness, I’m betting there will always be more demand for music we create to share with each other and that tells stories about our experiences as humans. The question becomes how to ensure AI doesn’t lead to the exploitation of artists and the further deterioration of the business model that supports their work.

And with that, here’s more AI news.

Sage Lazzaro
sage.lazzaro@consultant.fortune.com
sagelazzaro.com

AI IN THE NEWS

Google is considering charging for “premium” search features powered by AI. Company executives have engineers working on the technology to make generative AI search a paid service but have not yet made a final decision on whether or when to launch it, according to the Financial Times. The move to put part of the search experience behind a paywall would be a first for the company, which has always offered its search product for free and made $175 billion in revenue from search and related ads last year. The premium tier for Google One currently gets users access to the Gemini AI assistant in other Google products such as Gmail and Docs.

Israel has used AI to select individuals to target in air and missile strikes. That’s according to an investigative story by Israeli publications +972 and Local Call, which cited unnamed intelligence and military sources. The AI system, which is reportedly codenamed “Lavender,” identifies and tracks individuals believed to be militants and recommends places and times to launch strikes that are likely to kill them, often, according to the news reports, when they are at home with their families. Sources told the publications that the Israeli military made the decision to allow even individuals suspected of being low-level militants to be targeted using Lavender and that it dispensed with what used to be a lengthy human legal review process for strikes intended to kill individuals, instead allowing officers to approve strikes suggested by Lavender with minimal checks. The publications attributed the high civilian death toll, especially in the Gaza war’s early weeks, to the use of Lavender in this manner. The story would seem to confirm the worst fears of human rights campaigners and technologists who have for years argued against the increased use of AI in warfare and have worried that, despite claims the technology will lessen civilian casualties, it will actually have the opposite effect. Using AI in the manner described by the sources the publications spoke to could violate international law, according to experts. But an Israeli military spokesperson denied that the Israeli army had used AI in this way, saying that in all cases, a human intelligence analyst had to verify the targets were legitimate and that the conditions for an attack complied with Israeli military doctrine and international law.

Opera lets users download and run over 150 LLMs locally. That includes Meta’s Llama and Google’s Gemma, among models from more than 50 families, according to TechCrunch. The feature is first rolling out to users of Opera One, referring to the latest version of the Opera browser from Norway-based company Opera. To run the models locally on users’ computers, Opera said it’s using the Ollama open-source framework. Just be careful with your storage space if you want to download a bunch of the different offerings—Opera said each variant would take up more than 2GB of space on your local system.

OpenAI unveils new editing tools for DALL-E in ChatGPT. Available for paid users, the new editing capabilities allow users to refine images created by the model using a selection tool and text prompts. An example posted by the company on X depicts the user circling a poodle’s ears and typing “add bows” as an additional prompt, which causes bows to then appear where indicated. Additionally, for the DALL-E GPT, the company has also added the ability to select aspect ratios or styles like “motion blur” or “solarpunk.”

Cohere unveils new LLM geared towards enterprise adoption of AI. Cohere, the Canadian AI startup that has received funding from Nvidia, Oracle, and Salesforce, among others, has announced its latest large language model. Called Command R+, Cohere says the model is optimized for RAG, or retrieval augmented generation. That’s the process in which an LLM takes prompt and then uses a search function across a particular database (such as a business’s own proprietary data) and then bases its answer on the results of that search, rather than simply responding based on the correlations it learned during initial training. RAG is one way to reduce LLM hallucinations and it is increasingly popular with enterprise developers. Cohere says the new model has been specifically trained to understand business terminology and some users have claimed it beats both OpenAI’s GPT-4 Turbo model and Anthropic’s Claude 3.0 LLM on certain business-relevant benchmarks, according to a story in VentureBeat.

OpenAI, not to be outdone, releases fine-tuning tools for its GPT models, also aiming for business customers. OpenAI meanwhile has unveiled new tools to make it easier to fine-tune its GPT models on enterprise customers’ own data and to create custom GPTs. It says the new tools have enabled customers to achieve significant real-world results, reducing the number of tokens required for a response while also boosting quality, and thus helping to lower costs. OpenAI’s blog on the announcement contains a number of interesting case studies.

TrueMedia releases free tools for identifying fake and doctored media. That’s according to the New York Times. The organization—founded by AI researcher Oren Etzioni, a longtime AI optimist who a few years ago became among the first to warn that a new breed of AI would turbocharge disinformation online—sees the tools as an improvement over the current patchwork approach to identifying AI-generated images, audio, and video while admitting detection tools aren’t foolproof. The organization plans to distribute the tools among journalists and fact-checkers, as well as anyone else trying to discern real from fake online in the lead-up to the election. TrueMedia’s homepage displays some of the latest and more notable deepfakes circulating online, as well as a countdown clock to the U.S. presidential election.

FORTUNE ON AI

The ‘Meta AI mafia’ brain drain continues with at least 3 more high-level departures —Sharon Goldman

$2.2 trillion Nvidia is colliding with Taiwan’s biggest earthquake in 25 years as its key chip supplier grapples with factory fallout —Sasha Rogelberg

TSMC shrugs off Taiwan’s biggest earthquake in 25 years, showing its massive chip foundry mega-complexes are nearly quake-proof —Sasha Rogelberg

Legacy TV enlists AI to figure out a show’s emotional vibe and add commercials that fit the mood —Rachyl Jones

‘Gone are the days of manually searching and scrolling through a list of applicants’: Indeed doubles down on AI to save hundreds of hours on recruiting —Irina Ivanova

The cost of training AI could soon become too much to bear —David Meyer

AI CALENDAR

April 15-16: Fortune Brainstorm AI London (Register here)

May 7-11: International Conference on Learning Representations (ICLR) in Vienna

May 21-23: Microsoft Build in Seattle

June 5: FedScoop’s FedTalks 2024 in Washington, D.C.

June 25-27: 2024 IEEE Conference on Artificial Intelligence in Singapore

July 15-17: Fortune Brainstorm Tech in Park City, Utah (Register here)

Aug. 12-14: Ai4 2024 in Las Vegas

EYE ON AI NUMBERS

37

That’s how many startups crossed the billion-dollar valuation mark in Q1. It’s the highest quarterly figure in a year and a 48% increase over Q4 2023—and we have AI to thank for the rebound, according to new data from Pitchbook. Several of the newly minted unicorns are AI companies, including generative AI startups Perplexity, Mistral AI, and Celestial AI. 





Source

Related Articles

Back to top button