Generative AI

ElevenLabs Launches Generative AI Text-to-Sound-Effects Tool



Generative AI audio startup ElevenLab has released a new tool that turns text prompts into sound effects. Users can evoke any sound that would work best with an accompanying video or audio track by describing the sound they want to hear, including short instrumental pieces of music. The new feature augments ElevenLabs’ existing toolkit, known mainly for the realistic voice cloning used for both benign and deceptive purposes.

ElevenLabs Sound Effects

“In the last year, we revolutionized AI Voices by producing the first truly emotive, human-like Text to Speech platform. Text to Sound Effects marks another major step forward as we equip creators with all of the audio tools they need to produce high quality content,” ElevenLabs head of growth Sam Sklar explained in a blog post. “The tool has been designed to help creators—including film and television studios, video game developers, and social media content creators—to generate rich and immersive soundscapes quickly, affordably and at scale.”

ElevenLabs is offering the new sound effect creation tool to all of its users, with a monthly limit of 10,000 characters for those at the free level. The company’s usage page describes a second of a sound effect as about 40 characters, with a default clip’s duration using 200 characters. At that rate, users could produce about 50 sound effects per month. The other caveat is that free-tier users have to credit elevenlabs.io for the sound in anything published using it.

ElevenLabs utilized Shutterstock’s audio library, which contains licensed tracks, to train its model for this sound generation tool. The tool has already been tested in an alpha phase by various professionals, including video game developers, film producers, social media content creators, and marketers. This group of early adopters has provided valuable feedback, helping refine the tool before its release. The startup made a point of emphasizing that sound effects cannot violate the content and uses policy, though any audio guardrails will likely need regular updates.

“We’re excited to be partnering with ElevenLabs to fuel yet another significant innovation in AI, Text to Sound Effects, with our ethically-sourced data,” Shutterstock chief enterprise officer Aimee Egan said. “The combined power of our rich and immersive library of tracks and this cutting-edge audio technology has enabled the creation of a true market first. We’re thrilled by the positive feedback from the early access community and look forward to seeing the wide array of projects they will create.”

The new feature will likely only elevate the demand for ElevenLabs’ services, which led to an $80 million funding round at the beginning of the year. The startup is already famous for its fidelity to real voices, including being used in speeches from prison by Pakistan’s former Prime Minister Imran Khan, who employed ElevenLabs in a victory speech and during the campaign. And robocalls to New Hampshire voters earlier this year used ElevenLabs to make a deepfake version of President Biden in an attempt to suppress turnout in the state’s primary election, which is against ElevenLabs’ own rules. An investigation traced the calls back to a telecom provider, Lingo, which transmitted them on behalf of Life Corporation. The FCC issued a cease and desist over it, followed by an outright ban on deepfake robocalls. That’s led to a similar rush to come up with deepfake detectors by companies like Pindrop as well as internal detectors from ElevenLabs and others.

ElevenLabs Raises $80M And Shares Generative AI Voice Models, Tools and Deepfake Voice Marketplace

Synthesia and ElevenLabs Team Up to Augment Deepfake Videos With Generative AI Voice Models

ElevenLabs Releases Generative AI Voice Translation and Dubbing Tool








Source

Related Articles

Back to top button