Artificial Intelligence Has Read Everything On The Internet But Remains Hungry For More Data » TwistedSifter
The powers that be have tried their best to convince us that AI tech is nothing like what we’ve seen in science fiction movies and there is nothing to fear.
When you hear that it’s consumed everything we have and is still hungry for more, well…the parallels draw themselves.
AI companies are getting worried that, as they build bigger and better models, there won’t be enough data out there on the web to train them.
Some companies are looking for alternative sources of data training, things like video transcripts and “synthetic data” on the list.
That last thing is generated by AI, and no one know what will happen if we let it basically train itself.
I have to think it’s nothing good.
Early research agrees training an AI model on generated AI data would ultimately lead to “model collapse.”
Some companies have claimed they can create higher-quality synthetic data, but haven’t been forthcoming on what that would actually look like.
One company, Dataology (formed by ex-Meta and Google DeepMind researcher Ari Morcos), is one of the people looking into ways to train larger and smarter models with less data.
These are largely controversial means of data training, like transcriptions of public YouTube videos.
Researchers have seen the specter of AI running out of data looming for some time. Pablo Villalobos estimated that AI will run out of usable data within the next year or two, but he doesn’t seem worried.
“The biggest uncertainty is what breakthrough you’ll see.”
Or, you know, companies could stop trying to create those bigger and better models since there is a training data storage – along with other issues, like excessive energy use.
But I really don’t see that happening.
If you enjoyed that story, check out what happened when a guy gave ChatGPT $100 to make as money as possible, and it turned out exactly how you would expect.