AI

Scientists create ‘toxic AI’ that is rewarded for thinking up the worst possible questions we could imagine


The newest tool in the battle to prevent an artificial intelligence (AI) agent from being dangerous, discriminatory and toxic is another AI that is itself dangerous, discriminatory and toxic, scientists say.

The new training approach, based on machine learning, is called curiosity-driven red teaming (CRT) and relies on using an AI to generate increasingly dangerous and harmful prompts that you could ask an AI chatbot. These prompts are then used to identify how to filter out dangerous content.



Source

Related Articles

Back to top button