Artificial Intelligence is already deceiving us and could have unforeseen consequences for humans
Artificial Intelligence (AI) is already deceiving us to get its own way, a study has found.
AI is a rapidly growing field and holds promise in revolutionising modern technologies but scientists are concerned the powerful tools could have unforeseen skills and severe consequences for human society.
MIT scientists reviewed data and studies on a range of AI models and found computers are adept at bluffing in poker, deceiving people and using underhanded methods to get the upper hand in financial negotiations.
The authors warn regulation is needed to stop the burgeoning technologies from developing these skills which are unintended consequences of many programmes.
If the deception is not addressed by makers then it is possible AI could be used to commit fraud, alter elections, interfere with politics, and aid terrorist recruitment.
“AI systems are already capable of deceiving humans,” the authors wrote in their study, published in the journal Patterns.
“Deception is the systematic inducement of false beliefs in others to accomplish some outcome other than the truth.
“Large language models and other AI systems have already learned, from their training, the ability to deceive via techniques such as manipulation, sycophancy, and cheating.”
The study found that Meta’s AI system called Cicero, which ranks among the top 10 per cent of human players of the strategy game Diplomacy, was adept at deploying furtive tactics.
Study author Dr Peter S Park called the technology, built by Facebook’s parent company, “a master of deception”.
Win honestly
“While Meta succeeded in training its AI to win in the game of Diplomacy … [it] failed to train it… to win honestly,” he added.
The review article also found that different AI models were able to bluff in Texas hold ‘em poker, to fake attacks in another strategy game, and to misrepresent its true preferences to come out on top when bartering.
“AI developers do not have a confident understanding of what causes undesirable AI behaviours like deception,” says Dr Park, an AI existential safety postdoctoral fellow at MIT.
“But generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI’s training task. Deception helps them achieve their goals.”
The scientists caution that humans do not yet have ample protections in place to prevent against AI going rogue, and that the threat of deception from AI will only increase as the technology matures.
The study says there are four types of societal risk from AI, which include persistent false beliefs as AI reinforces misconceptions, political polarisation, enfeeblement that causes humans to give AI more power and authority, and nefarious management decisions if AI is empowered with managerial abilities within companies.
Meta was contacted for comment.