What’s on bankers’ wish lists for generative AI?
When Google began rolling out the AI Overview feature on its search engine earlier this month, which gives AI-generated answers above actual search results, people immediately began testing it. A
In a more serious example, in March Public Citizen released a
Cases like these explain why one of the top items on many bankers’ wish lists as they assess advanced AI models is reliability. In a recent survey conducted by Arizent, American Banker’s parent company, 80% of bank executives said they were very or somewhat concerned about the risk of nonsensical or inaccurate information (e.g., hallucinations, misinformation) in their generative AI deployments.
If he could have a generative AI model built from scratch to his specifications, Sathish Muthukrishnan, chief information, data and digital officer at Ally Financial, said it would have system prompts that would tell the model to behave like a bank, and let the model only access data generated specifically for Ally.
“We would tell the model not to create answers, but just say, ‘No, I don’t know the answer’ if it doesn’t know,” he said. “For a period of time, you always have to have a human in the middle that ensures the output coming from the model is not made up or coming from a place of hallucination.”
In the examples above, Google AI Overview (based on Google’s Gemini large language model) did not make up answers, but drew them from dubious sources on the internet.
The advice to eat rocks came from an April 2021 satirical article from The Onion: “Geologists recommend eating at least one small rock per day.” The suggestion to put glue in pizza came from an 11-year-old Reddit post that was clearly a joke.
“Some of these models, including the non-OpenAI ones, offer a response that looks good, and then you realize it’s just copied something off the internet nearly verbatim, and it’s not necessarily a great source,” said Gilles Ubaghs, strategic advisor at Datos Insights. “It’s important for any organization to be able to cross check that: Where is this coming from? Is this verifiable data? Is this just lifted from some salesy website somewhere for some vaporware company? Having that traceability is important.”
One thing several banks, including Ally, insist on to prevent hallucinations and nonsense answers, is that large language models be programmed to provide the source for every piece of information the model provides.
The apps providing AI-generated mushroom misinformation are apparently making mistakes.
“Experienced local foragers know there is no substitution for finding, seeing, smelling, touching, and, sometimes, tasting wild mushrooms where and when they appear,” Public Citizen wrote in its
Customization
In addition to rock-solid reliability, Muthukrishnan’s dream generative AI model would be tailor-made to Ally Bank.
“Today, banks are struggling to differentiate from each other,” he said. “Bank A is similar to Bank B, which is similar to Bank C, providing the same set of financial tools for customers. I would love a generative AI model built specifically for Ally customers that understands how they have engaged with Ally, what their expectations of their future would be, and be able to serve them in a manner that everybody feels as if they are our only customer.”
Such a model would be trained on all of the interactions, behaviors, experiences, and transactions that every customer has had with Ally.
“The superpower resides in understanding how clusters of customers engage with Ally, but being able to create an experience that is very specific to one single customer,” said Muthukrishnan. “So the data set would enable us to build a model to provide an experience to that single customer that doesn’t match what we provide to someone else.”
Many banks would like to train large language models with their own data, Ubaghs at Datos Insights said, but have a lot of data sitting in different silos. And at the same time, they don’t want to share their sensitive data with a company like OpenAI.
“This is why they’re working with companies like Microsoft, Oracle and Amazon Web Services,” that protect data within their cloud environments, Ubaghs said.
Data privacy and security
Data security and data privacy are, unsurprisingly, top concerns for bankers as they deploy generative AI. In Arizent’s research, 76% of bankers said they were very or somewhat concerned about exposing customer data through the use of generative AI; 73% said they were very or somewhat concerned about providing another point of vulnerability for hackers and cyberattacks.
“Is the bank enterprise data going to be absorbed by OpenAI? Are they going to be using that very, very sensitive bank information? This is something I keep hearing time and time again from all the banks I’m speaking to,” Ubaghs said. “Everyone’s extremely nervous about sharing their data with the large language models and OpenAI always comes up as the prime example. I don’t think people trust it, which may be problematic from a banking point of view.”
The deals OpenAI has made with
As a result, some banks are not interested in working with OpenAI directly, but want to go through infrastructure and cloud providers like Microsoft, Amazon or Oracle, Ubaghs said.
“They want that intermediary and that guidance,” Ubaghs said.
Muthukrishnan said his ideal generative AI model would not be allowed to share Ally data with the rest of the world, he noted.
“We would protect the customer data, making sure there is no data loss or exposure of privacy data to the external world,” he said.
The right to be forgotten
Another coveted feature for Muthukrishnan and others is the ability to digitally erase all prompts and interactions with the system.
“We want the model to take the input and give us an output, and then forget it ever happened,” Muthukrishnan said. “So there is no persistence within the model and the model is not learning from Ally data. We would take it upon ourselves to create the fabric and stitch the story behind how these interactions happened, and use it for further interactions to refine the output, but not allow the model to become more intelligent with what we are doing with it.”
At EY, which has built the largest private, secure generative AI environment in the world – it’s used by more than 275,000 employees – one requirement the company’s risk management team had was that the generative AI system could not retain any information about prompts, queries or responses.
“Their view was that we can’t have hundreds of thousands of people doing things and leaving data laying around all over the place,” said John Thompson, global head of artificial intelligence.
At one point, Thompson realized Microsoft’s abuse monitoring system was storing data. (The generative AI model at EY lives within Microsoft Azure.) He had to ask Microsoft to delete this data. Otherwise, he and his team were able to rely on Microsoft for data privacy and security protections, he said.
Protection against bias
Large language models can also pick up racist or misogynistic attitudes from sites they have been trained on like Reddit and X (formerly Twitter). They could also, potentially, pick up similar attitudes from a bank’s own data, such as transcripts of past customer service calls and past loan decisions.
But generative can also be used to detect bias, for instance, while summarizing customer service calls. “It can do a live transcription, pick out the highlights, do those summaries very quickly, so you can get a real-time view of what’s going on,” Ubaghs said. This could be used to detect bias and unfair treatment.
Ally has a model governance and testing team that tests models for bias.
“We ensure bias doesn’t creep in and there is no model drifting that happens,” Muthukrishnan said. “From a generative AI standpoint, you can ensure that happens by looking at the outcomes and the behavior of the model as opposed to the data that is input into it.”
Another item on many banks’ wish lists: Flexibility, so they don’t have to be locked in to one provider.
An example is Amazon Bedrock, a framework for generative AI services that’s not tied to one foundation model.
“So you’re not locked into OpenAI or Cohere or whatever it may be, into perpetuity, and you actually have the ability to swap those out and change to a different model,” Ubaghs said. “I think that sort of structure is going to become increasingly critical.”