Paul Weiss Assessing Value of AI, But Not Yet on Bottom Line
For a world that bills by the hour, generative AI is still getting a lot of leeway before having to prove a return on the investment.
The artificial intelligence boom has many predicting a transformation in how lawyers work. But law firms are largely still testing the software and moving carefully as they figure out how to navigate generative AI’s well-publicized risks, including the accuracy problems known as hallucinations and threats to confidential client data.
At Wall Street firm Paul Weiss, AI testing has included nearly a year and a half using the legal assistant tool known as Harvey. The firm isn’t using hard metrics like time saved to evaluate the program even now. The importance of reviewing and verifying the accuracy of the output, including checking the AI’s answers against other sources, makes any efficiency gains difficult to measure.
Instead, the firm is looking at a more subjective measure: how much lawyers like using it.
Harvey, for example, is good at helping experienced lawyers come up with ideas, said Gina Lynch, Paul Weiss’ chief knowledge and innovation officer. It’s not the tool the firm tells its attorneys to turn to for factual answers, yet in the hands of a seasoned attorney—someone who already knows the right answer to the question—the software helps brainstorm new ways to approach a problem.
The stronger guardrails legal research tools need to avoid hallucinations, for example, also means they can’t be as creative, she said. And what some of the firm’s lawyers liked most about Harvey “was its value for ideation and open-ended nature.”
That’s a fitting description about how the rubber is meeting the road more than a year into the hype since ChatGPT put “generative AI” on every lawyer’s mind. There are a host of programs—some tailored for discrete legal tasks that might show a clear ROI quickly, and some that even after months of testing leave the profession largely unsure of how to measure the value it brings to a law firm or a corporate customer.
In many cases, businesses are looking to well-resourced firms—Paul Weiss had revenue of $2 billion last year, according to The American Lawyer, and counts over 1,000 lawyers on staff—to get some understanding of how to use and evaluate the new software, and which products are worth using. The legal tech market will expand to $50 billion by 2027, according to researcher Gartner Inc., and venture capitalists have flooded dozens of companies with cash for marketing campaigns.
Harvey has been among the most notable. The startup, founded in 2022, has received more than $100 million in funding to date, including from OpenAI. Its customers include PwC and A&O Shearman. Paul Weiss offers one example of how a firm is putting the tool to use, and what lawyers are still figuring out about how generative AI will become part of their daily work.
“Clients don’t have a way to properly evaluate firms right now on the basis of their gen AI products,” said Chris Audet, chief of research in Gartner’s legal, risk and compliance leaders practice. Most firms are still testing out the technology, and aren’t yet ready to market their generative AI offerings to clients based on return on investment, or the effects on costs or volume of work.
“Most clients understand that Gen AI right now is really just something that they’re looking for that firm or that vendor to have on a roadmap.”
Bloomberg Law sells legal research tools and software, including some that make use of GenAI.
One Firm’s Take
Paul Weiss Rifkin Wharton & Garrison started testing Harvey in January 2023, and became a client later that year. The firm has also piloted CoCounsel and Lexis+AI, and licenses AI drafting tool Draftwise and review platforms Macro and Kira. The firm is also currently testing tools that can help with AI data extraction, said Iris Skornicki, director of knowledge solutions.
When she first started using legal AI, Skornicki said she was surprised to find the models are fairly inconsistent: The same query, asked a few minutes apart, might get a slightly different response. For some use cases, that’s the biggest limiting factor. Lawyers don’t like unpredictability, Lynch said.
Katherine Forrest, a partner and chair of the firm’s digital technology group, asked Harvey to help her come up with the questions that a panel of judges on the Ninth Circuit were likely to ask her in an upcoming argument, based on public documents. She gave the tool the names of the judges on the appellate panel, and asked it for the most likely questions each judge would ask, the hardest questions the tool could come up with, and fact-based answers.
The model gave her “remarkably accurate questions” that would have taken a mid-level associate weeks to compile, she told Bloomberg Law in March.
For now, the value from AI will largely be “low-hanging fruit” non-billable work and applying it to practice groups that benefit from efficiency, Lynch said. The tools are useful for more straightforward tasks such as summarizing documents including complaints, decisions, settlement agreements, legislation, public merger agreements, credit agreements, and SEC filings, Skornicki said.
The AI can create summaries that mimic the style and format of documents written by lawyers, and the ability to summarize legal documents without using legalese has been helpful in preparing documents for clients. The firm has leveraged Harvey on use cases its clients request, such as summarizing specific governance terms in deal documents in a tone aimed at business professionals.
The firm also collaborated with Harvey on creating custom workflows—for example, generating the first markup of a document or uploading the document and asking the tool for an issues list, Skornicki said.
It’s not replacing the human in a due diligence process, but making the process faster and more thorough. For example, clients used to ask for a review of a large sample set of contracts. With technology, the firm can now review all the contracts instead of just a sample.
“Historically, legal tech has often been seen as most relevant for associates doing research and document review,” and Harvey does some of those tasks, said Winston Weinberg, the company’s co-founder and CEO. But it was also designed to “surface new ideas,” he added.
The company said it’s rolling out the product to smaller and mid-size firms as it goes from beta mode into a more commercially available business model.
Accuracy Problems
Many legal tech vendors are addressing generative AI’s accuracy problems with a technology called retrieval augmented generation, or RAG, which points the AI to a specific, limited set of information.
But RAG isn’t perfect, Skornicki said. In the firm’s experiments, they found that tools using RAG sometimes pull an answer from the document when asked to rely on their general knowledge, or vice versa. And sometimes the AI still hallucinates, even when given a discrete data set.
RAG also allows client work to be separated into discrete data sets, which protects clients’ confidentiality. For security, everything is deleted from the model within 24 hours.
Generative AI tools also have a limited context window—the amount of information they can consider at once as part of a query. A larger number of documents in the context window means the prompt will have to be simpler. It’s likely that these limitations will improve over time, Lynch said.
Clients have asked the firm to demo legal tech products for them and are interested in the software the firm uses itself. Clients also tap into the firm’s experience running pilots of technology and are asking for advice on how to test and license tools, Skornicki said.
“This is the toughest environment we’ve ever had in terms of evaluating technology,” because the tech is changing so quickly, Lynch said.