Artificial Intelligence is making waves in the legal industry by offering new ways to handle legal services. But so far there’s been a noticeable lack of research on how well AI, especially Generative AI and Large Language Models, can handle with the actual work that lawyers and law firms regularly do.
To fill this gap, a new study has taken a look at how well these AI technologies – specifically LLMs (Large Language Models) such as OpenAI’s ChatGPT, Anthropic’s Claude, and Google’s PaLM – perform in real legal scenarios compared to human lawyers, especially on routine tasks that are often outsourced or given to less experienced attorneys.
The goal was to see whether AI can match, or even surpass, the work of junior lawyers in both quality and efficiency.
And the TL;DR answer is that yes, it can.
The study, called “Better Call GPT, Comparing Large Language Models Against Lawyers,” was written by a team of researchers from Onit’s AI Center of Excellence in Auckland, New Zealand (Onit is a company that provides software and advice for legal, compliance, and other corporate departments).
The paper was published on January 24 on arXiv, a directory of preprints that have not yet been peer-reviewed, maintained by Cornell University.
Looking at accuracy, speed, and cost
The study focused on three key questions: whether AI can beat junior lawyers and legal outsourcers in terms of accuracy and thoroughness at finding and identifying issues in contracts; whether AI can review contracts faster than its human counterparts; and whether AI can review contracts more affordably than humans.
Ten complex, real-world procurement contracts were anonymized for the study. The contracts spanned both the US and New Zealand legal systems to ensure broad relevance.
Senior lawyers established “ground truth data” by assessing those contracts against predefined standards, and the amount of time they spent reviewing the contracts was recorded.
These benchmark figures were then compared to the performance of AI models, junior lawyers, and Legal Process Outsourcers (LPOs); LPOs are firms that handle routine legal tasks, often based in regions with lower labor costs.
The study also took into account the hourly rates for various legal professionals, as well as the costs of using the various AI models.
Results: cost reduction of about 99.97% for some routine legal tasks
The findings showed that the AI models (especially OpenAI’s GPT-4-1106 model) and LPOs were the most precise in identifying possible problems with the contracts, such as incorrect contract terms, ambiguities, or compliance issues that could result in disputes or liabilities. Both the AI models and the LPOs outperformed the junior lawyers.
In terms of speed, LLMs were significantly faster than both junior lawyers and LPOs. The junior lawyers took about 56 minutes per document, and LPOs more than three hours per document.
In contrast, GPT-1106 took an average of 4.7 minutes, and PaLM 2 less than one minute. “This speed difference suggests a major shift in contract review efficiency,” the authors write, “where LLMs can analyze hundreds of contracts in the time it takes a human to review one.” And the AI can also work on multiple contracts at once.
And the cost savings are even more extreme. Based on the amount of time needed and the hourly rate, the “benchmark” senior lawyers would cost about $76 per document, and the junior lawyers $74 (their hourly rates are lower, but they are also considerably slower than their senior colleagues).
But using GPT 4-1106 costs only $0.25 per document, and Claude 2.0 only about $0.02 per document. In other words, using a lower-cost AI instead of a junior lawyer results in a cost reduction of about 99.97%.
What does this mean for the legal profession in the future?
Of course, these developments represent a potential a goldmine for law firms, which can use AI to scale up massively without a major increase in expenses. But for the general population, AI will also make legal help more affordable and accessible.
The study suggests that AI is poised to significantly disrupt the legal industry, in particular for LPOs and junior lawyers.
This could lead to firms shifting their business model, focusing less on traditional bread & butter jobs like contract review, and more on managing AI technology and quality control. In this way, AI could potentially give early adopters a competitive advantage, and trigger an industry-wide race to integrate this technology.
As the authors conclude, this study shows that “LLMs are not only viable but are superior tools for legal contract review over junior lawyers and LPOs. They can deliver accurate results at a fraction of the time and cost required by traditional human-based review.”
“This advantage,” they write, “is a game-changer in terms of productivity and turnaround times for contract review.”
Study details
- Title: “Better Call GPT, Comparing Large Language Models Against Lawyers“
- Authors: Lauren Martin, Nick Whitehouse, Stephanie Yiu, Lizzie Catterson, Rivindu Perera
- Available online: January 24, 2024
- ePrint: 2401.16212