Deploying AI at a business costs money, and businesses obviously intend to get something positive from using it. Often, the goal is to make employees more efficient – to enable them to crank out more work in the same time. In some cases, businesses might also hope that using AI will improve their products or make the business easier for customers to interact with.
But what if AI makes things worse? Is that likely? If so, what could go wrong? And what are the legal risks arising from AI mistakes?
Generative AIs (“GenAIs” in tech lingo) such as ChatGPT, Google’s Gemini, or Anthropic’s Claude are prone to two kinds of errors – hallucinations and omissions.
GenAI and Hallucinations: The Two Certainties in Life (Alongside Spam, Death, and Taxes)
“Hallucinating” is a charitable way of describing when GenAI produces erroneous output. You heard the old saw about know-it-all people: “often wrong but never in doubt.” Well, generative AIs are sometimes wrong but rarely in doubt. While they sometimes get an answer wrong to a prompt, they’ll usually respond confidently. Sometimes, they’ll say, “I don’t know about that,” but often they’ll give a confidently wrong answer. In contrast, most people will either admit when they don’t know the answer or hedge in a way others will discern as uncertainty.
Due to the nature of GenAI technology, it will hallucinate sometimes. This largely happens because of the “outlier problem.” Some prompts either require knowledge of a subject in which the AI has had little training, or the prompt strings together various subjects that the AI has not seen together (often, at least) in training. When that happens, the GenAI can get wrongly “creative.” Current popular GenAIs don’t have sanity checkers – separate technology that checks the accuracy of the AI’s output, such as linking to a calculator that would check any math performed. That means the GenAI user – the human – must be the sanity checker. That person must find and fix hallucinations.
Errors of Omission: Employee Expertise Needed (And Now for Something Completely Forgotten!)
GenAIs also sometimes omit critical stuff from their outputs. For example, for the past few months, I’ve been querying both ChatGPT and Microsoft CoPilot about a hypothetical copyright law scenario. Those systems have been identifying the right statutes and older case law, but they keep failing to mention a May 2023 Supreme Court case that changed the law. Their answers have not been hallucinations – they haven’t stated anything untrue – they just leave out the critical 2023 case that changed the law.
Omissions are a big AI risk for businesses. Many people now know you need to check the accuracy of AI output because it can “hallucinate.” You can do so by asking the AI to cite its sources (sometimes it provides them without asking) and checking them. But checking those sources might not reveal that the GenAI omitted something important.
The way to guard against this is to allow people to use GenAI only when they have subject matter expertise in the task at hand so that they might spot important omissions because of their education and experience. In other words, don’t try to use GenAI to work in areas where you have no competence. For example, I’m an IP attorney. Because I don’t know tax law, I shouldn’t try to use GenAI to provide tax law advice. But if you must restrict use of GenAI to people who are subject matter experts in the area in which it will be used, that probably will raise your personnel cost, making the use of GenAI less profitable.
Overreliance: Expecting Employees to Act in Their Self-Interest (It's Not Just a Flesh Wound!)
So, because GenAI will sometimes hallucinate or omit important things in producing output, what should a business do? Ideally, the people using the GenAI will vet the output before the company adopts it. But what if they don’t?
Scientific studies have shown that humans sometimes over-rely on AI to produce sound output. While this overreliance can happen for various reasons, it generally boils down to the AI output looking good enough to the human that the human decides it isn’t worth the effort to check it.
This gets into people’s analysis of their own self-benefit, which economists call “utility.” When employees think about whether it’s worthwhile to do the work to verify an AI’s output, they implicitly weigh three things:
a positive reward for getting the work done,
a positive reward for producing good (enough) work, and
a negative reward for spending time and effort to do the verification.
When the employee perceives that the negative outweighs the positives, that person may decide to skip the verification out of self-interest (personal utility maximization). “It ain’t worth the time and effort.”
Studies show that, sometimes, human overreliance on AI (trusting an output that is detectably wrong or uses flawed analysis) causes the person to produce a worse work product than if he had never used AI and did the work himself. (True, in some applications, an AI may consistently do better work than a human could, such as playing chess. We’re talking here about situations when humans and AI must work together – where AI can’t be trusted to go it alone based on proven results and where using an AI offers the possibility of doing better than the human working alone.)
Seeking Complementarity: Managing Employees for Maximum Man-Machine Symbiosis (Now with 42% More Silly Walks!)
Instead of suffering bad outcomes from AI overreliance, the elusive holy grail is to achieve what computer scientists call “complementarity,” which is when the human-AI team produces a better output than either the human or AI would working alone. (Complementarity is also achieved when the human-AI team does as well as a human acting alone but does the work cheaper than a human acting alone.)
Computer scientists have been conducting social science experiments to ascertain how human-AI interactions can be structured to achieve complementarity. Some strategies for achieving it are emerging from these experiments. At a conceptual level, the strategies involve changing how employees evaluate their personal utility in working with AI – making changes to the cost-benefit situation the employees confront so that they feel it’s personally worthwhile to verify the AI’s output.
Here are some of the emerging strategies. Many are managerial common sense:
Use monetary incentives. Pay the employee well. Studies show that paying more encourages employees to take work more seriously, but this reduces the profitability of using AI. Pay bonuses for doing work well by verifying and fixing AI output. This requires that the company be able to detect when an unvetted, erroneous AI work product gets through, at least often enough to catch slackers. On the flip side, impose financial penalties on employees who fail to vet outputs. An employer can withhold raises, bonuses, or promotions and, in a worse case, fire.
Require employees to disclose when and how they used AI and what they did to verify its output.
Prohibit employees from using AI to produce company work products in areas where the employee lacks subject matter expertise.
Invest in better AI. Use AI that cites its sources. Make it easier for employees to check sources to do verification. This won’t solve the AI omissions problem, though. Employee subject-matter expertise remains needed to spot omissions. If the AI is tailored to doing the needed task, it will hallucinate and omit key material less frequently, so failure to verify output won’t be damaging as often.
Make the AI less human. Studies show humans are more likely to assume an AI is accurate when it interacts like a human.
Make work more enjoyable. People who like their work are more likely to work hard.
Potential Legal Consequences of Not Fixing Employee Overreliance on AI (Cue the Lawyers and the Spanish Inquisition!)
What about the law? Why is human overreliance on AI a legal issue for companies? Hopefully, it’s obvious that defective output can create liability. If a business produces a defective work product – it’s wrong or omits essential information – that can create liability in various ways, including:
The work product may not meet the standards required in a contract, which would be a breach of contract.
Allowing the defect to go through may be negligence, which could subject the company to tort liability.
The defect might cause the work product to not meet the standard set by an applicable law or regulation.
The defect could lead to public embarrassment, hurting the company’s reputation and sales.
Outside of AI, business owners know that producing shoddy or defective products or services is the road to ruin and that they must hire, manage, and monitor employees to keep errors in check. That’s common sense.
Don’t set aside that common sense when managing and monitoring how employees use AI. Employees will act in their self-interest when using it, just as they do without it. Employers must understand how that self-interest will affect how employees use AI and structure things to incentivize employees to use it responsibly.
Written on September 19, 2024
by John B. Farmer
© 2024 Leading-Edge Law Group, PLC. All rights reserved.