An AI-driven insurance company that claimed it can detect fraud by analyzing “non-verbal cues” in videos of a person speaking has removed its claims after being called out by AI experts on Twitter, raising questions about not just the dystopian aims of the company but also about the actual capabilities of its AI and what that AI is used for.
Lemonade is an insurance startup that lets people file claims through videos submitted on an app. In a Twitter thread Monday that the company later deleted and called “awful,” Lemonade announced that the customer service AI chatbots it uses collect as much as 1,600 data points from a single video of a customer answering 13 questions. “Our AI carefully analyzes these videos for signs of fraud. It can pick up non-verbal cues that traditional insurers can’t, since they don’t use a digital claims process,” the company said in a now-deleted tweet. The thread implied that Lemonade was able to detect whether a person was lying in their video and could thus decline insurance claims if its AI believed a person was lying.
Videos by VICE
AI experts on Twitter immediately mocked and contested the claim, pointing out that the entire premise of so-called “emotion recognition” systems, which claim to detect a person’s mood or mental state, is highly suspect. They also raised the well-established point that these systems are inherently biased.
“These kinds of physiognomic systems don’t work period. It’s increasingly embarrassing [for companies] to talk about them … and yet somehow they keep bubbling up,” Luke Stark, a professor at Western University who studies physiognomic AI, told Motherboard. “There always seems to be the temptation to brag that you’re doing some new fancy thing, as this company did in their tweets.”
On Wednesday, Lemonade deleted the Twitter thread, saying that it “caused more confusion than anything else.” It also claimed that the company doesn’t approve or reject insurance claims based solely on AI analysis: “We do not use, and we’re not trying to build AI that uses physical or personal features to deny claims (phrenology/physiognomy).”
“The term non-verbal cues was a bad choice of words to describe the facial recognition technology we’re using to flag claims submitted by the same person under different identities. These flagged claims then get reviewed by our human investigators,” the company wrote in a blog post after deleting its tweets. “AI is non-deterministic and has been shown to have biases across different communities. That’s why we never let AI perform deterministic actions such as rejecting claims or canceling policies.”
In attempting to clarify the situation, Lemonade has still left widespread confusion about how the technology at the foundation of its business works. The post says the company uses facial recognition technology, for example, but in its privacy policy it claims that it will never collect customers’ biometric information. And how it achieves 1,600 data points from a video of a person answering 13 questions without biometric information also isn’t clear.
Lemonade did not immediately respond to questions from Motherboard about its process. It is worth noting that many so-called “artificial intelligence” startups actually rely on human labor behind the scenes. Many hope that human workers can train artificial intelligence systems that will ultimately replace them. It is not clear to what extent actual AI is involved in Lemonade’s process at all; the blog post says AI “flags” certain claims which are then reviewed by a “human investigator.”
When Lemonade went public in 2020, it did so with the promise of being a classic artificial intelligence-backed industry disrupter, but with a twist—it would be a public benefit corporation, or B Corp, with a dual mission of creating profit and social good.
But AI experts question how a commitment to social good can include using machine learning systems that the company itself admits are prone to bias and discrimination.
In its S-1 form filed with the U.S. Securities and Exchange Commission prior to the company going public—and in forms since—the company states that its proprietary AI algorithms are at the core of its business and that it could not function without them, but that they could also lead to profit loss should regulators ever crack down on their weaknesses.
“Our proprietary artificial intelligence algorithms may not operate properly or as we expect them to, which could cause us to write policies we should not write, price those policies inappropriately or overpay claims that are made by our customers,” the company wrote in the filing. “Moreover, our proprietary artificial intelligence algorithms may lead to unintentional bias and discrimination.”
At the same time, Lemonade claims that it can “vanquish bias” by using “algorithms we can’t understand.”
As Motherboard has previously reported, experts say the process of using AI to determine which non-verbal cues—such as eye movements, twitches, or even pauses in speech—are evidence of fraud or trustworthiness is far from a perfect science.
Research has shown that expressions of common emotions, or what behavior is most often associated with trustworthiness, varies not just between cultures and different people, but even between how a single person acts at different times and in different situations.
Chris Gilliard, a professor of English at Macomb Community College who has studied tech-based discrimination, says the usage of behavior-detection algorithms can be especially harmful in the insurance industry, where racial bias is already rampant.
“They are taking something that already has a long history of racism and discrimination—insurance—and increasing the likelihood that it will be even more discriminatory, given the well established research on how these kinds of systems often (mis)read Black people, disabled people, and trans and non-binary people,” Gilliard told Motherboard.
He added that these algorithmic systems can cause harm regardless of whether or not they make the final determination, because human intermediaries tend to take their assessments as fact.
“I think part of the problem is that the tech claims to universalize something that is not universal, and in doing so assigns a designation based on the values of the people who designed the system—and then those values get hard coded into the system regardless of whether they make sense to/for the people being measured,” said Gilliard.
“In short—if a computer says someone is lying, cheating, or guilty, people will tend to believe the system whether or not it’s accurate.”