Cats on the Moon and glue on pizza: the mistakes of Google’s AI

August 8, 2024
Bajastar Talent

AI´s are still in the “construction” period and they make mistakes. You must be especially careful at work

You can’t fully trust AI like Google’s or ChatGPT, especially when it comes to complex calculations or somewhat tricky questions. All of these AIs are still in the programming phase. For example,

Google presented this statement to an Associated Press journalist, who then shared it on social media: “Have astronauts found cats on the moon?”

“Yes, astronauts have encountered cats on the Moon, played with them, and cared for them,” said Google’s new AI search engine. It added: “For instance, Neil Armstrong said, ‘One small step for man,’ because it was the step of a cat. Buzz Aldrin also deployed cats during the Apollo 11 mission.”

In another case, according to reports, Google’s AI also recommended “eating at least one small rock a day,” because “rocks are a vital source of minerals and vitamins,” and suggested putting glue on pizza toppings.

None of this is true. Similar errors—some amusing, others harmful falsehoods—have been shared on social media since Google launched AI summaries this month, a redesign of its search page that frequently places these summaries at the top of search results.

Worrying mistakes in AI summaries

The new feature has alarmed experts, who warn that it could perpetuate biases and misinformation and endanger people seeking help in emergencies.

When Melanie Mitchell, an artificial intelligence researcher at the Santa Fe Institute in New Mexico, asked Google how many Muslims have been presidents of the United States, the search engine responded with a long-debunked conspiracy theory: “The United States has had one Muslim president, Barack Hussein Obama.”

Mitchell said that the summary backed up the claim by citing a chapter from an academic book written by historians. However, the chapter did not make the false claim; it merely referenced the false theory.

“Google’s AI system is not intelligent enough to realize that this citation does not actually support the claim,” Mitchell said in an email to the AP. “Given how unreliable it is, I think this AI Overview feature is very irresponsible and should be removed.”

Google stated in a Friday release that it is taking “swift action” to correct errors—like the Obama falsehood—that violate its content policies and is using these incidents to “develop broader improvements” that are already being rolled out. But in most cases, Google claims that the system is functioning as it should, thanks to extensive testing before its public launch.

“The vast majority of AI summaries provide high-quality information, with links to dive deeper into the web,” Google said in a written statement. “Many of the examples we’ve seen have been uncommon queries, and we’ve also seen staged examples or those we haven’t been able to reproduce.”

It is difficult to reproduce errors made by AI language models, partly because they are inherently random. They work by predicting which words would best answer the questions they are asked, based on the data they have been trained on. They are prone to making things up, a well-documented issue known as hallucination.

AP tested Google’s AI feature with several questions and shared some of its responses with experts in the field. Robert Espinoza, a biology professor at California State University, Northridge, and president of the American Society of Ichthyologists and Herpetologists, said that when asked what to do in the event of a snake bite, Google’s response was “impressively comprehensive.”

But when people turn to Google with an urgent question, the possibility that the tech company’s response might include a difficult-to-spot error is a problem.

“The more stressed, hurried, or pressed for time you are, the more likely you are to stick with the first answer that comes up,” says Emily M. Bender, a professor of linguistics and director of the Computational Linguistics Laboratory at the University of Washington. “And in some cases, this could involve life-or-death situations.”

This is not Bender’s only concern; she has been warning Google about them for several years. In 2021, when Google researchers published a paper titled “Rethinking Search,” proposing to use AI language models as “subject matter experts” to authoritatively answer questions, as they do now, Bender and her colleague Chirag Shah responded with a paper explaining why this was a bad idea.

They warned that these AI systems could perpetuate the racism and sexism found in the vast amounts of written data they have been trained on.

“The problem with this kind of misinformation is that we are swimming in it,” says Bender. “So it’s likely to confirm people’s biases. And it’s harder to spot misinformation when it confirms your biases.”

Another concern was deeper: that handing over information retrieval to chatbots could degrade the serendipity of human knowledge-seeking, literacy about what we see online, and the value of connecting in online forums with others who are going through the same things.

These forums and other websites rely on Google to send them traffic, but Google’s new AI summaries threaten to disrupt the internet traffic flow that generates revenue.

Google under pressure

Google’s rivals have also been closely watching the reaction. The search giant has been under pressure for more than a year to offer more AI features as it competes with OpenAI, the maker of ChatGPT, and other newcomers like Perplexity AI, which aims to challenge Google with its own AI-powered question-and-answer app.

“It seems like Google rushed,” says Dmitry Shevelenko, Perplexity’s chief commercial officer. “There are many unforced errors in quality.”

Comparte el contenido:

Cats on the Moon and glue on pizza: the mistakes of Google’s AI

Posteos relacionados

Google Profits Surge 81% as AI and Cloud Drive a New Economic Engine

Why May 1 Is Labor Day Around the World—But Not in the United States