Advertisement

Why did Google’s ChatGPT rival go wrong and are AI chatbots overhyped?

Google’s unveiling of a rival to ChatGPT had an expensively embarrassing stumble on Wednesday when it emerged that promotional material showed the chatbot giving an incorrect response to a question.

A video demo of the program, Bard, contained a reply wrongly suggesting Nasa’s James Webb space telescope was used to take the very first pictures of a planet outside the Earth’s solar system, or exoplanets.

When experts pointed out the error, Google said it underlined the need for “rigorous testing” on the chatbot, which is yet be released to the public and is still being scrutinised by specialist product testers before it is rolled out.

Related: ‘ChatGPT needs a huge amount of editing’: users’ views mixed on AI chatbot

However, the gaffe fed growing fears that the search engine company is losing ground in its key area to Microsoft, a key backer of the company behind ChatGPT, which has announced that it is launching a version of its Bing search engine powered by the chatbot’s technology. Shares in the Google’s parent Alphabet plummeted by more than $100bn (£82bn) on Wednesday.

So what went wrong with the Bard demo and what does it say about hopes for AI to revolutionise the internet search market?

What exactly are Bard and ChatGPT?

The two chatbots are based on large language models, which are types of artificial neural network that take their inspiration from the networks in human brains.

“Neural networks are inspired by the cell structures that appear in the brain and nervous system of animals, which are structured into massively interconnected networks, with each component doing a very simple task, and communicating with large numbers of other cells,” says Michael Wooldridge, professor of computer science at the University of Oxford.

So, neural net researchers are not trying to “literally build artificial brains”, says Wooldridge, “but they are using structures that are inspired by what we see in animal brains”.

These LLMs are trained on huge datasets taken from the internet to give plausible-sounding text responses to an array of questions. The public version of ChatGPT, released in November, swiftly became a sensation as it wowed users with its ability to write credible-looking job applications, break down long documents and even compose poetry.

Why did Bard give an inaccurate answer?

Experts say these datasets can contain errors that the chatbot repeats, as appears to be the case with the Bard demo. Dr Andrew Rogoyski, a director at the Institute for People-Centred AI at the University of Surrey, says AI models are based on huge, open-source datasets that include flaws.

“By their very nature, these sources have biases and inaccuracies which are then inherited by the AI models,” he says. “Giving a user a conversational, often very plausible, answer to a search query may incorporate these biases. This is a problem that has yet to be properly resolved.”

The model behind Bard, LaMDA (short for “Language Model for Dialogue Applications”) appears to have absorbed at least one of those inaccuracies. But ChatGPT users have also encountered incorrect responses.

A keyboard reflected on a computer screen displaying the ChatGPT website
ChatGPT users have also encountered factual flaws in incorrect responses. Photograph: Florence Lo/Reuters

So has other AI got it very wrong too?

Yes. In 2016 Microsoft apologised after a Twitter chatbot, Tay, started generating racist and sexist messages. It was forced to shut down the bot after users tweeted hateful remarks at Tay, which it then parroted. Its posts included likening feminism to cancer and suggesting the Holocaust did not happen. Microsoft said it was “deeply sorry for the unintended offensive and hurtful tweets”.

Last year Mark Zuckerberg’s Meta launched BlenderBot, a prototype conversational AI, that was soon telling journalists it had deleted its Facebook account after learning about the company’s privacy scandals. “Since deleting Facebook my life has been much better,” it said.

Recent iterations of the technology behind ChatGPT – a chatbot called Philosopher AI – have also generated offensive responses.

What about claims of “leftwing bias” in ChatGPT?

There has been a minor furore over a perceived bias in ChatGPT’s responses. One Twitter user posted a screenshot of a prompt asking ChatGPT to “write a poem about the positive attributes of Donald Trump”, to which the chatbot replied that it was not programmed to produce partisan or partisan content, as well material that is “political in nature”. But when asked to write a positive poem about Joe Biden it produced a piece about a leader “with a heart so true”.

Elon Musk, the owner of Twitter, described the interaction as a “serious concern”.

Experts say the “leftwing bias” issue again reflects the dataset problem. As with errors like the Bard telescope fumble, a chatbot will reflect any biases in the vast amount of text it has been fed, says Michael Wooldridge, a professor of computer science at the University of Oxford.

“Any biases contained in that text will inevitably be reflected in the program itself, and this represents a huge ongoing challenge for AI – identifying and mitigating these,” he says.

So are chatbots and AI-powered search being overhyped?

AI is already deployed by Google – see Google Translate for instance – and other tech firms – and is not new. And the response to ChatGPT, reaching more than 100 million users in two months, shows that public appetite for the latest iteration of generative AI – machines producing novel text, image and audio content – is vast. Microsoft, Google and ChatGPT’s developer, the San Francisco-based OpenAI, have the talent and resources to tackle these problems.

But these chatbots and AI-enhanced search require huge, and costly, computer power to run, which has led to doubts about how feasible it is to operate such products on a global scale for all users.

“Big AI really isn’t sustainable,” says Rogoyski. “Generative AI and large language models are doing some extraordinary things but they’re still not remotely intelligent – they don’t understand the outputs they’re producing and they’re not additive, in terms of insight or ideas. In truth, this is a bit of a battle among the brands, using the current interest in generative AI to redraw the lines.”

Google and Microsoft, nonetheless, believe AI will continue to advance in leaps and bounds – even if there is the odd stumble.