Major study reveals AI chatbots struggle with delivering accurate news

ago 4 hours
Major study reveals AI chatbots struggle with delivering accurate news

A significant study involving 22 public service media organizations has revealed that AI chatbots often misrepresent news, doing so 45% of the time. This issue persists across languages and regions. Among those analyzed were popular AI assistants like ChatGPT, Microsoft’s Copilot, Google’s Gemini, and Perplexity AI.

Study Overview and Findings

The research focused on the accuracy and reliability of responses from these AI assistants. Four key criteria were evaluated: accuracy, sourcing, context provision, and the ability to differentiate fact from opinion.

  • 45% of all AI-generated answers contained at least one significant issue.
  • 31% of responses had serious sourcing errors.
  • 20% contained major factual inaccuracies.

For example, an error identified by the study was naming Olaf Scholz as the German Chancellor, although Friedrich Merz assumed the role a month prior. Another error incorrectly attributed NATO’s leadership to Jens Stoltenberg, despite Mark Rutte already occupying the position.

AI Chatbots and Public Perception

According to the Reuters Institute’s Digital News Report 2025, 7% of online news consumers use AI chatbots for news. This number rises to 15% among those under 25. The findings raise concerns about the potential erosion of public trust in news media.

Jean Philip De Tender, the deputy director general of the European Broadcasting Union (EBU), highlighted that these issues are not isolated. Instead, they are systemic and pose a risk to democratic engagement. De Tender remarked that uncertainty in trust leads to a complete lack of trust, hindering democratic participation.

Methodology of the Study

This research is one of the largest to date and utilized a methodology similar to a previous BBC study conducted in February 2025. Journalists asked common news questions to the four AI assistants while remaining unaware of which assistant was answering.

Compared to the BBC’s earlier findings, there was minimal improvement in the accuracy of AI responses. However, significant errors remained prevalent across all four AI systems.

  • Gemini had the highest rate of sourcing issues, with 72% of its responses deemed problematic.
  • Microsoft’s Copilot and Gemini were highlighted as the underperformers in both studies.

Calls for Regulation and Accountability

In light of these findings, broadcasters and media organizations are urging governments to take action. The EBU emphasized the need for enforcement of existing laws related to information integrity and digital services.

Moreover, a collaborative campaign called “Facts In: Facts Out” has been initiated by the EBU and other media groups. This campaign demands that AI companies enhance accountability regarding how their products handle news.

The joint statement asserts that when AI systems distort or misrepresent trusted news sources, they compromise public trust. The campaign’s core message is clear: “If facts go in, facts must come out.”