es.euronews.com

Study Reveals High Rate of Falsehoods in Popular AI Chatbots

A NewsGuard study found that ten popular AI chatbots produced false information in one-third of their responses, with some exhibiting significantly higher error rates than others, highlighting persistent challenges in AI accuracy.

Read original article in Spanish

Spanish

United States

TechnologyArtificial IntelligenceMisinformationFact-CheckingAi ChatbotsNewsguardFalse Information

NewsguardInflection AiOpenaiMetaMicrosoftMistralAnthropicGoogleStorm-1516Pravda

Igor GrosuEmmanuel MacronBrigitte Macron

What is the most significant finding of the NewsGuard study on AI chatbot accuracy?: The study's most significant finding is that ten popular AI chatbots generated false information in roughly one-third of their responses. This demonstrates a persistent issue with accuracy, despite claims of improvements by developers. Inflection AI's Pi had the highest rate of false responses at 57%.
Which chatbots showed the most and least accuracy, and how do these results compare to previous findings?: Inflection AI's Pi (57% false) and Perplexity AI (46%) had the highest rates of false responses. Conversely, Anthropic's Claude (10%) and Google's Gemini (17%) exhibited the lowest rates. Perplexity AI's accuracy drastically decreased from 0% false responses in 2024 to 46% in August 2025, while Mistral showed no change, remaining at 37%.
What are the broader implications of these findings regarding the spread of misinformation and future development of AI chatbots?: The high rate of false information generated by these chatbots raises serious concerns about the spread of misinformation. The study highlights the persistent challenge of ensuring accuracy in AI models, indicating a need for more robust fact-checking mechanisms and improved methods for detecting and correcting false information. The reliance on foreign propaganda sources by some chatbots further emphasizes this risk.

Cognitive Concepts

2/5

Framing Bias

The article presents the study's findings neutrally, focusing on the percentage of false responses from each chatbot. However, the headline emphasizes the negative aspect ('False Information'), potentially framing the issue more negatively than a headline such as 'Study Reveals Accuracy Rates of Popular AI Chatbots'. The ordering of chatbots, from highest to lowest percentage of false responses, also reinforces this negative framing.

1/5

Language Bias

The language used is generally neutral and objective. Terms like "false assertions" and "inaccuracies" are factual and avoid overly charged language. However, the repeated use of "false" and the emphasis on percentages could unintentionally contribute to a negative perception of AI chatbots.

3/5

Bias by Omission

The article doesn't discuss the methodologies used in the NewsGuard study in detail. Understanding the types of questions asked and the criteria used to determine if a response was 'false' would be crucial for a complete analysis. Also, the motivations and potential biases of NewsGuard itself are not addressed. The potential benefits or positive aspects of AI chatbots are largely omitted.

2/5

False Dichotomy

The article doesn't present a false dichotomy, acknowledging a range of accuracy rates across different chatbots. However, focusing primarily on the percentage of false responses could implicitly create a false dichotomy between completely accurate and completely inaccurate responses, ignoring the possibility of partially correct or nuanced answers.

Sustainable Development Goals

Quality Education Negative

Direct Relevance

The proliferation of false information by AI chatbots negatively impacts quality education by providing students and researchers with unreliable sources of information. The study highlights that many popular AI chatbots generate false statements in a significant percentage of their responses, undermining the credibility of information accessible through these tools. This can lead to the spread of misinformation, hindering the development of critical thinking and informed decision-making skills among users, particularly students who may rely on these tools for research and learning.

Sep 26, 04:16

Agentic AI: Redefining Enterprise Collaboration and Leadership

Pawan Anand, Persistent's AVP, highlights the shift from GenAI to agentic AI, emphasizing the need for organizations to foster human-AI partnerships for enhanced adaptability and collaboration, rather than viewing AI as a mere tool.

Sep 26, 04:16

Inaccurate Speech-to-Text Metrics Hamper Customer Service AI

The article critiques the use of Word Error Rate (WER) and Character Error Rate (CER) as primary metrics for evaluating speech-to-text accuracy in customer service, advocating for task success as a superior benchmark.

Sep 26, 04:16

AI Readiness: Six Key Characteristics for Successful AI Deployment

Gartner predicts over 40% of agentic AI projects will fail by 2027, but companies exhibiting six key characteristics—business alignment, cross-functional ownership, open architectures, governed delivery, value-linked measurement, and continuous learning—are more likely to succeed.

Sep 26, 01:14

Mecklenburg-Vorpommern Students Demand AI Integration in Computer Science Curriculum

The Mecklenburg-Vorpommern student council is demanding a modernized computer science curriculum that integrates artificial intelligence (AI), citing the disconnect between current paper-based coding assessments and real-world AI applications.