For more than forty years, technology has quietly reshaped how humans reach for knowledge. What once required shelves of books, expert guides, or long hours of searching now fits into a pocket. Phones, laptops, tablets, and smartwatches have merged into a single, vast library where answers appear almost instantly. A question about where dinosaurs lived or how fast a pulse is racing can be resolved in seconds. With the rise of generative artificial intelligence, that speed has accelerated even further, promising clarity at the tap of a screen.
Yet speed and confidence are not the same as truth. As these systems grow more fluent and visually persuasive, a central question lingers beneath the convenience. How accurate is the knowledge being delivered, and where does it come from? To explore that uncertainty, researchers turned to a subject already wrapped in scientific debate and historical revision: the Neanderthals.
A Species Shaped by Uncertainty
The first skeletal remains of Neanderthals were depicted in 1864, and from that moment forward, their image has never been settled. Over the decades, scientists have argued over how Neanderthals lived, how they dressed, how they hunted, and even how they moved through the world. Interpretations shifted as new discoveries emerged and older assumptions were challenged. That long history of disagreement made Neanderthals an ideal test case for probing how generative AI handles complex, evolving scientific knowledge.
Because there is no single, uncontested picture of Neanderthal life, any system attempting to reconstruct them must choose which version of the past to trust. In this study, researchers Magnani and Clindaniel asked generative AI systems to do exactly that, not by evaluating their speed or creativity, but by examining their accuracy and sources.
When the Past Freezes in Time
The images generated during the study were striking, but not in the way modern science would expect. Neanderthals appeared as they were imagined more than a century ago, frozen in outdated interpretations. They were portrayed as a primitive human-related species, bearing archaic features that aligned them closer to chimpanzees than to modern humans. Their bodies were shown with excessive hair, their posture bent forward, reinforcing a long-abandoned stereotype of brutishness and simplicity.
Equally telling was who did not appear. The images largely excluded women and children, presenting Neanderthal life as a narrow, male-centered existence. This absence quietly echoed older scientific narratives that failed to consider social complexity or diversity, revealing how deeply those earlier ideas still linger in accessible datasets.
The visuals were persuasive, detailed, and confident, yet they carried the weight of interpretations that contemporary scholarship has spent decades revising.
Stories That Missed the Full Picture
The written narratives produced by ChatGPT revealed similar issues. Rather than reflecting the variability and cultural sophistication recognized in modern research, the stories consistently underplayed Neanderthal culture. Their lives were described in simplified, almost caricatured terms, as though complexity had never entered the discussion.
When researchers compared these narratives with current scholarly understanding, the gaps were stark. About half of the narration generated by ChatGPT failed to align with established academic knowledge. For one specific prompt, that mismatch rose to over 80 percent. The language sounded authoritative, yet the substance often traced back to interpretations that science has since re-evaluated or abandoned.
The issue was not hesitation or uncertainty. It was confidence built on incomplete foundations.
Technology That Arrived Too Early
In both images and text, another inconsistency emerged. The AI systems frequently included technological elements that did not belong in the Neanderthal time period. References to basketry, thatched roofs, ladders, and even glass and metal appeared where they should not have. These details were not minor embellishments. They reshaped how Neanderthal life was imagined, projecting technological sophistication that exceeded what the time period supports.
Such errors did not stem from creativity alone. They reflected a blending of sources across eras, where distinctions between historical periods blurred. The AI systems stitched together fragments of information without fully respecting their chronological boundaries, creating a past that felt coherent but was historically misplaced.
Tracing the Ghosts in the Machine
To understand why these inaccuracies appeared so consistently, Magnani and Clindaniel investigated where the AI systems were drawing their information from. By cross-referencing the generated images and narratives with scientific literature from different decades, they were able to identify patterns in sourcing.
ChatGPT produced content most consistent with research from the 1960s, while DALL-E 3 aligned more closely with literature from the late 1980s and early 1990s. These eras were not chosen deliberately by the systems. Instead, they reflected what information was most available and accessible within the datasets used to train them.
The findings revealed a quiet but powerful limitation. Generative AI does not simply summarize the best or newest science. It reproduces what it can most easily access.
When Access Shapes Imagination
The roots of this problem stretch beyond technology and into policy. Copyright laws established in the 1920s restricted access to scholarly research for decades. As a result, large portions of scientific literature remained locked away until the rise of open access in the early 2000s. Even today, not all contemporary research is equally available for AI systems to learn from.
Clindaniel emphasized that one path toward more accurate AI output lies in ensuring that anthropological datasets and scholarly articles are AI-accessible. Without that access, AI systems are left to rely on older, more readily available interpretations, no matter how outdated they may be.
The way knowledge is gated, archived, and shared directly influences how machines reconstruct the past.
Teaching Caution in an Age of Confidence
Magnani pointed to another essential layer of the challenge: education. Generative AI speaks with remarkable certainty, and that confidence can easily be mistaken for correctness. Teaching students to approach these tools cautiously is not about discouraging their use, but about fostering critical engagement.
According to Magnani, cultivating skepticism and technical literacy will help build a society that understands both the power and the limits of AI-generated knowledge. Rather than accepting outputs at face value, users must learn to ask where the information comes from and what might be missing.
This study is part of a broader series in which Magnani and Clindaniel are examining how AI intersects with archaeological research and interpretations of the past. Each investigation adds another layer to the same central insight: AI reflects human knowledge, but also human blind spots.
Why This Research Matters
This research matters because generative AI is no longer a novelty. It is becoming a primary gateway to information, shaping how people imagine history, science, and humanity itself. When AI presents outdated or incomplete views of Neanderthals, it does more than make factual errors. It reinforces long-abandoned assumptions and quietly reshapes public understanding.
The study shows that accuracy in AI is not just a technical challenge. It is a cultural and policy issue tied to access, education, and responsibility. As AI continues to define how knowledge is shared, ensuring that it draws from the most accurate and inclusive sources will determine whether it becomes a tool for deeper understanding or a mirror reflecting the past’s misunderstandings back at us.
In revealing how easily the past can be misremembered by machines, this research offers a clear reminder. The future of knowledge depends not only on faster answers, but on wiser ones.
Study Details
Matthew Magnani et al, Artificial Intelligence and the Interpretation of the Past, Advances in Archaeological Practice (2025). DOI: 10.1017/aap.2025.10110






