Would You Trust ChatGPT With Your Life? Millions Already Do

If you’ve been to a medical appointment recently, you may already have met an unusual assistant—an artificial intelligence system quietly taking notes as you describe your symptoms. Some doctors are now using “AI scribes” to capture conversations and turn them into structured medical records in real time. Others are experimenting with chatbots to guide patients before they even reach the clinic. And of course, millions of people have turned to tools like ChatGPT to type in symptoms, sometimes hoping for reassurance, sometimes bracing for unsettling answers.

Artificial intelligence in health care is no longer a distant vision. It is happening now—on our phones, in hospitals, in clinics. The technology is being hailed as a potential solution to doctor shortages, overburdened health systems, and the need for faster diagnoses. But alongside this excitement lies a serious set of risks, ones that could shape how safe and equitable future health care will be.

The Double-Edged Sword of AI in Medicine

AI already plays a behind-the-scenes role in many areas of medicine. Machine learning algorithms read X-rays and CT scans with remarkable accuracy. Virtual assistants triage patients online, guiding them to urgent or routine care. In some cases, chatbots are even used to monitor mental health or remind patients to take their medications.

The potential is undeniable. In a world where millions of people lack reliable access to doctors, chatbots that can give immediate guidance seem revolutionary. For a parent worried about a child’s fever at midnight, or for someone in a rural area hours away from a clinic, AI could feel like a lifeline.

But recent research has painted a more complex picture. While these systems can be impressively accurate, they also display a troubling tendency: over-treatment and bias. AI in health care is not immune to the problems of human systems—it can replicate inequalities, magnify costs, and, if left unchecked, cause harm.

A Study at the Crossroads of Promise and Risk

In one of the first large-scale attempts to test health chatbots in simulated real-world consultations, researchers put three widely used models—China’s ERNIE Bot, OpenAI’s ChatGPT, and DeepSeek—head-to-head against human doctors. The goal was not just to measure raw accuracy but to see how these systems behaved with different kinds of patients.

The experiment involved presenting common health complaints: chest pain after light activity, wheezing during exercise, shortness of breath. Each case came with carefully designed patient profiles that varied by age, gender, income, residence, and insurance status. The question was simple: would the chatbot’s advice change depending on who the patient appeared to be?

The results were fascinating—and troubling. All three AI systems diagnosed conditions more accurately than human doctors. They were remarkably good at suggesting the correct possibilities. Yet they also went much further than doctors in ordering unnecessary tests and prescribing medications that were not only unhelpful but sometimes risky.

Accuracy Meets Overuse

One striking finding was that the chatbots recommended additional tests in more than 90% of cases. Often these tests were expensive, time-consuming, or simply irrelevant. In asthma cases, for example, AI sometimes suggested antibiotics or CT scans, neither of which are recommended by medical guidelines.

Inappropriate medications were prescribed in more than half of the chatbot responses. These may not always be dangerous, but they increase costs and can expose patients to side effects or even antibiotic resistance. Where a doctor might confirm a diagnosis with a stethoscope and a simple lung test, the chatbot often leaned toward ordering multiple scans and prescribing more drugs than needed.

This tendency toward over-treatment raises a critical question: is AI trying too hard to “cover all bases”? Unlike doctors, who balance risks with costs and patient comfort, AI systems tend to optimize for completeness. In doing so, they may push health care toward unnecessary interventions.

Inequality in the Machine

Perhaps the most unsettling discovery was that AI responses varied depending on patient background. Older patients and wealthier patients were more likely to be recommended additional tests and treatments. In other words, the algorithm seemed to assume that some lives warranted more resources than others.

This echoes existing inequalities in health care, where wealth and privilege often translate into more—and sometimes better—treatment. But when such biases become embedded in AI, they risk scaling up inequality on a massive level.

AI does not exist in a vacuum. These systems are trained on data that reflect human societies, with all their flaws, prejudices, and systemic imbalances. If unchecked, they can replicate those patterns, reinforcing the very problems they were meant to solve.

Why Oversight Matters

The findings highlight a simple truth: while AI has enormous potential in medicine, it cannot be trusted without oversight. Left alone, it may drive up costs, encourage overtreatment, and widen the gap between patients who already receive good care and those who struggle to access it.

Safeguards are essential. These include equity checks to monitor how AI behaves across different patient groups, clear audit trails so that errors can be traced and corrected, and mandatory human oversight for decisions that carry high stakes. A chatbot may suggest a test, but a doctor must decide whether it truly benefits the patient.

Without such safeguards, health systems risk being seduced by AI’s speed and accuracy while ignoring its potential to cause harm.

A Global Question with Local Impacts

The stakes are particularly high for low- and middle-income countries. In regions where doctors are scarce, AI chatbots could dramatically expand access to health advice. But if those systems over-prescribe medications or order costly tests, they could bankrupt fragile health systems and harm patients.

Even in wealthier countries, the risks are real. Over-treatment can burden health systems already struggling with limited resources. And if AI favors wealthier patients with more aggressive treatment recommendations, it could entrench inequities that medicine has fought hard to reduce.

The excitement around AI in health care is understandable. The idea of instant, intelligent medical advice feels like something out of science fiction. But as this research shows, enthusiasm must be tempered with caution.

Designing AI for Fairness and Safety

The way forward is not to reject AI but to design it responsibly. That means involving doctors, patients, ethicists, and policymakers in shaping how these systems are developed and deployed. It means asking hard questions about who benefits, who is left behind, and how to measure not just accuracy but fairness and trustworthiness.

Equally important is transparency. Patients deserve to know when they are interacting with AI, what its limitations are, and how decisions are being made. Trust in medicine is hard won and easily lost. If AI is to play a larger role, it must earn that trust through openness and accountability.

The Inevitable Future

AI is coming to health care whether we are ready or not. The technology is advancing too quickly and the demand for solutions is too urgent to slow it down. The real question is not whether we will use AI in medicine, but how we will use it.

Will it become a tool that empowers doctors, expands access, and supports patients with fairness and compassion? Or will it become a driver of inequality, unnecessary treatment, and mistrust?

The answer depends on choices we make today. Research like this provides the evidence we need to design safeguards, policies, and ethical standards that put patient well-being at the center.

A Call for Caution and Hope

In the end, the story of AI in health care is not a tale of machines replacing doctors, but of technology and humanity learning to work together. AI can be brilliant, but it lacks empathy, judgment, and the deep understanding of human life that doctors bring. The challenge is to combine the strengths of both, while protecting against the weaknesses of each.

The promise is immense: a world where no patient is left without guidance, where rural communities have access to reliable health information, where doctors spend less time on paperwork and more time caring for people. The peril is just as real: a world of over-testing, runaway costs, and deepened inequality.

As AI steps into the doctor’s office, we must step carefully. The stakes are nothing less than our health, our trust, and the fairness of the systems that care for us.

More information: Yafei Si et al, Quality safety and disparity of an AI chatbot in managing chronic diseases: simulated patient experiments, npj Digital Medicine (2025). DOI: 10.1038/s41746-025-01956-w

Looking For Something Else?