The Limitations of ChatGPT in Answering Medical Questions

In recent years, artificial intelligence (AI) has made significant advancements, particularly in the realm of natural language processing. ChatGPT, an experimental AI chatbot developed by OpenAI, has gained immense popularity as a consumer application, attracting nearly 100 million users within just two months of its launch.

However, a recent study conducted by researchers at the University of Long Island has raised concerns about the accuracy and reliability of ChatGPT’s responses to medical queries.

Study: Assessing ChatGPT’s Performance

The researchers at the University of Long Island designed a study to evaluate the performance of ChatGPT in answering medication-related questions. They formulated 39 real questions sourced from the university’s Faculty of Pharmacy’s drug information service and compared the responses generated by ChatGPT with those provided by trained pharmacists.

The findings of the study revealed that ChatGPT only provided accurate answers to approximately a quarter of the questions, with the remaining responses being incomplete, incorrect, or unrelated.

Inaccurate and Dangerous Responses

One of the concerns raised in the study was the potential danger posed by ChatGPT’s inaccurate responses. For instance, when asked about the interaction between the antiviral medication Paxlovid and the blood pressure-lowering drug verapamil, ChatGPT incorrectly claimed that taking both medications together would not result in any adverse effects.

In reality, such a combination can lead to a significant drop in blood pressure, causing dizziness and fainting. This highlights the importance of individualized treatment plans, which healthcare professionals often create for patients taking both medications.

Fabricated References

Another significant issue identified in the study was ChatGPT’s practice of providing fabricated references to support its answers. When the researchers requested scientific references for each response, they discovered that ChatGPT could only provide legitimate references for a handful of questions.

Upon closer examination, they realized that the chatbot had manufactured these references. While the citations appeared legitimate at first glance, complete with correct formatting and URLs, they did not correspond to any existing articles.

In one particular case, the researchers asked ChatGPT about the conversion of spinal injection doses of the muscle spasm medication baclofen into corresponding oral doses. Although the team could not find a scientifically established dose conversion rate, ChatGPT presented a unique conversion rate and cited guidance from two medical organizations.

However, neither organization had ever provided official guidance on dose conversion, and the suggested conversion factor had no scientific basis.

Furthermore, the chatbot’s example calculation for dose conversion contained a critical error, resulting in a recommended oral dose that was 1,000 times lower than necessary. This miscalculation could potentially cause withdrawal symptoms such as hallucinations and seizures if followed by a healthcare professional.

Implications: Trust and Reliance on AI

The University of Long Island study raises important concerns about the accuracy and reliability of AI chatbots like ChatGPT, particularly in the context of medical information. ChatGPT’s ability to quickly synthesize information and provide seemingly professional responses may instill a false sense of confidence in its accuracy, leading users to rely on it as a source of medical advice.

However, as demonstrated by the study’s findings, this can be dangerous and potentially detrimental to patient care.

OpenAI’s Response and Recommendations

OpenAI, the organization behind ChatGPT, advises users not to rely on the chatbot’s responses as a substitute for professional medical advice or treatment. They emphasize that ChatGPT models are not fine-tuned to provide medical information and should not be used for diagnosing or treating serious medical conditions. OpenAI’s usage policies explicitly state that the models are not designed for such purposes.

To ensure the accuracy and reliability of medical information obtained online, consumers are encouraged to utilize reputable government websites like MedlinePlus from the National Institutes of Health. These websites provide reliable and up-to-date information.

However, it is important to remember that online sources should serve as a starting point and should not replace consultations with healthcare professionals who can provide personalized advice based on individual patient cases.


While AI chatbots like ChatGPT have demonstrated significant advancements in natural language processing, their limitations and potential risks cannot be overlooked, especially in critical domains such as healthcare.

The study conducted by the researchers at the University of Long Island highlights the importance of cautious reliance on AI-generated responses and emphasizes the need for human expertise in providing accurate and personalized medical advice.

As technology continues to evolve, striking a balance between the convenience offered by AI chatbots and the expertise of healthcare professionals will be crucial in ensuring optimal patient care and safety.