Artificial Intelligence ‘better than doctors’ at accurately judging eye problems
The clinical knowledge and reasoning skills of the ever improving technology are already approaching the level of specialist eye doctors, say University of Cambridge scientists
Artificial Intelligence is better than doctors in accurately assessing eye problems, according to new research.
The clinical knowledge and reasoning skills of the ever improving technology are already approaching the level of specialist eye doctors, say University of Cambridge scientists. GPT-4 – a ‘large AI language model’ – was tested against medics at different stages in their careers, including unspecialised junior doctors, as well as trainee and expert eye doctors.
Each was presented with a series of 87 patient scenarios involving a specific eye problem, and asked to give a diagnosis or advise on treatment by selecting from four options. GPT-4 scored “significantly better” in the test than unspecialised junior doctors, who are comparable to general practitioners (GPs) in their level of specialist eye knowledge.
The findings, published in the journal PLOS Digital Health, also showed that GPT-4 gained similar scores to trainee and expert eye doctors – although the top performing doctors scored higher. The Cambridge research team say that large language models aren’t likely to replace healthcare professionals, but have potential to improve healthcare as part of the clinical workflow.
The researchers believe that state-of-the-art large language models such as GPT-4 could be useful for providing eye-related advice, diagnosis, and management suggestions in “well-controlled contexts” such as triaging patients, or where access to specialist healthcare professionals is limited. Study lead author said Dr Arun Thirunavukarasu said: “We could realistically deploy AI in triaging patients with eye issues to decide which cases are emergencies that need to be seen by a specialist immediately, which can be seen by a GP, and which don’t need treatment.
“The models could follow clear algorithms already in use, and we’ve found that GPT-4 is as good as expert clinicians at processing eye symptoms and signs to answer more complicated questions. With further development, large language models could also advise GPs who are struggling to get prompt advice from eye doctors. People in the UK are waiting longer than ever for eye care.
“Large volumes of clinical text are needed to help fine-tune and develop these models, and work is ongoing around the world to facilitate this.” The team say the research is “superior” to previous studies because they compared the abilities of AI to practicing doctors, rather than to sets of examination results.
Dr Thirunavukarasu, now an Academic Foundation Doctor at Oxford University Hospitals NHS Foundation Trust, said: “Doctors aren’t revising for exams for their whole career. We wanted to see how AI fared when pitted against to the on-the-spot knowledge and abilities of practicing doctors, to provide a fair comparison.”
He added: “We also need to characterise the capabilities and limitations of commercially available models, as patients may already be using them – rather than the internet – for advice.”
The test included questions about several eye health issues – including extreme light sensitivity, decreased vision, lesions, itchy and painful eyes – taken from a textbook used to test trainee eye doctors. The textbook is not freely available on the internet, making it unlikely that its content was included in GPT-4’s training datasets.
Dr Thirunavukarasu said: “Even taking the future use of AI into account, I think doctors will continue to be in charge of patient care. The most important thing is to empower patients to decide whether they want computer systems to be involved or not. That will be an individual decision for each patient to make.” GPT-4 and GPT-3.5 or ‘Generative Pre-trained Transformers’ – are trained on datasets containing hundreds of billions of words from articles, books, and other internet sources.
GPT-4 powers the online chatbot ChatGPT to provide “bespoke” responses to human queries. ChatGPT has recently attracted significant attention in medicine for attaining passing level performance in medical school examinations, and providing more accurate and empathetic messages than human doctors in response to patient queries.
The researchers pointed out that the field of artificially intelligent large language models is moving “very rapidly” and, since the study was conducted, more advanced models have been released – which may be even closer to the level of expert eye doctors.