The promise and peril of ChatGPT diet plans
In 2003, The Human Genome Project, a groundbreaking international scientific endeavor that decoded the 3 billion DNA base pairs that make up the human genome, was officially completed. The project had started 13 years prior and promised to provide valuable insights into human biology, disease and evolution, though enterprising corporations saw another realm in which the findings would be potentially useful (and lucrative): dieting.
At the time, American culture was definitely saturated in diet talk. Former Surgeon General David Satcher had declared obesity an epidemic in the United States in 2001, which led to an onslaught of fitness and nutrition-focused news segments, documentaries and television programs, ranging from “The Biggest Loser” and “You Are What You Eat” to “Super Size Me” and MTV’s “Fat Camp.” Not all of these pieces of media have aged well in the ensuing two decades, but their existence speaks to the relentless societal interest at the time in how we should be feeding our bodies.
When companies like Nutrigenomix, DNAfit and Habit began offering pricy nutrition plans based on genetic testing and biomarkers, it was just one example of how the advent of new scientific technology and knowledge tends to be floated as a personal health solution. For instance, digital watches quickly started to double as heart monitors, while our smartphones now count our steps, sleep and menstrual cycles.
Now, there are questions as to whether language-based artificial intelligence models, like the popular ChatGPT, could serve as a tool for creating specialized nutrition plans that are potentially both cheaper and quicker than visiting a nutritionist.
Last year, researchers published a paper in the “Journal of Nutrition and Metabolism” that compared the answers between ChatGPT and human dieticians to common nutrition questions.
“Dieticians were asked to provide their most commonly asked nutrition questions and their own answers to them. We then asked the same questions to ChatGPT and sent both sets of answers to other dieticians or nutritionists and experts in the domain of each question to be graded based on scientific correctness, actionability and comprehensibility,” the study authors wrote. “The grades were also averaged to give an overall score, and group means of the answers to each question were compared using permutation tests.”
Surprisingly, ChatGPT’s responses often outperformed those of the dieticians across various criteria.
“The overall grades for ChatGPT were higher than those from the dieticians for the overall scores in five of the eight questions we received,” they continued. “ChatGPT also had higher grades on five occasions for scientific correctness, four for actionability, and five for comprehensibility. In contrast, none of the answers from the dieticians had a higher average score than ChatGPT for any of the questions, both overall and for each of the grading components.”
These findings were underscored by a more recent paper in “Frontiers of Nutrition.” This study aimed to assess the feasibility of personalized AI-generated weight-loss diet plans for clinical use through a survey-based evaluation by experts in obesity medicine and clinical nutrition. Similarly, the researchers used ChatGPT and graded the plans on effectiveness, balance, comprehensiveness, flexibility and applicability.
Results from 67 participants showed no significant differences among the plans, with AI-generated plans often indistinguishable from human-created ones. While some experts identified the AI plan, scores for AI-generated personalized plans were generally positive.
“Distinguishing AI-generated outputs from human writing, particularly those created by ChatGPT, presents a significant challenge,” the study authors wrote. “Our study reinforced this observation as only 5 out of 67 experts were able to accurately identify and select the AI-generated diet plan. These experts highlighted characteristics such as the broad comprehensiveness of the diet plan and the inclusion of atypical recommendations.”
They continued: “Moreover, an intriguing finding emerged in which 24 experts who initially reported that they could not identify the AI-generated plan correctly selected the AI plan. Their reasoning revolved around nonspecific characteristics, such as the absence of brand names and meal preparations perceived as unrealistic. Therefore, although the task of identifying AI-generated diet plans is complex, some experts were able to pinpoint them, typically because of factors not directly related to the quality of the diet plan.”
“Distinguishing AI-generated outputs from human writing, particularly those created by ChatGPT, presents a significant challenge.”
For all the promise of AI-generated diet plans, there are some definite drawbacks to the technology currently that would need to be addressed in order to really level-up the safety and efficacy of the plans outside of concerns about lack of specificity and unrealistic preparation suggestions
For instance, when assessing the plans ChatGPT created, they noticed tomatoes were frequently recommended; while tomatoes are a key part of a Spanish diet — which the prompt specified the test subject desired — they may conflict with dietary restrictions for conditions like gastroesophageal reflux disease (GERD) and chronic kidney disease (CKD). Similarly, the plans ChatGPT created often emphasized protein consumption for weight loss, despite the fact excessive amounts of protein can negatively impact CKD patients.
This underscores the challenge AI faces in balancing diverse considerations for patients with multiple, potentially conflicting chronic health issues. ChatGPT also seemed to struggle with providing specific portion sizes, macro and micronutrient breakdowns and serving suggestions (though as dietician Eliza Savage astutely pointed out, “It’s not very good at math or science. It’s a language model, after all”).
Researchers remain optimistic, however, while suggesting there’s a need for an extra layer of expertise before suggesting or implementing these plans.
“Current AI models, like ChatGPT, lack the capability to fact-check their outputs,” they wrote. “Therefore, it remains the responsibility of human experts to validate these outputs.”
Read more
about this topic