
bestbrk/istock via Getty Images
Machine learning unlocks insights into physician fatigue
A machine learning model identified physician fatigue through clinical notes, linking it to worse clinical decision-making in a new study.
New research shows that a machine learning, or ML, model can identify clinical notes written by physicians experiencing fatigue, offering insights into the quality of physicians' clinical decision-making.
Published in nature communications, the study aimed to measure fatigue through clinical notes and examine the effects of physician fatigue.
Researchers from the University of Chicago and the University of California, Berkeley, gathered physician notes from 2010 to 2012 from Mass General Brigham. Most of the notes collected were written on the same day of the patient encounter.
For the analysis, the researchers used data from 129,228 consecutive emergency department (ED) encounters. They identified the attending physician who wrote the clinical note for each visit, a total of 60 emergency physicians working across 11,592 shifts. ED shifts are "psychologically and physically demanding" for physicians, researchers noted.
The researchers calculated a physician's workload by counting the number of days worked over a rolling seven-day period ending with the current shift. They defined 'high-workload' physicians as those who worked at least four days prior to the current shift (14.8%) and compared them to physicians whose current shift was their first in seven days (19%), termed 'low-workload' physicians. They then trained an ML model to pinpoint notes written by high-workload physicians.
The study shows that the model accurately identified the notes written by high-workload physicians. It also identified notes written in situations associated with high fatigue, such as overnight shifts and periods of high patient volumes.
Notably, model-identified signs of fatigue in a clinical note correlated with worse physician decision-making. To assess the correlation, the researchers used a previously developed ED quality measure: whether to test a patient for acute coronary syndrome (ACS). They evaluated 'yield of testing,' which determines the clinical value of testing the patients; a higher yield means a higher rate of diagnosing ACS, while a lower yield means a risk of testing with no clear patient benefit.
The researchers found that with each standard deviation increase in model-identified fatigue, the yield of testing for heart attack was 19% lower.
"This result indicates that fine-grained measures of fatigue, like the one we use here, are a promising way to measure and elucidate the consequences of physician fatigue," the researchers wrote.
Further, the model found that clinical notes generated by large language models (LLMs) are more prone to being correlated with fatigue than those written by physicians. Researchers observed that the rate of model-identified fatigue for LLM-written notes was 74% higher than that of physician-written ones.
This indicates "the possibility that LLMs may introduce distortions in generated text that are not yet fully understood," the researchers noted.
Anuja Vaidya has covered the healthcare industry since 2012. She currently covers the virtual healthcare landscape, including telehealth, remote patient monitoring and digital therapeutics.