sdecoret -

3 use cases for generative AI in healthcare documentation

Generative AI in healthcare offers promise for tasks such as clinical documentation, but clear regulations and standards are needed to maximize benefits and minimize risks.

Rapid advancements in large language models (LLMs) have brought generative AI tools to practically every business sector, and healthcare is no exception.

As defined by the Government Accountability Office, generative AI is "a technology that can create content, including text, images, audio or video, when prompted by a user."

Generative AI systems learn patterns and relationships from massive amounts of data, allowing them to create new content that might be similar, but not identical, to training data. The technology processes and produces content through the use of machine learning algorithms and statistical models.

The healthcare sector is using generative AI for various use cases, including clinical documentation, patient communication and clinical text summarization.

Clinical documentation

The leading cause of clinician burnout is excessive documentation requirements, according to a 2022 athenahealth survey conducted by the Harris Poll. However, early research has shown promise for generative AI to streamline clinical documentation workflows, which could help mitigate burnout and improve clinician satisfaction.

A 2024 study published in NEJM Catalyst examines the adoption of ambient AI scribes within The Permanente Medical Group (TPMG).

Ambient AI scribe technology uses smartphone microphones and generative AI to transcribe patient encounters as they occur, providing clinicians with draft clinical documentation for review.

In October 2023, TPMG deployed ambient AI technology for 10,000 physicians and staff across diverse settings.

According to the study, physicians who have used the ambient AI scribe service have reported positive feedback, including the technology's ability to facilitate more personal, meaningful and effective patient interactions. Physicians also noted reduced after-hours EHR documentation.

Early assessments of patient feedback were also positive, with some individuals mentioning improved provider interactions. Additionally, early evaluation metrics found that ambient AI produced high-quality clinical documentation for clinician review.

While the promise of ambient AI to streamline clinical documentation is substantial, a 2023 study published in the Journal of the American Medical Informatics Association (JAMIA) indicates that the technology might fall short in documenting non-lexical conversational sounds (NLCSes), like mm-hm and uh-uh.

Researchers evaluated the performance of two clinical ambient AI tools for 36 primary care encounters. Patients and providers commonly use NLCS to convey information. For instance, a patient might say, "Mm-hm," to indicate yes in response to the question, "Do you have any allergies to antibiotics?"

The ambient AI tools had a word error rate of about 12% for all words. However, the NLCS word error rate fell between 40% and 57%, and the word error rate for NLCSes that conveyed clinically relevant information was even higher -- 94.7% and 98.7%.

"Some of these NLCSes were used to communicate clinically relevant information that, if not properly captured, could result in inaccuracies in clinical documentation and possibly adverse patient safety events," the researchers emphasized.

Patient communication

As digital health transformation has progressed, the volume of patient portal inbox messages has risen significantly. According to a 2021 study published in JAMIA, patient portal inbox messages increased by 157% since 2020.

Considering this, some healthcare organizations are looking to generative AI to draft replies to patient portal messages.

A 2024 quality improvement study published in JAMA Network Open evaluated the adoption and usability of AI-generated draft replies to patient messages at an academic medical center.

After five weeks, providers used the AI-generated draft replies 20% of the time. The researchers noted that this is "remarkable" given that they did not fine-tune the LLMs for replying to patient messages. Additionally, the study authors said minimal end-user education was necessary for adoption.

Post-implementation of the generative AI tool, clinicians reported decreased task load and emotional exhaustion scores, suggesting that generated draft replies could help mitigate clinician burnout.

Still, despite improvements in burden, the study found no changes in overall reply time, read time or write time when comparing pre-pilot and pilot periods.

The study authors suggested that switching from writing to editing message replies might be less cognitively taxing despite taking the same amount of time.

However, survey respondents showed optimism about time saved, indicating that perceptions of time and time captured via EHR metadata might differ from the actual time spent on message responses.

Clinical data summarization

Clinicians spend substantial time summarizing various types of information within patient records, and errors in this process can harm clinical decision support.

Generative AI has shown promise for summarizing clinical data. For instance, a 2023 study found that LLM summaries can outperform human expert summaries in terms of conciseness, completeness and correctness.

However, using generative AI for clinical data summarization brings risks, as these kinds of LLMs are unlikely to fall under FDA medical device oversight, according to a viewpoint published in JAMA.

The authors explained that LLMs performing summarization tasks do not clearly qualify as medical devices because they provide language-based outputs rather than predictions or numeric estimates of disease. Without statutory changes, the FDA's authority to regulate most LLMs for clinical summaries is unclear.

The authors noted that differences in summary length, organization, and tone could all influence clinician interpretations and subsequent clinical decision-making. LLM summaries can also exhibit biases like sycophancy, i.e., tailoring responses to user expectations.

The authors said that to address these issues, the industry needs comprehensive standards for LLM-generated summaries, including testing for biases and errors. Additionally, clinical testing is necessary to quantify harms and benefits.

LLMs should allow open-ended clinician prompting, and the FDA should clarify regulatory criteria to recognize them as medical devices, the authors suggested.

The path forward

Generative AI holds promise in transforming various aspects of healthcare and mitigating clinician burnout. However, stakeholders must work toward creating comprehensive standards and regulatory clarity to maximize the benefits of generative AI while minimizing risks.

A 2024 study published in npj Digital Medicine emphasizes that delivering on the promise of generative AI in healthcare relies on defined leadership, adoption incentives, and ongoing regulation.

Leadership should focus on creating guidelines for LLM performance and finding the optimal clinical settings for trials of generative AI tools. The article notes that a subcommittee within the FDA with leadership from physicians, healthcare administrators, developers and investors could be well-positioned to undertake this responsibility.

Additionally, widespread deployment of generative AI will require payer incentives, as most providers will likely consider generative AI tools a capital expense.

With leadership, incentivization and regulation, the healthcare industry can make generative AI feasible for implementation across the care continuum to streamline clinical workflows.

Hannah Nelson has been covering news related to health information technology and health data interoperability since 2020.

Dig Deeper on Health IT optimization

Cloud Computing
Mobile Computing