Teaching AI to ask medical questions | MIT News

0
84
Teaching AI to ask medical questions | MIT News



Physicians usually question a affected person’s digital well being document for data that helps them make remedy selections, however the cumbersome nature of those information hampers the method. Research has proven that even when a physician has been educated to make use of an digital well being document (EHR), discovering a solution to only one query can take, on common, greater than eight minutes.

The extra time physicians should spend navigating an oftentimes clunky EHR interface, the much less time they must work together with sufferers and supply remedy.

Researchers have begun creating machine-learning fashions that may streamline the method by mechanically discovering data physicians want in an EHR. However, coaching efficient fashions requires enormous datasets of related medical questions, which are sometimes arduous to return by resulting from privateness restrictions. Existing fashions wrestle to generate genuine questions — people who could be requested by a human physician — and are sometimes unable to efficiently discover appropriate solutions.

To overcome this knowledge scarcity, researchers at MIT partnered with medical consultants to review the questions physicians ask when reviewing EHRs. Then, they constructed a publicly obtainable dataset of greater than 2,000 clinically related questions written by these medical consultants.

When they used their dataset to coach a machine-learning mannequin to generate medical questions, they discovered that the mannequin requested high-quality and genuine questions, as in comparison with actual questions from medical consultants, greater than 60 % of the time.

With this dataset, they plan to generate huge numbers of genuine medical questions after which use these questions to coach a machine-learning mannequin which might assist medical doctors discover sought-after data in a affected person’s document extra effectively.

“Two thousand questions may sound like a lot, but when you look at machine-learning models being trained nowadays, they have so much data, maybe billions of data points. When you train machine-learning models to work in health care settings, you have to be really creative because there is such a lack of data,” says lead creator Eric Lehman, a graduate scholar within the Computer Science and Artificial Intelligence Laboratory (CSAIL).

The senior creator is Peter Szolovits, a professor within the Department of Electrical Engineering and Computer Science (EECS) who heads the Clinical Decision-Making Group in CSAIL and can also be a member of the MIT-IBM Watson AI Lab. The analysis paper, a collaboration between co-authors at MIT, the MIT-IBM Watson AI Lab, IBM Research, and the medical doctors and medical consultants who helped create questions and took part within the examine, will probably be introduced on the annual convention of the North American Chapter of the Association for Computational Linguistics.

“Realistic data is critical for training models that are relevant to the task yet difficult to find or create,” Szolovits says. “The value of this work is in carefully collecting questions asked by clinicians about patient cases, from which we are able to develop methods that use these data and general language models to ask further plausible questions.”

Data deficiency

The few massive datasets of medical questions the researchers had been capable of finding had a number of points, Lehman explains. Some had been composed of medical questions requested by sufferers on net boards, that are a far cry from doctor questions. Other datasets contained questions produced from templates, so they’re largely equivalent in construction, making many questions unrealistic.

“Collecting high-quality data is really important for doing machine-learning tasks, especially in a health care context, and we’ve shown that it can be done,” Lehman says.

To construct their dataset, the MIT researchers labored with training physicians and medical college students of their final 12 months of coaching. They gave these medical consultants greater than 100 EHR discharge summaries and informed them to learn via a abstract and ask any questions they may have. The researchers didn’t put any restrictions on query sorts or buildings in an effort to assemble pure questions. They additionally requested the medical consultants to establish the “trigger text” within the EHR that led them to ask every query.

For occasion, a medical skilled may learn a observe within the EHR that claims a affected person’s previous medical historical past is critical for prostate most cancers and hypothyroidism. The set off textual content “prostate cancer” may lead the skilled to ask questions like “date of diagnosis?” or “any interventions done?”

They discovered that almost all questions centered on signs, remedies, or the affected person’s check outcomes. While these findings weren’t surprising, quantifying the variety of questions on every broad matter will assist them construct an efficient dataset to be used in an actual, medical setting, says Lehman.

Once they’d compiled their dataset of questions and accompanying set off textual content, they used it to coach machine-learning fashions to ask new questions based mostly on the set off textual content.

Then the medical consultants decided whether or not these questions had been “good” utilizing 4 metrics: understandability (Does the query make sense to a human doctor?), triviality (Is the query too simply answerable from the set off textual content?), medical relevance (Does it is sensible to ask this query based mostly on the context?), and relevancy to the set off (Is the set off associated to the query?).

Cause for concern

The researchers discovered that when a mannequin was given set off textual content, it was capable of generate a very good query 63 % of the time, whereas a human doctor would ask a very good query 80 % of the time.

They additionally educated fashions to get well solutions to medical questions utilizing the publicly obtainable datasets they’d discovered on the outset of this undertaking. Then they examined these educated fashions to see if they may discover solutions to “good” questions requested by human medical consultants.

The fashions had been solely capable of get well about 25 % of solutions to physician-generated questions.

“That result is really concerning. What people thought were good-performing models were, in practice, just awful because the evaluation questions they were testing on were not good to begin with,” Lehman says.

The workforce is now making use of this work towards their preliminary purpose: constructing a mannequin that may mechanically reply physicians’ questions in an EHR. For the subsequent step, they may use their dataset to coach a machine-learning mannequin that may mechanically generate 1000’s or tens of millions of fine medical questions, which might then be used to coach a brand new mannequin for automated query answering.

While there’s nonetheless a lot work to do earlier than that mannequin could possibly be a actuality, Lehman is inspired by the robust preliminary outcomes the workforce demonstrated with this dataset.

This analysis was supported, partly, by the MIT-IBM Watson AI Lab. Additional co-authors embrace Leo Anthony Celi of the MIT Institute for Medical Engineering and Science; Preethi Raghavan and Jennifer J. Liang of the MIT-IBM Watson AI Lab; Dana Moukheiber of the University of Buffalo; Vladislav Lialin and Anna Rumshisky of the University of Massachusetts at Lowell; Katelyn Legaspi, Nicole Rose I. Alberto, Richard Raymund R. Ragasa, Corinna Victoria M. Puyat, Isabelle Rose I. Alberto, and Pia Gabrielle I. Alfonso of the University of the Philippines; Anne Janelle R. Sy and Patricia Therese S. Pile of the University of the East Ramon Magsaysay Memorial Medical Center; Marianne Taliño of the Ateneo de Manila University School of Medicine and Public Health; and Byron C. Wallace of Northeastern University.

LEAVE A REPLY

Please enter your comment!
Please enter your name here