Clinical trials are vital for advancing care. However, a systematic approach to tracking trial participation across different facilities and sponsors has been lacking. We developed natural language processing (NLP) methods to extract study enrollment history, including enrollment status, consent date, and study title from information on clinical trial participation recorded in clinical notes in the electronic health record based on national Veterans Affairs electronic health record data. The method exhibited high test-set precision for enrollment status (0.94), consent date (0.97), and study title (0.87) and acceptably high recall (0.76, 0.70, and 0.84, respectively). From a single center, the classifier correctly identified 111 of 125 trial participants (88.8%) across 12 distinct trials. Our study demonstrates the feasibility of using NLP to capture trial enrollment from a nationwide healthcare system. This algorithm creates a novel data resource for analyzing and tracking trial enrollment at the population level.
Goryachev et al. (Thu,) studied this question.