Key points are not available for this paper at this time.
The reconstruction of patient paths—ie, their temporally ordered diagnoses, diagnostic procedures, and the resulting treatments—has shown great potential for gaining new insights in real-world medical practice. Reconstructing patient trajectories from electronic health records (EHRs) is not a new discipline; early attempts go as far back as the late 1990s.1Dionisio JDN Cárdenas AF Taira RK et al.A unified timeline model and user interface for multimedia medical databases.Comput Med Imaging Graph. 1996; 20: 333-346Crossref PubMed Scopus (14) Google Scholar Since then, the field has developed rapidly. Prognosis of symptoms, diagnoses, treatment response, death, or other events has become a field of active research. The application of machine-learning methods and modern artificial intelligence (AI) approaches in particular (eg, transformers such as the Bidirectional Encoder Representations from Transformers BERT and generative pretrained transformer GPT models) have opened new avenues for modelling patient trajectories;2Lentzen M Linden T Veeranki S et al.A transformer-based model trained on large scale claims data for prediction of severe COVID-19 disease progression.IEEE J Biomed Health Inform. 2023; 27: 4548-4558Crossref Scopus (0) Google Scholar however, the large amount of semantically harmonised yet representative datasets needed for model training imposes a serious obstacle to the generation of high-quality models. Although there is a large body of work regarding time-series modelling and prediction based on structured EHRs (consisting of codes for diagnoses, prescriptions, and diagnostic procedures), the Article presented by Zeljko Kraljevic and colleagues3Kraljevic Z Bean D Shek A et al.Foresight—a generative pretrained transformer for modelling of patient timelines using electronic health records: a retrospective modelling study.Lancet Digit Health. 2024; 6: e281-e290Google Scholar in The Lancet Digital Health reports on a model that uses additional information derived from natural language processing (NLP) annotation of the unstructured (ie, textual) parts of EHRs. In their Article, Kraljevic and colleagues present a GPT approach (Foresight) for modelling patient trajectories using a considerable amount of EHRs as input and making forecasts of the next medical event. They do not only present high prediction performances within a typical split of training and testing data but also by validating model predictions on 34 synthetic patient trajectories by five clinicians. In contrast to previously published work, the authors make extensive use of the unstructured information section within EHRs through a pre-annotation of the EHRs using an established NLP pipeline. Consequently, the GPT is trained on the temporal order of Systematized Nomenclature of Medicine (SNOMED) concepts rather than on the text of the EHRs. The approach covers a wide spectrum of entity types and makes use of a large and diverse set of descriptors for timeline reconstruction and training. Neither the NLP approach, nor the AI modelling strategy used in the Article go beyond what is currently established technology. The training material and the way it has been preprocessed is much more important than the use of a unique technical approach. Others have also used large collections of EHRs for reconstruction of patient trajectories and making forecasts. For example, the authors of the BEHRT,4Li Y Rao S Solares JRA et al.BEHRT: transformer for electronic health records.Sci Rep. 2020; 107155 Google Scholar G-BERT,5Shang J, Ma T, Xiao C, Sun J. Pre-training of graph augmented transformers for medication recommendation. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence AI for Improving Human Well-being; Aug 10–16, 2019 (pp 5953–59).Google Scholar CEHR-BERT,6Pang C Jiang X Kalluri KS et al.CEHR-BERT: Incorporating temporal information from structured EHR data to improve prediction tasks.Proc Mach Learn Health. 2021; 158: 239-260Google Scholar and Med-BERT7Rasmy L Xiang Y Xie Z Tao C Zhi D Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction.NPJ Digit Med. 2021; 4: 86Crossref PubMed Scopus (217) Google Scholar models have used millions of structured EHRs. The Article discussed here must be seen as part of a growing number of related publications that use slightly different input sources for patient trajectory prediction. We will hopefully see lots of such time-series forecasting models from different health-care systems and different regions of the world, resulting in a diverse selection of generative models that allow for the representation of a wide spectrum of individual patient journeys. However, the main questions remain: what does the model tell us now? What can we learn? Kraljevic and colleagues refrain from making bold claims and present Foresight as model with potential, both clinical and educational. But what can we learn beyond the use of Foresight in the training of medical professionals? Although there are always time and resource constraints in the development of predictive models, there are some limitations to Kraljevic and colleagues' approach. It would have been desirable to see the trajectory reconstruction and prediction engine challenged by the reality of lifelong trajectories that, for instance, Siggaard and colleagues have established for the Danish population.8Siggaard T Reguant R Jørgensen IF et al.Disease trajectory browser for exploring temporal, population-wide disease progression patterns in 7.2 million Danish patients.Nat Commun. 2020; 114952 Crossref PubMed Scopus (53) Google Scholar That would have been a truly relevant validation, and it would have—at least partially—addressed the generalisability aspect of the model. Another obvious validation scenario would have been the comparison of best practice according to clinical guidelines with the reality of the patient paths that are encoded in the model.9Canonica GW Agache I Schünemann HJ Roche N Price D Del Giacco S Next generation health guidelines: the role of real-life data in evidence-based medicine.Allergy. 2024; 79: 12-14Crossref Scopus (0) Google Scholar Nonetheless, we should be very pleased with the work of Kraljevic and colleagues. If more models such as these are developed and validated, we might develop a larger ecosystem of patient timelines based on many models and will have plenty of material to address the above limitations. We declare no competing interests. Foresight—a generative pretrained transformer for modelling of patient timelines using electronic health records: a retrospective modelling studyForesight is a general-purpose model for biomedical concept modelling that can be used for real-world risk forecasting, virtual trials, and clinical research to study the progression of disorders, to simulate interventions and counterfactuals, and for educational purposes. Full-Text PDF Open Access
Building similarity graph...
Analyzing shared references across papers
Loading...
Martin Hofmann‐Apitius
University of Bonn
Holger Fröhlich
Fraunhofer Institute for Algorithms and Scientific Computing
The Lancet Digital Health
University of Bonn
Fraunhofer Institute for Algorithms and Scientific Computing
Bonn Aachen International Center for Information Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Hofmann‐Apitius et al. (Thu,) studied this question.
synapsesocial.com/papers/68e73091b6db6435876a9f03 — DOI: https://doi.org/10.1016/s2589-7500(24)00045-1