Abstract
The temporal sequence of clinical events is crucial in outcomes research, yet standard machine learning (ML) approaches often overlook this aspect in electronic health records (EHRs), limiting predictive accuracy. We introduce Temporal Learning with Dynamic Range (TLDR), a time-sensitive ML framework, to identify risk factors for post-acute sequelae of SARS-CoV-2 infection (PASC). Using longitudinal EHR data from over 85,000 patients in the Precision PASC Research Cohort (P2RC) from a large integrated academic medical center, we compare TLDR against a conventional atemporal ML model. TLDR demonstrated superior predictive performance, achieving a mean AUROC of 0.791 compared to 0.668 for the benchmark, marking an 18.4% improvement. Additionally, TLDR’s mean PRAUC of 0.590 significantly outperformed the benchmark’s 0.421, a 40.14% increase. The framework exhibited improved generalizability with a lower mean overfitting index (− 0.028), highlighting its robustness. Beyond predictive gains, TLDR’s use of time-stamped features enhanced interpretability, offering a more precise characterization of individual patient records. TLDR effectively captures exposure–outcome associations and offers flexibility in time-stamping strategies to suit diverse clinical research needs. TLDR provides a simple yet effective approach for integrating dynamic temporal windows into predictive modeling. It is available within the MLHO R package to support further exploration of recurrent treatment and exposure patterns in various clinical settings.
Similar content being viewed by others

Data availability
Due to patient privacy regulations, the dataset is not publicly available. The R package is available at https://github.com/clai-group/MLHO.
References
Vaswani, A., et al. Attention Is All You Need. NeurIPS. (2017).
Schober, P. & Vetter, T. R. Repeated measures designs and analysis of longitudinal data: If at first you do not succeed-try, try again. Anesth. Analg. 127, 569–575 (2018).
Albert, P. S. Longitudinal data analysis (repeated measures) in clinical trials. Stat. Med. 18, 1707–1732 (1999).
Herbert, R. D., Kasza, J. & Bø, K. Analysis of randomised trials with long-term follow-up. BMC Med. Res. Methodol. 18, 48 (2018).
Link, M. W. Measuring compliance in mobile longitudinal repeated-measures design study. Surv. Pract. 6, 1–8 (2013).
Butera, N. M. et al. Modeling longitudinal change in biomarkers using data from a complex survey sampling design: An application to the Hispanic Community Health Study/Study of Latinos. Stat. Med. 42, 632–655 (2023).
Estiri, H., Strasser, Z. H. & Murphy, S. N. High-throughput phenotyping with temporal sequences. J. Am. Med. Inform. Assoc. 28, 772–781 (2021a).
Estiri, H. et al. Temporal characterization of Alzheimer’s disease with sequences of clinical records. EBioMedicine 92, 104629 (2023).
Che, Z., Purushotham, S., Cho, K., Sontag, D. & Liu, Y. Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 8(1), 6085 (2018).
Neil, D., Pfeiffer, M. & Liu, S. C. Phased LSTM: Accelerating recurrent network training for long or event-driven sequences. NeurIPS 29, 3882–3890 (2016).
Horn, M., Moor, M., Bock, C., Rieck, B. & Borgwardt, K. Set functions for time series. ICML 119, 4353–4363 (2020).
Pungitore, S. & Subbian, V. Assessment of prediction tasks and time window selection in temporal modeling of EHR data: Systematic review. J. Healthc. Inform. Res. 7, 313–331 (2023).
Yang, S. et al. Machine learning approaches for EHR phenotyping: a methodical review. J Am Med Inform Assoc. 30, 367–381 (2023).
Estiri, H. et al. Transitive sequencing medical records for mining predictive and interpretable temporal representations. Patterns 1, 100051 (2020).
Estiri, H., Vasey, S. & Murphy, S. N. Transitive sequential pattern mining for discrete clinical data. In: Artificial Intelligence in Medicine. Springer 414–424 (2020).
Baudrier, C. et al. Identifying homogeneous healthcare use profiles and treatment sequences by combining sequence pattern mining with care trajectory clustering in kidney cancer patients on oral anticancer drugs: A case study. Health Informatics J. 28, 14604582221101526 (2022).
Hügel, J. et al. Temporal characterization and visualization of revolving therapy-events in lung cancer patients. Stud. Health Technol. Inform. 316, 1642–1646 (2024).
Jazayeri, A., Yang, C. C. & Capan, M. Frequent temporal patterns of physiological and biological biomarkers and their evolution in sepsis. Artif. Intell. Med. 143, 102576 (2023).
van den Berg, M. A. M. et al. Development and clinical impact assessment of a machine-learning model for early prediction of late-onset sepsis. Comput Biol Med. 163, 107156 (2023).
Dagliati, A. et al. Characterization of long COVID temporal sub-phenotypes by distributed representation learning from electronic health record data: a cohort study. eClinicalMedicine 64, 102210 (2023).
Estiri, H., Strasser, Z. H. & Murphy, S. N. Individualized prediction of COVID-19 adverse outcomes with MLHO. Sci. Rep. 11, 5322 (2021b).
Azhir, A., et al. Precision phenotyping for curating research cohorts of patients with unexplained post-acute sequelae of COVID-19. Med. 100532 (2024).
Cheng, J., Tian, J., Hügel, J. & Estiri H. MLHO. GitHub, (2021).
Bennasar, M., Hicks, Y. & Setchi, R. Feature selection using joint mutual information maximisation. Expert Syst. Appl. 42, 8520–8532 (2015).
Salem, O. A., Liu, F., Chen, Y. P. P., & Chen, X. (2021). Feature selection and threshold method based on fuzzy joint mutual information. International journal of approximate reasoning, 132, 107-126.
Yang, H. & Moody, J. Feature selection based on joint mutual information. ICSC Symposium. 22–25 (1999).
Alsentzer, E., et al. Publicly available clinical BERT embeddings. arXiv:1904.03323 (2019).
Clark, K., et al. ELECTRA: Pre-training text encoders as discriminators rather than generators. ICLR (2020).
Rasmy, L. et al. Med-BERT: Pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ. Digit. Med. 4, 86 (2021).
Shmatko, A. et al. Learning the natural history of human disease with generative transformers. Nature 647(8088), 248–256 (2025).
Antikainen, E. et al. Transformers for cardiac patient mortality risk prediction from heterogeneous EHR. Sci Rep. 13, 3517 (2023).
Kraljevic, Z. et al. Foresight-a generative pretrained transformer for modelling of patient timelines using electronic health records: A retrospective modelling study. Lancet Digit. Health 6, e281–e290 (2024).
Wang, L. et al. Transformer-based deep learning model for the diagnosis of suspected lung cancer in primary care based on electronic health record data. EBioMedicine 110, 105442 (2024).
Shortliffe, E. H. et al. Computer-based consultations in clinical therapeutics: explanation and rule acquisition capabilities of the MYCIN system. Comput Biomed Res. 7(4), 303–320 (1974).
Berner, E. S. & La Lande, T. J. Overview of clinical decision support systems. In Clinical Decision Support Systems: Theory and Practice 2nd ed 1–17 (Springer, 2016).
Acknowledgements
We acknowledge that a large language model (LLM) was used solely for grammar improvement and language editing of the manuscript.
Funding
Supported by NIAID R01AI165535. J. Hügel was partially funded by DAAD IFI, BMBF, and DFG (426671079).
Author information
Authors and Affiliations
Contributions
H.E., J.C., J.G.K., and J.H. conceived, designed, and planned the study. H.E., J.C., and J.G.K. collected and acquired the data. H.E., J.C., and J.G.K. performed data preparation. J.C. and H.E. analyzed the data. H.E., J.C., A.A., S.N.M., J.H., and J.G.K. interpreted the data. H.E., J.C., J.G.K., and J.H. drafted the paper. All authors critically reviewed and revised the final paper. All authors approved the decision to submit for publication.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
Use of patient data in this study was approved by the Mass General Brigham Institutional Review Board (protocol 2020P001063).
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Cheng, J., Hügel, J., Tian, J. et al. Temporal Learning with Dynamic Range (TLDR) for modeling recurrent exposure and treatment outcomes. Sci Rep (2026). https://doi.org/10.1038/s41598-026-45346-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-45346-y

