Abstract
Osteoradionecrosis of the jaw (ORNJ) is a radiation-induced late toxicity that can dramatically decrease patients’ quality of life. Recent increases in survival rates of head and neck cancers associated with human papillomavirus (HPV) infection have resulted in a higher frequency of radiation-induced toxicities, particularly ORNJ. Recent work with Normal Tissue Complication Probability (NTCP) models and a Weibull Accelerated Failure Time (WAFT) model have further developed our understanding of ORNJ clinical/dosimetric risk factors and longitudinal features, respectively. In this data descriptor, 1129 head and neck cancer (HNC) patients received curative intent radiotherapy (RT) at MD Anderson Cancer Center and were followed up with clinical and radiological assessments at 3–6, 12, 18, 24 months, and then annually following the conclusion of RT for development of ORNJ. This data, in addition to the patients’ demographic, supplementary clinical, and dosimetric information was recorded in a comma-separated value file embedded within this data descriptor. This large, longitudinal dataset is a significant resource for further systematic analysis of post-RT normal tissue outcomes in HNC.
Similar content being viewed by others
Background & Summary
Head and neck cancers (HNC) affect over 58,000 Americans annually, with a growing proportion attributed to human papillomavirus (HPV) infection1. HPV-associated HNC are notably diagnosed in younger populations and are associated with higher survival rates in comparison to HPV-negative HNC2. Radiation therapy (RT) remains the mainstay of treatment for HPV-positive HNC, but the combination of RT with extended survival has led to an increased incidence of RT-induced late toxicities in normal tissues. One such complication is osteoradionecrosis of the jaw (ORNJ), a severe sequela following RT with an incidence ranging from 4 to 15%3. The mechanism of ORNJ is believed to be first instigated by compromised vascularity through hypoxic, hypovascular, and hypocellular tissue (Marx’s 3 H’s)4 followed by progressive loss in cortical bone integrity, ultimately impairing oral function and quality of life5,6. Due to the favorable RT response and prognosis of HPV-associated HNC and the subsequent number of patients transitioning to survivorship, there is a need to better understand the timing and progressive risk of ORNJ in relation to radiation treatment of HNC in order to optimize prevention efforts of this often debilitating condition.
Previous cross sectional statistical analyses7,8,9,10, including Normal Tissue Complication Probability (NTCP) models of ORNJ11,12, have identified clinical and dosimetric risk factors associated with this sequela. Additionally, some studies have explored statistical correlations on longitudinal ORNJ data13,14. In recent work, we developed a fully parametric multivariable Weibull Accelerated Failure Time (WAFT) model to predict patient-specific ORNJ risk over time based on longitudinal data.
This data descriptor presents the underlying dataset used for the development of the ORNJ WAFT model15. The dataset is comprised of a large, longitudinal cohort of HNC patients and includes detailed demographic, clinical, and dosimetric variables, along with structured follow-up data and time-to-ORNJ events. The availability of this dataset offers a valuable resource for modeling ORNJ and supports the development of predictive tools for personalized survivorship care in HNC.
Methods
IRB protocol
After the University of Texas MD Anderson Cancer Center Institutional Review Board approval, data were extracted from a philanthropically funded observational cohort at the University of Texas MD Anderson Cancer Center (Stiefel Oropharynx Cancer Cohort, PA14-0947). A waiver of informed consent was approved through the MD Anderson RCR030800 protocol, allowing for retrospective analysis. All patients included were consented RT cases. We implemented formal reporting guidance as per Enhancing the QUAlity and Transparancy Of health Research Network guidance, using the RECORD Statement16, attached as a supplement.
Patient population
1129 HNC patients from an internal MD Anderson Cancer Center cohort were treated with curative intent RT from 2005 to 2022. Patients were closely followed via clinical and radiological assessments every 3, 6, 12, 18, and 24 months, and then approximately annually following the conclusion of RT. As this cohort derives from a single institution, generalizability to institutions with different patient demographics and treatment practices may be limited. This dataset has been externally validated on an independent cohort in the parent study15. The patient data was stored in and accessed via the Epic Electronic Health Record System.
Demographic data
All demographic, clinical, and dosimetric variables are summarized in Table 1. The patients’ demographic data included: gender (male or female), age (in years), smoking status (current, former, never), and smoking pack-years. Smoking pack-years were calculated by the product of tobacco packs smoked per day and number of years smoked (Table 2).
Clinical data
The patients’ clinical data included: overall survival, ORNJ status (binary, yes or 1 vs. no or 0), time to event, ORNJ grade, pre-RT dental extractions, T stage, N stage, chemotherapy (induction vs. induction and concurrent vs. concurrent vs. no chemotherapy), post-operative RT vs. definitive RT, HPV/p16 + Ve status (yes vs. no or unknown), tumor site group (oropharynx vs. oral cavity vs. nasopharynx/nasal cavity/paranasal sinuses vs. larynx/hypopharynx vs. major salivary glands vs. other), and mandible volume (in cubic centimeters, cc). 916 patients (81%) were coded with an HPV/p16 + Ve Status of ‘Unknown.’ While this is reflective of practical limitations17, it may bias future analyses, as HPV/p16 status has been shown to impact survival rates and quality of life2,18. Multiple imputation serves as a potential strategy to derive missing HPV/p16 statuses19,20. Patients with missing data were not included in the original analysis15. Overall survival time, binarily coded as 0 for no survival and 1 for survival, represents the time in months between time of RT start date and time of death or time to last follow-up. As this dataset covers a wide range of years preceding ORNJ grading consensus21, ORNJ status was binarily coded to account for any variability across staging systems22. 0 indicated no ORNJ detected and 1 indicated an active ORNJ diagnosis (of any grade) at time of last follow-up. To reflect current clinical standards, ORNJ grade was also specified using a numeric value of 0–4 following the Tsai staging system23. For patients with active ORNJ, time to event is calculated in months from the RT start date to time of ORNJ diagnosis. For patients without an active ORNJ diagnosis, the time to event was censored to be the time in months from RT start date to either time of death or last follow-up. Pre-RT dental extractions were binarily coded—0 indicated negative and 1 indicated positive for pre-RT dental extractions. T stage and N stage indicate the cancer stage, following the standard TNM staging system by the American Joint Committee on Cancer (AJCC, 7th/8th edition) and the International Union Against Cancer. Chemotherapy and post-operative RT vs. definitive RT indicate if RT was combined with another treatment; ‘concurrent chemotherapy’ indicates chemotherapy occurred simultaneously with RT while ‘induction chemotherapy’ indicates chemotherapy was completed before RT. Likewise, ‘post-op RT’ indicates RT was completed following surgery while ‘definitive’ indicates RT was completed without surgery. HPV/p16 + Ve indicates positive expression of HPV/p16 via ‘yes’, ‘no’, or ‘unknown.’ Mandible volume was reported (in cc) from delineated mandible contours; mandible bone was auto-segmented with a previously validated multiatlas-based auto-segmentation using commercial software ADMIRE (research version 1.1; Elekta AB, Stockholm, Sweden).
Dosimetric data
The patients’ dosimetric data included the following dose-volume metrics: volume of the mandible receiving at least a specified dose (V5-V80 Gy in 5 Gy increments), and dose received by a specified volume of mandible (D0.5%, D1%, D2%, D3%, D5-D95% in 5% increments, D97%, D98%, D99%, D99.5%). These metrics were calculated directly from the radiation dose distribution DICOM files utilizing a Python-based software developed from core standards and software24,25,26,27,28,29,30, notably pydicom and RT Dose Module Attributes as specified in DICOM PS3.3, and tested in-house.
Data Record
The complete comma-separated value (CSV) file containing demographic, clinical, and dosimetric data for the aforementioned patient population is publicly available on figshare31. This CSV file provides the unique opportunity for analysis of a large HNC cohort with detailed treatment-related information related to prevalence and timing of ORNJ.
The authors acknowledge the dichotomy of open science while maintaining patient confidentiality; this is particularly important with cohorts of long-term survivors. As such, patient identification was anonymized through a randomly assigned subject ID independent from their medical record number (MRN). The dataset contains no other patient identifiers (Figs. 1, 2).
Diagram showing the data collection workflow and input into the final CSV file. Patients underwent RT at MD Anderson Cancer Center (left), in which dosimetric data was generated and acquired from a treatment planning system (top middle). Patient demographic and clinical data were also acquired from initial and follow-up visits (bottom middle). These data were then inputted into the CSV file included within this data descriptor (right).
Technical Validation
Patient demographic and clinical data was stored and accessed via manual extraction by post-doctoral fellows with radiation oncology training from the University of Texas MD Anderson Cancer Center’s Epic Electronic Health Record System server and imported into REDCap electronic data capture tools hosted at the University of Texas MD Anderson Cancer Center32,33. The dataset was curated by multiple observers over time using a standardized template and variable dictionary. When discrepancies were suspected, records were double-checked against the original sources and corrected if inconsistencies were identified. Although formal inter-rater reliability statistics were not calculated, this approach provided additional quality assurance during the curation process to minimize misclassification and confirmation biases34.
Dosimetric data was obtained from clinical radiotherapy treatment plans using the RayStation treatment planning system (RaySearch Laboratories AB, Stockholm, Sweden). These data were first exported in standardized DICOM-RT (Digital Imaging and Communications in Medicine – Radiation Therapy) format and then analyzed to calculate dose-volume metrics to be used in the model.
1471 patients were examined for eligibility for this analysis. 342 patients were excluded due to clinical reasons such as prior irradiations; others were excluded for incomplete or missing data. The final cohort included a dataset of 1129 HNC from MD Anderson Cancer Center.
Usage Notes
The WAFT-based time-to-ORNJ online calculator graphical user interface (GUI) is available at https://uic-evl.github.io/OsteoradionecrosisVis/.
Data availability
The dataset is available on figshare31, publicly accessed at https://doi.org/10.6084/m9.figshare.26240435.v1. In accordance with NOT-OD-21-013, final NIH Policy for Data Management and Sharing, anonymized/de-identified data that support the findings of this study are openly available in an NIH supported generalist scientific data repository (figshare) no later than the time of an associated publication.
Code availability
The script used for analyzing this dataset and training and testing the WAFT model can be found here in this repository: https://github.com/LaiaHV-MDACC/ORN-time-to-event-prediction-modelling.
References
Siegel. Cancer statistics. A Cancer Journal for Clinicians - Wiley Online Library. https://acsjournals.onlinelibrary.wiley.com/doi/full/10.3322/caac.21820 (2024).
Young, D. et al. Increase in head and neck cancer in younger patients due to human papillomavirus (HPV). Oral Oncol 51, 727–730 (2015).
Frankart, A. J. et al. Osteoradionecrosis: Exposing the Evidence Not the Bone. Int. J. Radiat. Oncol. 109, 1206–1218 (2021).
Marx, R. E. Osteoradionecrosis: a new concept of its pathophysiology. J. Oral Maxillofac. Surg. Off. J. Am. Assoc. Oral Maxillofac. Surg. 41, 283–288 (1983).
Laraway, D. C. & Rogers, S. N. A structured review of journal articles reporting outcomes using the University of Washington Quality of Life Scale. Br. J. Oral Maxillofac. Surg. 50, 122–131 (2012).
O’Dell, K. & Sinha, U. Osteoradionecrosis. Oral Maxillofac. Surg. Clin. N. Am. 23, 455–464 (2011).
Dose-volume correlates of mandibular osteoradionecrosis in oropharynx cancer patients receiving intensity-modulated radiotherapy: Results from a case-matched comparison. Radiother. Oncol. J. Eur. Soc. Ther. Radiol. Oncol. 124, 232–239 (2017).
Aarup-Kristensen, S. et al. Osteoradionecrosis of the mandible after radiotherapy for head and neck cancer: risk factors and dose-volume correlations. Acta Oncol. Stockh. Swed. 58, 1373–1377 (2019).
Kubota, H. et al. Risk factors for osteoradionecrosis of the jaw in patients with head and neck squamous cell carcinoma. Radiat. Oncol. Lond. Engl. 16, 1 (2021).
Möring, M. M. et al. Osteoradionecrosis after postoperative radiotherapy for oral cavity cancer: A retrospective cohort study. Oral Oncol 133, 106056 (2022).
van Dijk, L. V. et al. Normal Tissue Complication Probability (NTCP) Prediction Model for Osteoradionecrosis of the Mandible in Patients With Head and Neck Cancer After Radiation Therapy: Large-Scale Observational Cohort. Int. J. Radiat. Oncol. Biol. Phys. 111, 549–558 (2021).
Humbert-Vidan, L. et al. Multi-institutional Normal Tissue Complication Probability (NTCP) Prediction Model for Mandibular Osteoradionecrosis: Results from the PREDMORN Study. 2025.02.10.25322026 Preprint at https://doi.org/10.1101/2025.02.10.25322026 (2025).
Treister, N. S. et al. Exposed bone in patients with head and neck cancer treated with radiation therapy: an analysis of the Observational Study of Dental Outcomes in Head and Neck Cancer Patients (OraRad). Cancer 128, 487–496 (2022).
Goldwaser, B. R., Chuang, S.-K., Kaban, L. B. & August, M. Risk Factor Assessment for the Development of Osteoradionecrosis. J. Oral Maxillofac. Surg. 65, 2311–2316 (2007).
Humbert-Vidan, L. et al. Externally validated digital decision support tool for time-to-osteoradionecrosis risk-stratification using right-censored multi-institutional observational cohorts. Radiother. Oncol. 207 (2025).
Benchimol, E. I. et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med 12, e1001885 (2015).
Sijtsema, N. D. et al. Development of a local dose-response relationship for osteoradionecrosis within the mandible. Radiother. Oncol. 186, 109736 (2023).
Margalit, D. N. et al. Radiation Therapy for HPV-Positive Oropharyngeal Squamous Cell Carcinoma: An ASTRO Clinical Practice Guideline. Pract. Radiat. Oncol. 14, 398–425 (2024).
Ren, J. et al. Multiple imputation and clinico-serological models to predict human papillomavirus status in oropharyngeal carcinoma: An alternative when tissue is unavailable. Int. J. Cancer 146, 2166–2174 (2020).
Habbous, S. et al. Human papillomavirus in oropharyngeal cancer in Canada: analysis of 5 comprehensive cancer centres using multiple imputation. CMAJ Can. Med. Assoc. J. 189, E1030–E1040 (2017).
Peterson, D. E. et al. Prevention and Management of Osteoradionecrosis in Patients With Head and Neck Cancer Treated With Radiation Therapy: ISOO-MASCC-ASCO Guideline. J. Clin. Oncol. 42, 1975–1996 (2024).
Watson, E. E. et al. Development and Standardization of an Osteoradionecrosis Classification System in Head and Neck Cancer: Implementation of a Risk-Based Model. J. Clin. Oncol. 42, 1922–1933 (2024).
Tsai, C. J. et al. Osteoradionecrosis and radiation dose to the mandible in patients with oropharyngeal cancer. Int. J. Radiat. Oncol. Biol. Phys. 85, 415–420 (2013).
Mason, D. et al. pydicom/pydicom: pydicom 3.0.1. Zenodo https://doi.org/10.5281/zenodo.13824606 (2024).
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
Lowekamp, B. C., Chen, D. T., Ibanez, L. & Blezek, D. The Design of SimpleITK. Front. Neuroinformatics 7 (2013).
van der Walt, S. et al. scikit-image: image processing in Python. PeerJ 2, e453 (2014).
The pandas development team. pandas-dev/pandas: Pandas. Zenodo https://doi.org/10.5281/zenodo.16918803 (2025).
Hunter, J. D. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 9, 90–95 (2007).
NEMA PS3 / ISO 12052, Digital Imaging and Communications in Medicine (DICOM) Standard, National Electrical Manufacturers Association, Rosslyn, VA, USA (available free at http://www.dicomstandard.org/).
Humbert-Vidan, L., Kamel, S., Fuller, C. D., Moreno, A. & Lai, S. MDACC ORN Time-to-event anonymized clinical dataset. figshare https://doi.org/10.6084/m9.figshare.26240435.v1 (2025).
Harris, P. A. et al. Research Electronic Data Capture (REDCap) - A metadata-driven methodology and workflow process for providing translational research informatics support. J. Biomed. Inform. 42, 377–381 (2009).
Harris, P. A. et al. The REDCap Consortium: Building an International Community of Software Platform Partners. J. Biomed. Inform. 95, 103208 (2019).
Althubaiti, A. Information bias in health research: definition, pitfalls, and adjustment methods. J. Multidiscip. Healthc. 9, 211–217 (2016).
Acknowledgements
NAW is supported by a training fellowship from UTHealth Houston Center for Clinical and Translational Sciences T32 Program (Grant No. T32 TR004905), a NIH National Institute of Dental and Craniofacial Research (NIDCR) Academic Industrial Partnership Grant (R01DE028290), and the American Legion Auxiliary Fellowship in Cancer Research. ZK is supported by a doctoral fellowship from the Cancer Prevention Research Institute of Texas grant RP210042. GEM acknowledges funding support from NIH UG3 TR004501, NSF CNS-2320261, the University Scholar Award from the University of Illinois System, and NIH/NCI R01CA258827. JR received salary support from the NIDCR Diversity Supplement Grant R01DE028290-02S1. KAW was supported by the Image-Guided Cancer Therapy T32 Training Program Fellowship from T32CA261856. KKB acknowledges support from the Image Guided Cancer Therapy Research Program at The University of Texas MD Anderson Cancer Center, which was partially funded by the National Institutes of Health/NCI under award number P30CA016672. MAN received funding from the National Institutes of Health/National Institute of Dental and Craniofacial Research (NIH/NIDCR) through grant R03DE033550. ASRM received funding from NIDCR (U01DE032168, 1R01DE028290-01A1) and NCI (R01CA258827). LVVD received funding and salary support from KWF Dutch Cancer Society through a Young Investigator Grant (KWF-13529) and from NWO ZonMw through the VENI grant (NWO-09150162010173). ACM receives funding from the NIH/NIDCR via grants K12CA088084, R21DE031082, and K01DE030524. CDF, SYL, and KAH receive related funding support from the NIH/NIDCR (U01DE032168). CDF and SYL also receive funding support from the NIH/NIDCR (R01DE025248). CDF also receives infrastructure and salary support through the NIH/NCI MD Anderson Cancer Center Core Support Grant (CCSG) Image-Driven Biologically-informed Therapy (IDBT) program (P30CA016672-47). SYL is supported through the CCSG Head and Neck Program (P30CA016672-48). This work was supported directly or in part by effort, funding, resources or infrastructure from the NIH NCI OPC SURVIVOR: Optimizing OroPharyngeal Cancer SURVIVORship Program Project Grant (NCI P01CA285249); and the Charles and Daneen MD Anderson Oropharyngeal Cancer Fund. In accordance with NOT-OD-25-049, Supplemental Guidance to the 2024 NIH Public Access Policy: Government Use License and Rights: “This manuscript is the result of funding in whole or in part by the National Institutes of Health (NIH). It is subject to the NIH Public Access Policy. Through acceptance of this federal funding, NIH has been given a right to make this manuscript publicly available in PubMed Central upon the Official Date of Publication, as defined by NIH.”
Author information
Authors and Affiliations
Contributions
Conceptualization: N.A.W., S.K., Z.K., M.A., G.C., X.Z., K.K.B., R.H., M.A.N., K.A.H., L.V.V.D., A.C.M., C.D.F., L.H.V. Methodology: N.A.W., S.K., Z.K., M.A., M.M.C., R.H., M.A.N., K.A.H., A.C.M., C.D.F., L.H.V. Software: S.K., A.W., Z.K., G.E.M., C.D.F., L.H.V. Validation: S.K., C.D.F., L.H.V. Formal analysis: S.K., Z.K., A.C.M., C.D.F., L.H.V. Investigation: N.A.W., S.K., A.W., Z.K., M.A., G.E.M., G.C., X.Z., M.M.C., K.A.W., J.R., K.K.B., M.C., A.O.O., R.A.W., R.H., M.A.N., K.A.H., A.S.R.M., L.V.V.D., A.C.M., S.Y.L., C.D.F., L.H.V. Resources: N.A.W., S.K., Z.K., G.E.M., M.A.N., K.A.H., S.Y.L., C.D.F., L.H.V. Data curation: N.A.W., S.K., Z.K., M.A., M.M.C., C.D.F., L.H.V. Writing - original draft: N.A.W., C.D.F., L.H.V. Writing - review and editing: N.A.W., S.K., A.W., Z.K., M.A., G.E.M., G.C., X.Z., M.M.C., K.A.W., J.R., K.K.B., M.C., A.O.O., R.A.W., R.H., M.A.N., K.A.H., A.S.R.M., L.V.V.D., A.C.M., S.Y.L., C.D.F., L.H.V. Visualization: N.A.W., L.H.V. Supervision: N.A.W., S.K., Z.K., A.C.M., C.D.F., L.H.V. Project administration: N.A.W., S.K., Z.K., A.C.M., C.D.F., L.H.V. Funding: K.A.H., A.C.M., S.Y.L, C.D.F.
Corresponding authors
Ethics declarations
Competing interests
CDF has received unrelated grant support from Elekta AB and holds unrelated patents licensed to Kallisio, Inc. (US PTO 11730561) through the University of Texas, from which they receive patent royalties. CDF has also received unrelated travel and honoraria from Elekta AB, Philips Medical Systems, Siemens Healthineers/Varian, and Corewell Health. Additionally, CDF has served in an unpaid advisory capacity for Siemens Healthineers/Varian and has served on the guidelines/scientific committee for Osteoradionecrosis for the American Society of Clinical Oncology. VCS is a consultant and equity holder in Femtovox Inc and a consultant for PDS Biotechnology. KAW serves as an Associate Editor for Physics and Imaging in Radiation Oncology. The authors declare that no other competing interests exist.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
West, N.A., Kamel, S., Wentzel, A. et al. Clinical and dosimetric dataset of time-to-event normal tissue complication probability for osteoradionecrosis. Sci Data 13, 16 (2026). https://doi.org/10.1038/s41597-025-06321-w
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41597-025-06321-w




