IdentifiHR predicts homologous recombination deficiency in high-grade serous ovarian carcinoma using gene expression

Weir, Ashley L.; Lee, Samuel C.; Li, Mengbo; Pandey, Ahwan; Tan, Chin Wee; Garsed, Dale W.; Ramus, Susan J.; Davidson, Nadia M.

doi:10.1038/s43856-026-01387-y

Download PDF

Article
Open access
Published: 14 January 2026

IdentifiHR predicts homologous recombination deficiency in high-grade serous ovarian carcinoma using gene expression

Communications Medicine , Article number: (2026) Cite this article

1261 Accesses
Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

Background

Approximately half of all high-grade serous ovarian carcinomas (HGSCs) have a therapeutically targetable defect in homologous recombination (HR) DNA repair. While there are genomic and transcriptomic methods, developed for other cancers, to identify HR deficient (HRD) samples, there are no gene expression-based tools to predict HR status in HGSC specifically. We have built a HGSC-specific model to predict HR status using gene expression.

Methods

We separated The Cancer Genome Atlas (TCGA) cohort of HGSCs into training (n = 288) and testing (n = 73) sets and labelled each case as HRD or HR proficient (HRP) based on the clinical standard for classification. Using the training set, we performed differential gene expression analysis between HRD and HRP cases. The 2604 significantly differentially expressed genes were used to train a penalised logistic regression model.

Results

IdentifiHR uses the expression of 209 genes to predict HR status in HGSC. These genes preserve the genomic damage signal, capturing known regions of HR-specific copy number alteration which impact gene expression. IdentifiHR is 85% accurate in the TCGA test set and 86% accurate in an independent cohort of 99 samples, taken from primary tumours, ascites and normal fallopian tubes. Further, IdentifiHR is 84% accurate in pseudobulked single-cell HGSC sequencing from 37 patients and outperforms existing expression-based methods to predict HR status, being BRCAness, MutliscaleHRD and expHRD.

Conclusions

IdentifiHR is an accurate model to predict HR status in HGSC. It is available as an open source R package, empowering researchers to robustly classify HR status when only transcriptomic sequencing data is available.

Plain language summary

High-grade serous ovarian cancer (HGSC) is a type of ovarian cancer with very poor outcomes. However, half of HGSCs have faulty DNA repair that can be targeted for treatment if it is identified. Existing methods look at changes in DNA that arise when repair is faulty, but do not consider which genes are actively being used, or are “expressed”, by the cancer. We developed IdentifiHR, a machine learning method to predict DNA repair status using the expression of 209 genes. We tested IdentifiHR on 209 patient samples and found it correctly predicts repair status in about 85–86% of cases, performing better than existing tools on the same patient data. IdentifiHR is released as a software package for public use.

Prognostic relevance of HRDness gene expression signature in ovarian high-grade serous carcinoma; JGOG3025-TR2 study

Article 02 January 2023

Discovery and validation of a transcriptional signature identifying homologous recombination-deficient breast, endometrial and ovarian cancers

Article 25 June 2022

Whole genome sequencing approach to assess homologous recombination deficiency in a pan-cancer cohort

Article Open access 12 January 2026

Data availability

The results published here are in whole or part based upon data generated by The Cancer Genome Atlas, managed by the NCI and NHGRI. Information about TCGA can be found at http://cancergenome.nih.gov. RNA sequencing, gene-level copy number, methylation, SNP and structural variant data collected on the TCGA HGSC cohort, with associated clinical data, are available from the Genomic Data Commons (TCGA project) data portal (https://portal.gdc.cancer.gov/, https://www.cancer.gov/tcga, dbGaP Study Accession: phs000178.v11.p8). AOCS gene expression counts were accessed at the Gene Expression Omnibus (accession: GSE209964). Previously published WGS data are available from the European Genome-phenome Archive (accession: EGAD00001000877). MSKCC gene expression counts were available as a Seurat object on Synpase (SynID: syn51091849). The source data for Fig. 2A, D, E can be found in Supplementary Data 2, for Fig. 3A, B in Supplementary Data 5, for Fig. 3C in Supplementary Data 6, for Fig. 3D, E in supplementary data 7, for Fig. 3F in Supplementary Data 9 and for Fig. 4A–C in Supplementary Data 5. These data are available in the supplementary information and in the IdentifiHR repository, https://github.com/DavidsonGroup/IdentifiHR. All other data supporting the findings of this study, including the source data for all figures, are publicly available.

Code availability

All analyses were carried out in R v4.2.1. Code to reproduce the analysis can be found in the IdentifiHR repository, https://github.com/DavidsonGroup/IdentifiHR.

References

Moore, K. N. et al. Niraparib monotherapy for late-line treatment of ovarian cancer (QUADRA): a multicentre, open-label, single-arm, phase 2 trial. Lancet Oncol. 20, 636–648 (2019).
Google Scholar
Alsop, K. et al. BRCA mutation frequency and patterns of treatment response in BRCA mutation-positive women with ovarian cancer: a report from the Australian Ovarian Cancer Study Group. J. Clin. Oncol. 30, 2654–2663 (2012).
Google Scholar
Miller, R. E. et al. ESMO recommendations on predictive biomarker testing for homologous recombination deficiency and PARP inhibitor benefit in ovarian cancer. Ann. Oncol. 31, 1606–1622 (2020).
Google Scholar
Thorne, H. et al. BRCA1 and BRCA2 carriers with breast, ovarian and prostate cancer demonstrate a different pattern of metastatic disease compared with non-carriers: results from a rapid autopsy programme. Histopathology 83, 91–103 (2023).
Cancer Genome Atlas Research N. Integrated genomic analyses of ovarian carcinoma. Nature 474, 609–615 (2011).
Google Scholar
Mafficini, A. et al. BRCA somatic and germline mutation detection in paraffin embedded ovarian cancers by next-generation sequencing. Oncotarget 7, 1076–1083 (2016).
Google Scholar
Hennessy, B. T. et al. Somatic mutations in BRCA1 and BRCA2 could expand the number of patients that benefit from poly (ADP ribose) polymerase inhibitors in ovarian cancer. J. Clin. Oncol. 28, 3570–3576 (2010).
Google Scholar
Koczkowska, M. et al. Detection of somatic BRCA1/2 mutations in ovarian cancer - next-generation sequencing analysis of 100 cases. Cancer Med. 5, 1640–1646 (2016).
Google Scholar
Vos, J. R. et al. Universal tumor DNA BRCA1/2 testing of ovarian cancer: prescreening PARPi treatment and genetic predisposition. J. Natl. Cancer Inst. 112, 161–169 (2020).
Google Scholar
Song, H. et al. Contribution of germline mutations in the RAD51B, RAD51C, and RAD51D genes to ovarian cancer in the population. J. Clin. Oncol. 33, 2901–2907 (2015).
Google Scholar
Popova, T. et al. Ploidy and large-scale genomic instability consistently identify basal-like breast carcinomas with BRCA1/2 inactivation. Cancer Res. 72, 5454–5462 (2012).
Google Scholar
Birkbak, N. J. et al. Telomeric allelic imbalance indicates defective DNA repair and sensitivity to DNA-damaging agents. Cancer Discov. 2, 366–375 (2012).
Google Scholar
Abkevich, V., Timms, K. M., Hennessy, B. T., Potter, J., Carey, M. S., Meyer, L. A. et al. Patterns of genomic loss of heterozygosity predict homologous recombination repair defects in epithelial ovarian cancer. Br. J. Cancer 107, 1776–1782 (2012).
Google Scholar
Marquard, A. M., Eklund, A. C., Joshi, T., Krzystanek, M., Favero, F., Wang, Z. C. et al. Pan-cancer analysis of genomic scar signatures associated with homologous recombination deficiency suggests novel indications for existing cancer drugs. Biomark. Res. 3, 9 (2015).
Google Scholar
Burdett, N. L., Willis, M. O., Alsop, K., Hunt, A. L., Pandey, A., Hamilton, P. T. et al. Multiomic analysis of homologous recombination-deficient end-stage high-grade serous ovarian cancer. Nat. Genet. 55, 437–450 (2023).
Google Scholar
Macintyre, G., Goranova, T. E., De Silva, D., Ennis, D., Piskorz, A. M., Eldridge, M. et al. Copy number signatures and mutational processes in ovarian carcinoma. Nat. Genet. 50, 1262–1270 (2018).
Google Scholar
Drews, R. M., Hernando, B., Tarabichi, M., Haase, K., Lesluyes, T., Smith, P. S. et al. A pan-cancer compendium of chromosomal instability. Nature 606, 976–983 (2022).
Google Scholar
Alexandrov, L. B., Kim, J., Haradhvala, N. J., Huang, M. N., Tian Ng, A. W., Wu, Y. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
Google Scholar
Koskela, H., Li, Y., Joutsiniemi, T., Muranen, T., Isoviita, V. M., Huhtinen, K. et al. HRD related signature 3 predicts clinical outcome in advanced tubo-ovarian high-grade serous carcinoma. Gynecol. Oncol. 180, 91–98 (2024).
Google Scholar
Steele, C. D., Abbasi, A., Islam, S. M. A., Bowes, A. L., Khandekar, A., Haase, K. et al. Signatures of copy number alterations in human cancer. Nature 606, 984–991 (2022).
Google Scholar
Gulhan, D. C., Lee, J. J., Melloni, G. E. M., Cortes-Ciriano, I. & Park, P. J. Detecting the mutational signature of homologous recombination deficiency in clinical samples. Nat. Genet. 51, 912–919 (2019).
Google Scholar
Nguyen, L., Van Hoeck, J. W. M. M. & Cuppen, A. E. Pan-cancer landscape of homologous recombination deficiency. Nat. Commun. 11, 5584 (2020).
Google Scholar
Abbasi, A., Steele, C. D., Bergstrom, E. N., Khandekar, A., Farswan, A. & McKay, R. R. et al. HRProfiler detects homologous recombination deficiency in breast and ovarian cancers using whole-genome and whole-exome sequencing data. Cancer Res. 2504–2513 (2025).
Davies, H., Glodzik, D., Morganella, S., Yates, L. R., Staaf, J., Zou, X. et al. HRDetect is a predictor of BRCA1 and BRCA2 deficiency based on mutational signatures. Nat. Med. 23, 517–525 (2017).
Google Scholar
Sztupinszki, Z., Diossy, M., Krzystanek, M., Reiniger, L., Csabai, I., Favero, F. et al. Migrating the SNP array-based homologous recombination deficiency measures to next generation sequencing data of breast cancer. npj Breast Cancer 4, 16 (2018).
Google Scholar
Guo, M. & Wang, S. M. The BRCAness landscape of cancer. Cells 11 (2022).
Jacobson, D. H., Pan, S., Fisher, J. & Secrier, M. Multi-scale characterisation of homologous recombination deficiency in breast cancer. Genome Med. 15, 90 (2023).
Google Scholar
Lee, J. J., Kang, H. J., Kim, D., Lim, S. O., Kim, S. S., Kim, G. et al. expHRD: an individualized, transcriptome-based prediction model for homologous recombination deficiency assessment in cancer. BMC Bioinformatics 25, 236 (2024).
Google Scholar
Kang, J., Lee, J., Lee, A. & Lee, Y. S. Prediction of homologous recombination deficiency from cancer gene expression data. J. Int. Med. Res. 50, 3000605221133655 (2022).
Google Scholar
Vazquez-Garcia, I., Uhlitz, F., Ceglia, N., Lim, J. L. P., Wu, M., Mohibullah, N. et al. Ovarian cancer mutational processes drive site-specific immune evasion. Nature 612, 778–786 (2022).
Google Scholar
Zhou, W., Triche, T. J. Jr., Laird, P. W. & Shen, H. SeSAMe: reducing artifactual detection of DNA methylation by Infinium BeadChips in genomic deletions. Nucleic Acids Res. 46, e123 (2018).
Google Scholar
Raine, K. M., Van Loo, P., Wedge, D. C., Jones, D., Menzies, A., Butler, A. P. et al. ascatNgs: identifying somatically acquired copy-number alterations from whole-genome sequencing data. Curr. Protoc. Bioinformatics 56, 1–9 7 (2016).
Google Scholar
Aran, D., Sirota, M. & Butte, A. J. Systematic pan-cancer analysis of tumour purity. Nat. Commun. 6, 8971 (2015).
Google Scholar
Goldman, M. J., Craft, B., Hastie, M., Repecka, K., McDade, F., Kamath, A. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat. Biotechnol. 38, 675–678 (2020).
Google Scholar
Tothill, R. W., Tinker, A. V., George, J., Brown, R., Fox, S. B., Lade, S. et al. Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome. Clin. Cancer Res. 14, 5198–5208 (2008).
Google Scholar
Knijnenburg, T. A., Wang, L., Zimmermann, M. T., Chambwe, N., Gao, G. F., Cherniack, A. D. et al. Genomic and molecular landscape of DNA damage repair deficiency across the Cancer Genome Atlas. Cell Rep. 23, 239–54.e6 (2018).
Google Scholar
Carter, S. L., Cibulskis, K., Helman, E., McKenna, A., Shen, H., Zack, T. et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012).
Google Scholar
Chen, X., Schulz-Trieglaff, O., Shaw, R., Barnes, B., Schlesinger, F., Kallberg, M. et al. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32, 1220–1222 (2016).
Google Scholar
Van der Auwera GAOC, B.D. Genomics in the Cloud: Using Docker, GATK, and WDL in Terra 1st edn (O’Reilly Media, 2020).
Garsed, D. W., Pandey, A., Fereday, S., Kennedy, C. J., Takahashi, K., Alsop, K. et al. The genomic and immune landscape of long-term survivors of high-grade serous ovarian cancer. Nat. Genet. 54, 1853–1864 (2022).
Google Scholar
Patch, A. M., Christie, E. L., Etemadmoghadam, D., Garsed, D. W., George, J., Fereday, S. et al. Whole-genome characterization of chemoresistant ovarian cancer. Nature 521, 489–494 (2015).
Google Scholar
Shen, R. & Seshan, V. E. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res. 44, e131 (2016).
Google Scholar
Telli, M. L., Timms, K. M., Reid, J., Hennessy, B., Mills, G. B., Jensen, K. C. et al. Homologous Recombination Deficiency (HRD) score predicts response to platinum-containing neoadjuvant chemotherapy in patients with triple-negative breast cancer. Clin. Cancer Res. 22, 3764–3773 (2016).
Google Scholar
Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Google Scholar
Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010).
Google Scholar
Lord, C. J. & Ashworth, A. BRCAness revisited. Nat. Rev. Cancer 16, 110–120 (2016).
Google Scholar
Oshi, M., Gandhi, S., Wu, R., Asaoka, M., Yan, L., Yamada, A. et al. Development of a novel BRCAness score that predicts response to PARP inhibitors. Biomark. Res. 10, 80 (2022).
Google Scholar
Zhang, M., Ma, S. C., Tan, J. L., Wang, J., Bai, X., Dong, Z. Y. et al. Inferring homologous recombination deficiency of ovarian cancer from the landscape of copy number variation at subchromosomal and genetic resolutions. Front. Oncol. 11, 772604 (2021).
Google Scholar
Farrugia, D. J., Agarwal, M. K., Pankratz, V. S., Deffenbaugh, A. M., Pruss, D., Frye, C. et al. Functional assays for classification of BRCA2 variants of uncertain significance. Cancer Res. 68, 3523–3531 (2008).
Google Scholar
Mesman, R. L. S., Calleja, F., Hendriks, G., Morolli, B., Misovic, B., Devilee, P. et al. The functional impact of variants of uncertain significance in BRCA2. Genet. Med. 21, 293–302 (2019).
Google Scholar
Comitani, F., Nash, J. O., Cohen-Gogo, S., Chang, A. I., Wen, T. T., Maheshwari, A. et al. Diagnostic classification of childhood cancer using multiscale transcriptomics. Nat. Med. 29, 656–666 (2023).
Google Scholar
Wong, M., Mayoh, C., Lau, L. M. S., Khuong-Quang, D. A., Pinese, M., Kumar, A. et al. Whole genome, transcriptome and methylome profiling enhances actionable target discovery in high-risk pediatric cancer. Nat. Med. 26, 1742–1753 (2020).
Google Scholar
Prat, A., Pineda, E., Adamo, B., Galvan, P., Fernandez, A., Gaba, L. et al. Clinical implications of the intrinsic molecular subtypes of breast cancer. Breast 24, S26–S35 (2015).
Google Scholar
Cancer Genome Atlas, N. Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).
Google Scholar
Prat, A., Parker, J. S., Fan, C. & Perou, C. M. PAM50 assay and the three-gene model for identifying the major and clinically relevant molecular subtypes of breast cancer. Breast Cancer Res. Treat. 135, 301–306 (2012).
Google Scholar

Download references

Acknowledgements

A.L.W. is supported by a Research Training Program scholarship and is partially funded by a CSL PhD top-up scholarship and a Tour De Cure PhD grant. N.M.D. is funded by NHMRC Investigator Grant [GNT2016547 to N.M.D.] and the Estate of Judith Corrie Philpots. S.J.R. is funded by NHMRC Investigator Grant [GNT2009840 to S.J.R]. We thank Dr Matthew Wakefield for offering insight and expertise in ovarian carcinoma biology. We acknowledge the contributions of Dr Ksenija Nesic and the entire laboratory of Professor Clare Scott at the Walter and Eliza Hall Institute for offering feedback on the complete IdentifiHR model. We offer thanks to Professor James Brenton and members of the Brenton laboratory for discussions surrounding HR and the training of our model. Figures 1 and 3 created, in part, in BioRender. Weir, A. (2025); https://BioRender.com/isms2aw. We thank the many patients who contributed to the data used in this research, and our cancer consumer advisers. We also acknowledge the Wurundjeri people of the Kulin nation as the traditional owners and guardians of the land on which the work was performed.

Author information

Authors and Affiliations

The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
Ashley L. Weir, Samuel C. Lee, Mengbo Li, Chin Wee Tan & Nadia M. Davidson
Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville, VIC, Australia
Ashley L. Weir, Samuel C. Lee, Mengbo Li, Chin Wee Tan & Nadia M. Davidson
Olivia Newton-John Cancer Research Institute, Heidelberg, VIC, Australia
Samuel C. Lee
School of Cancer Medicine, La Trobe University, Bundoora, VIC, Australia
Samuel C. Lee
Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
Ahwan Pandey & Dale W. Garsed
The Sir Peter MacCallum Department of Oncology, The University of Melbourne, Melbourne, VIC, Australia
Ahwan Pandey & Dale W. Garsed
Frazer Institute, Faculty of Medicine, The University of Queensland, Woolloongabba, Brisbane, QLD, Australia
Chin Wee Tan
School of Clinical Medicine, UNSW Medicine and Health, University of NSW Sydney, Sydney, NSW, Australia
Susan J. Ramus

Authors

Ashley L. Weir
View author publications
Search author on:PubMed Google Scholar
Samuel C. Lee
View author publications
Search author on:PubMed Google Scholar
Mengbo Li
View author publications
Search author on:PubMed Google Scholar
Ahwan Pandey
View author publications
Search author on:PubMed Google Scholar
Chin Wee Tan
View author publications
Search author on:PubMed Google Scholar
Dale W. Garsed
View author publications
Search author on:PubMed Google Scholar
Susan J. Ramus
View author publications
Search author on:PubMed Google Scholar
Nadia M. Davidson
View author publications
Search author on:PubMed Google Scholar

Contributions

A.L.W. and N.M.D. conceived and designed the study. A.L.W. collected, processed, and curated all data, developed the methodology, validated the method and results, wrote the original draft and all subsequent iterations, and produced all tables and visualisations in the study. N.M.D., S.J.R. and C.W.T. supervised the research and revised the manuscript. D.G and A.P. processed and analysed WGS data in the AOCS cohort. S.C.L. and M.L. advised on method development and analysis. All authors contributed to the review of the manuscript. All authors approved the manuscript for submission.

Corresponding authors

Correspondence to Ashley L. Weir or Nadia M. Davidson.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Medicine thanks Michael Menzel and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Transparent Peer Review file

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Supplementary Data 9

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Weir, A.L., Lee, S.C., Li, M. et al. IdentifiHR predicts homologous recombination deficiency in high-grade serous ovarian carcinoma using gene expression. Commun Med (2026). https://doi.org/10.1038/s43856-026-01387-y

Download citation

Received: 19 September 2024
Accepted: 06 January 2026
Published: 14 January 2026
DOI: https://doi.org/10.1038/s43856-026-01387-y