Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Uncovering the role of LINE-1 in the evolution of lung adenocarcinoma

Abstract

Understanding lung cancer evolution can identify tools for intercepting its growth1,2. Here, in a landscape analysis of 1,024 lung adenocarcinomas (LUADs) with deep whole-genome sequencing integrated with multiomic data, we identified 542 LUADs with a diverse clonal architecture. In this group, we observed divergent evolutionary trajectories based on tobacco smoking exposure, ancestry and sex. LUAD from smokers showed an abundance of tobacco-related C:G>A:T driver mutations3 in KRAS and short subclonal diversification. LUAD in people who have never smoked (hereafter, never-smokers) showed early occurrence of copy-number alterations and EGFR mutations associated with SBS5 and SBS40a mutational signatures. Tumours containing EGFR mutations exhibited long latency, particularly in female individuals of European-ancestry. Tumours from Asian never-smokers showed a short clonal evolution. Importantly, we found that the mutational signature ID24 is a marker of a previously unrecognized mechanism for LUAD evolution. Tumours with ID2 showed short latency and high long interspersed nuclear element-1 (LINE-1, hereafter L1) retrotransposon activity linked to L1 promoter demethylation. These tumours exhibited an aggressive phenotype with genomic instability, elevated hypoxia scores, low neoantigen burden, metastasis propensity and poor overall survival. Reactivated L1-retrotransposition-induced mutagenesis probably contributes to the mutational signature ID2, including through the regulation of the transcriptional factor ZNF695, a member of the KZFP family5. The complex nature of LUAD evolution creates both challenges and opportunities for screening and treatment plans.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Evolutionary dynamics of lung cancer.
Fig. 2: Features associated with lung tumour latency.
Fig. 3: Characterization of tumours with the mutational signature ID2.
Fig. 4: The association between L1 retrotransposition and the mutational signatures ID2 and ID1.
Fig. 5: Activation of germline L1 retrotransposition due to DNA demethylation of L1 promoter.
Fig. 6: ZNF695 upregulation in tumours and its association with mutational signature ID2.

Similar content being viewed by others

Data availability

Normal and tumour-paired CRAM files, as well as raw methylation intensity (IDAT) files for the WGS participants from the Sherlock-Lung study, have been deposited in dbGaP under accession numbers phs001697.v2.p1 and phs002992.v1.p1. RNA-seq FASTQ files for the same individuals are available through dbGaP under the accession number phs002346.v1.p1. The human reference genome (GRCh38) was obtained from the GATK resource repository (https://github.com/broadinstitute/gatk/blob/master/src/test/resources/large/Homo_sapiens_assembly38.fasta.gz). Publicly available LUAD multi-omics datasets can be accessed through dbGaP (accessions phs000178.v9.p8 and phs000488.v1.p1) and the European Genome-phenome Archive (EGA) (EGAS00001001757, EGAS00001002801 and EGAS00001003830). Detailed dataset descriptions and accession information are also provided in Supplementary Table 1.

Code availability

The WGS bioinformatics pipelines can be accessed at GitHub (https://github.com/xtmgah/Sherlock-Lung). Battenberg SCNA calling algorithm can be found at GitHub (https://github.com/Wedge-lab/battenberg). The Dirichlet process-based method for the subclonal reconstruction of tumours can be found at GitHub (https://github.com/Wedge-lab/dpclust). The bioinformatic pipeline for identifying TE insertion is available at GitLab (https://gitlab.com/mobilegenomesgroup/TraFiC).

References

  1. Yates, L. R. & Campbell, P. J. Evolution of the cancer genome. Nat. Rev. Genet. 13, 795–806 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Jamal-Hanjani, M. et al. Tracking the evolution of non-small-cell lung cancer. N. Engl. J. Med. 376, 2109–2121 (2017).

    Article  CAS  PubMed  Google Scholar 

  3. Alexandrov, L. B. et al. Mutational signatures R. Science 354, 618–622 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Yang, P., Wang, Y. & Macfarlan, T. S. The role of KRAB-ZFPs in transposable element repression and mammalian evolution. Trends Genet. 33, 871–881 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Zhang, Y. et al. Global variations in lung cancer incidence by histological subtype in 2020: a population-based study. Lancet Oncol. 24, 1206–1218 (2023).

    Article  PubMed  Google Scholar 

  7. Gerstung, M. et al. The evolutionary history of 2,658 cancers. Nature 578, 122–128 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Frankell, A. M. et al. The evolution of lung cancer and impact of subclonal selection in TRACERx. Nature 616, 525–533 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Burns, K. H. Transposable elements in cancer. Nat. Rev. Cancer 17, 415–424 (2017).

    Article  CAS  PubMed  Google Scholar 

  10. Rodriguez-Martin, B. et al. Pan-cancer analysis of whole genomes identifies driver rearrangements promoted by LINE-1 retrotransposition. Nat. Genet. 52, 306–319 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Mendez-Dorantes, C. et al. Chromosomal rearrangements and instability caused by the LINE-1 retrotransposon. Preprint at bioRxiv https://doi.org/10.1101/2024.12.14.628481 (2024).

  12. Rodić, N. et al. Retrotransposon insertions in the clonal evolution of pancreatic ductal adenocarcinoma. Nat. Med. 21, 1060–1064 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Nguyen, T. H. M. et al. L1 retrotransposon heterogeneity in ovarian tumor cell evolution. Cell Rep. 23, 3730–3740 (2018).

    Article  CAS  PubMed  Google Scholar 

  14. Zhang, T. et al. Genomic and evolutionary classification of lung cancer in never smokers. Nat. Genet. 53, 1348–1359 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Landi, M. T. et al. Tracing lung cancer risk factors through mutational signatures in never-smokers. Am. J. Epidemiol. 190, 962–976 (2021).

    Article  PubMed  Google Scholar 

  16. Zhu, B. et al. The genomic and epigenomic evolutionary history of papillary renal cell carcinomas. Nat. Commun. 11, 3096 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Senkin, S. et al. Geographic variation of mutagenic exposures in kidney cancer genomes. Nature 629, 910–918 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Díaz-Gay, M. et al. The mutagenic forces shaping the genomes of lung cancer in never smokers. Nature 644, 133–144 (2025).

    Article  PubMed  PubMed Central  Google Scholar 

  19. McGranahan, N. et al. Clonal status of actionable driver events and the timing of mutational processes in cancer evolution. Sci. Transl. Med. 7, 283ra54 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Otlu, B. et al. Topography of mutational signatures in human cancer. Cell Rep. 42, 112930 (2023).

  21. Stamatoyannopoulos, J. A. et al. Human mutation rate associated with DNA replication timing. Nat. Genet. 41, 393–395 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Whitfield, M. L., George, L. K., Grant, G. D. & Perou, C. M. Common markers of proliferation. Nat. Rev. Cancer 6, 99–106 (2006).

    Article  CAS  PubMed  Google Scholar 

  23. Emami Nejad, A. et al. The role of hypoxia in the tumor microenvironment and development of cancer stem cell: a novel approach to developing treatment. Cancer Cell Int. 21, 62 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Bhandari, V., Li, C. H., Bristow, R. G., Boutros, P. C. & PCAWG Consortium. Divergent mutational processes distinguish hypoxic and normoxic tumours. Nat. Commun. 11, 737 (2020).

  25. Liu, N. et al. Selective silencing of euchromatic L1s revealed by genome-wide screens for L1 regulators. Nature 553, 228–232 (2018).

    Article  CAS  PubMed  Google Scholar 

  26. Li, X. et al. LINE-1 transcription activates long-range gene expression. Nat. Genet. 56, 1494–1502 (2024).

    Article  CAS  PubMed  Google Scholar 

  27. Scott, E. C. & Devine, S. E. The role of somatic L1 retrotransposition in human cancers. Viruses 9, 131 (2017).

  28. Tubio, J. M. C. et al. Mobile DNA in cancer. Extensive transduction of nonrepetitive DNA mediated by L1 retrotransposition in cancer genomes. Science 345, 1251343 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  29. McKerrow, W. et al. LINE-1 expression in cancer correlates with p53 mutation, copy number alteration, and S phase checkpoint. Proc. Natl Acad. Sci. USA 119, e2115999119 (2022).

  30. Kazazian, H. H. Jr & Moran, J. V. Mobile DNA in health and disease. N. Engl. J. Med. 377, 361–370 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Petljak, M. et al. Characterizing mutational signatures in human cancer cell lines reveals episodic APOBEC mutagenesis. Cell 176, 1282–1294 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Nam, C. H. et al. Widespread somatic L1 retrotransposition in normal colorectal epithelium. Nature 617, 540–547 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Levin, H. L. & Moran, J. V. Dynamic interactions between transposable elements and their hosts. Nat. Rev. Genet. 12, 615–627 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Gasior, S. L., Wakeman, T. P., Xu, B. & Deininger, P. L. The human LINE-1 retrotransposon creates DNA double-strand breaks. J. Mol. Biol. 357, 1383–1393 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Morrish, T. A. et al. DNA repair mediated by endonuclease-independent LINE-1 retrotransposition. Nat. Genet. 31, 159–165 (2002).

    Article  CAS  PubMed  Google Scholar 

  36. Farkash, E. A. & Luning Prak, E. T. DNA damage and L1 retrotransposition. J. Biomed. Biotechnol. 2006, 37285 (2006).

    PubMed  PubMed Central  Google Scholar 

  37. Suzuki, J. et al. Genetic evidence that the non-homologous end-joining repair pathway is involved in LINE retrotransposition. PLoS Genet. 5, e1000461 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Baldwin, E. T. et al. Structures, functions and adaptations of the human LINE-1 ORF2 protein. Nature 626, 194–206 (2024).

    Article  CAS  PubMed  Google Scholar 

  39. Freeman, B. et al. Analysis of epigenetic features characteristic of L1 loci expressed in human cells. Nucleic Acids Res. 50, 1888–1907 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Rodgers, K. & McVey, M. Error-prone repair of DNA double-strand breaks. J. Cell. Physiol. 231, 15–24 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Wangsri, S., Subbalekha, K., Kitkumthorn, N. & Mutirangura, A. Patterns and possible roles of LINE-1 methylation changes in smoke-exposed epithelia. PLoS ONE 7, e45292 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Stueve, T. R. et al. Epigenome-wide analysis of DNA methylation in lung tissue shows concordance with blood studies and identifies tobacco smoke-inducible enhancers. Hum. Mol. Genet. 26, 3014–3027 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Caliri, A. W., Caceres, A., Tommasi, S. & Besaratinia, A. Hypomethylation of LINE-1 repeat elements and global loss of DNA hydroxymethylation in vapers and smokers. Epigenetics 15, 816–829 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Camila, B. et al. Genotoxicity and hypomethylation of LINE-1 induced by electronic cigarettes. Ecotoxicol. Environ. Saf. 256, 114900 (2023).

    Article  CAS  PubMed  Google Scholar 

  45. Joehanes, R. et al. Epigenetic signatures of cigarette smoking. Circ. Cardiovasc. Genet. 9, 436–447 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Imbeault, M., Helleboid, P.-Y. & Trono, D. KRAB zinc-finger proteins contribute to the evolution of gene regulatory networks. Nature 543, 550–554 (2017).

    Article  CAS  PubMed  Google Scholar 

  47. Han, G. et al. An atlas of epithelial cell states and plasticity in lung adenocarcinoma. Nature 627, 656–663 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Long, E. et al. Context-aware single-cell multiomics approach identifies cell-type-specific lung cancer susceptibility genes. Nat. Commun. 15, 7995 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Rosspopoff, O. & Trono, D. Take a walk on the KRAB side. Trends Genet. 39, 844–857 (2023).

    Article  CAS  PubMed  Google Scholar 

  50. Hill, W. et al. Lung adenocarcinoma promotion by air pollutants. Nature 616, 159–167 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Haga, Y. et al. Whole-genome sequencing reveals the molecular implications of the stepwise progression of lung adenocarcinoma. Nat. Commun. 14, 8375 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Huang, Z. et al. Single-cell analysis of somatic mutations in human bronchial epithelial cells in relation to aging and smoking. Nat. Genet. 54, 492–498 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Colom, B. et al. Mutant clones in normal epithelium outcompete and eliminate emerging tumours. Nature 598, 510–514 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Jardim, D. L., Goodman, A., de Melo Gagliato, D. & Kurzrock, R. The challenges of tumor mutational burden as an immunotherapy biomarker. Cancer Cell 39, 154–173 (2021).

    Article  CAS  PubMed  Google Scholar 

  55. Klein, S. L. & Flanagan, K. L. Sex differences in immune responses. Nat. Rev. Immunol. 16, 626–638 (2016).

    Article  CAS  PubMed  Google Scholar 

  56. Vaz, M. et al. Chronic cigarette smoke-induced epigenomic changes precede sensitization of bronchial epithelial cells to single-step transformation by KRAS mutations. Cancer Cell 32, 360–376 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Mengs, U. Tumour induction in mice following exposure to aristolochic acid. Arch. Toxicol. 61, 504–505 (1988).

    Article  CAS  PubMed  Google Scholar 

  58. Ambatipudi, S. et al. Tobacco smoking-associated genome-wide DNA methylation changes in the EPIC study. Epigenomics 8, 599–618 (2016).

    Article  CAS  PubMed  Google Scholar 

  59. Kobayashi, S. et al. EGFR mutation and resistance of non-small-cell lung cancer to gefitinib. N. Engl. J. Med. 352, 786–792 (2005).

    Article  CAS  PubMed  Google Scholar 

  60. Shah, N. M. et al. Pan-cancer analysis identifies tumor-specific antigens derived from transposable elements. Nat. Genet. 55, 631–639 (2023).

    Article  CAS  PubMed  Google Scholar 

  61. Bergmann, E. A., Chen, B.-J., Arora, K., Vacic, V. & Zody, M. C. Conpair: concordance and contamination estimator for matched tumor-normal pairs. Bioinformatics 32, 3196–3198 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Pedersen, B. S. et al. Somalier: rapid relatedness estimation for cancer and germline studies using efficient genome sketches. Genome Med. 12, 62 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Dentro, S. C. et al. Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes. Cell 184, 2239–2254 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Sadedin, S. P. & Oshlack, A. Bazam: a rapid method for read extraction and realignment of high-throughput sequencing data. Genome Biol. 20, 78 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Martínez-Jiménez, F. et al. A compendium of mutational cancer driver genes. Nat. Rev. Cancer 20, 555–572 (2020).

    Article  PubMed  Google Scholar 

  67. Yuan, K., Macintyre, G., Liu, W., PCAWG-11 Working Group & Markowetz, F. Ccube: a fast and robust method for estimating cancer cell fractions. Preprint at bioRxiv https://doi.org/10.1101/484402 (2018).

  68. Yang, L. et al. Diverse mechanisms of somatic structural variations in human cancer genomes. Cell 153, 919–929 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Chen, X. et al. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32, 1220–1222 (2016).

    Article  CAS  PubMed  Google Scholar 

  70. Muiños, F., Martínez-Jiménez, F., Pich, O., Gonzalez-Perez, A. & Lopez-Bigas, N. In silico saturation mutagenesis of cancer genes. Nature 596, 428–432 (2021).

    Article  PubMed  Google Scholar 

  71. Chakravarty, D. et al. OncoKB: a precision oncology knowledge base. JCO Precis. Oncol. 1, 1–16 (2017).

  72. Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Cheng, J. et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381, eadg7492 (2023).

    Article  CAS  PubMed  Google Scholar 

  74. Bergstrom, E. N. et al. SigProfilerMatrixGenerator: a tool for visualizing and exploring patterns of small mutational events. BMC Genom. 20, 685 (2019).

    Article  Google Scholar 

  75. Islam, S. M. A. et al. Uncovering novel mutational signatures by de novo extraction with SigProfilerExtractor. Cell Genom. 2, 100179 (2022).

  76. Sondka, Z. et al. COSMIC: a curated database of somatic variants and clinical data for cancer. Nucleic Acids Res. 52, D1210–D1217 (2024).

    Article  CAS  PubMed  Google Scholar 

  77. Díaz-Gay, M. et al. Assigning mutational signatures to individual samples and individual somatic mutations with SigProfilerAssignment. Bioinformatics 39, btad756 (2023).

  78. Degasperi, A. et al. Substitution mutational signatures in whole-genome-sequenced cancers in the UK population. Science 376, abl9283 (2022).

  79. Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543–550 (2014).

    Article  Google Scholar 

  80. Grossman, R. L. et al. Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375, 1109–1112 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  81. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  PubMed  Google Scholar 

  82. Putri, G. H., Anders, S., Pyl, P. T., Pimanda, J. E. & Zanini, F. Analysing high-throughput sequencing data in Python with HTSeq 2.0. Bioinformatics 38, 2943–2945 (2022).

  83. Zhang, Y., Parmigiani, G. & Johnson, W. E. ComBat-seq: batch effect adjustment for RNA-seq count data. NAR Genom. Bioinform. 2, lqaa078 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  84. Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    Article  CAS  PubMed  Google Scholar 

  85. Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Li, T. et al. TIMER2.0 for analysis of tumor-infiltrating immune cells. Nucleic Acids Res. 48, W509–W514 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. de Tribolet-Hardy, J. et al. Genetic features and genomic targets of human KRAB-zinc finger proteins. Genome Res. 33, 1409–1423 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  88. Del Toro, N. et al. The IntAct database: efficient access to fine-grained molecular interaction data. Nucleic Acids Res. 50, D648–D653 (2022).

    Article  PubMed  Google Scholar 

  89. Oleksiewicz, U. et al. TRIM28 and interacting KRAB-ZNFs control self-renewal of human pluripotent stem cells through epigenetic repression of pro-differentiation genes. Stem Cell Rep. 9, 2065–2080 (2017).

    Article  CAS  Google Scholar 

  90. McKerrow, W. & Fenyö, D. L1EM: a tool for accurate locus specific LINE-1 RNA quantification. Bioinformatics 36, 1167–1173 (2020).

    Article  CAS  PubMed  Google Scholar 

  91. Reyes-Gopar, H. et al. A single-cell transposable element atlas of human cell identity. Cell Rep. Methods 5, 101086 (2025).

  92. Müller, F. et al. RnBeads 2.0: comprehensive analysis of DNA methylation data. Genome Biol. 20, 55 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  93. Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Kawaguchi, S., Higasa, K., Shimizu, M., Yamada, R. & Matsuda, F. HLA-HD: an accurate HLA typing algorithm for next-generation sequencing data. Hum. Mutat. 38, 788–797 (2017).

    Article  CAS  PubMed  Google Scholar 

  95. Reynisson, B., Alvarez, B., Paul, S., Peters, B. & Nielsen, M. NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res. 48, W449–W454 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Schenck, R. O., Lakatos, E., Gatenbee, C., Graham, T. A. & Anderson, A. R. A. NeoPredPipe: high-throughput neoantigen prediction and recognition potential pipeline. BMC Bioinform. 20, 264 (2019).

    Article  Google Scholar 

  97. Thorsson, V. et al. The immune landscape of cancer. Immunity 48, 812–830 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Buffa, F. M., Harris, A. L., West, C. M. & Miller, C. J. Large meta-analysis of multiple cancers reveals a common, compact and highly prognostic hypoxia metagene. Br. J. Cancer 102, 428–435 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Winter, S. C. et al. Relation of a hypoxia metagene derived from head and neck cancer to prognosis of multiple cancers. Cancer Res. 67, 3441–3449 (2007).

    Article  CAS  PubMed  Google Scholar 

  100. Ragnum, H. B. et al. The tumour hypoxia marker pimonidazole reflects a transcriptional programme associated with aggressive prostate cancer. Br. J. Cancer 112, 382–390 (2015).

    Article  CAS  PubMed  Google Scholar 

  101. Elvidge, G. P. et al. Concordant regulation of gene expression by hypoxia and 2-oxoglutarate-dependent dioxygenase inhibition: the role of HIF-1α, HIF-2α, and other pathways. J. Biol. Chem. 281, 15215–15226 (2006).

    Article  CAS  PubMed  Google Scholar 

  102. Sørensen, B. S., Toustrup, K., Horsman, M. R., Overgaard, J. & Alsner, J. Identifying pH independent hypoxia induced genes in human squamous cell carcinomas in vitro. Acta Oncol. 49, 895–905 (2010).

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported by the Intramural Research Program of the National Cancer Institute, US National Institute of Health (NIH) (project ZIACP101231 to M.T.L.); by the NIH grants R01ES032547-01, R01CA269919-01 and 1U01CA290479-01 to L.B.A. as well as by L.B.A.’s Packard Fellowship for Science and Engineering. The research performed in L.B.A.’s laboratory was also supported by UC San Diego Sanford Stem Cell Institute. The funders had no roles in study design, data collection and analysis, decision to publish or preparation of the manuscript. The M.D.-G. fellowship, within the “Generación D” initiative, Red.es, Ministerio para la Transformación Digital y de la Función Pública, for talent attraction (C005/24-ED CV1), is funded by the European Union NextGenerationEU funds, through PRTR. The computational analyses reported in this manuscript have used the NIH high-performance Biowulf Cluster and the Triton Shared Computing Cluster at the San Diego Supercomputer Center of UC San Diego. We thank the study participants; P. Kraft for reading manuscript and comments; and the staff at Westat for their assistance in collecting samples and corresponding clinical data. Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organization, the authors alone are responsible for the views expressed in this article and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer/World Health Organization. This research was supported by the Intramural Research Program of the NIH. The contributions of the NIH authors were made as part of their official duties as NIH federal employees, are in compliance with agency policy requirements, and are considered works of the United States Government. However, the findings and conclusions presented in this paper are those of the authors and do not necessarily reflect the views of the NIH or the US Department of Health and Human Services. We want to acknowledge the patients and the INCLIVA Biobank (PT17/0015/0049) integrated in the Spanish National Biobanks Network and in the Valencian Biobanking Network for their collaboration. This study was supported by the Health and Medical Research Fund of Hong Kong SAR, HMRF 03142856. The related studies of Taiwan site were supported by grants from the Ministry of Health and Welfare, Taiwan DOH97-TD-G-111-026 (C.A.H.), DOH98-TD-G-111-015 (C.A.H.), DOH99-TD-G-111-028 (C.A.H.); DOH97-TD-G-111-029 (C.Y.C.), DOH98-TD-G-111-018 (C.Y.C.), DOH99-TD-G-111-015 (C.Y.C.), DOH97-TD-G-111-028(I.S.C.), DOH98-TD-G-111-017(I.S.C.), DOH99-TD-G-111-014(I.S.C.), and the Ministry of Science and Technology, Taiwan MOST109-2740-B-400-002 (C.A.H.), MOST110-2740-B-400-002 (C.A.H.), MOST111-2740-B-400-002 (C.A.H.). This work has been supported in part by the Tissue Core at the H. Lee Moffitt Cancer Center & Research Institute, a comprehensive cancer center designated by the National Cancer Institute and funded in part by a Moffitt Cancer Center Support Grant (no. P30-CA076292). And, in part, by NIH (NCI) grant # U01CA209414 to the Boston Lung Cancer Survival Study of the Dana-Farber/ Harvard Cancer Center (D.C.C.). The authors would like to thank the team at the IUCPQ site of the Quebec Respiratory Health Network Biobank of the FRQS for their valuable assistance, and would like to thank the staff at Harvard University, Yale University, Roswell Park Cancer Institute and Roswell PI, Instituto Nacional de Cancerologia, Nice University Hospital Centre (Nice UHC) - University Côte d’Azur and the Nice Biobank CRB, Toronto University Health Network, and Mayo Clinic for their assistance providing samples and corresponding clinical data.

Author information

Authors and Affiliations

Authors

Contributions

T.Z. and M.T.L. conceived and designed the study. T.Z., L.Y., B.Z., J. Shi, L.B.A., D.C.W. and M.T.L. developed the methodology. T.Z., W.Z., C.W., M.D.-G., P.H.H., J.M.S., J.P.M., A. Klein, C.H., L.Y., B.Z., A. Khandekar and J. Sang carried out the formal analyses. J.Y., M.A., M. Cardelli, M. Cecati, F.M., K.B., J.C. and K.M.J. performed validation of findings. C.L., M.K.B., W.D.T., L.M.S., P.J., R.H. and S.-R.Y. conducted pathology review and analysis. M.P.W., K.C.L., C.A.H., C.-Y.C., N.E.C., A.C.P., D.C., E.S.E., J.M.S., M.B.S., S.S.Y., M. Manczuk, J.L., B.Ś., A.M., O.S., D.Z., I.H., D.M., S.M., M. Savic, Y.B., B.E.G.R., D.C.C., V.G., P.B., G.L., P.H., N.R., Q.L., M.T.L. and S.J.C. provided resources and essential materials. T.Z., J.R., M. Miraftab, F.J.C.-M., M. Saha, P.H.H. and O.W.L. undertook data curation. The original manuscript draft was written by T.Z. and M.T.L. All of the authors, including L.Y., B.Z., J. Shi, J.C., T.Z., M.A.N., D.C.W., S.J.C., L.B.A. and M.T.L., contributed to the review and editing of the manuscript. T.Z. performed visualization. M.T.L. supervised the study. All of the authors read and approved the final version of the manuscript.

Corresponding author

Correspondence to Maria Teresa Landi.

Ethics declarations

Competing interests

L.B.A. is a co-founder, CSO, scientific advisory member and consultant for io9, has equity and receives income. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies. L.B.A. is also a compensated member of the scientific advisory board of Inocras. L.B.A.’s spouse is an employee of Biotheranostics. L.B.A. declares US provisional applications filed with UCSD with serial numbers: 63/269,033, 63/366,392, 63/289,601, 63/483,237, 63/412,835 and 63/492,348. L.B.A. is also listed as an inventor of a US patent 10,776,718 for source identification by non-negative matrix factorization. S.-R.Y. has received consulting fees from AstraZeneca, Sanofi, Amgen, AbbVie and Sanofi; and received speaking fees from AstraZeneca, Medscape, PRIME Education and Medical Learning Institute. The other authors declare no competing interests.

Peer review

Peer review information

Nature thanks David Adams and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Power analysis for detecting a diverse clonal architecture in 1024 LUAD.

a, The scatter plot illustrates the relationship between the number of reads per chromosome copy (NRPCC) and the total detected single nucleotide variants (SNVs). Our ability to detect subclones relies not on the number of identified SNVs, but on the number of reads per tumour chromosomal copy. NRPCC accounts for tumour purity, ploidy, and sequencing coverage. b, The minimum cancer cell fraction (CCF) of the detected clusters in each tumour is plotted against NRPCC. To mitigate biases, we exclusively considered tumours with NRPCC ≥ 10. In these tumours, our analysis is sufficiently powered to identify a subclone with a CCF ≥ 30%. The suggested NRPCC threshold is denoted by the dashed line.

Extended Data Fig. 2 Overview of the molecular timing distribution of copy number gains per chromosome.

a, Pie charts depict the distribution of the inferred mutation time for a given copy number gain in a group. Green denotes early clonal gains, with a gradient to purple for late gains. The size of each pie chart is proportional to the frequency of recurrence of each event. b, Comparison of the molecular timing of copy number gains on chromosomes 21 and 22 versus other chromosomes across all three groups. c, Comparison of the molecular timing of copy number gains on chromosome X versus other chromosomes in tumours from the EU_S group. Two-sided Wilcoxon rank-sum test p-values are shown above the box plots. All box plots show the median (centre line), IQR (box), and whiskers extending to 1.5× the IQR by default in ggplot2. The P-values derived from the two-sided Wilcoxon rank-sum test are shown above the box plots.

Extended Data Fig. 3 Contribution of mutational signatures to major driver genes.

a, Single base substitution (SBS) mutational patterns for the driver mutations of EGFR, KRAS, and TP53. b, Distribution of all driver mutations and hotspot mutations (recurrence > 5) in EGFR, KRAS, and TP53 attributed to each mutational signature.

Extended Data Fig. 4 Dynamic SBS mutational processes during early and late tumour evolution.

Fold changes between the relative proportions of early and late clonal mutations attributed to individual SBS mutational signatures. Box plots display the median (centre line), interquartile range (box), and whiskers extending to 1.5× the IQR. Each point represents a tumour sample (n = 542) and is coloured by mutational signature. P values from two-sided Wilcoxon rank-sum tests are shown below the box plots.

Extended Data Fig. 5 Associations between tumour latency and EGFR mutation status (a), KRAS mutation status (b) and sex (c), stratified by ancestry and tobacco smoking group.

All box plots display the median (centre line), interquartile range (box), and whiskers extending to 1.5× the IQR. Sample size: n = 542. P values were calculated using two-sided Wilcoxon rank-sum tests.

Extended Data Fig. 6 Association between tumours with mutational signature ID2 and increased tumour proliferation.

a, Ingenuity pathway analysis (IPA) of differentially expressed genes in tumours with and without mutational signature ID2. The x-axis represents the Z-score, and the y-axis indicates the corresponding canonical pathways. Suppressed pathways are shown in blue, while activated pathways are shown in orange. Circle size reflects the level of significance. b–d, Gene set enrichment analysis (GSEA) highlights key proliferation-related pathways significantly associated with tumours harbouring mutational signature ID2, including hallmark mitotic spindle (b), hallmark MYC targets V2 (c), and hallmark G2M checkpoint (d). The normalized enrichment score (NES) and FDR-adjusted q-value are displayed at the bottom of each GSEA plot. e,f, Enrichment of tumour metastasis in tumours with mutational signature ID2, stratified by TP53 mutation status. Odds ratios and p-values from the Fisher’s exact test are shown above the plot. g,h, Comparison of ID2 deletion burden between tumours with and without metastasis, stratified by TP53 mutation status (g) or across all tumours (n = 294) (h). All box plots show the median (centre line), IQR (box), and whiskers extending to 1.5× the IQR by default in ggplot2. The P-values derived from the two-sided Wilcoxon rank-sum test are shown above the box plots.

Extended Data Fig. 7 Box plot illustrating neoantigen predictions across all mutational signatures.

Neoantigen burden is normalized as the number of neoantigens per 1000 mutations. All box plots show the median (centre line), IQR (box), and whiskers extending to 1.5× the IQR by default in ggplot2. Each data point is shown as an individual sample.

Extended Data Fig. 8 Validation of the association between L1 retrotransposition and mutational signatures ID2 using WGS data from normal and matched colorectal cancer clones32.

Scatter plots show Pearson correlations between the number of somatic L1 insertions and the number of ID1/ID2 mutations in normal colorectal epithelium (red), matched colorectal cancer (blue), and MUTYH-associated adenomatous polyps (orange). Spearman correlation coefficients (R) and p-values (P) are indicated in the plots. The fitted regression line (blue) and 95% confidence intervals (grey) are shown.

Extended Data Fig. 9 ZNF695 expression in single cell RNA-seq studies.

a, Identification of lung cell types with detectable ZNF695 expression across multiple single cell types from RNA-seq data in the CZ CELLxGENE database. The numbers on the right y-axis represent the number of cells identified from the single-cell studies. b, Enrichment of ZNF695 expression in alveolar type 2 proliferating (AT2pro) cells from a normal lung single-nucleus multiome dataset. Dot plot visualizes the normalized RNA expression of ZNF695 by cell type. The colour and size of each dot correspond to the scaled average expression level and fraction of expressing cells, respectively. c, Enrichment of L1 RNA expression in cells with detectable ZNF695 expression. P-values from the Wilcoxon rank-sum test are shown on the top of the boxplot.

Extended Data Fig. 10 Schematic model of canonical and non-canonical ZNF695 function in L1 regulation.

This schematic, adapted from ref. 49, illustrates ZNF695-mediated regulation of L1 retrotransposons. In normal cells, canonical KZFP transcripts encode proteins with an N-terminal KRAB domain and a C-terminal zinc finger array. The zinc finger domain binds transposable elements (TEs), while the KRAB domain recruits a heterochromatin-inducing complex—including TRIM28 (KAP1), SETDB1, the NuRD complex, HP1, and DNMTs. This complex mediates H3K9 trimethylation and directs DNMTs to methylate CpG sites in L1 promoter regions, repressing L1 retrotransposition. In tumour cells, however, non-canonical ZNF695 transcripts, lacking the zinc finger domain, predominate. This impairs TE binding and DNMT recruitment, leading to L1 promoter hypomethylation and subsequent L1 activation. Between the first and second strand synthesis by activated ORF2p (ref. 38), intermediates may be cleaved and integrated into the genome. Coupled with DNA repair mechanisms, this process may generate single-base-pair indels (e.g., ID1/ID2). This suggests a dominant-negative effect, where the balance between canonical and non-canonical transcripts governs L1 methylation status.

Supplementary information

Supplementary Information

Guide to the Supplementary Tables 1–7 and Supplementary Figs 1–36.

Reporting Summary

Supplementary Tables 1–7

Supplementary Tables 1–7.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, T., Zhao, W., Wirth, C. et al. Uncovering the role of LINE-1 in the evolution of lung adenocarcinoma. Nature (2025). https://doi.org/10.1038/s41586-025-09825-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1038/s41586-025-09825-y

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer