Abstract
Neuroendocrine and tuft cells are rare chemosensory epithelial lineages defined by the expression of ASCL1 and POU2F3 transcription factors, respectively. Neuroendocrine cancers, including small cell lung cancer (SCLC), frequently display tuft-like subsets, a feature linked to poor patient outcomes1,2,3,4,5,6,7,8,9. The mechanisms driving neuroendocrine–tuft tumour heterogeneity and the origins of tuft-like cancers are unknown. Using multiple genetically engineered animal models of SCLC, we demonstrate that a basal cell of origin (but not the accepted neuroendocrine origin) generates neuroendocrine–tuft-like tumours that highly recapitulate human SCLC. Single-cell clonal analyses of basal-derived SCLC further uncovered unexpected transcriptional states, including an Atoh1+ state, and lineage trajectories underlying neuroendocrine–tuft plasticity. Uniquely in basal cells, the introduction of genetic alterations enriched in human tuft-like SCLC, including high MYC, PTEN loss and ASCL1 suppression, cooperates to promote tuft-like tumours. Transcriptomics of 944 human SCLCs revealed a basal-like subset and a tuft–ionocyte-like state that altogether demonstrate notable conservation between cancer states and normal basal cell injury response mechanisms10,11,12,13. Together, these data indicate that the basal cell is a probable origin for SCLC and other neuroendocrine–tuft cancers that can explain neuroendocrine–tuft heterogeneity, offering new insights for targeting lineage plasticity.
Similar content being viewed by others
Main
Tumour cell plasticity, a hallmark of cancer, promotes cell fate diversity, tumour progression and therapy resistance14. However, the intrinsic and extrinsic cues driving plasticity remain largely undefined, and the role of tumour cell of origin in influencing plasticity is still unclear.
Neuroendocrine (NE) cancers are aggressive tumours found throughout the body, including in the lung, prostate, pancreas and gastrointestinal tract15,16. A tuft-like subset of these tumours depends on the lineage-specific oncogene POU2F32,3,4,5,6,7,17. Although tuft-like cancers are not yet clinically distinguished, they are transcriptionally distinct from other NE types and may respond uniquely to therapies5,6,7,18,19. The origins of tuft-like cancers and the genetic alterations that drive them are poorly understood, partly because of a lack of representative models.
Small cell lung cancer (SCLC) is among the deadliest NE cancers, with a 5-year survival rate below 7%20,21. SCLC comprises distinct molecular subtypes: SCLC-A (ASCL1+), SCLC-N (NEUROD1+) and SCLC-P (POU2F3+), as well as a controversial fourth group that lacks these markers and has been described as YAP1+, mesenchymal and/or inflammatory8,9,20,21,22,23,24,25. Human SCLC displays intratumoural subtype heterogeneity and plasticity, especially among the A, N and Y states8,26,27,28,29,30,31. Whether SCLC-P arises through plasticity remains unknown.
The accepted SCLC cell of origin is the pulmonary neuroendocrine cell (PNEC)32, a rare chemosensory population expressing ASCL1. However, PNECs with SCLC mutations fail to generate SCLC-P in animal models26,27. SCLC-P, which has the worst prognosis, is driven by tuft-lineage transcription factors and is suspected to arise from tuft/brush cells, so named for their apical microvilli that sense and respond to pathogens5,8,9,17,33. Stem-like lung basal cells can differentiate during injury repair into ASCL1+ PNECs, POU2F3+ tuft cells and FOXI1+ ionocytes10,11,12,13,34,35, suggesting that they may bridge NE and tuft tumour fates. Although basal cells can give rise to SCLC36,37,38, their role in enabling NE–tuft plasticity remains undefined.
Basal cells permit SCLC subtype diversity
To investigate the NE–tuft plasticity in SCLC, we analysed the intratumoural heterogeneity of subtype-defining transcription factors in POU2F3+ human tumours. In a study of 70 POU2F3+ biopsies, more than 80% co-expressed at least one other SCLC subtype marker31. Our immunohistochemistry analysis of 119 human SCLC biopsies showed that approximately 19% were POU2F3+, and among these, more than 82% also expressed ASCL1 and/or NEUROD1 (Extended Data Fig. 1a,b). Co-immunofluorescence of human biopsies and patient-derived xenografts (PDXs) demonstrated intratumoural heterogeneity of ASCL1 (A), NEUROD1 (N) and POU2F3 (P), often in mutually exclusive populations (Extended Data Fig. 1c,d). Rare double-positive A/N and A/P cells were also observed, suggesting transitional states (Extended Data Fig. 1c,d). Given the probable monoclonal origin of SCLC, co-expression of subtype markers within tumours supports the existence of plasticity between subtypes, including between SCLC-P and others.
SCLC nearly always harbours RB1 and TP53 loss and mutually exclusive MYC family activation (typically MYCL in NE-high and MYC in NE-low subsets)39. SCLC-P is associated with MYC amplification and overexpression33. Yet, MYC-high GEMMs (Rb1fl/flTrp53fl/flMycT58A (RPM) mice) do not express POU2F3 when tumours arise from NE (CGRP–Cre) or alveolar/club cells (CCSP–Cre)26,27,40. Rare POU2F3+ tumours occur in RPM mice initiated with broadly active Ad–CMV–Cre26, suggesting that tuft-like SCLC can arise from an undefined cell of origin.
Basal cells generate both NE and tuft lineages during injury repair10,13. These lineages comprise less than 1% of the lung and are typically post-mitotic, whereas basal cells account for approximately 34% of lung epithelial cells and approximately 80% of proliferating lung cells, which are present throughout the airway epithelium10,11,12,13,34, including the main bronchi where SCLC arises clinically. Tobacco smoke, the primary SCLC risk factor, increases basal cell proliferation and metaplastic potential41. Therefore, we tested whether basal cells can generate both NE and tuft tumour fates.
In mice, basal cells are largely restricted to the trachea10,12. To initiate SCLC from basal cells, we combined naphthalene injury (which ablates club cells and expands basal cells) with basal-specific Ad–KRT5 (K5)–Cre (Fig. 1a and Extended Data Fig. 1e,f). RPM tumours formed after K5–Cre with an average latency of 53 days, similar to tumours induced by CMV–Cre, CGRP–Cre or CCSP–Cre viruses27 (Extended Data Fig. 2a). Tumours localized to the trachea and main airways exhibited SCLC histopathology and retained YAP1 and KRT5 expression in in situ lesions, with reduced expression in invasive tumours (Fig. 1b and Extended Data Fig. 2b). KRT5-initiated tumours showed extensive SCLC subtype heterogeneity, including POU2F3+ tumours, which were enriched in tracheal and main airways and were more abundant than from other cells of origin (Fig. 1c–e and Extended Data Fig. 2c).
a, Schematic of SCLC induction in RPM GEMMs with basal-specific Cre. b, RPM K5–Cre tumour haematoxylin and eosin (H&E): whole lobes (top) and tumour morphology (bottom). c, Immunohistochemistry for indicated proteins in RPM tumours from indicated Ad–Cre. K5–Cre mice pretreated with naphthalene injury. d, POU2F3+ tumours (H-score above 50) per lung per mouse (n indicated) by Ad–Cre. One-way analysis of variance (ANOVA) with post hoc Tukey’s P values shown. Error bars = mean ± s.e.m. e, H-score quantification of RPM tumours for indicated markers by Ad–Cre. Tumour number indicated; n = 4–18 mice per group. f, Schematic of RPM basal organoid and allograft generation. g, Bright-field images of RPM organoids pre- (wild type) and post-CMV–Cre (transformed). h, H&E of RPM allografts with classic (top) or variant (bottom) histology. i, scRNA-seq UMAP from wild type (n = 1) and RPM basal organoids (n = 2) and the resulting RPM allografts (n = 2 independent experiments) with basal versus NE cell signature enrichment. Cell number analysed per sample is indicated. j, Left, Leiden clusters in UMAP of downsampled RPM allografts (n = 2 independent experiments; n = 4,435 cells; Supplementary Table 1). Right, expression of indicated genes in UMAP (top) and by Leiden clusters (bottom). Red circle, Pou2f3 + cluster 20. k, UMAP in j annotated by SCLC fate. l–n, UMAP in j of NE score (l) and violin plots of NE score (l), NE/tuft/basal cell signatures (m) and ChIP–seq targets and YAP1 activity by SCLC fate (n) (Supplementary Table 2). o, Co-immunofluorescence for DAPI (nuclei) and indicated proteins in RPM allografts. Yellow arrows mark co-expressing cells. Violin plots show median and upper/lower quartiles. Unless otherwise noted, statistics are Kruskal–Wallis (KW) tests with post hoc uncorrected (e) or Bonferroni-corrected (l–n) Dunn’s pairwise comparisons. ****P < 2 × 10−16; NS, not significant (P > 0.05); other P values indicated. Scale bars: 1 mm (b, top), 50 μm (b, bottom, c,h), 10 μm (c, insets), 650 μm (g, left), 275 μm (g, right), 75 μm (o). Schematics in a and f were created using BioRender (https://biorender.com).
Compared with CMV-initiated and CGRP-initiated RPM tumours, KRT5-initiated tumours had increased POU2F3, slightly reduced but prominent ASCL1 and NEUROD1 and decreased YAP1 expression (Fig. 1c,e and Extended Data Fig. 2d). Co-immunofluorescence revealed intratumoural heterogeneity, with A, N and P largely marking distinct cells, with rare co-expressing cells (most frequently A/N, followed by A/P) (Extended Data Fig. 2e,f). To assess transcriptional heterogeneity at the single-cell level, we profiled multiple KRT5-initiated and CGRP-initiated RPM tumours using single-cell RNA sequencing (scRNA-seq) and visualized them using uniform manifold approximation and projection (UMAP) (Extended Data Fig. 2g). Both groups showed comparable expression of SCLC subtype markers, chromatin immunoprecipitation followed by high-throughput sequencing (ChIP–seq) targets18,27,42, NE score and human-derived SCLC archetype29 and scRNA-seq signatures (Extended Data Fig. 2g–m and Supplementary Table 2).
Leiden clustering revealed broader transcriptional heterogeneity in basal-derived versus NE-derived tumours (Extended Data Fig. 2i). KRT5-derived tumours were enriched in eight of 11 clusters, including clusters 7 and 8, which retained basal/epithelial markers (Extended Data Fig. 2i,n,o). Clusters 7 and 8 were enriched for Ascl1 and ‘A2’ (‘NEv2’) signatures (Extended Data Fig. 2n,o), associated with ASCL1+HES1+ epithelial, drug-resistant and immune-modulatory states29. These clusters also expressed the Krt13+ basal hillock and luminal hillock states, which are similarly drug-resistant and immune-modulatory10,43. Thus, basal cells of origin expand the transcriptional and subtype diversity of RPM tumours. KRT5–RPM tumours recapitulate the A > N > P subtype distribution seen in human SCLC more accurately than previous GEMMs.
Basal organoids model all SCLC subtypes
To test the basal cell of origin more directly, we isolated normal tracheal basal cells from RPM GEMMs using surface ITGA6 expression12 and cultured them as organoids (Fig. 1f). ITGA6+ cells expressed basal markers NGFR and KRT5, with a subset co-expressing the hillock marker KRT1310,43 (Extended Data Fig. 3a). Cells were transformed ex vivo with Ad5–CMV–Cre and analysed by co-immunofluorescence and scRNA-seq (Extended Data Fig. 3b–f). Transformed RPM basal organoids retained morphology and transcriptomes similar to wild-type controls (Fig. 1g and Extended Data Fig. 4a). By scRNA-seq and co-immunofluorescence, transformed organoids showed increased expression of LoxP–Stop–LoxP (LSL)-recombination-associated genes (such as Myc and Firefly luciferase) and proliferation markers (such as Mki67) (Extended Data Figs. 3c and 4b,c and Supplementary Table 1). In wild-type and transformed organoids, basal/stem markers (P63, KRT8, KRT5 and suprabasal KRT13)44 remained high, whereas non-basal lineage markers were low (CCSP and FOXJ1) (Extended Data Figs. 3c,d,f and 4b and Supplementary Table 1). Neither wild-type nor transformed basal organoids expressed SCLC subtype markers (Extended Data Figs. 3c–e and 4b), thus retaining a basal identity in vitro.
After subcutaneous implantation into severe combined immunodeficiency (SCID)/beige mice, transformed basal organoids formed tumours resembling SCLC within 4–6 weeks (Fig. 1f,h). Unlike organoids, allografts lost basal markers and acquired NE and proliferation markers (Fig. 1i and Extended Data Fig. 4d), highlighting the influence of growth environment. Subsampled allograft cells were reclustered (Fig. 1j). Most clusters expressed NE/neuronal genes, cluster 20 expressed Pou2f3 and tuft cell markers and cluster 6 retained basal genes (Fig. 1j and Extended Data Fig. 4e). Clusters 4, 7 and 19 were enriched for Atoh1 and inner ear genes (Fig. 1j and Extended Data Fig. 4e), marking an aggressive, pro-metastatic SCLC subset observed in PDX and human tumours45 but not yet in mouse models. SCLC-A, SCLC-N, SCLC-P, hybrid Ascl1+Neurod1+ (A/N), ATOH1 (At) and basal (B) fates were assigned to clusters on the basis of lineage markers, SCLC subtype and normal lung cell signatures and differentially expressed genes (Fig. 1k–n, Extended Data Fig. 4e,f and Supplementary Table 1). The A/N hybrid state, also found in RPM K5–Cre tumours (Extended Data Fig. 2f,h,i), was present in allografts (Fig. 1k), aligning with A/N co-expression in human tumours (Extended Data Fig. 1b). NE scoring showed that most tumour fates were NE-high, whereas the basal state remained NE-low (Fig. 1k,l). Normal lung cell type signatures10, ChIP–seq signatures (ASCL1, NEUROD1, ATOH1 and POU2F318,27,42,45) and YAP1 activity46 supported these identities (Fig. 1j,m,n). Similar to K5–Cre RPM tumours, allografts showed mutually exclusive A/N/P expression, with rare co-positive cells (Fig. 1o). These results indicate that basal cells can give rise to SCLC heterogeneity, including tuft-like states, without a tuft cell of origin.
In GEMMs, MYC promotes SCLC-A, SCLC-N and SCLC-Y subtypes from NE cells, whereas non-MYC models (such as Rb1fl/flTrp53fl/flRbl2fl/fl (RPR2)) yield only SCLC-A26. To assess whether MYC is required for subtype heterogeneity and tuft fate from basal cells, we generated basal organoids from RPR2 mice (Extended Data Fig. 4g), which express and occasionally amplify Mycl26,40. Similar to RPM organoids, transformed RPR2 organoids maintained basal identity in vitro (Extended Data Fig. 3c–e). After transplantation, RPR2 allografts developed after approximately 6 months, and unlike RPM allografts, RPR2 tumours were uniformly ASCL1+, with little or no NEUROD1 or POU2F3 expression (Extended Data Fig. 4h). ScRNA-seq confirmed that RPR2 tumours expressed Mycl but not Myc (Extended Data Fig. 4i), consistent with RPR2 autochthonous tumours. RPR2 tumours showed reduced transcriptional diversity; they occupied fewer Leiden clusters (Supplementary Table 1), lacked non-NE states (such as cluster 8) and clustered near SCLC-A RPM cells in UMAP (Extended Data Fig. 4j–l). RPR2 allografts were enriched for SCLC-A and SCLC-A2 archetypes29, with reduced SCLC-N and SCLC-P scores (Extended Data Fig. 4m). RPR2 cells strongly expressed ASCL1 ChIP–seq targets but showed significantly lower NEUROD1, POU2F3 and MYC target gene signatures (Extended Data Fig. 4n). Together, these findings indicate that MYC promotes transcriptional diversity and tuft-like fates from basal cells, consistent with its enrichment in human SCLC-P.
ASCL1 loss promotes POU2F3 tuft-like SCLC
ASCL1 and POU2F3 are typically mutually exclusive in cancer and development2,3,10,30,33, and ASCL1 is required for NE phenotypes in multiple cancers3,27,42,47,48. Because RPM and RPR2 basal allografts predominantly adopt an ASCL1+ fate, we investigated whether ASCL1 represses tuft fate and whether its loss alters SCLC subtypes. Notably, previous studies deleting Ascl1 in non-basal GEMMs did not induce POU2F327,42.
Basal organoids were generated from RPM Ascl1fl/fl (RPMA) GEMMs3,27, transformed ex vivo and allografted (Extended Data Figs. 3c–e and 5a). RPMA allografts formed in 12–15 weeks, nearly double the latency of RPM tumours (Extended Data Fig. 5b). Most RPMA tumours exhibited SCLC histopathology, with some harbouring adeno-squamous non-small cell lung cancer (NSCLC) features (Fig. 2a). ASCL1 was absent at both the RNA and protein levels; however, some SCLC regions retained NEUROD1 (Fig. 2a–c and Extended Data Fig. 5c), unlike non-basal models in which ASCL1 loss prevents NEUROD1 expression27, indicating that basal cells confer broader subtype potential.
a, H&E of RPMA allograft tumours from SCID/beige mice showing SCLC (top) or NSCLC (bottom) histology. b, Immunohistochemistry (left) and H-score quantification (right) of indicated proteins in RPM (SCLC only) and RPMA (SCLC-dominant or NSCLC-dominant; more than 50% area); n = 8–13 first-passage tumours from n = 4–10 mice per genotype. One-way ANOVA with post hoc Fisher’s least significant difference (LSD) pairwise comparisons. Error bars, mean ± s.d. c, Immunoblot of RPM (n = 2) versus RPMA allografts (n = 3 tumours); HSP90, loading control. For gel source data, see Supplementary Fig. 1. d, Co-immunofluorescence for DAPI (nuclei) and indicated proteins in RPMA allografts. Merge, NEUROD1 and POU2F3. e, scRNA-seq UMAP from RPM (n = 3 tumours in n = 3 samples) and RPMA allografts (n = 3 pooled tumours in n = 1 sample) with cell number indicated. f, UMAP in e annotated by Leiden cluster (Supplementary Table 3); percentage of total cells per sample per cluster (right). g, Dot plot of indicated marker genes grouped by cluster in f; dot colour, expression; dot size, percentage-expressing. Pie charts show RPM and RPMA proportions per cluster. h, UMAP in e coloured by SCLC fate (left); percentage of cells per sample per fate (right). i, UMAP in e by NE score (left); violin plots by fate or genotype (right). j, UMAPs in e and violin plots (insets) for ChIP targets or activity scores. Two-sided Wilcoxon rank-sum tests. k, UMAPs in e and violin plots for normal NE, tuft and basal cell signatures by fate. l, Violin plots of SCLC archetype signatures by fate (Supplementary Table 2). Violin plots show median ± quartiles. Unless otherwise stated, statistics are by Kruskal–Wallis tests with post hoc Dunn’s and Bonferroni correction for pairwise comparisons. ****P < 0.0001; NS (P > 0.05); other exact P values indicated. Max, maximum; Mes., mesenchymal; Min, minimum; Prolif., proliferating. Scale bars, 50 μm (a,b), 75 μm (d).
Compared with RPM allografts, ASCL1 loss in RPMA tumours increased non-NE phenotypes and POU2F3 expression (Fig. 2b–d and Extended Data Fig. 5c). Lineage-defining factors were largely mutually exclusive (Fig. 2d), as in human SCLC. Non-NE markers YAP1 and HES1, typically antagonistic to ASCL126,27,49,50, were enriched in basal and mesenchymal/stem compartments (Fig. 2b and Extended Data Fig. 5d,f). KRT5 and P63 expression in RPMA NSCLC regions further distinguished them from RPM tumours (Extended Data Fig. 5d).
Compared with RPM allografts, scRNA-seq and Leiden clustering of RPMA cells revealed depletion of NE and neuronal-enriched clusters and emergence of a larger Pou2f3+ tuft-like population (cluster 16; Fig. 2e–g and Extended Data Fig. 5e,f). One cluster (15) retained Neurod1 and other neuronal markers (Fig. 2e–g and Extended Data Fig. 5e,f), aligning with NEUROD1 protein expression (Fig. 2b–d). Clusters were categorized into NE, neuronal, hybrid NE/neuronal, ATOH1, tuft, basal or subtype-low states (Fig. 2h and Supplementary Table 3). Most RPMA cells occupied NE-low Pou2f3+ or subtype-low states, whereas a minority retained NE-high/neuronal signatures, differing starkly from the RPM fate distributions (Fig. 2h,i).
RPMA cells showed reduced ASCL1, NEUROD1 and ATOH1 ChIP target genes and enrichment of POU2F3 targets and YAP1 activity (Fig. 2j). Enrichment of normal lung and human SCLC archetype signatures matched assigned fates (Fig. 2k,l). Thus, ASCL1 loss shifts the SCLC landscape towards non-NE and SCLC-P states. Although ASCL1 genetic alterations are not reported in human SCLC39, microenvironmental cues, such as chemotherapy or altered WNT, Notch or YAP1/TAZ signalling8,22,26,49,50,51, can suppress ASCL1 and drive non-NE phenotypes, mirroring the RPMA model. These findings support a role for the basal origin in enabling diverse SCLC states, consistent with its differentiation potential. Blocking NE fate through ASCL1 loss permits the emergence of POU2F3+, YAP1+ and subtype-low tumours, establishing the first in vivo SCLC model with robust POU2F3+ populations.
Lineage tracing reveals SCLC trajectories
The coexistence of mutually exclusive SCLC fates in individual tumours implies lineage plasticity; however, underlying mechanisms remain unclear. To conclusively test if transitions occur between subtypes, we used lentiviral CellTagging52 to trace individual tumour clones from basal-derived RPM and RPMA organoids and allografts at single-cell resolution (Fig. 3a,b). Organoids were infected with more than three CellTags to uniquely barcode each cell, either before (RPM and RPMA) or after (RPM only) CMV–Cre transformation, and then propagated for clonal expansion before transplantation (Fig. 3a). This enabled clonal tracing from either a normal (‘CellTag pre-Cre’) or transformed basal cell origin (‘CellTag post-Cre’). Analysis of CellTagged pre-Cre organoids revealed heterogeneity within starting basal states that did not impact the resulting in vivo tumour clonal dynamics (Extended Data Fig. 7). Because transformed organoids remain basal before implantation (Extended Data Figs. 3–5), we focused specifically on allograft tumour clones to assess fate transitions.
a, Schematic of CellTagging in RPM and RPMA basal organoids and allografts. CellTagged organoids (pre-Cre and post-Cre) were collected for scRNA-seq at implant time; n = 1 RPMA, n = 1 RPM (pre-Cre) and n = 2 RPM (post-Cre) tumours (representing independent experimental replicates) were sequenced for clonal analyses. b, Fluorescence images of transformed CellTagged RPM organoids (similar results obtained in n = 4 independent CellTagging experiments). c, ForceAtlas2 (FA) map of RPM and RPMA tumours (from Fig. 2f,h) by Leiden cluster per genotype (top, Supplementary Table 3) or fate (bottom). d, Leiden cluster frequencies per clone. Each bar represents one clone. Clonal patterns, genotype and CellTag method are shown on the x axis. Individual clones are shown in Extended Data Fig. 6b,c and CellTag annotations in Supplementary Table 4. e, ForceAtlas2 maps of main RPM and RPMA clonal patterns in d. f, ForceAtlas2 maps of main clonal patterns in e, annotated by SCLC fate in c. g, ForceAtlas2 map coloured by pseudotime (start = basal-enriched cluster 17). h, ForceAtlas2 maps of main clonal patterns in e, annotated by pseudotime in g and fate in c. Straight arrows denote state transitions; circular arrows denote self-renewal within a fate. i, CellRank plots of fate probabilities in RPM and RPMA tumour cells in c, annotated by SCLC fate (left) or Leiden cluster (right). Cells arranged inside the circle according to fate probability, with fate-biased cells next to their corresponding edge and naive cells in the middle. j, CellRank-predicted expression trends of putative driver genes (Supplementary Table 5), plotted along pseudotime trajectories from basal to indicated fates/Leiden clusters. Scale bars: 650 μm (b, left), 275 μm (b, right). Schematic in a was created using BioRender (https://biorender.com).
scRNA-seq data from CellTagged RPM and RPMA allograft cells were visualized with ForceAtlas2 for lineage trajectory analysis (Extended Data Fig. 6a–c). The composition of each CellTagged clone by Leiden cluster was determined (n = 46 RPM; n = 40 RPMA) (Fig. 3c,d and Supplementary Table 4), and unbiased analysis on clone composition revealed six main clonal patterns (patterns 1–6) and one minor pattern (pattern 7), suggesting that SCLC undergoes non-random plasticity trajectories (Fig. 3c–e and Extended Data Fig. 6b).
We examined how clonal dynamics relate to SCLC fate (Fig. 3c–f and Extended Data Fig. 6d). Patterns 1, 2, 5 and 6 were exclusive to RPM; patterns 3 and 4 occurred only in RPMA (Fig. 3c–f and Extended Data Fig. 6b,c). Pattern 1 clones transitioned between basal and Ascl1+ NE states (Fig. 3c–f), whereas pattern 6 was rarer and enriched for a distinct, neuronally biased NE population with reduced adhesion/junction gene expression (Supplementary Table 3). Pattern 2 clones showed broader plasticity, spanning basal, hybrid NE/neuronal and neuronal fates (Fig. 3c–f). Pattern 5 clones occupied the ATOH1 state with some cells in tuft and neuronal clusters. These data suggest that RPM cells favour SCLC-A and, less frequently, a hybrid NE/neuronal fate with access to neuronal, ATOH1 and tuft states. Similar clonal patterns emerged from CellTag pre-Cre and post-Cre samples (Fig. 3d), indicating pattern stability whether tracing from a normal or transformed basal origin.
RPMA clones exhibited unique plasticity patterns from RPM (Fig. 3c–f). Pattern 3 clones were largely confined to subtype-low clusters with epithelial and hillock-like features, lacking NE clusters consistent with Ascl1 loss (Fig. 3f and Extended Data Fig. 6e). Pattern 4 clones occupied distinct subtype-low states enriched for proliferative/stem-like or mesenchymal signatures, which was the most common RPMA pattern to harbour tuft or neuronal cells (Fig. 3f, Extended Data Fig. 6c–e and Supplementary Table 3).
Although CellTag analysis revealed a clonal structure, it lacked the directionality of subtype plasticity. To infer fate trajectories, we implemented diffusion pseudotime analysis starting from the basal state (cluster 17; Fig. 3g). Pseudotime predicted the following: patterns 1 and 6 progressed basal → NE; pattern 2: basal → hybrid NE/neuronal → neuronal; pattern 3: basal → subtype low (clusters 10 and 12); pattern 4: basal → subtype low (cluster 12) → distinct subtype low (clusters 2, 3 and 13), tuft or neuronal; and pattern 5: basal → early tuft → ATOH1 or neuronal (Fig. 3h).
To estimate fate probabilities and identify drivers, we applied CellRank, which revealed varying commitment levels across lineages, with some cells remaining naive/uncommitted (Fig. 3i and Extended Data Fig. 6e). Cells in NE, neuronal, tuft and ATOH1 states showed strong fate bias, whereas subtype-low clusters and some NE cells with A2-like traits (clusters 1, 7 and 14) remained relatively naive (Fig. 3i). Cluster 12 cells were the most unbiased, lying centrally in fate space (Fig. 3i), and were common across clones (Fig. 3c,d). Marker analysis indicated that this cluster reflects a cycling, fetal-like and lineage-naive basal state (Supplementary Table 3), possibly resembling plastic progenitor states in other cancers.
CellRank was further used to identify putative genes driving differentiation trajectories (Fig. 3j and Supplementary Table 5). NE fate trajectories showed early Ascl1 and Foxa2 induction, followed by Runx1t1 and Prox1. Supporting this, FOXA2 is enriched at ASCL1-bound enhancers in SCLC42 and regulates NE fate with ASCL1 in prostate cancer. Neuronal fates were driven by Neurod1, Tubb3, Nhlh2 and neuronal transcription factors Sox4 and Sox11. Hybrid NE/neuronal fates were linked to Prox1 and Nkx2-1 (both interact with ASCL1) and brain epigenetic regulator Kdm7a. Tuft state drivers included Pou2f3, Ascl2, Gng13, Avil and Rgs13; these tuft markers are also linked to prostate cancer2. ATOH1-driven fates involved Atoh1, Ush2a, Lhx3 and Pou4f3, which are key regulators of inner ear hair cell development, some of which are ATOH1 target genes in SCLC45.
Subtype-low states showed distinct predicted drivers (Extended Data Fig. 6e,f and Supplementary Tables 3 and 5). Wnt-related genes (Tcf7l2 and Sox9), Vim and Sparc drove mesenchymal (cluster 5); Elf3, Krt18/19, Epcam and ionocyte gene Atp6v0e drove cluster 10 (a state enriched for epithelial and luminal hillock43,44 markers); and Twist1, Fzd2 and ribosomal genes drove proliferative/stem cluster 13. Together, these data uncovered fate-specific transcriptional drivers and highlighted notable plasticity among SCLC-A, SCLC-N, SCLC-At and SCLC-P states at single-cell resolution.
PTEN loss promotes tuft-like SCLC
Our data indicate that genetic alterations (such as Myc gain and Ascl1 loss) and cell of origin shape SCLC fate. Human POU2F3high tumours exhibit increased MYC, decreased ASCL1 and enrichment for PTEN loss33 (Fig. 4a). In a cohort of 112 tumours23, PTEN deletion occurred in 63% of POU2F3high cases versus 27% of POU2F3low cases (Fisher’s exact test; P < 0.009) (Fig. 4a), indicating a link between PTEN loss and SCLC-P.
a, Expression of POU2F3 or ASCL1 (log2[TPM + 1]) and MYC or PTEN copy number (log2 ratio) in n = 112 human SCLC tumours grouped by POU2F3 status (n = 96 low; n = 16 high). Data were from ref. 23. The median (dashed red) and quartiles (dotted lines) are shown. Two-tailed Mann–Whitney U-test with the exact P values indicated. b, Schematic of RPM and RPMA basal organoid/allograft generation with Pten loss. c, Immunohistochemistry images (left) and H-score quantification (right) of RPM and RPMA allografts (including multi-passage) with LCV2–sgControl or sgPten. Ctrl, parental + sgControl. All tumours are SCLC-dominant (50% or more), except the NSCLC-dominant (only from RPMA) on the far right of the bar graph. A total of 10–17 tumours quantified from n = 5–12 mice per genotype. One-way ANOVA with post hoc Fisher’s LSD multiple comparisons. Error bars, mean ± s.d. d, Linear regression of pAKT versus POU2F3 H-scores in RPM (n = 13) and RPMA (n = 13) allografts. Control, parental + sgControl. The goodness of fit (R2) and P value are shown. e, Schematic of SCLC induction in RPP GEMMs through K5–Cre. f, Immunohistochemistry for SCLC subtype markers in RPP tumours grouped by Ad–Cre. K5–Cre mice pretreated with naphthalene ‘injury’. g, POU2F3 H-scores in RPM versus RPP tumours by indicated Ad–Cre. Exact tumour number indicated from n = 4–18 mice per group. Median, red bar; quartiles, solid lines. Statistics are by Kruskal–Wallis test with post hoc Dunn’s pairwise comparisons. h, Immunohistochemistry on serial K5–Cre RPP tumour sections for indicated proteins in high-MYC, medium-MYC and low-MYC regions. i, Linear regression of MYC versus POU2F3 H-scores in n = 21 tumours from n = 4 K5–Cre RPP mice; R2 and P value are shown. ****P < 0.0001; NS (P > 0.05); other exact P values are indicated. Scale bars: 50 µm unless stated (c,f,h) and 10 µm (f, insets). Schematics in b and e were created using BioRender (https://biorender.com).
To test this, we introduced CRISPR-mediated Pten knockout or control single-guide RNAs (sgRNAs) into RPM and RPMA basal organoids (Fig. 4b), validated phospho-AKT induction (Extended Data Fig. 8a,b) and implanted organoids into mice. Pten loss accelerated tumour growth in both backgrounds (Extended Data Fig. 8c). As expected, RPMA tumours lacked ASCL1 and expressed YAP1, which were not changed with Pten loss (Fig. 4c and Extended Data Fig. 8d). However, Pten loss markedly increased POU2F3 in both RPM and RPMA allografts, with near-uniform POU2F3 in some RPMA tumours (Fig. 4c). POU2F3 levels correlated with phospho-AKT and inversely with NEUROD1, particularly in RPMA (Fig. 4c,d and Extended Data Fig. 8d), indicating that Pten loss promotes SCLC-P, seemingly at the expense of SCLC-N.
Compared with controls, Pten-deleted RPMA tumours showed increased histological heterogeneity, with regions of adeno-NSCLC, adeno-squamous NSCLC and squamous NSCLC enriched for KRT5 and P63 (Extended Data Fig. 8e,f). Although other genotypes remained predominantly SCLC, RPMA–sgPten tumours comprised approximately 40% SCLC and 60% NSCLC, primarily adeno-squamous (Extended Data Fig. 8f). Squamous tumours often arise from basal cells and show MYC and PI3K/AKT activation53, suggesting that ASCL1 status may govern lineage choice between NE and squamous fates under MYC/AKT signalling. Tuft-like and squamous-like histologies co-occur in RPMA–sgPten tumours, suggesting a possible lineage relationship. Human SCLC-P has been observed adjacent to squamous cell carcinoma in combined SCLC54, suggesting a potential transition between these histologies. Thus, Pten loss in ASCL1-deficient basal cells yields a model for studying transitions between main lung cancer subtypes.
PTEN loss and MYC cooperate to drive SCLC-P
Because Pten loss promoted SCLC-P in allografts, we tested if it drives SCLC-P from a basal origin in autochthonous models. In Rb1fl/flTrp53fl/flPtenfl/fl (RPP) GEMMs, tumours initiated in NE cells resemble MYCL-driven SCLC-A lacking NEUROD1 and POU2F340. KRT5-initiated RPP tumours from basal cells developed with comparable latency (145 days versus 164 days), still lacked NEUROD1, but expressed significantly higher POU2F3 than CGRP–Cre or CMV–Cre controls (Fig. 4e–g and Extended Data Fig. 9a–c). Some K5–Cre RPP tumours co-expressed ASCL1 and POU2F3 (Fig. 4h), supporting NE–tuft plasticity. Similar to RPM models, POU2F3+ RPP tumours were enriched in the trachea and primary bronchi (Extended Data Fig. 9c). Thus, Pten loss promotes SCLC-P in basal cells across genetic contexts in both allograft and autochthonous systems.
PTEN loss and PI3K/AKT pathway activation can upregulate MYC through multiple mechanisms in cancer. Consistent with previous reports33, public SCLC datasets23 show MYC amplification in approximately 56% of POU2F3high versus 19% of POU2F3low tumours (Fisher’s exact test; P < 0.003) and co-occurrence with PTEN loss in 25% versus 7.3% (Fisher’s exact test; P = 0.05) (Fig. 4a). MYC was absent in ASCL1-only tumours, but it was detected in all POU2F3+ RPP tumours (Fig. 4h,i). Compared with MYC-driven RPM tumours, RPP basal tumours had even higher POU2F3 (Fig. 4g), suggesting PI3K/AKT–MYC cooperativity. Supporting this, AKT activation through Pten loss or myristoylated AKT expression increased MYC in POU2F3+ human SCLC cells (Extended Data Fig. 9d,e). Altogether, PI3K/AKT signalling acts upstream of MYC to drive tuft fate in SCLC, revealing a potential targetable axis for SCLC-P.
Inflammatory basal state of human SCLC
We next investigated whether a basal-like state exists in human SCLC and how it compares to basal-derived murine tumours. ‘Lineage-negative’ (Lin−) SCLC lacking A, N or P, often described as YAP1+, inflamed or mesenchymal8,9,22,23,24,25, remains controversial, and its relationship to other subtypes is unclear. Whether Lin− tumours are enriched for basal-derived signatures is unknown. We analysed bulk transcriptomes from 944 SCLC biopsies (Caris Life Sciences; Supplementary Table 8). Tumours were classified as A, N, P, mixed or Lin− (Fig. 5a). Basal markers were found in small subsets of A/N/P/mixed tumours, but Lin− tumours exhibited strong enrichment for a normal basal cell signature10 (Fig. 5a, Extended Data Fig. 10a and Supplementary Table 2), suggesting that basal-like phenotypes are minor but present across SCLC subtypes, consistent with mouse models.
a, Heat map of bulk RNA sequencing (RNA-seq) expression in n = 944 human SCLC biopsies (Supplementary Table 8) for indicated genes or signatures, grouped by subtype and ordered by basal signature score. b, As in a, with YAP1 subtype included. c, Spearman correlation matrix of genes and gene signatures in n = 944 SCLC tumours, including TIP genes and mouse (M) and human (H) ionocyte (iono) signatures (Supplementary Table 2). Yellow box highlights tuft/iono correlations. d, Expression of human SCLC subtype signatures (from data in a and Supplementary Table 7) in RPM/RPMA basal-derived tumours (from Fig. 2e) by UMAP or in violin plots grouped by SCLC fate. e, Pearson correlation matrix comparing enrichment scores of human SCLC subtype signatures (from four independent datasets: Caris, Liu et al.23, George et al.39 and Lissa et al.61), normal lung cell type signatures (basal, tuft and NE) and ChIP–seq target signatures (A, N or P_targets) in RPM/RPMA tumour data (from Fig. 2e). f, GSEA for ‘antigen presentation’ and ‘T cell inflamed’ signatures per indicated SCLC subtype versus ‘all’ other subtypes (data from a). NES and P values were determined using Kolmogorov–Smirnov and permutation testing. g, Expression of ‘antigen presentation’ and ‘inflamed’ human SCLC signatures8,23 in RPM and RPMA tumours (from Fig. 2e) by UMAP or in violin plots grouped by fate. h, Expression of therapeutic targets in RPM/RPMA tumours (from Fig. 2e) by UMAP or in violin plots grouped by fate. i, Graphical abstract: Waddington landscapes depict SCLC fate trajectories from NE (left) versus basal (right) origins. SCLC-Y from NE origins are transcriptionally similar to A/N cells from basal origins and are therefore omitted. The arrow thickness reflects frequency of trajectories in RPM GEMM. MYC, PTEN and ASCL1 concentrations vary by fate. All violin plots show median and upper/lower quartiles. Statistics are by Kruskall–Wallis tests and post hoc Dunn’s pairwise comparisons with Bonferroni correction. ****P < 0.0001; NS (P > 0.05); other exact P values indicated. NES, normalized enrichment score. Schematic in i was created using BioRender (https://biorender.com).
The basal signature strongly correlated with YAP1 in human tumours, particularly in Lin− and across subtypes (Fig. 5a and Extended Data Fig. 10a). In an independent cohort with transcriptomic and proteomic data (n = 107)23, YAP1 mRNA and protein also correlated with the basal signature and basal markers (Extended Data Fig. 10b). Thus, real-world tumours were reclassified to include YAP1 as a subtype, revealing basal signature enrichment in YAP1 and mixed tumours and a relative depletion in the remaining Lin− samples (Fig. 5b). ATOH1 expression correlated with SCLC-P and mixed tumours, aligning with murine lineage-tracing data showing plasticity between ATOH1, tuft and neuronal states (Fig. 3h). The presence of a basal-like identity in YAP1 and mixed tumours suggests that much of the YAP1+ human SCLC subtype may be enriched for basal-like tumour cells. The basal signature in mixed tumours further implies that the basal state may promote subtype diversity, consistent with A/N hybrid states in basal-derived murine tumours (Figs. 2h and 3c and Extended Data Fig. 2h,i).
In addition to NE and tuft cells, basal cells can generate rare CFTR+ ionocytes and tuft–ionocyte progenitors (TIPs) marked by POU2F3 and FOXI1 or ASCL310,13,55. Ionocyte markers such as FOXI1 are detected in tuft-like tumours across tissues6,7,23,39. In human SCLC, tuft and ionocyte gene expression signatures are strongly correlated (Fig. 5c, Extended Data Fig. 10c and Supplementary Table 2). Gene set enrichment analyses (GSEA) showed that SCLC-A was enriched for NE signatures, SCLC-Y for basal signatures and SCLC-P for both tuft and ionocyte signatures (Extended Data Fig. 10d). Subtype-specific signatures from the Caris dataset and three independent human SCLC datasets showed strong concordance with murine basal-derived fates and their respective normal lung cell counterparts (Fig. 5d,e and Extended Data Fig. 10e). These data support that the basal cell can generate diverse SCLC subtypes, recapitulating normal basal-derived lineages10,13. Future studies should study tuft or TIP cells as possible origins for SCLC and the extent of plasticity from these lineages.
In human SCLC, YAP1 was observed in limited-stage disease and may predict better prognosis owing to increased inflammation8,9,22,23,24. Yet, YAP1 was also enriched at relapse and linked to chemotherapy resistance50. Basal-like SCLC-Y tumours were enriched for ‘antigen presentation’ and ‘T cell inflamed’ signatures (Fig. 5f), and inflammatory signatures8,23 were enriched in the murine basal tumour state (Fig. 5g), suggesting that the basal-like state has an inflammatory nature. The association of YAP1 with good and poor outcomes may reflect its expression across basal, mesenchymal and stromal cell types (Extended Data Figs. 5f and 6d,e) and/or basal state heterogeneity, emphasizing the need to link specific cell fates to prognosis and therapy response.
Key SCLC therapeutic targets (such as DLL3, NCAM1 (also known as CD56), SEZ6 and TACSTD2 (also known as TROP2)) are enriched in distinct cell fates56, with Dll3, Ncam1 and Sez6 in NE or neuronal states and Tacstd2 in the basal state (Fig. 5h). ASCL1 loss mimics therapeutic pressure on NE-high tumours and predicts resistance through the emergence of POU2F3+ and non-NE states. Thus, SCLC may require combination and/or sequential therapies to counter subtype plasticity.
Discussion
The strong concordance of human SCLC subtype signatures within basal-derived mouse models (Fig. 5d,e) underscores basal cells and their differentiation capacity as central drivers of SCLC development and plasticity. Our findings support a model whereby basal cells give rise to all SCLC subtypes, including SCLC-A, SCLC-N, SCLC-P, mixed, TIP-like and ATOH1+ phenotypes (Fig. 5i). Given the abundance, distribution and differentiation potential of basal cells, these data strongly implicate them as a previously underappreciated but probable cell of origin for SCLC and possibly other tumours with NE–tuft heterogeneity. Concordantly, the computational predictions of cell of origin have leveraged the correlation between mutational density and chromatin state57,58. A recent pan-cancer preprint combining single-cell chromatin accessibility with whole-genome sequencing surprisingly predicted the basal cell, rather than NE or tuft/TIP cells, as the origin of SCLC57.
Emerging evidence of basal cell heterogeneity, including a KRT13+ hillock state10,41,43,59, raises the possibility that specific basal states may drive distinct tumour fates. In our study, a subset of KRT13+ hillock-like basal cells recombined in naphthalene plus K5–Cre reporter mice (Extended Data Fig. 11a–e) and a fraction of primary basal cells expressed KRT13 (Extended Data Fig. 3a). Although organoid-derived tumours with hillock-like basal origins did not show unique clonal dynamics in our limited dataset (Extended Data Fig. 7), future studies can target this population more precisely.
It remains to be determined if other tumours with NE–tuft heterogeneity arise from basal cells or acquire basal-like features to enable plasticity. We recently identified globose basal cells as the origin for olfactory neuroblastomas with both NE and tuft/microvillar phenotypes3. Basal-like intermediate states also occur in lineage plasticity driving therapeutic resistance in lung36 and prostate adenocarcinoma2,60. These findings support the notion that SCLC can arise de novo from basal cells and/or transition to a basal-like state under treatment pressure.
Altogether, we established the basal cell as a probable origin for SCLC plasticity. We introduced new POU2F3-enriched lung tumour models, including immune-competent GEMMs, suitable for genetic, microenvironmental and therapeutic studies. The mammalian SWI/SNF complex has emerged as a selective dependency in SCLC-P18,19, and further work is needed to understand the role of PI3K/AKT signalling in driving or sustaining this subtype. These immune-competent models will be key for studying the inflammatory basal state and for preclinical testing.
Additionally, basal-derived organoid allografts closely resemble autochthonous tumours but offer enhanced flexibility for lineage tracing, genetic manipulation and drug studies. These new models provide valuable tools to dissect NE–tuft plasticity and assess plasticity-directed therapies. Insights gained from these systems are expected to advance understanding of SCLC and other cancers characterized by NE–tuft heterogeneity1,2,3,4,7,16.
Methods
Resource availability
Materials availability
There are limitations to the availability of basal-derived organoid lines generated in this study owing to their derivation from primary cells in the tracheal epithelium. The human SCLC tissue used in this study was not available because of sample scarcity. Human transcriptomic data from Caris Life Sciences used for this study are not publicly available but can be made available upon reasonable request. The de-identified sequencing data are owned by Caris Life Sciences, and qualified researchers can apply for access by signing a data usage agreement.
Experimental models and study participant details
Mice
Rb1fl/fl;Trp53fl/fl;H11b–LSL–MycT58A/T58A–Ires–Luciferase (RPM; The Jackson Laboratory (JAX) no. 029971)40, RPM–R26–LSL–Cas9–Ires–Gfp (RPM–Cas9)26,27, RPM;Ascl1fl/fl (RPMA)27, Rb1fl/fl;Trp53fl/fl;Rbl2fl/fl (RPR2)62 and Rb1fl/fl;Trp53fl/fl;Ptenfl/fl (RPP)63,64 mice have been previously described. SCID/beige mice (CBSCGB) were purchased and available from Taconic and Charles River Laboratories. Ai9-tdTomato Cre-reporter mice (B6.Cg-Gt(ROSA)26Sortm9(CAG-tdTomato)Hze/J) were generously donated as a gift from C.-L. Lee at Duke University and are available for purchase through JAX (strain no. 007909; RRID:IMSR_JAX:007909).
All mice were housed and treated according to regulations set by the Institutional Animal Care and Use Committee (IACUC) of Duke University. Specifically, the mice were housed under conditions described by the Guide for the Care and Use of Laboratory Animals with a 12-h light/12-h dark cycle, at temperatures from 18–24 °C and in a 40–60% humidity range. Sample groups were randomly allocated but with intention to equally distribute sex among cohorts. Viral infections were performed in a Biosafety Level 2+ room following the guidelines from Duke University Institutional Biosafety Committee. Male and female mice were distributed equally for all experiments. Mice with symptoms including but not limited to inability to ambulate, eat or drink; weight loss in excess of 15% of body weight; tumours exceeding 10% of body weight; tumours with necrosis or ulceration of skin surface; laboured breathing; or other signs of poor body condition were killed before study end points to ensure human end points, as defined by Duke University’s Policy on Tumor Burden in Rodents and as permitted by IACUC protocol no. A057-22-03(2025). Tumour volume or end points, as defined by our IACUC protocol, were not exceeded in any of our studies.
Basal-derived organoid cultures and cell lines
Basal-derived organoid cultures from RPM, RPMA and RPR2 mice were obtained and transformed ex vivo. Organoid lines were determined to be free of pathogens by IDEXX 18-panel mouse pathogen testing and were confirmed to be Mycoplasma-negative before implantation to SCID/beige hosts. The cell lines used in this study include HEK-293T/17 cells (American Type Culture Collection (ATCC) CRL-11268) to produce lentivirus and H1048 SCLC cells (ATCC CRL-5853). Cell lines were tested for Mycoplasma every 3 months and were negative. Cell line identities were confirmed through short tandem repeat profiling within 6 months of usage, last performed in July 2024. No commonly misidentified cell lines were used in this study.
Patient tissue for immunostaining
Biopsies for the establishment of PDX models were performed after obtaining written informed consent from patients under an Institutional Review Board-approved protocol at Memorial Sloan Kettering (IRB14-091). Models were established and characterized, as previously described65. As previously described26 for human biopsies from Huntsman Cancer Institute, all patients provided informed consent for the collection of specimens, approved by the University of Utah Institutional Review Board (IRB_00010924 and IRB_00089989) in accordance with the US Common Rule. For tissue microarrays, human biopsies collected at Washington University in St. Louis were acquired with approval under IRB_202008098.
Caris Life Sciences patient cohort
Caris real-world data derived from a retrospective review of patient tumour specimens (n = 944) with a diagnosis of SCLC (on the basis of pathological confirmation by local pathologists) submitted to a Clinical Laboratory Improvement Amendments-certified laboratory (Caris Life Sciences) for molecular profiling. This study was conducted in accordance with the guidelines of the Declaration of Helsinki, Belmont Report and US Common Rule and in compliance with policy 45 CFR 46.101(b). This study was conducted using retrospective, de-identified clinical data, and patient consent was therefore not required. Human samples were derived from individuals in the age range of 65–70 years old and included approximately 49% male and 51% female samples. Population characteristics are reported in Supplementary Table 8. De-identified patient demographics and treatment information can also be found in Supplementary Table 8.
Method details
Naphthalene injury model and tumour initiation in mice
Mice at 6–8 weeks of age were treated intraperitoneally with 275 mg kg−1 naphthalene before 9 AM in corn oil, as described66, 72 h before administration of adenoviral Cre, a time point where KRT5+ basal cells are shown to be abundant and proliferative67, which we have validated (Extended Data Fig. 1e,f). After naphthalene treatment, the mice were infected by intratracheal (RPM and RPP) or intranasal (RPP) instillation with 1 × 108 plaque-forming units (pfu) of Ad5–K5–Cre adenovirus (University of Iowa VVC-Berns-1547) using established methods68,69. No observed differences in latency or tumour phenotype occurred in RPP mice with intratracheal versus intranasal inoculation methods; therefore, both were included in the results. In brief, the mice were anaesthetized with isoflurane at a flow rate of 20–25 ml h−1, depending on the size and sex of the mouse. The optimal breathing rate was approximately one breath every 2–3 s. For intratracheal instillation, the mice were positioned on a platform with their chest hanging vertically beneath them. A steel feeding tube or Exel Safelet IV catheter (needle removed) was slid into the trachea, and 63 μl of viral cocktail consisting of 10 mM CaCl2 (Sigma; C5670), 1 × 108 pfu adenovirus and MEM (Thermo Fisher Scientific; 11095080) up to 63 μl was administered through a P200 pipette to the catheter opening. The mice were maintained in this position until the entire volume was dispensed and then monitored until they regained full motility and recovered from anaesthesia. For intranasal instillation, the mice were held in a supine position and administered 63 μl of identical viral cocktail through a P20 pipette, alternating between the left and right nares for each drop. Administration of other Ad–Cre viruses (CGRP, VVC-U of Iowa-1160; SPC, VVC-U of Iowa-1168; CCSP, VVC-U of Iowa-1166; CMV, VVC-U of Iowa-5) also occurred in mice 6–8 weeks of age with identical methods, intratracheally, but without naphthalene injury, as previously described27,40,70.
Micro-computed tomography imaging
To monitor tumour development in autochthonous models, mice were imaged beginning 4 weeks after Ad–Cre administration for RPM mice, 8 weeks for RPP mice and every 2 weeks thereafter. The mice were anaesthetized with isoflurane and imaged using a small animal Quantum GX2 micro-computed tomography (PerkinElmer). Quantum GX2 images were acquired with 18-s scans at 45-μm resolution, 90 kV, with 88 mA of current. The mice were killed when the tumour burden resulted in any difficulty in breathing or significant weight loss, as permitted by IACUC.
Immunohistochemistry
For immunohistochemistry of autochthonous mouse models, lungs were inflated with 1× PBS, extracted and individual lung lobes and trachea were collected for fixation. Tissues were fixed in 10% neutral buffered formalin for 24 h at room temperature, washed in PBS and transferred to 70% ethanol. Formalin-fixed paraffin-embedded (FFPE) sections at 4–5 μm were dewaxed, rehydrated and subjected to high-temperature antigen retrieval by boiling 20 min in a pressure cooker in 0.01 M citrate buffer at pH 6.0. Slides were quenched of endogenous peroxide in 3% H2O2 for 15 min, blocked in 5% goat serum in PBS/0.1% Tween 20 (PBS-T) for 1 h and then stained overnight with primary antibodies in blocking buffer (5% goat serum or SignalStain antibody diluent (Cell Signaling Technology (CST); 8112). For non-CST primary antibodies, an HRP-conjugated secondary antibody (Vector Laboratories) was used at 1:200 dilution in PBS-T, incubated for 45 min at room temperature and followed by DAB staining (Vector Laboratories). Alternatively, CST primary antibodies were detected using 150 μl of SignalStain Boost IHC Detection Reagent (CST; 8114). All staining was performed using the Sequenza cover plate technology. The primary antibodies included ASCL1 (Abcam; ab211327) 1:300, NEUROD1 (Abcam; ab213725; using Tris/EDTA buffer (pH 9.0) instead of citrate buffer for antigen retrieval) 1:300, POU2F3 (Sigma; HPA019652) 1:300, YAP1 (CST; 14074) 1:300, HES1 (CST; 11988) 1:300, DNP63 (R&D Systems; AF1916) 1:400, phospho-AKT Ser473 (CST; 4060) 1:100, MYC (Abcam; ab32072; Tris/EDTA buffer (pH 9.0) 1:300, NKX2-1 (Abcam; ab76013) 1:250 and KRT5 (BioLegend; 905501) 1:1,000. For manual H-score quantification, whole slides were scanned using a Pannoramic MIDI II automatic digital slide scanner (3DHISTECH), and images were acquired using SlideViewer software (3DHISTECH). Immunohistochemistry quantification from primary tumour models included tumours from both the trachea and lung lobes. H-score was quantified on stained slides on a scale of 0–300, taking into consideration the percentage of positive cells and staining intensity, as previously described71, where H-score = positive cells (%) × intensity score of 0–3. For example, a tumour with 80% positive cells with high intensity of 3 has a 240 H-score.
Immunofluorescence
Lung and tumour tissue was collected and fixed for at least 24 h in 10% neutral buffered formalin and then transferred to 70% ethanol before embedding in paraffin. Wild-type and transformed organoids (more than 1 × 106 cells) were collected in approximately 0.5–1 ml of organoid medium using a P1000 pipette tip and then transferred to a conical tube containing 10 ml of 10% formalin. The organoids were fixed at room temperature in formalin for 24 h. After fixation, the organoids were spun down at 500g for 5 min and then washed in 70% ethanol. Ethanol was removed, and the organoids were resuspended in approximately 300 μl of 3% low-melting agarose gel (microwaved to melt and then incubated in a 50 °C water bath for 30 min) using a wide-bore P1000 pipette tip and then transferred to one well of a 96-well V-bottom plate. When the agarose solidified (approximately 3–5 min at room temperature), agarose plugs containing organoids were transferred from the well plate to histology cassettes, placed in 70% ethanol and then subjected to FFPE and sectioning for slides. Before staining, the slides were rehydrated in CitriSolv (2 × 3 min), 100% ethanol (2 × 3 min), 90% ethanol (1 × 3 min), 70% ethanol (1 × 3 min), 40% ethanol (1 × 3 min) and dH2O (1 × 5 min). Rehydrated tissue was subjected to high-temperature antigen retrieval by boiling for 15 min in a pressure cooker containing 0.01 M citrate buffer at pH 6.0. The slides were cooled at room temperature for 2 h and positioned for staining in Sequenza staining racks (Thermo Fisher Scientific; 10129-584). The slides were blocked at room temperature for 1 h in 10% donkey serum in PBS/0.2% Tween 20 (PBS-T). For primary mouse-on-mouse (M.O.M.) tissue staining, M.O.M. IgG Blocking Reagent (VectorLabs; PK-2200) was also added according to the manufacturer’s protocol. The primary antibody was diluted in 10% donkey serum in PBS-T and added to slides, and the slides were incubated overnight at 4 °C. The following day, the slides were washed three times with PBS-T and then stained with the secondary antibody diluted in 10% donkey serum. For M.O.M. staining, M.O.M. protein concentrate (VectorLabs; PK-2200) was added to the secondary antibody solution, according to the manufacturer’s protocol. The slides were then subjected to three times extra washes with PBS-T, followed by DAPI staining (1 μg ml−1 in PBS-T) for 20 min. After three extra washes in PBS-T, the slides were coverslipped with Aqua-Poly/Mount mounting medium (Polysciences; 18606-20). The primary antibodies included anti-mouse ASCL1 (BD Pharmingen; 556604) 1:25, anti-rabbit ASCL1 (Abcam; ab211327) 1:100, anti-goat NEUROD1 (R&D Systems; AF2746) 1:50, anti-rabbit NEUROD1 (Abcam; ab213725) 1:200, POU2F3 (Sigma; HPA019652) 1:100, anti-rabbit CCSP/SCGB1A1 (MilliporeSigma; 07-623) 1:75, anti-rat KRT8 (Developmental Studies Hybridoma Bank; TROMA-I) 1:100, anti-mouse FOXJ1 (eBioscience; 14-9965-80) 1:100, anti-mouse KI67 (BD Pharmingen; BDB556003) 1:100, anti-goat P63 (R&D Systems; AF1916) 1:40, anti-chicken mCherry/tdTomato (Sigma; AB356481) 1:100, anti-rabbit KRT13 (Abcam; ab92551) 1:200, anti-guinea-pig KRT13 (Origene; BP5076) 1:100, anti-rabbit KRT5 (BioLegend; 905501) 1:200, anti-mouse KRT5 (GeneTex; GTX60580) 1:100 and anti-rabbit CGRP (Sigma; C8198) 1:100. The secondary antibodies for immunofluorescence were all used at a concentration of 10 μg ml−1 and included donkey anti-rabbit AF488 (Invitrogen; A21206), donkey anti-rat AF568 (Invitrogen; A78946), donkey anti-rat AF647 (Invitrogen; A78947), donkey anti-mouse AF647 (Invitrogen; A31571), goat anti-mouse IgG2a AF647 (Invitrogen; A21241), donkey anti-goat AF594 (Invitrogen; A11058), donkey anti-chicken AF594 (Invitrogen; A78951) and donkey anti-guinea-pig AF594 (Jackson ImmunoResearch; 706-585-148). The slides were imaged on an EVOS M5000 (Invitrogen; AMF5000) digital inverted benchtop microscope or on a Leica STELLARIS SP8 FALCON confocal microscope on an upright DM6 stand using a ×20 objective and laser illumination. Fluorescence signal was collected with two photomultiplier tubes and three HyS detectors (plus one HyX detector for four-colour staining). Images were acquired using Leica LAS X Microscope Software, including the Navigator function for imaging whole cross sections of the mouse trachea.
Analysis of K5–Cre-based reporter activity in mice
R26R–Ai9 mice were infected intratracheally with 1 × 108 pfu of Ad5–K5–Cre adenovirus 72 h after naphthalene injury, as described above. Tissues from tracheas and lungs were collected at 0, 3, 4 or 7 days post-infection and subjected to FFPE using the methods described above, with tracheas and lung lobes embedded in separate cassettes. Tissues from the tracheas and all lung lobes of four mice were subjected to co-immunofluorescent staining using the methods described above, and one cross section of the lungs and trachea per animal was assessed for co-expression of tdTom with KRT5, P63 and/or KRT13. In total, 160 tdTom+ cells were manually assessed from the lung and tracheal tissues for co-expression of KRT13 and/or KRT5, and 90 tdTom+ cells were manually assessed for co-expression of KRT5 and/or P63, on the basis of overlapping fluorescent signal from images of all tdTom+ cells (acquired with a Leica STELLARIS SP8 confocal microscope and LAS X software; see ‘Immunofluorescence’ section). The results of manual quantification are included in Extended Data Fig. 11e and represent the percentage of tdTom+ cells in the tracheal epithelium, lung airway and total airway (trachea + lung airway epithelium) that co-express the indicated basal markers or are within two cell distances from KRT5+, P63+ or KRT13+ cells. TdTom+ cells lacking co-expression of a basal marker but within two cell distances from cells with high basal marker expression were included in the quantification because their proximity to basal cells suggests that they may have been targeted by K5–Cre in a basal state but basal marker expression was downregulated during rapid differentiation/regeneration after naphthalene injury.
Semi-automated image quantification
Quantification of cell types present in the murine lung and tracheal epithelium on days 0, 1.5, 2, 3, 5 and 10 after naphthalene injury (Extended Data Fig. 1e,f) was semi-automated using QuPath open software for bioimage analysis (v.0.5.1-x64). At each time point, stained tissues from one to four mice were imaged using a Leica STELLARIS SP8 confocal microscope and LAS X software (see ‘Immunofluorescence’). For tracheal quantification, whole tracheal cross sections were captured per set of co-stains using the LAS X Navigator software with a ×20 objective. For lung airway quantification, approximately three to ten images of distinct airways were captured per animal per time point. Images of tracheas and lung airways were then exported with scale bars as multi-page TIFFs for import to QuPath with ‘image type’ set as ‘fluorescence’. The lung and tracheal epithelium per image were manually annotated in QuPath, and then all annotations were subjected to automated analysis for positive cells per fluorescent marker using the appropriate channel (analyse > cell detection > cell detection). Cell detection parameters were selected for each individual stain on the basis of the positive and negative control regions of the images. After cell detection, the number of positive cells per annotation was exported to a.tsv file (measure > export measurements; export type = ‘annotations’) that lists the ‘perimeter’ and number of cells detected in each annotation. The ‘perimeter’ measurements of each annotation were divided by 2 to estimate the length of the epithelium quantified (approximate length = perimeter/2) and to ultimately obtain the estimated number of positive cells per millimetre of lung or tracheal epithelium. For lung airways, approximately 5–35 total millimetres of epithelium was quantified per cell-type marker per time point (CCSP, KRT5, KRT13 and CGRP). For tracheal airways, approximately 18–50 total millimetres of epithelium was quantified per cell-type marker per time point (CCSP, KRT5, KRT13, CGRP and P63).
Quantification of co-expression of subtype markers ASCL1, NEUROD1 and POU2F3 in RPM K5–Cre-initiated tumours (Extended Data Fig. 2f) was semi-automated and performed using QuPath (v.0.5.1). RPM K5–Cre-initiated tumours were subjected to immunofluorescence for SCLC subtype markers and DAPI (to mark nuclei) (see ‘Immunofluorescence’), and images of ten unique tumours from five mice were captured on an EVOS M5000 (Invitrogen; AMF5000) digital inverted benchtop microscope with a ×10 or ×20 objective per channel. Images from all four channels per tumour were exported with scale bars and imported individually to QuPath with image type set as fluorescence. The DAPI image per tumour was first subjected to manual annotation of the tumour region in QuPath (whole image annotation if the region only included tumour, or custom annotation if the image included normal regions). Next, the DAPI image was subjected to automated detection of all cells (analyse > cell detection), and then the identified cells were converted to ‘annotations’ by first creating a single measurement classifier on the basis of DAPI cell detections (classify > object classifier > create single measurement classifier). Settings included ‘object filter’ set to ‘detections (all)’, ‘channel filter’ set to ‘blue’, ‘measurement’ set to ‘cell: blue mean’ and ‘classifier name’ set to ‘blue’. Live preview was checked to manually ensure that the classifier captured all cells. After creating the single measurement classifier, all objects (‘cells’) were converted to annotations using the QuPath script editor. After running the command, all resulting annotations (each should highlight one cell in the image) in the ‘annotations’ pane in QuPath were copied (select all with cursor then edit > copy to clipboard > selected objects) and then pasted onto each image from the other three channels. For each of the other three images, automatic detection of positive cells for all annotations was performed (analyse > cell detection > positive cell detection) using appropriate parameters per channel to detect positive cells on the basis of the positive/negative control regions of images. Once positive cells were identified for all channels (all three images), annotation measurements were exported (measure > export measurements; ‘export type’ set to ‘annotations’). The resulting .tsv file contained ordered annotations representing individual tumour cells in each image, with corresponding columns of whether that annotation ‘cell’ was positive or not for each channel/image. Positive detections from each channel per annotation ‘cell’ were assessed to obtain the number of co-expressing cells per imaged region of ten distinct tumours.
Human SCLC cell infections
Human SCLC cell line H1048 was obtained from ATCC and cultured in RPMI medium supplemented with 10% fetal bovine serum (FBS), 1% l-glutamine and 1% penicillin–streptomycin antibiotic cocktail. To generate H1048 sgNTC, sgPTEN and myristoylated AKT cell lines, cells were infected with a non-targeting sgRNA (sgNTC) or an sgRNA against PTEN (sgPTEN: 5′-GAC TGG GAA TAG TTA CTC CC -3′) in the LCV2-hygro backbone (Addgene plasmid no. 98291) or infected with the pHRIG–AKT1 lentiviral construct (Addgene; 53583). In brief, high-titre virus (approximately 1–5 × 107 pfu) was produced using HEK-293T cells transfected with a three-plasmid system, including the targeting construct and lentiviral packaging plasmids pCMV delta R8.2 (Addgene plasmid no. 8455) and pCMV–VSVG (Addgene plasmid no. 8454). Viruses were collected at 48 h and 72 h post-transfection, concentrated by means of ultracentrifugation (25,000 rpm for 1.45 h), resuspended in 1× sterile PBS and stored at –80 °C until use. H1048 cells were subjected to spinoculation at 37 °C, 900g, for 30 min. During spinoculation, 0.5–1 million cells per well of a six-well plate were cultured with 2-ml RPMI, 25-μl HEPES buffer (Thermo Fisher Scientific; 15630080), 8 μg ml−1 of Polybrene (Santa Cruz Biotechnology; sc-134220) and 25-μl high-titre virus. Cells were selected 48 h after spinoculation with hygromycin (for LCV2-hygro-infected cells) or sorted for GFP to enrich for cells infected with pHRIG–AKT1.
Immunoblotting
For human cell line and mouse tumour western blots, protein lysates were prepared, as described40,72, separated by means of SDS–PAGE and transferred to polyvinylidene fluoride (PVDF) membranes (Bio-Rad; 1704157) using the Trans-Blot Turbo Transfer System (Bio-Rad; 1704150). Membranes were blocked for 1 h in 5% milk, followed by overnight incubation with primary antibodies at 4 °C. The membranes were washed for 3 × 10 min at room temperature in Tris-buffered saline with Tween 20 (TBS-T). Mouse and rabbit HRP-conjugated secondary antibodies (Jackson ImmunoResearch; 1:10,000) were incubated for 1 h in 5% milk at room temperature, followed by washing 3 × 10 min at room temperature in Tris-buffered saline with Tween 20. The membranes were exposed to WesternBright Quantum HRP substrate (Advansta; K-12045-D50) and detected on HyBlot CL film (Denville Scientific). The primary antibodies included ASCL1 (1:1,000; Abcam; ab211327), NEUROD1 (1:1,000; CST; 62953), MYC (1:1,000; CST; 5605), PTEN (1:1,000; CST; 9559), POU2F3 (1:1,000; Sigma; HPA019652), pAKT (Ser473) (1:1,000; CST; 4060), pAKT (Thr308) (1:1,000; CST; 13038), total AKT (1:1,000; CST; 9272) and HSP90 (1:1,000; CST; 4877) as loading control.
Basal organoids
Tracheal basal cell isolation and organoid culture
Live, normal tracheal basal cells were isolated from RPM, RPR2 and RPMA mice (not treated with Ad–Cre) and grown as organoids, as described previously12,73,74. In brief, the mice were euthanized, and three to four tracheas per genotype were isolated in cold DMEM/F12-Advanced media (Thermo Fisher Scientific; 12-634-238 + 10% FBS, 1% l-glutamine and 1% penicillin–streptomycin). Tracheas were opened to expose the lumen using a razor blade and forceps. Each trachea was placed in a 1.5-ml Eppendorf tube in 500-μl dispase (50 U ml−1; Corning; 354235) diluted in HBSS-free medium (Thermo Fisher Scientific; 14175-095) to 16 U ml−1 and incubated at room temperature for 30 min. After incubation, tracheas were transferred to new Eppendorf tubes containing 500 μl of 0.5 mg ml−1 of DNAse (Thermo Fisher Scientific; NC9709009) diluted in HBSS-free medium and incubated for an extra 40 min at room temperature. Tracheas from each genotype were pooled in a 10-cm dish containing DMEM/F12-Advanced media, and forceps were used to gently pull apart the epithelial layers/sheets from the cartilage of each trachea. The media containing all tracheal epithelial sheets per genotype were transferred to a 15-ml conical tube and centrifuged at 4 °C, 2,000 rpm, for 5 min. The supernatant was removed. The remaining cell pellet was resuspended in 1 ml of TrypLE Express (Invitrogen; 12604013) and then incubated at 37 °C for 5 min. TrypLE was quenched through the addition of 10-ml DMEM/F12-Advanced media and then transferred through a 100-μm cell strainer into a 50-ml conical tube. Excess tissue was pushed through the cell strainer gently using a plunger from a syringe. Filtered cells were spun down at 2,000 rpm for 5 min at 4 °C, and the supernatant was removed. The remaining cell pellet was resuspended in 1 ml of fluorescence-activated cell sorting (FACS) buffer (20-ml PBS, 400-μl FBS and 80-μl 0.5 M EDTA), spun down and then stained in 100 μl of FACS buffer containing 1-μg anti-rat ITGA6/CD49 (eBioscience; 14-0495-85) primary antibody for 30–60 min on ice. The samples were washed three times in FACS buffer and then resuspended in 100 μl of FACS buffer with 1 μg of secondary antibody goat anti-rat APC (BioLegend; 405407) and incubated for 30 min in the dark on ice. The samples were washed three times with FACS buffer and then stained for 15 min with 1 μg ml−1 of DAPI. After three times extra washes, the cells were subjected to FACS, and DAPI−ITGA6+ cells were isolated. The resulting basal cells were resuspended in 100% Matrigel (Corning/Fisher; CB-40234C or homemade), 20,000–100,000 cells per one 50-μl Matrigel dome, and then 50 μl of Matrigel was plated per well of a pre-warmed 24-well plate. After Matrigel solidified at 37 °C, 500 μl of organoid culture medium (OCM) consisting of 50% L-WRN conditioned medium75 (Sigma; SCM105 or homemade), 50% DMEM/F12-Advanced media (supplemented with 10% FBS, 1% l-glutamine, 1% penicillin–streptomycin, 10 ng ml−1 EGF (Thermo Fisher Scientific; PHG0311), 10 ng ml−1 FGF (Thermo Fisher Scientific; PHG0369) and 10 μM Y-27632 Rho kinase/ROCK inhibitor (MedchemExpress; HY-10071) was added per well. OCM was changed every 2–3 days, and organoids were split and expanded using TrypLE upon confluence.
Lentiviral transduction of organoids with CellTag Library V1
The CellTag V1 plasmid library was purchased from Addgene (plasmid no. 124591) and amplified according to the published protocol for this technology52. In brief, the plasmid library was transformed using Stellar Competent Cells at an efficiency of approximately 220 colony-forming units per unique CellTag in the V1 library. The library was isolated from Escherichia coli culture through the Plasmid Plus Mega Kit (QIAGEN; 12981) and assessed for complexity through high-throughput DNA sequencing with the Illumina MiSeq (75-cycle paired-end sequencing v.3). Generation of the CellTag whitelist from sequencing data resulted in 13,836 unique CellTags in the 90th percentile of detection frequency. High titer lentivirus was generated from the CellTag V1 library following published protocols26,76 and titred on the basis of GFP fluorescence with 293T cells (ATCC; CRL11-268).
To generate ‘CellTagged pre-Cre’ organoids and allografts, normal (no Cre-mediated recombination) RPM, RPMA and RPR2 basal organoids were expanded for approximately 3.5 weeks post-isolation, and then approximately 1 × 106 cells were transduced with the CellTag V1 lentiviral library. To generate ‘CellTagged post-Cre’ organoids and allografts, previously transformed RPM basal organoids (approximately 6–8 weeks before) were expanded and then subjected to CellTag V1 lentiviral transduction. For transduction, normal or transformed organoids were dissociated into single cells with TrypLE (Invitrogen; 12604013) for 30 min and subjected to mechanical dissociation every 10 min of TrypLE incubation. TrypLE was quenched, and cells were pelleted and resuspended in 500 μl of OCM plus 8 μg ml−1 of Polybrene (Santa Cruz Biotechnology; sc-134220) and 25 μl of CellTag V1 high-titre lentivirus and then plated in one well of a 24-well plate. The cells were spinoculated at 300g for 30 min at room temperature to increase the transduction efficiency, incubated immediately after spinoculation for 3–6 h at 37 °C and then pelleted and replated in 50 μl of Matrigel and 500 μl of fresh viral supernatant. The organoids were incubated for 24 h, and then viral medium was replaced with normal OCM for organoid expansion. GFP was visible in more than 50% of cells as soon as 24 h after viral transduction.
Basal organoid Cre administration
CellTagged pre-Cre normal basal organoids (RPM, RPMA and RPR2) were expanded for approximately 8–10 weeks after CellTagging to allow clonal expansion and then were subjected to Cre-mediated transformation. CellTagged post-Cre normal basal organoids (RPM) were subjected to Cre-mediated transformation approximately 3–6 weeks after generation before CellTagging occurred. Because TAT–Cre treatment resulted in unreliable levels of recombination (Extended Data Figs. 3b, 4g and 5a), we used high-titre adenoviral CMV–Cre (University of Iowa; VVC-U of Iowa-5) to recombine all genotypes, including those for CellTagging experiments. For all samples, successful recombination with Ad–CMV–Cre occurred by (1) dissociating organoids into single cells (approximately 500,000–1 million cells) using TrypLE (Invitrogen; 12604013) for 30 min at 37 °C with mechanical dissociation every 10 min; (2) spinoculating (300g; room temperature; 30 min) organoids in 2.5–5 × 107 pfu CMV–Cre in 500-μl OCM + 10 μg ml−1 of Polybrene in a 24-well plate; (3) incubating 4–6 h at 37 °C; and (4) seeding in Matrigel and propagating as normal organoid cultures, as described above. Full recombination of all alleles for each genotype was confirmed using recombination PCR approximately 4 weeks after Cre treatment.
PCR validation of recombination efficiency
The QIAGEN DNeasy kit (69506) was used to isolate genomic DNA from basal-derived organoids after exposure to Cre. Fully recombined tumour-derived cell lines from each genotype were used for positive recombination controls. DNA concentrations were measured on a BioTek Synergy HT plate reader. Equal quantities of tumour genomic DNA (100 ng) were amplified by PCR with GoTaq (Promega; M7123) using primers to detect Rb1 recombination: D15′-GCAGGAGGCAAAAATCCACATAAC-3′, 1lox5′ 5′-CTCTAGATCCTCTCATTCTTCCC-3′ and 3′ lox 5′-CCTTGACCATAGCCCAGCAC-3′. The PCR conditions were 94 °C for 3 min, 30 cycles of (94 °C for 30 s, 55 °C for 1 min and 72 °C for 1.5 min), 72 °C for 5 min and held at 4 °C. The expected band sizes were approximately 500 bp for the recombined Rb1 allele and 310 bp for the unrecombined/floxed allele. Primers to detect Trp53 recombination included the following: A 5′-CACAAAAACAGGTTAAACCCAG-3′, B 5′-AGCACATAGGAGGCAGAGAC-3′ and D 5′-GAAGACAGAAAAGGGGAGGG-3′. The PCR conditions were 94 °C for 2 min, 30 cycles of (94 °C for 30 s, 58 °C for 30 s and 72 °C for 50 s), 72 °C for 5 min and held at 4 °C. The expected band sizes were 612 bp for the Trp53 recombined allele and 370 bp for the unrecombined/floxed allele. Primers to detect MycT58A recombination included the following: CAG-F2 5′-CTGGTTATTGTGCTGTCTCATCAT-3′ and MycT-R 5′-GCAGCTCGAATTTCTTCCAGA-3′. The PCR conditions used were 94 °C for 2 min, 35 cycles of (95 °C for 30 s, 60 °C for 30 s and 72 °C for 1.5 min), 72 °C for 7 min and held at 4 °C. The expected band sizes were approximately 350 bp for the recombined allele and approximately 1,239 bp for the unrecombined/floxed allele. Primers to detect Ascl1 recombination included the following: Sense Ascl1 5′UTR:5′-AACTTTCCTCCGGGGCTCGTTTC-3′ (for Cre recombined fwd), VR2: 5′-TAGACGTTGTGGCTGTTGTAGT-3′ (for Cre recombined rev), MF1 5′-CTCTGTCCAAACGCAAAGTGG-3′ (for floxed fwd) and VR2 5′-TAGACGTTGTGGCTGTTGTAGT-3′ (for floxed rev). The PCR conditions were 94 °C for 5 min, 30 cycles of (94 °C for 1 min, 64 °C for 1.5 min and 72 °C for 1 min), 72 °C for 10 min and held at 4 °C. The expected band sizes were approximately 700–850 bp for the Ascl1 recombined allele and approximately 857 bp for the unrecombined/floxed allele. Recombination PCR to detect Rbl2 (also known as p130) recombined (approximately 350 bp) and floxed (more than 1,500 bp) alleles was performed under conditions and with primers as previously described62. The PCR products were run on 1.2% agarose/Tris–acetate–EDTA (TAE) gels containing ethidium bromide or SYBR Safe, and images were acquired using a Bio-Rad Gel Doc XR imaging system.
Generation of basal-organoid derived allografts
After validating the recombination in basal organoids, fully recombined RPM, RPR2 and RPMA basal organoids were implanted as whole or partially digested organoids into flanks of SCID/beige mice that were between 6 and 12 weeks old (Taconic/Charles River Laboratories). Subcutaneous implants of approximately 0.5–3 × 106 cells per flank in 50 μl of 50:50 Matrigel:OCM mix were performed. After implantation, basal organoid allografts were measured once to thrice weekly with calipers and collected when tumours reached an average of 1 cm3 but no greater than 2 cm3 or upon ulceration, loss of more than 10% of the baseline animal body weight or interference with animal eating, drinking or moving, whichever was earlier, in accordance with Duke University’s Policy on Tumor Burden in Rodents and to ensure a humane end point. A tumour volume of 2 cm3 is the maximum, as allowed by our IACUC protocol, and permitted end points were not exceeded in any study. Tumour tissue was then subjected to FFPE and/or dissociation for scRNA-seq experiments and/or reimplantation.
CRISPR editing of organoids and validation
To generate Pten knockout RPM and RPMA organoids, basal organoids transformed with CMV–Cre were infected with either a non-targeting sgRNA (sgCtrl) or an sgRNA targeting Pten (sgPten: 5′-TCATCAAAGAGATCGTTAGC-3′), both cloned into the LCV2-puro backbone (Addgene plasmid no. 52961). In brief, a high-titre virus (approximately 1–5 × 107 pfu) was produced using HEK-293T cells transfected with a three-plasmid system, including LCV2–sgCtrl or sgPten and lentiviral packaging plasmids pCMV delta R8.2 (Addgene plasmid no. 8455) and pCMV–VSVG (Addgene plasmid no. 8454). Viruses were collected at 48 h and 72 h after transfection, concentrated by means of ultracentrifugation (25,000 rpm for 1.45 h), resuspended in 1× sterile PBS and stored at −80 °C until use. Fully recombined RPM and RPMA organoids were subjected to spinoculation with the high-titre virus using methods described in ‘Lentiviral transduction of organoids with CellTag Library V1’. Successful editing of Pten was validated through T7 endonuclease genome mismatch assays using the following primers: Fwd: 5′-CTCTCGTCGTCTGTCTA-3′ and Rev: 5′-CGAACACTCCCTAGGTGAATAC-3′. In brief, an approximately 1,000-bp region containing the sgPten site was amplified using Q5 High-Fidelity DNA Polymerase (New England Biolabs; M0492). The PCR conditions used were 98 °C for 30 s, 35 cycles of (98 °C for 10 s, 65 °C for 20 s and 72 °C for 30 s), 72 °C for 2 min and held at 4 °C. The PCR product was purified using a PCR DNA Clean & Concentrator Kit (ZYMO; D4030), and 200 ng of the annealed PCR product was subjected to T7 Endonuclease I digestion for 15 min at 37 °C. The digestion was quenched with 0.25 M EDTA. The products (digested and undigested controls) were run on agarose/Tris–acetate–EDTA gels containing ethidium bromide or SYBR Safe, and images were acquired using a Bio-Rad Gel Doc XR imaging system.
Immunoblotting was performed to validate PTEN loss through downstream induction of phospho-AKT (Ser473) in RPM and RPMA basal organoids, using methods described above for human cell lines. The primary antibodies included pAKT (Ser473) (1:1,000; CST; 4060S), total AKT (1:1,000; CST; 9272) and HSP90 (1:1,000; CST; 4877) as loading control.
Single-cell transcriptomics
scRNA-seq sample information
Samples sequenced for this study included (1) n = 1 wild-type basal organoid sample (no CellTag data); (2) a series of basal organoids and resulting allograft tumours that were ‘CellTagged pre-CMV–Cre’ (to trace the lineage from a single normal basal cell of origin) (n = 1 RPM organoid sample, n = 1 RPM allograft tumour, n = 1 RPMA organoid sample, n = 1 RPMA allograft (pool of three tumours), n = 1 RPR2 organoid sample and n = 1 RPR2 allograft tumour); (3) a series of organoids and resulting allograft tumours that were ‘CellTagged post-CMV–Cre’ (to trace the lineage from a single transformed basal cell of origin) (n = 1 RPM organoid sample and n = 2 RPM allograft tumours); and (4) primary lung tumours from autochthonous GEMMs (n = 2 RPM Cgrp–Cre initiated tumours and n = 2 RPM K5–Cre initiated tumour samples, each of which was a pool of two distinct lung tumours from one mouse). RPR2 transformed organoids were not analysed in this study, but data were included in the associated Gene Expression Omnibus (GEO) deposition. N = 5 extra RPM Cgrp–Cre initiated primary tumours were used in the analyses in this study but were previously published (GEO: GSE149180 and GSE1555692)26,27.
RPM, RPMA and RPR2 organoid samples CellTagged pre-Cre were prepared for single-cell transcriptomics approximately 3 months after the initial CellTagging and approximately 1 month after Cre treatment (for transformed organoids). RPM organoids CellTagged post-Cre were prepared for single-cell transcriptomics approximately 6–8 weeks after transformation and approximately 2–3 weeks after CellTagging. All transformed organoid samples were prepared for scRNA-seq on the same day of implant to SCID/beige hosts (see Fig. 3a for experimental timeline).
Preparation of single-cell suspensions for scRNA-seq
Organoid samples were dissociated into single-cell suspensions using TrypLE Express for 30 min at 37 °C with mechanical disruption approximately every 10 min. Allografts and primary tumours used for scRNA-seq were isolated fresh from the lung or flank of mice and immediately subjected to digestion into single-cell suspensions and preparation for sequencing. Tumour tissue was mechanically dissociated into small clumps using scissors in 1 ml of an enzymatic digestion cocktail per sample and then incubated for 30 min at 37 °C. The digestion cocktail consisted of 4,200 μl of HBSS-free medium (Thermo Fisher Scientific; 14175), 600 μl of TrypLE Express (Invitrogen; 12604013), 600 μl of 10 mg ml−1 of collagenase type 4 (Worthington Biochemical; LS004186) prepared in HBSS-free medium (Thermo Fisher Scientific; 14175-095) and 600 μl of dispase (50 U ml−1; Corning; 354235) and sterilized using a 0.22-μm syringe filter. Enzymatic digestion was quenched on ice with 500 μl of quench medium containing 7.2 ml of Leibovitz’s L-15 medium (Thermo Fisher Scientific; 11415), 800 μl of FBS and 30 μl of 5 mg ml−1 of DNase (Thermo Fisher Scientific; NC9709009) in HBSS-free medium. Tissue suspension was passed through a 100-μm cell strainer. Cells were spun at 2,500g for 5 min at 4 °C. The supernatant was removed and replaced with 500 μl of ammonium chloride–potassium lysis buffer per sample to remove red blood cell contamination (3 min incubation at 37 °C; VWR; 10128-808). The reaction was quenched with 10 ml of cold DMEM/F12-Advanced media supplemented with 10% FBS, 1% l-glutamine and 1% penicillin–streptomycin. The cells were spun at 500g for 5 min at 4 °C and resuspended in cold, filtered medium or cold, filtered 0.04% BSA in PBS at a concentration of 1–2 × 106 cells ml−1 and counted manually with a haemocytometer. Single-cell organoid or tumour cell suspensions were immediately subjected to multiplexing with 10X CellPlex or 10x Genomics library preparation, as described below.
RNA-seq library preparation
Multiplexed samples included transformed organoids CellTagged pre-Cre (RPM, RPMA and RPR2), one K5–Cre RPM sample containing one tumour more central in the lung and one tumour nearer the trachea and the two primary RPM Cgrp–Cre-initiated tumours. The samples were multiplexed before library preparation using 10x Genomics 3′ CellPlex Kit Set A (1000261) and Feature Barcodes (1000262) and following 10x Genomics Demonstrated Protocol (CG000391) following suggestions for dissociated tumour cells. After CellPlexing, the samples were loaded onto a Chromium X series controller (10x Genomics; 1000331), targeting 10,000 cells per sample. Samples not subjected to multiplexing were immediately loaded onto a 10x Chromium X controller targeting 10,000–20,000 cells per sample. Library preparation was performed following manufacturer’s protocols using the 10x Chromium Next GEM Single Cell 3′ Kit, v.3.1 (10x Genomics; PN-1000268). Completed libraries were sequenced on an Illumina NovaSeq 6000, Illumina NextSeq 1000 or a NovaSeq X Plus to target more than 30,000 reads per cell with the 10x Genomics-recommended paired-end sequencing mode for dual indexed samples. The individual sample details, including CellPlex oligo information, are provided as metadata with the GEO submission.
Demultiplexing and data alignment
scRNA-seq data were demultiplexed and processed into FASTQ files through the Cell Ranger v.7.2.0 pipeline (10x Genomics). The primary tumour samples were aligned to a custom mouse genome (GRCm38-mm10-2020-A), including eGFP, Cas9, Firefly luciferase (fLuc) and Venus, to detect recombined alleles in our various mouse models. RPM tumours in this publication express fLuc40 following recombination of the MycT58A-Ires-LucLSL/LSL allele, and RPMA tumours express fLuc and Venus following recombination of the MycT58A-Ires-LucLSL/LSL and Ascl1fl/fl alleles, respectively. Some RPM primary tumours were derived from RPM–Cas9–GFP mice and express eGFP and Cas9 in addition to fLuc after recombination. CellTagged basal allograft tumour samples were aligned to a custom mouse genome (GRCm38-mm10-2020-A) with fLuc and Venus to aid in detecting recombined tumour cells and include GFP.CDS and CellTag.UTR transcripts to allow detection of CellTags, according to the published workflow52,77 (https://github.com/morris-lab/CellTagR). Sequences used for custom genome builds are included in this publication’s GitHub repository. Count barcodes and unique molecular identifiers were generated using Cell Ranger count or Cell Ranger multi-pipelines for CellPlexed samples.
Initial quality control and normalization
Quality control and downstream analysis were performed in Python (v.3.8.8) using Scanpy (v.1.10.0), according to current expert recommendations for single-cell best practices78. Anndata objects were created from filtered feature matrices with sc.read_10X_mtx(), and quality metrics were calculated using sc.pp.calculate_qc_metrics(). Low-quality cells and potential doublets were initially excluded by selecting for cells with 15% or lower mitochondrial content (‘pct_counts_mito’), greater than 1,000–2,000 total counts (‘total_counts’; sample dependent) and greater than 500–2,000 but less than 7,000–9,000 genes detected (‘n_genes_by_counts’; sample dependent). Normalized counts were calculated with sc.pp.normalize_total() and a target sum of 10,000. Integrated anndata objects containing multiple scRNA-seq datasets were combined using adata.concatenate() with join=‘outer’.
Further quality control and clustering
Data were further processed using Scanpy (v.1.10.0), scvi-tools (v.0.17.4) and benchmarking standards to minimize batch effects while maintaining true biological variability, particularly across integrated objects79. First, highly variable genes were determined with sc.pp.highly_variable genes() using 5,000 (for organoids and allograft tumours) or 10,000 (for primary tumours) top genes, flavour set to ‘seurat_v3’ and batch key set to the name of the specific scRNA-seq batch. Poisson gene selection was then calculated with scvi.data.poisson_gene_selection() with the same n_top_genes and batch_key used for highly variable gene selection. The probabilistic deep learning model was set up through scvi.model.SCVI.setup_anndata() to initialize the integration and clustering of datasets from anndata objects containing only the top genes identified by the highly variable Poisson gene selection. Continuous covariate keys were set as percentage of mitochondrial counts, and batch keys set to distinguish samples were prepared or sequenced at different times. The model was trained with default parameters, an early stopping patience of 20 and a maximum of 500 epochs using the model.train() function. The latent representation of the model was obtained with model.get_latent_representation() and added to the.obsm of each full anndata object (including all genes, not just the highly variable genes). Neighbours were then calculated with sc.pp.neighbors() with use_rep set to the.obsm category added from the latent representation. UMAP embedding was performed using sc.tl.umap() with min_dist=0.5. Finally, Leiden clusters were generated with sc.tl.leiden() with resolution set to 1.0 for initial steps. As required throughout the scvi pipeline, we used raw counts for all of the steps described above.
Extra rounds of quality control and data filtering were performed per dataset by assessing n_genes_by_counts, total counts and percentage of mitochondrial counts per cluster. In general, the model tends to cluster low-quality and doublet cells together; therefore, clusters with exceptionally high or low average genes_by_counts, total counts or mitochondrial content were labelled as low quality and considered for removal from the dataset. Removal was performed after assessing gene expression on the basis of known markers of tumour and normal lung cells and marker genes per cluster, as determined by sc.tl_rank_genes_groups(), to help ensure that biological cells that normally have higher or lower n_genes_by_counts, total_counts or percentage of mitochondrial content were not aberrantly filtered out. In addition, in tumour samples from autochthonous or allograft models, we removed non-tumour cells by assessing common immune and normal lung cell type gene expression. Each time a cluster was removed, we ran the scvi pipeline on the new anndata object iteratively through this quality control step until there were no longer any low-quality or non-tumour cell clusters in the anndata object. Additionally, each time clusters were removed from a larger anndata object, the pipeline was re-run for optimal clustering. Final Leiden clusters were determined with sc.tl.leiden() resolution set to 0.5–1.0 for all datasets (sample dependent, on the basis of heterogeneity in marker expression determined to be biologically relevant). The full source code used to reproduce all scRNA-seq analysis methods is on GitHub (https://github.com/TGOliver-lab/Ireland_Basal_SCLC_2025) and is publicly accessible upon publication.
Plot generation
UMAP plots showing clustering, sample information and/or expression of specific genes were generated with sc.pl.umap() with vmin = 0, vmax = ‘p99.5’ and the normalized count layer as input. Dot plots of normalized counts were generated with sc.pl.dotplot(), and, if clustered, the dendrogram was set to ‘True’. For violin plots, data were first converted from anndata objects to Seurat objects using the readH5AD() function in the zellkonverter package and the CreateSeuratObject() function in the SeuratObject package and then plotted in R using Seurat’s VlnPlot() function or the plotColData function in the scater package. Signature scores in UMAP were also generated in Seurat using the FeaturePlot() function.
Transcriptomic gene signatures and differential gene expression analysis
For signature score assessment and differential gene expression analysis, anndata objects were converted to Seurat objects in R, and normalized and log-transformed counts were used for visualization after the raw count data were subjected to Seurat’s NormalizeData() function. Cell cycle scores were assigned to Seurat objects using the CellCycleScoring() function and Seurat’s cc.genes gene lists, converted to mouse homologues.
MYC26, ASCL142 and NEUROD127 target gene signatures were previously generated and represent conserved transcriptional targets identified through ChIP–seq ± RNA-seq on mouse and human SCLC cell lines and/or tumours. To generate a POU2F3 target gene signature, published.bed files from POU2F3 ChIP on two human POU2F3 + SCLC cell lines (H526 and H1048) were downloaded (GEO: GSE247951)18. Peaks were called and annotated using ChIPseeker in R. Genes with peaks in the promoter region (less than 2 kb from gene TSS) were considered target genes. Conserved target genes between H1048 and H526 were identified, and only conserved target genes also enriched by log2FC > 0.5 in POU2F3-high versus POU2F3-low human SCLC tumours by published bulk RNA-seq23 were included in the final POU2F3 target gene score. The ATOH1 target gene score was derived from the binding and expression target analysis (BETA) of ChIP–seq data from SCLC patient-derived xenograft cells, as previously published45. The YAP1 activity score was derived from a published 22-gene YAP/TAZ target signature derived from RNA-seq and ChIP–seq data from 891 cancer cell lines and including genes exhibiting a twofold or greater decrease in expression following YAP/TAZ knockdown or upregulated in YAP1 overexpression/activation46. ASCL1, NEUROD1, ATOH1, POU2F3 and MYC target gene scores and YAP1 activity score were assigned to the metadata of converted Seurat objects using the AddModuleScore() function. All target gene and activity scores are included in Supplementary Table 2.
Normal NE, tuft and basal cell scores were previously published as consensus gene lists derived from mouse scRNA-seq data on normal lung cell types10, but proliferation genes, which included ‘Rpl’ and ‘Rps’ genes, were removed to eliminate cell cycle differences and focus on fate-specific markers. The published ionocyte consensus signature was limited to only 19 genes10; therefore, we applied an expanded ionocyte signature derived from genes >0.5 log2FC enriched in droplet-based scRNA-seq data on mouse trachea/lungs from the same study10 (63 genes in total). In addition, we used a human ionocyte signature derived from the top 100 human ionocyte markers established in the same study from scRNA-seq on human bronchial epithelium10. Normal basal hillock and luminal hillock signatures were derived from mouse scRNA-seq data43, and methods to generate these gene signatures are previously described44. The normal cell type signatures are listed in Supplementary Table 2. Gene signatures to assess basal cell heterogeneity in organoid samples (Extended Data Fig. 7) represent the top 100 genes in published enrichment signatures of cell types identified in scRNA-seq data of human tracheal epithelium41 (listed in Supplementary Table 6).
SCLC archetype signatures (A, A2, N and P) are previously published29,80,81 and were derived from archetype assignments of human SCLC cell lines. The archetype signatures per subtype are listed in Supplementary Table 2. Gene sets for human SCLC subtypes A, N and P (Extended Data Fig. 2m) were obtained from scRNA-seq data on human tumours and have been previously published82. Inflamed SCLC tumour signatures (Fig. 5g) were derived from published non-negative matrix factorization (NMF) studies on bulk RNA-seq data from n = 81 human SCLC tumours (Gay et al.8 and George et al.39) or mRNA, protein and phosphorylation data from n = 107 human SCLC tumours (Liu et al.23), where distinct inflammatory subsets were identified (annotated as NMF3 in both studies). We generated the Gay et al. inflammatory signature from the publication’s NMF-derived gene signature (n = 1,300 total genes)8 by taking genes greater than 1 log2FC enriched in SCLC-I versus other SCLC subtypes, resulting in a signature with n = 379 human genes converted to mouse homologues (Supplementary Table 7). The Liu et al.23 inflammatory signature comprised the top 100 genes enriched in NMF3 versus other NMF groups (greater than 1 log2FC)23, converted to mouse homologues (Supplementary Table 7).
Human SCLC subtype signatures were generated from the real-world Caris dataset and three extra bulk RNA-seq datasets of human SCLC tumours (Liu et al.23, George et al.39 and Lissa et al.61). All tumours were initially subtyped as hSCLC-A, hSCLC-N, hSCLC-Y, hSCLC-P, hSCLC-mixed or Lin− according to the methods described below in Gene expression profiling and SCLC subtype classification. Caris subtype signatures represent the top 100 enriched genes per human subtype compared with all other samples. Liu et al.23, George et al.39 and Lissa et al.61 subtype signatures represent genes showing a log2FC of 2 or greater in each subtype compared with samples from other subtypes (further detailed in Supplementary Table 7). Seurat’s AddModuleScore() function was used to apply human SCLC subtype signatures (after converting human genes to mouse homologues) from the real-world tumour data and the Liu et al.23, George et al.39 and Lissa et al.61 datasets to mouse tumour scRNA-seq data. The resulting subtype scores in our mouse data were then compared for similarity by means of the Pearson correlation matrix in Fig. 5e. The subtyping results and all human SCLC subtype signatures are listed in Supplementary Table 7.
Finally, the NE score was determined on the basis of Spearman correlation with an established expression vector, in which approximately 41 NE and 87 non-NE human cell lines were used to identify a core 50-gene signature that comprised 25 NE genes and 25 non-NE genes that robustly predicted NE phenotype83. Seurat objects were converted to single-cell experiment84 objects with the as.SingleCellExperiment() function, and then NE score was added as metadata and visualized using the Scater plotColTable() function. Other signatures were visualized similarly with Scanpy or using FeaturePlot() and/or VlnPlot() functions in Seurat.
Marker genes of Leiden clusters in UMAPs throughout the study were calculated using the sc.tl.rank_genes_groups() function in Scanpy on normalized count data with Wilcoxon rank-sum test and the number of genes set to 500.
CellTag analysis and clone calling
CellTags were identified in the RPM and RPMA basal organoid and allograft scRNA-seq samples using processed binary alignment/map (BAM) files (from Cell Ranger count output; see above methods) and following the CellTagR pipeline documentation (https://github.com/morris-lab/CellTagR). In short, BAM files were filtered to exclude unmapped reads and include reads that align to the GFP.CDS transgene or CellTag.UTR. CellTag objects were created in R, and CellTags were extracted from the filtered BAM files to generate the matrices of cell barcodes, 10x unique molecular identifiers and CellTags. The matrix was further filtered to include only barcodes identified as cells by the Cell Ranger pipeline and then subjected to error correction through Starcode. CellTags not detected in our whitelist with representation above the 75th percentile (generated from assessment of our lentiviral library complexity, as described above) were also removed. Clones were assigned as cells expressing more than two but less than 20 CellTags with similar combinations of CellTags (Jaccard similarity better than 0.8). For scRNA-seq analysis and CellTag visualization, we used Scanpy (v.1.10.0) for initial quality control and scvi-tools (v.0.17.4) for integration and clustering in Python, following the methods described above. The resulting anndata objects were converted to Seurat objects in R for further CellTag analyses. CellTag-based clonal information was added as metadata by 10x-assigned barcode to visualize clone distribution and cell identity per clone using standard visualization functions in R. The final clonal analyses only considered clones with more than five cells to ensure sufficient sampling of each clone. CellTag metadata for RPM and RPMA organoid and allograft tumour samples CellTagged pre-Cre and RPM organoid and allograft samples CellTagged post-Cre are included in Supplementary Table 4. CellTagging on RPR2 basal organoids and allografts occurred, but clonal representation was limited in the allograft tumour because of long latency of this model (approximately 6 months) and strong bottlenecks; thus, no CellTag information on RPR2 samples is included in this study.
ForceAtlas2 mapping, diffusion pseudotime and CellRank analyses
To visualize clonal trajectories, scRNA-seq data from RPM and RPMA basal-derived allografts were projected through force-directed graphing with ForceAtlas285 in Scanpy using the sc.pl.draw_graph() function with default settings. We chose to model the combined RPM and RPMA allograft cells on one trajectory because we observed that cells from each genotype occupied all Leiden clusters, although at variable frequencies, suggesting that the cells in each model have the potential to reach all transcriptional phenotypes. Diffusion pseudotime was calculated with sc.tl.dpt() with default settings (n_dcs=10) after setting root cells as cluster 17 basal-like cells (the phenotypic starting point of the experiment, that is, basal organoids) with adata.uns[‘iroot’]. Next, diffusion pseudotime was used as input to perform CellRank2 analysis86, which computes a transition matrix of cellular dynamics and uses estimators to calculate subsequent fate probabilities, driver genes and gene expression trends. First, we set a PseudotimeKernel (pk) and then calculated a cell–cell transition matrix using pk.compute_transition_matrix(). Next, the Generalized Perron Cluster Cluster Analysis (GPCCA) estimator, a Markov chain-based estimator in CellRank87, was initialized with g=GPCCA(pk). To assign macrostates, the estimator was fit using g.fit() with n_states set to 9 (determined after assessment of the top 20 eigenvalues) and cluster_key set to assigned Leiden clusters. Setting n_states to 9 allowed for all Leiden clusters with distinct SCLC fate phenotypes to be picked up as macrostates by the estimator. Cluster 17, the basal-like cluster, was assigned to the estimator as an initial state using g.set_initial_states(), consistent with the assigned pseudotime root state. Terminal states were predicted by the estimator on the basis of the highest stability values using default settings with the g.predict_terminal_states() function and allow_overlap set to True. The predicted terminal states are shown in Extended Data Fig. 6e. Next, g.compute_fate_probabilities() was used to assign probability values to all cells on the basis of how probable each cell is to reach each terminal state. Fate probabilities per cell, annotated by Leiden cluster or SCLC fate, were visualized using cr.pl.circular_projection(). The resulting graphs were manually annotated with the assigned cell fate groupings for Fig. 3i. The predicted lineage drivers of each terminal state were computed with the g.compute_lineage_drivers() function that predicts driver genes by correlating gene expression with fate probability values. The predicted CellRank lineage drivers for each terminal state are available in Supplementary Table 5. Finally, we focused on individual trajectories of interest to visualize temporal expression patterns of predicted driver genes along pseudotime using the generalized additive models (GAMs)88. To initialize a model for GAM fitting, the cr.models.GAM() function was run with max_iter set to 6,000, spline_order set to 2 and n_knots set to 10. The top 50–100 predicted driver genes were fit to the GAM model per lineage. Gene expression changes over pseudotime were visualized using the cr.pl.heatmap() function and sorted according to peak expression in pseudotime (Fig. 3j and Extended Data Fig. 6f). The key parameters for the cr.pl.heatmap() function included data_key set to normalized counts, show_fate_probabilities set to True, time_key set to diffusion pseudotime and show_all_genes set to True. Source code to reproduce these analyses is deposited at GitHub (https://github.com/TGOliver-lab/Ireland_Basal_SCLC_2025).
Human SCLC data from Caris Life Sciences
Whole transcriptome sequencing sample preparation and data alignment
Whole transcriptome sequencing uses a hybrid capture method to pull down the full transcriptome from FFPE tumour samples using the SureSelect Human All Exon V7 bait panel (Agilent Technologies) and the NovaSeq platform (Illumina). The FFPE specimens underwent pathology review to discern the percentage of tumour content and tumour size; a minimum of 10% tumour content in the area selected for microdissection was required to enable enrichment and extraction of tumour-specific RNA. The QIAGEN RNA FFPE tissue extraction kit was used for extraction, and the RNA quality and quantity were determined using the Agilent TapeStation. Biotinylated RNA baits were hybridized to the synthesized and purified complementary DNA targets, and the bait–target complexes were amplified in a post-capture PCR. The resultant libraries were quantified and normalized, and the pooled libraries were denatured, diluted and sequenced. Raw data were demultiplexed using the Illumina DRAGEN FFPE accelerator. FASTQ files were aligned with STAR aligner (release 2.7.4a at GitHub). Expression data were produced using Salmon, which provides fast and bias-aware quantification of transcript expression89. BAM files from the STAR aligner were further processed for RNA variants using a proprietary custom detection pipeline. The reference genome used was GRCh37/hg19, and analytical validation of this test demonstrated 97% or higher positive percent agreement, 99% or higher negative percent agreement and 99% or higher overall percent agreement with a validated comparator method.
Gene expression profiling and SCLC subtype classification
For stratification of patient samples into subgroups, RNA expression values for established and putative lineage-defining transcription factors ASCL1 (A), NEUROD1 (N), POU2F3 (P) and YAP1 (Y) were standardized to Z scores. Samples with a single positive Z score among A/N/P/Y were assigned to the respective gene-associated subgroups, samples with multiple positive Z scores were classified as ‘mixed’ and samples with negative Z scores for all four genes were classified as ‘lineage-negative’ (Lin−). This method was also used to ‘subtype’ tumours from the George et al.39, Liu et al.23 and Lissa et al.61 datasets (results in Supplementary Table 7).
Gene set enrichment analyses
GSEA was performed using human homologues of normal tuft, basal, NE and ionocyte cell signatures, previously established from mouse and/or human scRNA-seq datasets10, as described above in the scRNA-seq-related methods and included in Supplementary Table 2. The input included rank-ordered gene lists for each subtype classification on the basis of log-transformed fold change expression (log-transformed fold change of ASCL1 subtype versus all other samples, NEUROD1 subtype versus all other samples, and so on).
Quantification and statistical analyses
All statistical analyses were performed in R or using GraphPad Prism (v.6). Detailed statistical methods, including those for immunostaining quantification, bulk and single-cell transcriptomics analyses, are described in relevant figure legends and Methods. In general, for data with normal distributions, pairwise comparisons were subjected to Student’s two-tailed unpaired t-tests, whereas multi-group comparisons were subjected to one-way ANOVA tests followed by post hoc Tukey’s or Fisher’s LSD multiple comparison tests. Non-parametric data, as determined by normality testing (Anderson–Darling, Shapiro–Wilk and Kolmogorov–Smirnov), were subjected to two-tailed Mann–Whitney or Wilcoxon rank-sum tests for pairwise comparisons or Kruskal–Wallis tests followed by post hoc Dunn’s multiple comparison tests (uncorrected or with Bonferroni correction) for comparisons of more than two groups. For Kaplan–Meier survival analysis, P values were calculated using log-rank (Mantel–Cox) testing. P values < 0.05 were considered statistically significant for all tests, unless otherwise specified. No statistical methods were used to predetermine sample sizes. All violin plots with box–whisker overlays show median and quartiles. Error bars for all data shown represent mean ± s.d. or s.e.m., as indicated in figure legends. Manual immunostaining and quantification, allograft measurements and quantification of histopathologies were performed blinded to treatment status and/or genotype to prevent bias from skewing the results. Blinding was not reported for other experiments given that no subjective measurements were considered to be involved.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All scRNA-seq data have been deposited at the GEO (GSE279200) and are available at Zenodo (https://doi.org/10.5281/zenodo.15857303)90. Source data are provided with this paper.
Code availability
The original code used to reproduce the analyses in this study has been deposited on GitHub (https://github.com/TGOliver-lab/Ireland_Basal_SCLC_2025) and are available at Zenodo (https://doi.org/10.5281/zenodo.15865663)91. Further analyses comparing SCLC-Y cells from an NE origin26 with a basal origin are available on GitHub.
References
Koh, J. et al. Molecular classification of extrapulmonary neuroendocrine carcinomas with emphasis on POU2F3-positive tuft cell carcinoma. Am. J. Surg. Pathol. 47, 183–193 (2023).
Chen, C. C. et al. Temporal evolution reveals bifurcated lineages in aggressive neuroendocrine small cell prostate cancer trans-differentiation. Cancer Cell 41, 2066–2082 (2023).
Finlay, J. B. et al. Olfactory neuroblastoma mimics molecular heterogeneity and lineage trajectories of small-cell lung cancer. Cancer Cell 42, 1086–1105 (2024).
Akbulut, D. et al. Differential NEUROD1, ASCL1, and POU2F3 expression defines molecular subsets of bladder small cell/neuroendocrine carcinoma with prognostic implications. Mod. Pathol. 37, 100557 (2024).
Huang, Y. H. et al. POU2F3 is a master regulator of a tuft cell-like variant of small cell lung cancer. Genes Dev. 32, 915–928 (2018).
Yamada, Y. et al. Pulmonary cancers across different histotypes share hybrid tuft cell/ionocyte-like molecular features and potentially druggable vulnerabilities. Cell Death Dis. 13, 979 (2022).
Yamada, Y. et al. Tuft cell-like carcinomas: novel cancer subsets present in multiple organs sharing a unique gene expression signature. Br. J. Cancer 127, 1876–1885 (2022).
Gay, C. M. et al. Patterns of transcription factor programs and immune pathway activation define four major subtypes of SCLC with distinct therapeutic vulnerabilities. Cancer Cell 39, 346–360 (2021).
Nabet, B. Y. et al. Immune heterogeneity in small-cell lung cancer and vulnerability to immune checkpoint blockade. Cancer Cell 42, 429–443 (2024).
Montoro, D. T. et al. A revised airway epithelial hierarchy includes CFTR-expressing ionocytes. Nature 560, 319–324 (2018).
He, P. et al. A human fetal lung cell atlas uncovers proximal-distal gradients of differentiation and key regulators of epithelial fates. Cell 185, 4841–4860 (2022).
Rock, J. R. et al. Basal cells as stem cells of the mouse trachea and human airway epithelium. Proc. Natl Acad. Sci. USA 106, 12771–12775 (2009).
Plasschaert, L. W. et al. A single-cell atlas of the airway epithelium reveals the CFTR-rich pulmonary ionocyte. Nature 560, 377–381 (2018).
Mehta, A. & Stanger, B. Z. Lineage plasticity: the new cancer hallmark on the block. Cancer Res. 84, 184–191 (2024).
Kawasaki, K., Rekhtman, N., Quintanal-Villalonga, Á. & Rudin, C. M. Neuroendocrine neoplasms of the lung and gastrointestinal system: convergent biology and a path to better therapies. Nat. Rev. Clin. Oncol. 20, 16–32 (2023).
Oronsky, B., Ma, P. C., Morgensztern, D. & Carter, C. A. Nothing but NET: a review of neuroendocrine tumors and carcinomas. Neoplasia 19, 991–1002 (2017).
Wu, X. S. et al. OCA-T1 and OCA-T2 are coactivators of POU2F3 in the tuft cell lineage. Nature 607, 169–175 (2022).
He, T. et al. Targeting the mSWI/SNF complex in POU2F-POU2AF transcription factor-driven malignancies. Cancer Cell 42, 1336–1351 (2024).
Duplaquet, L. et al. Mammalian SWI/SNF complex activity regulates POU2F3 and constitutes a targetable dependency in small cell lung cancer. Cancer Cell 42, 1352–1369 (2024).
Rudin, C. M. et al. Molecular subtypes of small cell lung cancer: a synthesis of human and mouse model data. Nat. Rev. Cancer 19, 289–297 (2019).
Poirier, J. T. et al. New approaches to SCLC therapy: from the laboratory to the clinic. J. Thorac. Oncol. 15, 520–540 (2020).
Sutherland, K. D., Ireland, A. S. & Oliver, T. G. Killing SCLC: insights into how to target a shapeshifting tumor. Genes Dev. 36, 241–258 (2022).
Liu, Q. et al. Proteogenomic characterization of small cell lung cancer identifies biological insights and subtype-specific therapeutic strategies. Cell 187, 184–203 (2024).
Owonikoko, T. K. et al. YAP1 expression in SCLC defines a distinct subtype with T-cell-inflamed phenotype. J. Thorac. Oncol. 16, 464–476 (2021).
Ng, J. et al. Molecular and pathologic characterization of YAP1-expressing small cell lung cancer cell lines leads to reclassification as SMARCA4-deficient malignancies. Clin. Cancer Res. 30, 1846–1858 (2024).
Ireland, A. S. et al. MYC drives temporal evolution of small cell lung cancer subtypes by reprogramming neuroendocrine fate. Cancer Cell 38, 60–78 (2020).
Olsen, R. R. et al. ASCL1 represses a SOX9+ neural crest stem-like state in small cell lung cancer. Genes Dev. 35, 847–869 (2021).
Gopal, P. et al. Multivalent state transitions shape the intratumoral composition of small cell lung carcinoma. Sci. Adv. 8, eabp8674 (2022).
Groves, S. M. et al. Archetype tasks link intratumoral heterogeneity to plasticity and cancer hallmarks in small cell lung cancer. Cell Syst. 13, 690–710 (2022).
Baine, M. K. et al. SCLC subtypes defined by ASCL1, NEUROD1, POU2F3, and YAP1: a comprehensive immunohistochemical and histopathologic characterization. J. Thorac. Oncol. 15, 1823–1835 (2020).
Qi, J., Zhang, J., Liu, N., Zhao, L. & Xu, B. Prognostic implications of molecular subtypes in primary small cell lung cancer and their correlation with cancer immunity. Front. Oncol. 12, 779276 (2022).
Sutherland, K. D. et al. Cell of origin of small cell lung cancer: inactivation of Trp53 and Rb1 in distinct cell types of adult mouse lung. Cancer Cell 19, 754–764 (2011).
Baine, M. K. et al. POU2F3 in SCLC: clinicopathologic and genomic analysis with a focus on its diagnostic utility in neuroendocrine-low SCLC. J. Thorac. Oncol. 17, 1109–1121 (2022).
Hogan, B. L. et al. Repair and regeneration of the respiratory system: complexity, plasticity, and mechanisms of lung stem cell function. Cell Stem Cell 15, 123–138 (2014).
Vaughan, A. E. et al. Lineage-negative progenitors mobilize to regenerate lung epithelium after major injury. Nature 517, 621–625 (2015).
Gardner, E. E. et al. Lineage-specific intolerance to oncogenic drivers restricts histological transformation. Science 383, eadj1415 (2024).
Lázaro, S. et al. Differential development of large-cell neuroendocrine or small-cell lung carcinoma upon inactivation of 4 tumor suppressor genes. Proc. Natl Acad. Sci. USA 116, 22300–22306 (2019).
Park, J. W. et al. Reprogramming normal human epithelial tissues to a common, lethal neuroendocrine cancer lineage. Science 362, 91–95 (2018).
George, J. et al. Comprehensive genomic profiles of small cell lung cancer. Nature 524, 47–53 (2015).
Mollaoglu, G. et al. MYC drives progression of small cell lung cancer to a variant neuroendocrine subtype with vulnerability to Aurora kinase inhibition. Cancer Cell 31, 270–285 (2017).
Goldfarbmuren, K. C. et al. Dissecting the cellular specificity of smoking effects and reconstructing lineages in the human airway epithelium. Nat. Commun. 11, 2485 (2020).
Borromeo, M. D. et al. ASCL1 and NEUROD1 reveal heterogeneity in pulmonary neuroendocrine tumors and regulate distinct genetic programs. Cell Rep. 16, 1259–1272 (2016).
Lin, B. et al. Airway hillocks are injury-resistant reservoirs of unique plastic stem cells. Nature 629, 869–877 (2024).
Izzo, L. T. et al. KLF4 promotes a KRT13+ hillock-like state in squamous lung cancer. Preprint at bioRxiv https://doi.org/10.1101/2025.03.10.641898 (2025).
Catozzi, A. et al. Functional characterization of the ATOH1 molecular subtype indicates a pro-metastatic role in small cell lung cancer. Cell Rep. 44, 115603 (2025).
Wang, Y. et al. Comprehensive molecular characterization of the Hippo signaling pathway in cancer. Cell Rep. 25, 1304–1317 (2018).
Rodarte, K. E. et al. Neuroendocrine differentiation in prostate cancer requires ASCL1. Cancer Res. 84, 3522–3537 (2024).
Romero, R. et al. The neuroendocrine transition in prostate cancer is dynamic and dependent on ASCL1. Nat. Cancer 5, 1641–1659 (2024).
Lim, J. S. et al. Intratumoural heterogeneity generated by Notch signalling promotes small-cell lung cancer. Nature 545, 360–364 (2017).
Wu, Q. et al. YAP drives fate conversion and chemoresistance of small cell lung cancer. Sci. Adv. 7, eabg1850 (2021).
Wagner, A. H. et al. Recurrent WNT pathway alterations are frequent in relapsed small cell lung cancer. Nat. Commun. 9, 3787 (2018).
Kong, W. et al. CellTagging: combinatorial indexing to simultaneously map lineage and identity at single-cell resolution. Nat. Protoc. 15, 750–772 (2020).
Quintanal-Villalonga, A. et al. Comprehensive molecular characterization of lung tumors implicates AKT and MYC signaling in adenocarcinoma to squamous cell transdifferentiation. J. Hematol. Oncol. 14, 170 (2021).
Jimbo, N. et al. The expression of YAP1 and other transcription factors contributes to lineage plasticity in combined small cell lung carcinoma. J. Pathol. Clin. Res. 10, e70001 (2024).
Shah, VS. et al. Single cell profiling of human airway identifies tuft-ionocyte progenitor cells displaying cytokine-dependent differentiation bias in vitro. Nat. Commun. 16, 5180 (2025).
Ajay, A. et al. Assessment of targets of antibody drug conjugates in SCLC. NPJ Precis. Oncol. 9, 1 (2025).
Bairakdar, M. D. et al. Learning the cellular origins across cancers using single-cell chromatin landscapes. Preprint at Nat. Commun. https://doi.org/10.1038/s41467-025-63957-3 (2025).
Polak, P. et al. Cell-of-origin chromatin organization shapes the mutational landscape of cancer. Nature 518, 360–364 (2015).
Watson, J. K. et al. Clonal dynamics reveal two distinct populations of basal cells in slow-turnover airway epithelium. Cell Rep. 12, 90–101 (2015).
Chan, J. M. et al. Lineage plasticity in prostate cancer depends on JAK/STAT inflammatory signaling. Science 377, 1180–1191 (2022).
Lissa, D. et al. Heterogeneity of neuroendocrine transcriptional states in metastatic small cell lung cancers and patient-derived models. Nat. Commun. 13, 2023 (2022).
Schaffer, B. E. et al. Loss of p130 accelerates tumor development in a mouse model for human small-cell lung carcinoma. Cancer Res. 70, 3877–3883 (2010).
Cui, M. et al. PTEN is a potent suppressor of small cell lung cancer. Mol. Cancer Res. 12, 654–659 (2014).
McFadden, D. G. et al. Genetic and clonal dissection of murine small cell lung carcinoma progression by genome sequencing. Cell 156, 1298–1311 (2014).
Caeser, R. et al. Genomic and transcriptomic analysis of a library of small cell lung cancer patient-derived xenografts. Nat. Commun. 13, 2144 (2022).
Hong, K. U., Reynolds, S. D., Giangreco, A., Hurley, C. M. & Stripp, B. R. Clara cell secretory protein-expressing cells of the airway neuroepithelial body microenvironment include a label-retaining subset and are critical for epithelial renewal after progenitor cell depletion. Am. J. Respir. Cell Mol. Biol. 24, 671–681 (2001).
Hsu, H.-S. et al. Repair of naphthalene-induced acute tracheal injury by basal cells depends on β-catenin. J. Thorac. Cardiovasc. Surg. 148, 322–332 (2014).
DuPage, M., Dooley, A. L. & Jacks, T. Conditional mouse lung cancer models using adenoviral or lentiviral delivery of Cre recombinase. Nat. Protoc. 4, 1064–1072 (2009).
Ferone, G. et al. SOX2 is the determining oncogenic switch in promoting lung squamous cell carcinoma from different cells of origin. Cancer Cell 30, 519–532 (2016).
Chalishazar, M. D. et al. MYC-driven small-cell lung cancer is metabolically distinct and vulnerable to arginine depletion. Clin. Cancer Res. 25, 5107–5121 (2019).
Flowers, J. L. et al. Use of monoclonal antiestrogen receptor antibody to evaluate estrogen receptor content in fine needle aspiration breast biopsies. Ann. Surg. 203, 250–254 (1986).
Oliver, T. G. et al. Caspase-2-mediated cleavage of Mdm2 creates a p53-induced positive feedback loop. Mol. Cell 43, 57–71 (2011).
Liu, Y. et al. Chromosome 3q26 gain is an early event driving coordinated overexpression of the PRKCI, SOX2, and ECT2 oncogenes in lung squamous cell carcinoma. Cell Rep. 30, 771–782 (2020).
Tata, P. R. et al. Dedifferentiation of committed epithelial cells into stem cells in vivo. Nature 503, 218–223 (2013).
Miyoshi, H. & Stappenbeck, T. S. In vitro expansion and genetic modification of gastrointestinal stem cells in spheroid culture. Nat. Protoc. 8, 2471–2482 (2013).
Mollaoglu, G. et al. The lineage-defining transcription factors SOX2 and NKX2-1 determine lung cancer cell fate and shape the tumor immune microenvironment. Immunity 49, 764–779 (2018).
Biddy, B. A. et al. Single-cell mapping of lineage and identity in direct reprogramming. Nature 564, 219–224 (2018).
Heumos, L. et al. Best practices for single-cell analysis across modalities. Nat. Rev. Genet. 24, 550–572 (2023).
Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19, 41–50 (2022).
Wooten, D. J. et al. Systems-level network modeling of small cell lung cancer subtypes identifies master regulators and destabilizers. PLoS Comput. Biol. 15, e1007343 (2019).
Groves, S. M. et al. Involvement of epithelial-mesenchymal transition genes in small cell lung cancer phenotypic plasticity. Cancers 15, 1477 (2023).
Chan, J. M. et al. Signatures of plasticity, metastasis, and immunosuppression in an atlas of human small cell lung cancer. Cancer Cell 39, 1479–1496 (2021).
Zhang, W. et al. Small cell lung cancer tumors and preclinical models display heterogeneity of neuroendocrine phenotypes. Transl. Lung Cancer Res. 7, 32–49 (2018).
Amezquita, R. A. et al. Orchestrating single-cell analysis with bioconductor. Nat. Methods 17, 137–145 (2020).
Jacomy, M., Venturini, T., Heymann, S. & Bastian, M. ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS ONE 9, e98679 (2014).
Weiler, P., Lange, M., Klein, M., Pe’er, D. & Theis, F. CellRank 2: unified fate mapping in multiview single-cell data. Nat. Methods 21, 1196–1205 (2024).
Reuter, B., Fackeldey, K. & Weber, M. Generalized Markov modeling of nonreversible molecular kinetics. J. Chem. Phys. 150, 174103 (2019).
Servén, D., Brummitt, C. & Abedi, H. pyGAM: generalized additive models in Python. Zenodo https://doi.org/10.5281/zenodo.1476122 (2018).
Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).
Oliver, T., Ireland, A. & Tyson, D. Basal cell of origin resolves neuroendocrine–tuft lineage plasticity in cancer. Zenodo https://doi.org/10.5281/zenodo.15857303 (2025).
Ireland, A. & Tyson, D. TGOliver-lab/Ireland_Basal_SCLC_2025: Nature manuscript acceptance. Zenodo https://doi.org/10.5281/zenodo.15865663 (2025).
Acknowledgements
We thank the members of the Oliver lab for their technical assistance, mouse colony management and administrative support (C. Cheng). We appreciate MedGenome and Duke High Throughput Genomics Core for sequencing services, Duke Cancer Institute (DCI) Flow Cytometry core for cell sorting, Duke Human Vaccine Institute for use of the Agilent Tapestation, Preclinical Research Resource Core at Huntsman Cancer Institute and University of Utah for organoid culture reagents, and J. Johnson for Ascl1-conditional mice. We thank B. Goldstein for use of the 10x Chromium X for scRNA-seq library preparation and B. Goldstein, Z. Hartman, C.-L. Lee and A. Tata and P. Tata for their generous sharing of SCID/beige and Cre-reporter mice. We thank the BioRepository & Precision Pathology Center, a shared resource of the Duke University School of Medicine and DCI. The BioRepository & Precision Pathology Center receives support from the P30 Cancer Center Support Grant (P30 CA014236) and Cooperative Human Tissue Network (UM1CA239755). We thank the Light Microscopy Core Facility at Duke University for use of the Leica STELLARIS 8 confocal microscope, which is funded by the NIH Shared Instrumentation Grant (1S10OD034340-01A1). We acknowledge the funding sources, including the NIH National Cancer Institute for awards R01CA262134 (T.G.O.), U24CA213274 (T.G.O. and C.M.R.), U01CA231844 (T.G.O. and R.G.), F31 CA275295-01 (A.S.I.), R50 CA243783 (D.R.T.), ZIA BC 011793 (A.T.), 5K08CA259161 (J.M.C.) and R35 CA263816 (C.M.R.). We thank DCI for the pilot funds from the P30 Cancer Center Support Grant P30 CA014236 (T.G.O.).
Author information
Authors and Affiliations
Contributions
A.S.I. and T.G.O. conceptualized the study. A.S.I., D.A.X., S.B.H., M.W.B., L.Y.Z., S.L.-R., D.R.T., B.L.W., A.E. and T.G.O. conceived the methodology and performed the investigation. A.S.I. designed, carried out and analysed most of the experiments. D.A.X. and S.B.H. performed the tissue immunostaining. M.W.B., L.Y.Z. and B.E.H. performed the human cell line experiments. S.L.-R. and S.B.H. conducted animal infections and in vivo imaging. A.S.I., D.R.T., J.M.C. and A.E. performed the computational analyses of scRNA-seq. R.G., A.D., J.C.M., A.T., S.P., C.M.R., J.M.C. and A.E. acquired, prepared and provided critical human tissue and datasets. A.S.I., D.R.T., R.G., C.M.R. and T.G.O. acquired funding. T.G.O. administered the project and supervised this study. A.S.I. and T.G.O. wrote the paper with inputs from all authors.
Corresponding author
Ethics declarations
Competing interests
T.G.O. has a patent related to SCLC subtypes (US12188095-B2), a sponsored research agreement with Auron Therapeutics, has consulted for Nuage Therapeutics and Light Horse Therapeutics and served on the Scientific Advisory Board (SAB) for Lung Cancer Research Foundation and as a consulting editor for Cancer Research and Genes & Development. C.M.R. has consulted regarding oncology drug development with AbbVie, Amgen, AstraZeneca, Boehringer Ingelheim and Jazz, and receives licencing fees for DLL3-directed therapies. He serves on the SABs of Auron Therapeutics, DISCO, EARLI and Harpoon Therapeutics. A.D. serves on SABs for Jazz, AstraZeneca and Amgen. A.E. reports employment and stock options with Caris Life Sciences. J.M.C. has consulted for Sonata Therapeutics. A.T. received grants to the NCI from EMD Serono Research and Development, AstraZeneca, Gilead Sciences and ProLynx. J.C.M. has consulted for IQVIA, Genome Insights, Incyte, Novotech, Red Arrow Therapeutics, Pfizer, Vilya, Replimune and Iovance, and received honoraria from and advised for Caris Life Sciences. D.R.T. is on the SAB of Vrise Therapeutics. S.P. has consulted for Amgen, Bristol Myers Squibb, Daiichi Sankyo, Johnson & Johnson, Novocure, OncoHost and Takeda Pharmaceuticals. The other authors declare no competing interests. Diversity, equity, ethics, and inclusion: We included sex balance in the selection of human and non-human participants. One or more authors of this paper self-identify as people from sexual and gender minorities.
Peer review
Peer review information
Nature thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 POU2F3+ tumours exhibit intratumoural subtype heterogeneity indicative of plasticity. Related to Fig. 1.
a, Representative IHC staining on human SCLC biopsies for given markers with sample name indicated. One row = one tumour. Scale bar, 25 μm. b, Venn diagram depicting number of human SCLC biopsies (n = 119 total) staining positive for ASCL1, NEUROD1, POU2F3, or lacking all markers (“Subtype-Neg”). c, Representative co-immunofluorescence (co-IF) staining on human SCLC biopsies (n = 28 total) for DAPI (nuclei, blue), ASCL1 (yellow), NEUROD1 (purple) or POU2F3 (green). Individual channels (top) and an overlay without DAPI (bottom) are shown. Scale bars, 50 μm. Yellow arrows indicate colocalization of markers. d, Representative co-IF staining on patient-derived-xenografts (n = 2 distinct models) for DAPI (nuclei, blue), ASCL1 (green), NEUROD1 (purple) or POU2F3 (red). Individual channels (left) and an overlay without DAPI (right) are shown. Yellow arrows and insets (a-b) emphasize co-expressing cells (bottom). Scale bars, 75 μm. e, Co-IF staining of tracheal and lung epithelium pre-(Day 0) and post-naphthalene (Day 3, 5, 10) for indicated markers. DAPI marks nuclei (blue), P63 and KRT5 (basal), CCSP (club), and KRT13 (hillock). K5-Cre was not administered to these mice, but the timepoint of typical K5-Cre administration in autochthonous GEMMs is indicated for ease of interpretation of cell types present at the biologically-relevant timepoint for tumour experiments. Scale bars, 10 µm. f, Quantification of positive cells per mm of epithelium in the trachea or lung epithelium, with dashed line at Day 3 (timepoint K5-Cre is typically administered in autochthonous GEMMs), showing the frequency of cells present at the theoretical time of K5-Cre administration. Statistics performed on frequency values from n = 12–58 regions of tracheal epithelium or n = 13–50 regions of lung epithelium per timepoint. Error represents mean +/- SEM. Quantification of n = ~4–12 total mm lung or n = ~13–51 total mm tracheal epithelium per timepoint per stain from n = 1–2 mice at Day 0, n = 2 mice at Day 1.5, n = 2 mice at Day 2, n = 4 mice at Day 3, n = 2 mice at Day 5, and n = 2 mice at Day 10.
Extended Data Fig. 2 Basal cells give rise to SCLC with expansive subtype heterogeneity. Related to Fig. 1 and Supplementary Tables 1 and 2.
a, Survival of RPM mice infected with indicated Ad-Cre viruses. Mouse numbers in figure. Mantel-Cox log-rank test vs. K5-Cre + naphthalene (purple); exact p-values in figure. b, IHC images for indicated proteins from in situ (3–5 wks post-infection) or invasive (> 6 weeks post-infection) RPM K5-Cre tumours. Invasive tumours are most often negative for basal markers (middle row), but rare tumours have sporadic basal marker expression (bottom row). Representative of tumours from n = 10 mice. c, POU2F3 IHC H-scores by airway location and Ad-Cre virus. Each dot = 1 tumour. Kruskal-Wallis with post-hoc Dunn’s test (p-values in figure). N = 11 Cgrp, 43 Cmv, 101 K5-Cre tumours quantified (as in Fig. 1e), but split by airway location. Error represents mean ± SD. d, YAP1 IHC in RPM tumours by indicated Ad-Cre viruses (left) and H-score quantification (right). Median (red bar) and quartiles (dotted liness) indicated. KWwith post-hoc Dunn’s multiple comparisons (exact p-values in figure). Number of tumours quantified indicated on figure. e, Co-immunofluorescent (co-IF) staining for DAPI (nuclei, blue), ASCL1 (green), NEUROD1 (purple), or POU2F3 (red) in RPM K5-Cre tumours, representative of n = 5 mice. Tumour regions outlined with dashed white line. Yellow arrows indicate co-expressing cells. Scale bars, 75 μm. f, Quantification of co-expression by immunofluorescence staining for SCLC subtype markers ASCL1 (A), NEUROD1 (N), or POU2F3 (P) from n = 10 K5-Cre-initiated RPM tumours from n = 5 mice, where tumours have 1 or >1 subtype-defining transcription factor(s) (TF) detected (left). Student’s unpaired two-tailed t-test. Box and whiskers represent min, median, and max values. Percent of RPM tumour cells co-expressing the indicated subtype markers (right). One-way ANOVA with post-hoc Tukey’s. Exact p-values in figure. Each dot represents one field of a tumour. Error, mean with SEM. g, UMAP of scRNA-seq from RPM tumours: NE-derived (Cgrp-Cre, purple, n = 6 mice, 7 tumours) vs basal-derived (K5-Cre, orange, n = 2 mice, 4 tumours). Cells coloured by sample on right with number of cells analyzed indicated. h, UMAP (top) and violin plots (bottom) of selected gene expression in Cgrp- vs K5-Cre–derived tumour cells. Each dot is one cell. Two-sided Wilcoxon rank sum tests. Exact p-values indicated in figure. i, UMAP in (g) annotated by Leiden cluster (left) (Supplementary Table 1). Proportion of cells from Cgrp- vs K5-Cre tumours per Leiden cluster, as % of all cells per sample (right). j,m, Signature scores in tumour cells from (g): j, ChIP-seq targets, k, NE score, l, SCLC archetypes, or m, human SCLC subtype-specific signatures. Yellow/red diamond = mean. Two-sided Wilcoxon rank sum tests. Exact p-values indicated in figure. See Supplementary Table 2 for gene signatures. n, Stacked bar graph depicting percent of Leiden clusters as in (i) dominated by Cgrp- (purple) vs. K5-Cre (orange) initiated RPM tumour cells (left). Dot plot of top 10 differentially-expressed marker genes per Leiden cluster (Supplementary Table 1) (right). o, Violin plots of SCLC-A2 archetype or basal- or luminal-hillock signatures by Leiden cluster as in (i). Red diamond =mean. KW tests (p-value indicated on figure) and post-hoc Dunn’s with Bonferroni correction; ****p < 2e-16, other exact p-values in figure. Scale bars, 50 μm unless otherwise noted.
Extended Data Fig. 3 Basal-derived GEMM organoids maintain basal identity in vitro. Related to Figs. 1 and 2.
a, Representative flow cytometry data from tracheal basal cells analyzed with a live/dead marker, ITGA6, indicated basal markers (KRT5, NGFR), and/or hillock marker (KRT13). Gating strategy includes progressive gating on cell size, singlets, and live cells (top row), followed by analysis of ITGA6 on Y-axis and indicated control (no primary) or basal marker on X-axis. Percentage of total live cells indicated in each quadrant. Quantification in upper right indicates the percentage of ITGA6+ cells specifically that co-express another basal or hillock marker based on mean fluorescence of that marker relative to all live cells. b, Recombination PCR for RPM basal organoids for indicated alleles pre- and post-treatment with TAT-Cre or Ad-CMV-Cre at two concentrations (2.5e7 or 5e7 pfu). *Organoids subject to spinoculation with CMV-Cre virus. Red font = condition selected for subsequent allografting. For gel source data, see Supplementary Fig. 1. c, Co-immunofluorescence staining (Co-IF) of wildtype “WT” basal organoids (collected within ~2 weeks of CMV-Cre transformation) and CMV-Cre-transformed basal-derived organoids (RPM, RPR2, RPMA) for DAPI (nuclei, blue), KI67 (proliferation, orange), and ASCL1 (NE, green). Positive controls (bottom panel) are an RPR2 SCLC lung tumour. Quantification via CellProfiler of KI67 positivity per organoid per genotype (bottom). Number of organoids quantified is labeled. One-way ANOVA with post-hoc Tukey’s correction. Exact p-values in figure. Error is mean ± SD. d, Co-IF of WT basal organoids and CMV-Cre-transformed basal-derived organoids (RPM, RPR2, RPMA): Left panel) DAPI (nuclei, blue), NEUROD1 (neuronal, green), P63 (basal, yellow) and KRT8 (luminal basal, red). Positive control (+) for NEUROD1 is a murine olfactory neuroblastoma tumour and for basal markers is a murine squamous lung tumour. Right panel) DAPI (nuclei, blue), FOXJ1 (ciliated, purple), CCSP (SCGB1A1, club, green) and KRT8 (luminal basal, red). Positive control (+) control for FOXJ1 and CCSP is airway from a normal mouse lung, and for KRT8 is a murine squamous lung tumour. e, Co-IF of WT basal organoids and CMV-Cre-transformed basal-derived organoids (RPM, RPR2, RPMA) for DAPI (nuclei, blue) and POU2F3 (tuft, green). Positive control (+) is a murine olfactory neuroblastoma tumour. f, Co-IF staining of WT basal organoids and RPM basal-derived organoids for DAPI (nuclei) or indicated basal markers. Positive control is murine squamous lung tumour (+). All co-IF scale bars, 150 μm. Staining results in (c-f) are representative of organoids collected from three or more independent experiments per genotype, all with similar results.
Extended Data Fig. 4 Basal-derived GEMM organoids give rise to neuroendocrine, neuronal, and tuft-like SCLC in vivo. Related to Fig. 1 and Supplementary Tables 1 and 2.
a, UMAP of scRNA-seq data from wildtype (orange, n = 1 sample) and transformed RPM basal organoids (purple, n = 2 independent samples) (left) and by Leiden cluster (right) (Supplementary Table 1), as in Fig. 1i. Number of cells analyzed in figure. b, Dot plot expression of indicated marker genes in wildtype (WT) versus transformed RPM organoids, grouped by Leiden cluster as in (a). c, UMAP of RPM organoids pre- (wildtype = “WT”) and post-CMV-Cre (“RPM”) by cell cycle phase. Proportion of cells in each phase (“frequency”), represented as % of cells per sample (right). d, UMAP as in Fig. 1i of RPM organoids pre- (“WT”) and post-CMV-Cre (“RPM”) and RPM basal allograft tumour (“Allo”) by cell cycle phase (left). Proportion of cells in each phase (“frequency”), represented as % of cells per sample (right). e, Dot plot of indicated marker genes in RPM basal allograft tumours by Leiden cluster (from Fig. 1j) (Supplementary Table 1). f, Violin plot with expression of SCLC archetype signatures where A = ASCL1, A2 = ASCL1-A2, N = NEUROD1, and P = POU2F3 (Supplementary Table 2) in RPM allograft tumour cells by Leiden cluster (from Fig. 1j). Kruskal-Wallis (KW) test (p-value in figure). g, Recombination PCR for RPR2 basal organoids for indicated alleles pre- (“None”) and post-treatment with TAT-Cre or Ad-CMV-Cre at two concentrations (2.5e7 or 5e7 pfu). *Organoids subject to spinoculation with CMV-Cre virus. Red font = condition for subsequent allografting. For gel source data, see Supplementary Fig. 1. h, Representative H&E and IHC of RPR2 basal-derived allograft tumours for indicated SCLC subtype markers (left) with H-score quantification compared to RPM basal allograft tumours (right). N = 6 RPR2 tumours per stain and n = 11 (ASCL1, POU2F3) and 8 (NEUROD1) RPM tumours quantified. Median (black line) and quartiles (dotted black lines) indicated. Scale bar, 50 μm. Welch’s two-tailed t-test. Exact p-values in figure. i, UMAP of scRNA-seq from basal-organoid-derived RPM (turquoise, n = 2) and RPR2 (maroon, n = 1 independent samples) allografts; number of cells analyzed per genotype indicated (left). UMAP and corresponding violin plots with expression of indicated SCLC subtype markers and Myc family oncogenes (right). Yellow diamond=mean expression. j, UMAP in (i) by Leiden cluster (Supplementary Table 1) (left). Proportion of cells from RPM vs RPR2 allograft tumours per Leiden cluster, as % of all cells per sample (right). k, UMAP of scRNA-seq data in (i) by NE score (left). Violin plot of NE score in RPM vs RPR2 basal allograft tumour cells, compared to Cluster 8 in (j) (right) with median and quartiles indicated. KW test with post-hoc Dunn’s (Bonferroni correction) (p-values in figure). l, UMAP of SCLC fate in RPM allograft tumours (from Fig. 1k) compared to location of RPR2 tumour cells (maroon) as in (i). m, Violin plots of SCLC archetype signatures or (n) ChIP target genes signatures (Supplementary Table 2) in RPM vs RPR2 basal allograft samples from scRNA-seq in (i). Unless otherwise noted, statistics are two-sided Wilcoxon rank-sum tests; **** p < 2.2e-16 and other exact p-values in figure.
Extended Data Fig. 5 ASCL1 loss promotes POU2F3+ tuft-like SCLC. Related to Fig. 2 and Supplementary Table 3.
a, Recombination PCR for RPMA basal organoids for indicated alleles pre- (None) and post-treatment with TAT-Cre recombinase or Ad-CMV-Cre at two concentrations (2.5e7 or 5e7 pfu). *Organoids subject to spinoculation with CMV-Cre virus. Red font indicates condition used for subsequent allografting. For gel source data, see Supplementary Fig. 1. b, Quantification of tumour volume (mm^3) over time (weeks) in RPM (purple) vs RPMA (orange) basal allografts. Number of tumours quantified are indicated in legend. No suffix=Passage 1 (solid line), p2/p3= Passage 2 or 3 (dashed line). Red “X” indicates censored animals due to early tumour ulceration. c, Bar charts indicating fraction of Ascl1, Neurod1, or Pou2f3-high cells (count expression >0.01) in RPM vs RPMA basal-organoid-derived allografts from scRNA-seq data in Fig. 2e. Red=Positive = “High”; Blue=Negative = “Low”. d, Representative IHC images from RPM and RPMA basal-organoid-derived tumours for indicated markers (left). H-score quantification for indicated proteins (right) in RPM (SCLC only) and RPMA (SCLC- and NSCLC-dominant tumours, >50% of tumour region). Each dot represents one tumour. Only first-passage tumours included. For each marker, n = 11 RPM and 12 RPMA (n = 7 SCLC, n = 5 NSCLC) tumours quantified per genotype from n = 7 RPM and n = 9 RPMA mice. One-way ANOVA with post-hoc Fisher’s LSD pairwise comparisons (exact p-values in figure). Error bars = mean ± s.d. Scale bar, 50 μm. e, Expression of indicated genes in UMAP in RPM and RPMA basal allograft tumours (scRNA-seq data from Fig. 2e) (top), and corresponding violin plots by Leiden cluster (assigned in Fig. 2f) (bottom). Kruskal-Wallis (KW) test (p-value indicated in figure). f, Expression of indicated Neuroendocrine, Neuronal, Mesenchymal/Stem, Basal, SCLC-A2 archetype, Tuft, Ionocyte, Tuft-Ionocyte-Progenitor (TIP), Hillock basal, and SCLC-Atoh1 markers in UMAP of RPM and RPMA basal-derived allograft tumours (from scRNA-seq in Fig. 2e).
Extended Data Fig. 6 Lineage-tracing reveals distinct SCLC evolutionary trajectories. Related to Fig. 3 and Supplementary Tables 4 and 5.
a, UMAP (left) and FA projection (right) of scRNA-seq data from RPM and RPMA basal-organoid derived allograft tumours (from Fig. 3a), coloured by individual sample with method of CellTagging indicated. b, FA projection of individual CellTag clones from RPM basal allograft tumours, coloured by in vivo clonal dynamic “Pattern”, identified in Fig. 3d. See Supplementary Table 4. c, FA projection of individual CellTag clones from RPMA basal allograft tumours, coloured by in vivo clonal dynamic “Pattern”, identified in Fig. 3d. See Supplementary Table 4. d, Expression of indicated SLC subtype or fate markers in RPM and RPMA basal allograft tumour cells in FA map (from scRNA-seq data in Fig. 3c). e, Top: FA map by Leiden cluster of RPM and RPMA basal allograft tumour cells (from Fig. 3c) (left) with corresponding CellRank-predicted terminal states (middle) and assigned SCLC fate with added annotations defining phenotypic variation within SL clusters (right). Predicted CellRank trajectories had an assigned start of Cluster 17, the ‘Basal’ cluster. Bottom: Violin plots by Leiden cluster of SCLC-A2 archetype and Luminal and Basal hillock signatures used to inform SL phenotypic variation. f, CellRank trajectory-specific expression trends of putative drivers (Supplementary Table 5).
Extended Data Fig. 7 Initial basal state of RPM organoids does not determine clonal dynamics. Related to Fig. 3 and Supplementary Tables 2 and 6.
a, UMAP of basal organoid cells from wild-type (WT) organoids pre-Cre, or “CellTag Pre-Cre” RPM/RPMA organoids following transformation with CMV-Cre, with samples collected for scRNA-seq ~4 weeks after the time of Cre administration as in Fig. 3a. b, UMAP annotated by Leiden cluster and genotype corresponding to samples in (a) (Supplementary Table 6). c, Expression of signatures associated with normal lung cell types (e.g., ciliated, neuroendocrine, ionocyte/tuft) or various basal states in UMAP as in (a). Basal state signatures derived from Goldfarbmuren et al., Nat Comm, 2020 and Lin et al., Nature, 2024 (see Supplementary Table 6). d, UMAP as in (a) annotated by assigned basal state and genotype. e, CellTagged clones include those only from “CellTag Pre-Cre” organoids, as described in Fig. 3a, that were present and matching in both transformed organoids and subsequent allografts. Each bar represents one individual clone [comprising between 5–178 cells (RPM) or 6-1,241 cells (RPMA)] with assigned basal state as in (d). The corresponding in vivo clonal pattern (from Fig. 3d) that in vitro clones yield is annotated on the x-axis. f, The location and pattern of in vitro clones in UMAP as in (a) (top row), annotated by basal state, do not differ significantly from each other, but give rise to distinct in vivo RPM patterns 1, 2, and 5 (bottom row).
Extended Data Fig. 8 PTEN loss promotes POU2F3 in basal-derived SCLC. Related to Fig. 4.
a, T7 endonuclease assay on transformed RPM and RPMA basal organoids after lentiviral infection of LCV2 with sgCtrl or sgPten. Expected products of digestion with editing are 671 and 239 bp. b, Immunoblot of pAKT (Ser473) and total AKT with HSP90 as loading control on transformed RPM and RPMA basal organoids with LCV2-sgCtrl or -sgPten. c, Quantification of tumour volume (mm^3) over time (weeks) in RPM (top) and RPMA (bottom) sgCtrl (passage 1=orange, solid; passage 2=green, dashed), and sgPten (passage 1=blue, solid; passage 2=purple, dashed, passage 3=purple, dotted), basal organoid allografts. Exact number of tumours quantified indicated in figure. d, IHC of RPM and RPMA basal-organoid-derived tumours infected with LCV2-sgControl (sgCtrl) or sgPten for indicated markers (left). H-score quantification for indicated proteins (right) in RPM and RPMA “Ctrl” (parental and sgCtrl-infected tumours, orange) and “sgPten” tumours (purple) with SCLC-dominant histopathology (> 50% of tumour region analyzed is SCLC). Quantification of NSCLC-dominant tumours (only in RPMA) included on far right (both “Control” and “sgPten” tumours). Exact number of tumours quantified from n = 5–6 mice per genotype indicated in figure. Multi-passage tumours included. e, Representative H&E and IHC for indicated markers in RPM and RPMA sgCtrl and sgPten tumours with SCLC histopathology versus regions of NSCLC histology including adenocarcinoma (Adeno), adeno-squamous carcinoma (Adeno-squamous), or Squamous differentiation. H-score quantification for indicated proteins (right) in RPM and RPMA “Ctrl” (parental and sgControl-infected tumours, orange) and “sgPten” tumours (purple) with SCLC-dominant histopathology (> 50% of tumour region analyzed). Quantification of NSCLC-dominant tumours (only found in RPMA) included on the far right (both “Control” and “sgPten” tumours). Exact number of tumours quantified from n = 2–6 mice per genotype indicated in figure. Multi-passage tumours included. f, Stacked bar chart has average proportions of indicated histopathologies in individual RPM and RPMA control or sgPten-tumours. Exact number of tumours analyzed indicated above bars. Error bars, mean ± SEM. Histopathologies determined via analysis of H&E and NKX2-1, P63, KRT5, and SCLC subtype marker staining. LCNEC is large-cell neuroendocrine carcinoma. For gel source data for (a,b), see Supplementary Fig. 1. All scale bars, 50 μm. All statistical tests are one-way ANOVA tests with Fisher’s LSD multiple comparisons; **** p < 0.0001, ns=not significant= p > 0.05, and other exact p-values indicated in figure.
Extended Data Fig. 9 PTEN loss and MYC cooperate to drive POU2F3+ SCLC. Related to Fig. 4.
a, Survival of RPP and RPM mice treated with naphthalene + K5-Cre three days later versus RPP mice infected with Cgrp-Cre. Number of mice indicated in the figure. Mantel-Cox log-rank test with exact p-values in figure. b, H-score quantification of ASCL1 (top) and NEUROD1 (bottom) in RPP GEMM tumours initiated by indicated Ad-Cre viruses. Each dot is one tumour. Exact tumour number quantified from n = 5–10 mice per group in figure. Error, mean +/- SD. c, Bar graph depicting H-score for POU2F3 IHC from RPP lung tumours grouped by location in the airways and cell type-specific Ad-Cre virus. Each dot represents one tumour. Exact tumour number quantified from n = 4–18 mice/group in figure. Error, mean +/- SD. d, Immunoblot analysis of human SCLC cell line, H1048, for indicated markers after LCV2 infection with non-targeting control (sgNTC) or sgPTEN sgRNAs with HSP90 as a loading control. e, Immunoblot analysis of H1048 for indicated markers in parental cells versus cells with ectopic myristoylated-AKT (myrAKT) with HSP90 as a loading control. For gel source data for (d,e), see Supplementary Fig. 1. Unless otherwise noted, statistics represent Kruskal-Wallis (KW) tests (p-value indicated on figures). If KW was significant, post-hoc Dunn’s pairwise comparisons were performed (exact p-value on figures with ns=not significant=p > 0.05).
Extended Data Fig. 10 Human SCLC harbours a basal-like subset. Related to Fig. 5 and Supplementary Tables 2, 7, and 8.
a, Spearman correlation matrix of individual genes or gene signature correlations by bulk RNA-seq in n = 944 human SCLC biopsies (see Supplementary Table 2 for gene signatures). b, Heatmap with expression of SCLC subtype markers via bulk RNA-seq in n = 107 human tumours from Liu et al., Cancer Cell, 2024 (top). Tumours are annotated by published NMF groups, and sorted from left to right on “Basal score” expression (mean expression of n = 41 basal fate-related genes). Expression of indicated basal markers captured by proteomic profiling of the same n = 107 tumours23 are included to assess correlation of “Basal score” signature at the mRNA level with protein expression of basal markers. Individual rows are scaled from 0 to 1 with legends included in figure (right). c, Heatmap with expression by bulk RNA-seq of normal tuft, ionocyte, and tuft-ionocyte-progenitor (“TIP”) marker genes or gene signatures in n = 944 human SCLC biopsies, sorted left to right by expression of POU2F3. Samples are annotated by classified SCLC subtypes (including the 5-group and 6-group classifications as in Fig. 5a,b). d, Gene set enrichment analysis (GSEA) of normal neuroendocrine (NE), tuft, basal, and ionocyte cell signatures (Supplementary Table 2) in each human SCLC subtype (A, N, P or Y) versus “All” other subtypes in the real-world bulk RNA-seq dataset. Normalized enrichment score (NES) and p-values determined by Kolmogorov-Smirnov and permutation testing are shown. e, Expression of ionocyte signatures derived from mouse (M) or human (H) scRNA-seq studies (Supplementary Table 2) applied to RPM and RPMA basal allograft tumour cells (from Fig. 2e) in UMAP (top) or by SCLC fate in violin plots (bottom). Box-whisker overlays on violin plots indicate median and upper and lower quartile. Kruskal-Wallis (KW) tests (p-value in figure) with post-hoc Dunn’s multiple comparison tests (****p < 0.0001, ns=not significant=p > 0.05, and other exact p-values in figure).
Extended Data Fig. 11 A subset of basal cells targeted by K5-Cre express KRT13 hillock cell marker. Related to Fig. 1.
a-d, Representative tdTom+ cells in naphthalene-injured airway epithelium from Ai9 reporter mice at 3–7 days post-K5-Cre administration stained for indicated basal (KRT5, P63) or hillock (KRT13) markers (upper insets). Bottom panel is merged high magnification co-IF from white box inset indicated in top panel. a, Arrows indicate co-expressing cells in the tracheal epithelium. b, Arrows indicate co-expressing cells in the lung epithelium. c, Arrow indicates tdTom+ cells lacking KRT13 (left) or co-expressing KRT13 in a KRT13+ “hillock” structure (right) in the tracheal epithelium. d, Arrows indicate co-expressing cells in the lung epithelium. e, Percent of tdTom+ cells in the trachea vs lung airway vs total (all airway epithelium) co-expressing or within two cells distance from KRT5 (green), P63 (purple), or KRT13 (turquoise) cells. Quantification reflects all detected tdTom+ cells in mouse airways 3–7 days post K5-Cre administration (n = 4 mice, 160 tdTom+ cells were analyzed for KRT13/tdTom co-stain; n = 90 tdTom+ cells were analyzed for KRT5/P63/tdTom co-stain). Scale bars, 50 µm.
Supplementary information
Supplementary Fig. 1 (download PDF )
Raw, uncropped gels.
Supplementary Table 1 (download XLSX )
Top 500 marker genes of Leiden clusters from scRNA-seq data in Fig. 1j and Extended Data Figs. 2j and 4a,j.
Supplementary Table 2 (download XLSX )
Related to Figs. 1m,n, 2j,k and 5a–c,e and Extended Data Figs. 2j,l,m,o, 4f,m,n and 10a–e. ASCL1, NEUROD1, POU2F3 and MYC target gene signatures, conserved in mouse and human datasets. ATOH1 predicted target genes derived from SCLC patient-derived xenograft cells. YAP1 activity score established previously from RNA-seq and ChIP-seq data from 891 cancer cell lines. Normal NE, tuft, basal and ionocyte cell signatures derived from mouse or human scRNA-seq and previously published. Human SCLC archetype gene signatures established previously from bulk RNA-seq data on human SCLC cell lines. Established basal and luminal hillock cell signatures derived from mouse scRNA-seq data.
Supplementary Table 3 (download XLSX )
Top 500 marker genes of Leiden clusters in Figs. 2f and 3c from RPM and RPMA basal-organoid-derived allograft tumours.
Supplementary Table 4 (download XLSX )
Related to Fig. 3 and Extended Data Figs. 6 and 7. CellTag metadata for RPM and RPMA basal organoids and resulting allograft tumours (including both CellTag pre-Cre and CellTag post-Cre experiments.
Supplementary Table 5 (download XLSX )
Related to Fig. 3g–j and Extended Data Fig. 6e,f. Putative lineage drivers per CellRank-predicted lineage trajectory in RPM and RPMA basal-derived allograft tumours, predicted with a GPCCA estimator. The estimator predicts lineage drivers per trajectory by correlating gene expression changes with calculated fate probability values; thus, higher correlation values are stronger predicted drivers.
Supplementary Table 6 (download XLSX )
Related to Extended Data Fig. 7. Basal cell heterogeneity signatures derived from published scRNA-seq data on human tracheal epithelium. Top 500 marker genes per Leiden cluster from wild-type and CellTagged pre-Cre RPM and RPMA organoid scRNA-seq data as in Extended Data Fig. 7b.
Supplementary Table 7 (download XLSX )
Related to Fig. 5d–g. Human SCLC subtype classifications and signatures derived from three distinct and published bulk RNA-seq datasets and the real-world Caris dataset. T cell inflamed, antigen presentation and SCLC-‘inflamed’ signatures.
Supplementary Table 8 (download XLSX )
Related to Fig. 5 and Extended Data Fig. 10. Demographics and treatment information available from human SCLC tumours included in the real-world Caris bulk RNA-seq dataset.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ireland, A.S., Xie, D.A., Hawgood, S.B. et al. Basal cell of origin resolves neuroendocrine–tuft lineage plasticity in cancer. Nature 647, 257–267 (2025). https://doi.org/10.1038/s41586-025-09503-z
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41586-025-09503-z
This article is cited by
-
Lineage plasticity: a new dilemma in lung cancer treatment
Molecular Biology Reports (2026)
-
Learning the cellular origins across cancers using single-cell chromatin landscapes
Nature Communications (2025)
-
Wired for growth: neuron–tumour signalling in the lung and brain increases growth of a hard-to-treat cancer
Nature (2025)







