Introduction

Idiopathic Pulmonary Fibrosis (IPF) is the most common adult interstitial lung disease, with rising incidence and prevalence1,2 and a clinical course marked by progressive loss of lung function resulting in lung transplant or death within 3-7 years of diagnosis3. Despite the substantial and growing public health burden of pulmonary fibrosis (PF), there are only two FDA-approved medications, which only modestly slow lung function decline4,5. A major barrier to developing better PF therapies is the failure to model and study the proximal cellular dysfunction that initiates the disease6. While a dominant hypothesis in the field is that intrinsic alveolar type 2 epithelial cell (AT2) dysfunction is the proximal pathogenic event in PF3,7,8,9,10,11, how dysfunctional AT2s participate in the complex multicellular orchestration of lung fibrosis is largely unknown.

Defining histopathologic features of human PF and murine fibrotic lungs include aberrant alveolar epithelial cells and an extracellular matrix-producing fibroblast population. We and multiple other groups have shown that in murine modeling of lung injury, AT2s enter a transitional state between AT2 and AT1 identified by enriched expression of cytokeratins (Krt8, Krt19) and Cldn4 (referred to as cell cycle arrest state, ADI, PATS, DAPTs, ABI or Krt8hi cells), and that in severe fibrotic injuries these cells become stuck in an intermediate state12,13,14,15,16,17,18,19,20,21. These cells share a transcriptional signature with aberrant basaloid cells identified in the lungs of PF patients14,16. However, functional characterization of the disease-associated epithelial and fibroblast cell states has been limited by the difficulty of isolating them to interrogate their interactions.

Here, we used multiple translational PF platforms to functionally characterize the alveolar niche signaling that leads AT2s to enter this pathologic state and the contribution of aberrant intermediate AT2s to profibrotic signaling. To accomplish this, we leveraged our published SftpcC121G model of spontaneous progressive lung fibrosis, where knock-in of a clinical PF-associated Sftpc BRICHOS mutation (C121G) to the adult mouse lung12,22,23 generates a misfolded aggregation-prone proprotein (proSP-C) that induces unfolded protein response (UPR) stress—a specific AT2 dysfunction that is hypothesized to be an upstream mechanistic driver of sporadic PF8,9,11,24,25,26. Using the SftpcC121G model, we identified a cell-surface receptor (CD44) that is expressed on AT2s in the aberrant intermediate epithelial cell state during lung fibrosis, which enabled us to sort and enrich for these aberrant alveolar cells. We then tested the hypothesis that aberrant intermediate epithelial cells enriched in profibrotic mediators can directly promote fibrogenic activation in alveolar fibroblasts. We coupled this murine model with the first patient-specific induced pluripotent stem cell (iPSC)-derived AT2 (iAT2) model of SFTPC BRICHOS mutation-associated PF. Using our murine and human cell models, we identified that loss of canonical epithelial progenitor signals coupled with TGF-β stimulation can cause AT2 entry to the aberrant intermediate state. We thus demonstrate a previously suggested, but unestablished, profibrotic role of the lung alveolar epithelium and the niche signaling alterations that cause AT2s to enter this pathologic state.

Results

Identification of an aberrant intermediate alveolar epithelial cell subset in the Sftpc C121G model of pulmonary fibrosis

As previously reported, SftpcC121G mice receiving weekly oral gavage tamoxifen (TMX) developed spontaneous and progressive lung fibrosis without the need for a secondary extrinsic lung injury (Fig. 1a, b)12. To identify fibrotic cell-cell signaling pathways that may contribute to lung fibrosis, we characterized the epithelial and mesenchymal cell populations in SftpcC121G mice at the end-stage (week 7) of the model through single-cell RNA sequencing (scRNAseq) on whole lung epithelial (EpCAM+) and mesenchymal (EpCAM;CD31;CD45) cells from two SftpcC121G and two SftpcWT mice (Fig. 1a, c). After quality filtering, we obtained 29,303 cell transcriptomic profiles that showed representation from the lung mesenchyme (Pdgfra, Pdgfrb, Acta2), proximal epithelium (Epcam, Krt5, Scgb1a1, Muc5b, Foxj1), and distal epithelium (Epcam, Sftpc, Abca3, Hopx, Ager) (Fig. 1c; Supplementary Fig. 1). Similar to previous scRNAseq studies on end-stage human PF patient lung samples, we observed significant transcriptomic alterations in all three cell compartments as evidenced by shifts in clustering between SftpcWT and SftpcC121G samples (Fig. 1c; Supplementary Fig. 1)14,16.

Fig. 1: Identification of an aberrant intermediate alveolar epithelial population in the SftpcC121G model of pulmonary fibrosis.
figure 1

a Schematic of chronic tamoxifen-induced SftpcC121G mutation model and subsequent cell sorting for single-cell RNA sequencing (scRNAseq). b Representative (n = 4 biological replicates) Trichrome staining with fibrosis severity scoring (normal = blue, moderate = green, and severe = red) and bar graph showing relative amounts of injury per lung lobe. c scRNAseq UMAP representation of mesenchymal, proximal (airway), and distal (alveolar) epithelial subsets in both SftpcWT and SftpcC121G mice. Inserts depict re-clustered subset analysis of distal epithelium. d Violin plots of relative gene expression in the distal epithelium comparing expression between the two genotypes. Two-way Student’s t test. p values are shown. e Cluster annotation based on marker gene expression in the alveolar epithelium. f UMAP representation of marker gene scores of distal epithelial populations. UPR, unfolded protein response. g Dot plot of individual gene expression in distal epithelial cell populations. h Gene score representation by UMAP and dot plots of signature marker genes described in murine basaloid-like cells19. i Gene score dot plots from human aberrant basaloid populations14,16. j UMAP representation of profibrotic genes enriched within the aberrant intermediate epithelial cell population. Schematics Created in BioRender. Rodriguez, L. (2025) https://BioRender.com/19kfeg6.

To further characterize the alveolar epithelial cell populations, we re-clustered the distal (alveolar) lung epithelium from both SftpcWT and SftpcC121G mice (Fig. 1c). We analyzed the top 150 differentially expressed genes between the two genotypes and discovered that there was a decrease in AT2 marker gene (Sftpb, Abca3, Slc34a2) expression and an enrichment of transitional cell-associated genes (Krt18, Krt8, Sfn, Cldn4) in the SftpcC121G alveolar epithelium (Fig. 1d; Supplementary Fig. 2a)12,14,15,16,17,18,22,23. Consistent with the misfolded proSP-C that initiates AT2 dysfunction in the model, we identified enrichment in UPR genes (Hspa5, Ddit3) in the SftpcC121G alveolar epithelia (Fig. 1d). In our re-clustered alveolar epithelium, we identified 5 clusters (Fig. 1e; Supplementary Fig. 2b). Based on marker gene expression, we identified an AT2 cluster comprised almost entirely (91%) of SftpcWT cells and an AT1 cluster comprised largely of SftpcWT cells (73%), while the majority (81.5%) of SftpcC121G cells segregated into 3 separate clusters (Supplementary Fig. 2c). We identified a cluster enriched in UPR genes (UPR active AT2) and a cluster expressing both AT2 and AT1 marker genes as well as transitional state genes (transitional epithelial cells) (Fig. 1e–g; Supplementary Fig. 2b, d, e). Trajectory inference analysis starting from either the SftpcWT AT2 cluster or the SftpcC121G UPR active AT2 cluster demonstrated two trajectories (Supplementary Fig. 2f). One trajectory marked the AT2-to-AT1 progression seen in functional epithelial repair through the transitional epithelial cell cluster terminating in the AT1 cluster. A second trajectory terminated in a cluster that we termed aberrant intermediate epithelial cells based on high expression of transitional cell markers (Krt8, Cldn4, Krt19) coupled with lower expression of AT2 and AT1 marker genes compared to the transitional epithelial cluster (Fig. 1g; Supplementary Fig. 2e). We found that 19.3% of SftpcC121G AT2s were within this cluster and that this cell population closely resembled the transcriptomic profile of a previously described basaloid like subset of mouse transitional cells, including the expression of defining gene markers Pdlim7, Fblim1, and Palld (Fig. 1h; Supplementary Fig. 2d, e)19. Further, this population shared a similar transcriptomic profile to aberrant basaloid cells described in human PF (Fig. 1i)14,16, and expressed a number of profibrotic gene markers (Tgfb1, Tgfb2, Itgb6, Fn1, Ctgf) (Fig. 1g, j). Together, this analysis of the SftpcC121G alveolar epithelium revealed the presence of an aberrant intermediate epithelial cell state that was enriched in profibrotic genes and shared transcriptomic features with the mouse and human PF aberrant basaloid cells.

Chronic Sftpc C121G lungs harbor mesenchymal cell populations found in human PF with altered epithelial-to-mesenchymal signaling

We next sought to characterize the mesenchymal cell populations present at end-stage fibrosis in the SftpcC121G lungs (Supplementary Fig. 3), based on the known heterogeneity of lung fibroblasts within healthy adult murine lungs, as well as shifts in gene expression and cell states within fibrotic lung tissues17,27,28,29,30,31. To accomplish this, we re-clustered the lung mesenchyme from our scRNAseq dataset and identified 8 mesenchymal cell populations with distinct gene profiles (Supplementary Fig. 3a, b). This analysis included clusters of alveolar (Col13a1, Npnt, Inmt) and adventitial (Col14a1, Pi16, Ly6a) fibroblasts, as well as fibrotic (Cthrc1, Col1a1, Tgfb1, Fst, Spp1) and a recently described transitional/inflammatory (Lcn2, Hp, Sfrp1) fibroblast population (Supplementary Fig. 3c, d)27,29,30,31. We first compared gene expression differences for these marker genes between the SftpcC121G fibroblasts to the SftpcWT fibroblasts in the mesenchymal dataset and discovered that there was a decrease in alveolar marker gene expression and an enrichment of both transitional/inflammatory and fibrotic fibroblast marker genes, notably Cthrc1, in the SftpcC121G fibroblasts (Supplementary Fig. 3e). Analysis of population distribution showed that compared to SftpcWT fibroblasts, SftpcC121G fibroblasts were less frequent in the alveolar fibroblast cluster (9.8% vs. 35.2%) and more frequent in the Cthrc1+ fibrotic fibroblast cluster (30.3% vs. 0.7%) (Supplementary Fig. 3b). This analysis showed alterations in lung mesenchymal transcriptome and cell state composition in the fibrotic SftpcC121G lungs with a shift from a homeostatic to pathogenic population.

We next examined the emergence of fibrogenic signaling between alveolar epithelial and mesenchymal cell populations in SftpcC121G fibrotic lungs. We utilized a recently developed software package, NICHES, to generate UMAPs based on predicted ligand–receptor interactions through epithelial and mesenchymal gene expression patterning32. As NICHES accounts for directionality of signaling, we were able to analyze and generate separate UMAPs of epithelial-to-mesenchymal (Fig. 2; Supplementary Fig. 4) and mesenchymal-to-epithelial (Supplementary Fig. 5) signaling. Analysis of epithelial-to-mesenchymal signaling revealed that the cells segregated into 11 clusters, numbered by size (cluster 0 containing the most cells, cluster 11 the fewest) (Fig. 2a, b). SftpcWT and SftpcC121G cells displayed minimal clustering overlap, suggesting the emergence of a divergent interactome in the fibrotic lung (Fig. 2b). Clusters 2 and 3 were predominantly composed of SftpcWT cells and known ligand–receptor pairs between AT1s (cluster 2) and AT2s (cluster 3) and alveolar fibroblasts. Clusters 0,1,4,5 comprised almost exclusively SftpcC121G cells (Fig. 2b, c). In particular, the sending epithelial population cells in cluster 5 were largely composed of aberrant intermediate cells (84.0%) and the receiving mesenchymal cells were fibrotic (32.0%) and transitional/inflammatory (22.5%) fibroblasts (Fig. 2c). Further analysis of cluster 5 showed it was defined by profibrotic ligand–receptor signaling pairs, including Tgfb1-Tgfbr2, Tgfb2-Tgfbr2, Pdgfb-Pdgfrb, Ctgf-Lrp1, and Fn1-Cd44 (Fig. 2d–f; Supplementary Fig. 4)33,34,35,36,37, suggesting a potential role of aberrant intermediate epithelial cells in promoting a fibrotic phenotype in lung mesenchymal cells through direct ligand–receptor signals.

Fig. 2: Epithelial-to-mesenchymal signaling is characterized by a gain of profibrotic signaling from aberrant intermediate epithelial cells.
figure 2

a Schematic demonstrating single-cell ligand–receptor mapping using NICHES. b NICHES ‘cell-to-cell’ ligand–receptor signaling UMAP generated using epithelial cells as the principal sender and mesenchyme as receiver. Clusters are numbered based on the number of cells they contain (0 = most cells, 11 = least cells). c (Top) NICHES UMAPs labeled by epithelial sender and mesenchymal receiver cell types with cluster 5 circled. (Bottom) Ratios of cell types within each cluster. d Heatmap of individual ligand–receptor pairs defining each cluster archetype. e NICHES UMAPs showing known profibrotic ligand–receptor pairs altered in fibrosis and enriched in cluster 5 (aberrant intermediate epithelial cell sender cluster). f Violin plots of Tgfb1-Tgfbr2 signaling showing increased signaling in SftpcC121G lungs with aberrant intermediate epithelial cell sending. Two-way Student’s t test, p value is shown. Schematics Created in BioRender. Rodriguez, L. (2025) https://BioRender.com/72q4ak7.

NICHES analysis of the mesenchymal-to-epithelial signaling also demonstrated altered signaling in SftpcC121G lungs (Supplementary Fig. 5). UMAP representation of mesenchymal-to-epithelial ligand–receptor signaling revealed 8 separate clusters, with clusters 2,3,8 primarily associated with SftpcWT cells and clusters 0,1,4,5 primarily associated with SftpcC121G cells. Aberrant intermediate epithelial cells primarily clustered with fibrotic and transitional/inflammatory fibroblasts (clusters 0, 4, and 5) (Supplementary Fig. 5a–c), suggesting they received ligand signals from pathogenic fibroblast populations. These clusters were defined by enrichment of ligand–receptor pairs indicative of dysregulated TGF-β signaling and increased extracellular matrix interactions (Col1a1-Itgb3, Col1a1-Itga5, Tgfb1-Itgav, and Fn1-Cd44) (Supplementary Fig. 5d, e). We also discovered a loss of canonical epithelial maintenance signaling cues in SftpcC121G lungs, including decreased Fgf7-Fgfr2 and Bmp4-Bmpr1a signaling, suggesting impaired alveolar epithelial niche maintenance in SftpcC121G lungs (Supplementary Fig. 5d, e)28,38. Together, the ligand–receptor mapping analysis revealed that the aberrant intermediate epithelial cells were implicated in bidirectional fibrogenic signaling with pathological fibroblasts.

Sorted CD44Hi aberrant alveolar epithelial cells exhibit increased expression of profibrotic mediators

While our scRNAseq analysis of SftpcC121G lungs and studies performed by others in both human PF and mouse fibrosis models14,16,19,20,21 describe a subset of intermediate epithelial cells that express profibrotic markers, the isolation and ex vivo analysis of these cell populations has remained elusive39,40. Our scRNAseq analysis revealed that SftpcC121G aberrant intermediate cells were enriched in expression of Cd44, which transcribes the transmembrane glycoprotein CD44 (Fig. 3a). We had previously shown in the SftpcC121G model that AT2s expressing the mutant SP-C become KRT8+ as they lose AT2 identity and enter transitional states, and that this can occur in lung regions both with and without injury or fibrosis12,22. While Cd44 is known to be upregulated in immune cells and fibroblasts during lung fibrosis35,36, immunofluorescent chemistry (IFC) of lung sections for KRT8, as a marker of these dysfunctional SftpcC121G AT2s, and CD44 demonstrated that within fibrotic regions of SftpcC121G peripheral lung tissue (identified by dense collagen deposition in picrosirius red staining Supplementary Fig. 6a), the majority of KRT8+ cells (55.93% ± 5.35 SD) were CD44 + . In contrast, in less fibrotic regions only 17.43% ( ± 5.615 SD) of KRT8+ cells were CD44+ (Fig. 3b). Based on the localization of CD44 + /KRT8+ cells to fibrotic regions, we developed a FACS strategy to isolate CD44 high vs low alveolar epithelial cells (Fig. 3c; Supplementary Fig. 6b). Similar to previous studies41, we found that 8.12% ( ± 1.86 SD) of AT2s in SftpcWT lungs are CD44Hi. In SftpcC121G lungs, we discovered an increase in the percentage of CD44Hi alveolar epithelial cells to 35.83% ( ± 4.38 SD) (Fig. 3d). We found no change in the total number of sorted AT2s from the SftpcC121G lungs compared to SftpcWT lungs, suggesting that CD44Hi SftpcC121G AT2s do not emerge due to an AT2 proliferative response (Supplementary Fig. 6c, d). We have previously found in the SftpcC121G model that at 7 weeks 82.2% of AT2s undergo SFTPC allele cre-mediated recombination to express the mutant SP-C12. We next asked whether the CD44Hi SftpcC121G AT2s had undergone SFTPC recombination, which would indicate that these cells express the mutant Sftpc transcript. We found that 78.1% (±5.5 SD) of the CD44Lo and 95.9% (±1.7 SD) of the CD44Hi SftpcC121G AT2s had undergone recombination (Supplementary Fig. 6e). This suggests that un-recombined AT2s are not likely to enter the CD44Hi cell state and antecedent mutant SP-C expression occurs in the CD44Hi SftpcC121G AT2s. To further validate our sorting strategy, we prepared cytospins of sorted AT2s from SftpcWT lungs and of CD44Hi/Lo AT2s sorted from SftpcC121G lungs, and stained the cells for CD44 and KRT8 (Supplementary Fig. 6f). This demonstrated that in SftpcWT lungs 5-9% of sorted AT2s were CD44 + , affirming our flow cytometry gating percentages for SftpcWT AT2s. Among the CD44Hi SftpcC121G sorted AT2s, 94.9% (range 93.4–98.2%) were CD44+ and 90.6% (range 86.2-95.4%) were CD44 + /KRT8+ double positive, supporting our lung tissue analysis showing CD44+ /KRT8+ double positive cells in the fibrotic SftpcC121G lung. Based on these analyses we concluded that we had successfully developed a sorting strategy in the SftpcC121G model to enrich for CD44Hi AT2s, which in our single-cell transcriptomic analysis appeared to represent an aberrant and profibrogenic cell population.

Fig. 3: Sorted CD44Hi SftpcC121G AT2s are enriched in profibrotic mediators.
figure 3

a UMAP representation of Cd44 expression in aberrant intermediate cells. b (left) Representative (n = 3 biological replicates) immunofluorescence staining of SftpcWT and non-fibrotic and fibrotic regions of SftpcC121G lungs (identified by picrosirius red (PSR) staining; see Supplementary Fig. 6) showing CD44+ /KRT8+ cells in fibrotic regions (scale bar = 50 μm). (right) Quantification of KRT8+ /CD44− and KRT8+ /CD44+ cells per 350 μm × 550 μm high-power field. Ordinary one-way ANOVA with adjustments (Tukey’s) made for multiple comparisons for total KRT8+ cells per HPF (*) and CD44+ /KRT8+ cells per HPF (#). Data are presented as mean values ± SEM. c Flow cytometry approach to CD44 AT2s. Total lung epithelial cells (EpCAM+ /CD45−/PECAM−) were gated for AT2s (CD200+ /CD104−) before isolating CD44Hi/Lo cells from SftpcWT and SftpcC121G lungs. d Percent AT2s that are CD44Hi in SftpcWT and SftpcC121G mice. Two-way Student’s t test; n = 4 biological replicates. Data are presented as mean values ± SEM. e PCA demonstrating bulk transcriptomic differences between CD44Hi/Lo AT2s from SftpcWT and SftpcC121G mice. f Heatmap of all differentially expressed genes across all conditions (>1.5 fold change, adjusted p value < 0.05, adjustments were made for multiple comparisons), grouped by k-means clustering. Select genes are highlighted within each cluster. g Heatmap of murine aberrant basaloid and AT2 marker genes grouped by k-means clustering. h Gene expression by RT-qPCR from freshly sorted AT2s. Ordinary one-way ANOVA with adjustments (Tukey’s) made for multiple comparisons; n = 3 biological replicates from SftpcWT and n = 4 from SftpcC121G. Data are presented as mean values ± SEM. i Volcano plot showing differentially upregulated (red) and downregulated (blue) genes (>1.5 fold change (vertical lines), adjusted p value < 0.05 (horizontal line), adjustments were made for multiple comparisons) between CD44Hi and CD44Lo SftpcC121G AT2s annotated with AT2 marker genes, profibrotic marker genes, and mouse aberrant basaloid genes. j Gene ontology analysis for upregulated biological processes in CD44Hi SftpcC121G compared to CD44Lo SftpcC121G AT2s. k Heatmap of the KRT5-/KRT17+ gene list grouped by k-means clustering. Source data are provided as a Source Data file. p value is shown if p  <  0.05.

To further validate that the CD44 sorting strategy enriched for AT2s in the aberrant intermediate cell state, we performed RNA sequencing on flow-sorted CD44Hi/Lo alveolar epithelial cells from SftpcWT and SftpcC121G lungs. This analysis showed transcriptional distinctions among the four AT2 populations (Fig. 3e, f). Analysis of the CD44Hi SftpcWT AT2s showed increased expression of inflammatory genes (Il-1β, Tnf, Ccl20) and upregulation of the GO inflammatory response signaling pathway (Fig. 3f and Supplementary Fig. 7), corroborating a recent finding that CD44Hi AT2s from normal murine lungs are a proinflammatory population of AT2s42. However, there was little transcriptional similarity between the CD44Hi SftpcWT AT2s and the CD44Hi SftpcC121G AT2s. CD44Hi SftpcC121G AT2s expressed marker genes of the murine aberrant basaloid intermediate state, had decreased expression of AT2 marker genes, and were enriched for profibrotic genes (Tgfb1, Tgfb2, Fn1, Ctgf, Pdgfb) that we had identified as expressed by the aberrant intermediate epithelial cluster in our scRNAseq analysis (Fig. 3f–h). Further comparison of the CD44Hi and CD44Lo SftpcC121G AT2s showed CD44Hi SftpcC121G AT2s were enriched for profibrotic genes and extracellular matrix organization pathways (Fig. 3i, j). Finally, the CD44Hi SftpcC121G AT2s had increased expression of the human aberrant basaloid cell KRT5-/KRT17+ gene signature previously described in human PF (Fig. 3k)14. Overall, these data suggest that our CD44 sorting strategy in the SftpcC121G model successfully enriches for AT2s in the aberrant intermediate cell state expressing profibrotic genes.

To determine whether Cd44 expression in aberrant intermediate epithelial cells was unique to the SftpcC121G model, we analyzed CD44Hi alveolar epithelial cells from the bleomycin murine fibrosis model. Utilizing a previously published scRNAseq dataset17, we found that a subset of alveolar cells in the single-dose bleomycin mouse model also had increased expression of Cd44 (Supplementary Fig. 8a, b). These Cd44Hi alveolar cells emerged during the fibrogenic phase (days 9–14) of the bleomycin model and expressed similar profibrotic genes as CD44Hi SftpcC121G AT2s, most notably Fn1 (Supplementary Fig. 8b, c). To test whether the CD44Hi sorting strategy enriched for aberrant intermediate cells in this model, we administered bleomycin to lineage traceable SftpcCreERT2;Rosa26tdTomato(tdT) mice and harvested lungs during the fibrotic phase, at day 14 post-bleomycin (Supplementary Fig. 8d). In fibrotic lungs, we observed tdT + /CD44+ cells by IFC as well as a significant increase in tdT + /CD44Hi cells by flow cytometry—32.60% (±3.32 SD) compared to 3.57% (±2.32 SD) in non-bleomycin lungs (Supplementary Fig. 8e, f). Flow-sorted tdT + /CD44Hi cells from fibrotic lungs had increased expression of Cldn4 and profibrotic markers (Fn1, Tgfb1, Tgfb2) (Supplementary Fig. 8g), suggesting that CD44 sorting is a useful strategy to enrich for aberrant epithelial cells in this model. We were thus able to develop a strategy to enrich for a subset of alveolar epithelial cells that emerge during fibrosis modeling and are enriched in profibrotic signaling ligands.

CD44Hi Sftpc C121G AT2s contribute to profibrotic epithelial-mesenchymal crosstalk

Previous studies and our transcriptomic analysis suggested that aberrant alveolar epithelial cells may contribute to fibrosis through bidirectional profibrotic signaling with lung fibroblasts14,16,19,20,21,43. In the SftpcC121G lungs, we identified CD44 + /KRT8+ double positive cells in proximity to parenchymal smooth muscle actin (SMA) positive cells, suggesting the opportunity of niche cell-cell signaling between fibrotic parenchymal fibroblasts and the aberrant AT2s (Fig. 4a). To model these interactions, we performed ex vivo mixed organoid cultures in which AT2s from SftpcWT or SftpcC121G lungs were co-cultured with either SftpcWT or SftpcC121G PDGFRA+ lung fibroblasts (Fig. 4b and Supplementary Fig. 11a). Comparison of sorted bulk PDGFRA+ fibroblasts (PDGFRA+;CD45;EpCAM;CD31;MCAM) from SftpcWT or SftpcC121G fibrotic lungs – a strategy that did not specifically enrich for the most fibrotic lung fibroblasts but rather captured the heterogeneity of the PDGFRA+ population in the SftpcC121G lung – demonstrated that SftpcC121G fibroblasts had increased expression of both transitional/inflammatory (Lcn2, Sfrp1, Hp) and fibrotic (Cthrc1, Col1a1) marker genes compared to SftpcWT fibroblasts (Supplementary Fig. 9). It has been previously reported that CD44Hi SftpcWT AT2s have increased progenitor capacity and represent a stem cell pool in response to injury. While we observed differential expression of the GO cell proliferation signaling pathway (Supplementary Fig. 7) in CD44Hi SftpcWT AT2s compared to CD44Lo SftpcWT AT2s, we saw no increase in CD44Hi SftpcWT AT2s colony-forming efficiency (CFE) when cultured with either fibrotic or normal lung fibroblasts (Supplementary Fig. 10a–c). Upon co-culture of SftpcWT AT2s (combined CD44Hi and CD44Lo populations) and of CD44Hi and CD44Lo SftpcC121G AT2s with SftpcWT or SftpcC121G PDGFRA+ lung fibroblasts, we found that SftpcWT AT2s had increased CFE and organoid size when co-cultured with SftpcC121G fibroblasts compared to SftpcWT fibroblasts, suggesting SftpcC121G fibroblasts could promote SftpcWT AT2 progenitor capacity (Supplementary Fig. 11a–d). However, no increase in CFE or organoid size was observed when CD44Hi SftpcC121G AT2s were cultured with SftpcC121G fibroblasts compared to SftpcWT fibroblasts. These findings suggest the effect of fibroblast signaling on AT2 proliferation may be dependent on AT2 cell state and capacity of AT2s to respond to external cell-cell signaling.

Fig. 4: Mixed organoid culture demonstrates CD44Hi SftpcC121G AT2s induce a pathologic response in fibroblasts through bidirectional signaling.
figure 4

a Representative (n = 3 biological replicates) immunofluorescence staining of fibrotic regions from SftpcC121G lungs shows KRT8+ /CD44+ cells in proximity to SMA+ parenchymal cells (scale bar = 100 μm). b Schematic depicting mixed organoid cultures derived from various combinations of SftpcWT and SftpcC121G CD44Hi/Lo AT2s and PDGFRA+ fibroblasts from SftpcWT and SftpcC121G mice and magnetic bead sorting of day 14 organoids prior to gene expression analysis. Gene expression analysis from enriched (c) epithelial or (d) fibroblast populations derived from day 14 organoids. Two-way ANOVA with uncorrected Fisher’s LSD multiple comparisons (* denotes differences between co-culture conditions with the same fibroblast genotype, # denotes differences between co-culture conditions with the same epithelial genotype); n = 3 biological replicates. Data are presented as median value ± range. e Schematic of bidirectional signaling in mixed organoid culture. Source data are provided as a Source Data file. p value is shown if p  <  0.05. Schematics Created in BioRender. Rodriguez, L. (2025) https://BioRender.com/8j8bqzx.

To investigate the signaling mechanisms which may drive the differences seen in mixed organoid cultures, we developed an isolation assay using EpCAM+ magnetic bead to isolate cells from day 14 organoid cultures (Fig. 4b), allowing us to reliably distinguish both epithelial cell and fibroblast populations for gene expression analysis (Supplementary Fig. 11e). As expected, based on our scRNAseq data, AT2 markers (Abca3, Etv5, Lamp3) were downregulated in both CD44Hi and CD44Lo SftpcC121G AT2s compared to SftpcWT AT2s, regardless of fibroblast genotype in co-culture (Supplementary Fig. 11f). Compared to SftpcWT AT2s, we observed decreased expression in AT1 markers (Pdpn, Akap5, Vegfa) in both CD44Hi and CD44Lo SftpcC121G AT2s when cultured with SftpcC121G fibroblasts, suggesting that SftpcC121G AT2s have impaired AT2 to AT1 differentiation (Supplementary Fig. 11f) in the presence of the fibrotic lung fibroblasts. Conversely, we observed increased expression of Akap5 and Vegfa when CD44Hi and CD44Lo SftpcC121G AT2s were cultured with SftpcWT fibroblasts, suggesting homeostatic lung fibroblasts could improve impaired SftpcC121G AT2 differentiation capacity. Further, CD44Hi SftpcC121G AT2s had increased expression of profibrotic markers (Tgfb1, Tgfb2, Itgb6, Ctgf, Pdgfb, Fn1) and Cdkn1a and CD44, when cultured with SftpcC121G fibroblasts compared to SftpcWT fibroblasts (Fig. 4c). These data revealed a differential effect of fibrotic lung fibroblasts on homeostatic SftpcWT AT2s and CD44Hi SftpcC121G AT2s, and showed that fibrotic lung fibroblasts amplify CD44Hi SftpcC121G AT2 expression of the genes associated with the profibrotic aberrant intermediate cell phenotype.

Based upon the expression of profibrotic genes in CD44Hi SftpcC121G AT2s in organoid co-culture, we next determined whether these cells induced a profibrotic response in the fibroblasts in the co-culture. We found that co-culture with CD44Hi SftpcC121G AT2s, compared to SftpcWT AT2s, resulted in increased expression of transitional (Lcn2, Sfrp1) and fibrotic (Cthrc1, Col1a1, Fst) marker genes in SftpcWT fibroblasts (Fig. 4d; Supplementary Fig. 11f). When co-cultured with SftpcC121G fibrotic lung fibroblasts, CD44Hi SftpcC121G AT2s also induced increased expression of transitional (Sfrp1) and fibrotic (Col1a1, Tgfb1, Fst) marker genes. In contrast, CD44Hi SftpcWT AT2s from homeostatic lungs, which are not enriched for profibrotic mediators or aberrant basaloid gene programs (Fig. 3, Supplementary Fig. 7), did not induce pathological fibroblast gene expression changes in either fibrotic or SftpcWT lung fibroblasts (Supplementary Fig. 10d). We thus found that in organoid co-culture aberrant intermediate epithelial CD44Hi SftpcC121G AT2s could directly impact the alveolar fibroblast phenotype, driving expression of pathogenic fibroblast marker genes in SftpcWT fibroblasts and amplifying the pathologic phenotype of SftpcC121G lung fibroblasts (Fig. 4e).

Altered FGF7 signaling contributes to impaired CD44Hi Sftpc C121G AT2s progenitor capacity

Having established that CD44Hi SftpcC121G AT2s can evoke cell state alterations in alveolar fibroblasts, we next sought to investigate whether CD44Hi SftpcC121G AT2s have a differential response to reciprocal signaling known to maintain AT2 progenitor capacity. We had identified loss of canonical epithelial maintenance niche signaling archetypes between pathogenic fibroblasts and CD44Hi SftpcC121G AT2s in our scRNAseq ligand–receptor analysis (Supplementary Fig. 5), in particular Fgf7-Fgfr2 ligand–receptor signaling, which is critical for AT2 homeostasis and injury repair (Fig. 5a)28,38,44. Further, we had found that SftpcC121G fibroblasts exhibited decreased Fgf7 expression compared to SftpcWT fibroblasts (Fig. 5b), CD44Hi and CD44Lo SftpcC121G AT2s had decreased Fgfr2 expression compared to SftpcWT AT2s (Fig. 5c), and fibroblasts in organoid co-culture with SftpcC121G AT2s decreased their Fgf7 expression (Fig. 4d). Based on these findings, we asked what response CD44Hi and CD44Lo SftpcC121G AT2s had to FGF7 signaling. To address this question, we utilized a previously published AT2-alone organoid model, which allows AT2s to be cultured independently of feeder cells45,46. Using this model, we assessed the dose response to FGF7 (Fig. 5d) and observed genotype-specific differences in CFE. Specifically, while all AT2 conditions required a basal level of FGF7 for organoid growth (Supplementary Fig. 12), only SftpcWT AT2s demonstrated an increase in CFE with escalating FGF7 concentrations (4, 10, and 25 ng/mL), whereas CFE in CD44Hi and CD44Lo SftpcC121G AT2s did not increase (Fig. 5e, f). Interestingly, CD44Lo SftpcC121G AT2s had an increased CFE compared to SftpcWT AT2s in the lowest FGF7 condition suggesting this subset of SftpcC121G AT2s had an FGF7 independent progenitor capacity. Analysis of the day 14 AT2s from organoid culture demonstrated that SftpcC121G AT2s (CD44 Lo and Hi) maintained the decreased expression of AT2 marker genes and increased Krt8 expression compared to SftpcWT AT2s that we observed in the freshly sorted AT2s (Fig. 5g). Additionally, CD44Hi SftpcC121G AT2s maintained increased expression of Fn1. IFC staining of AT2s from the organoid culture for proSP-C demonstrated the normal punctate proSP-C protein pattern consistent with post-ER processing in SftpcWT AT2s, but the reticular ER staining pattern with perinuclear aggregates in the CD44Hi and CD44Lo SftpcC121G AT2s (Fig. 5h) that we observed in the in vivo AT2s from the SftpcC121G model12. This proSP-C pattern and the increase in IFC staining for KRT8 in the CD44Hi and CD44Lo SftpcC121G AT2s affirmed that the SftpcC121G AT2s maintain key features of their phenotype in the ex vivo model (Fig. 5h, i). Overall, these data suggest that in part, the CD44Hi SftpcC121G AT2 aberrant intermediate alveolar epithelial cell progenitor defect is mediated by altered epithelial-mesenchymal crosstalk through reduced FGF signaling and that the SftpcC121G AT2s maintain their aberrant phenotype in the ex vivo AT2-only model.

Fig. 5: SftpcC121G AT2s have reduced proliferation capacity through decreased response to FGF7 stimulation.
figure 5

a Violin plot comparing Fgf7-Fgfr2 signaling expression between SftpcWT and SftpcC121G cells in the NICHES analysis of mesenchymal to epithelial ligand–receptor signaling. Two-way Student’s t test. b RT-qPCR gene expression of Fgf7 in freshly sorted PDGFRA+ fibroblasts from SftpcWT and fibrotic SftpcC121G lungs (n = 3 biological replicates). Two-way student’s t test. Data are presented as median value ± range. c RT-qPCR gene expression of Fgfr2 in freshly sorted SftpcWT and SftpcC121G CD44Hi/Lo AT2s (n = 3 biological replicates). Ordinary one-way ANOVA with adjustments (Tukey’s) made for multiple comparisons. Data are presented as median value ± range. d Schematic depicting AT2-alone 3D culture with a gradient of FGF7 dosing. e Representative images of organoid cultures. Scale bar = 1 mm. f Organoid diameter (µm) and colony forming efficiency (CFE; %) at day 14 of culture (n = 3 biological replicate per condition). Two-way ANOVA with uncorrected Fisher’s LSD multiple comparisons (* denotes differences between samples of the same genotype, # denotes differences between samples treated with the same FGF7 concentration). Data are presented as median value ± range. g RT-qPCR gene expression analysis of AT2s cultured with 10 ng/mL FGF7 for 14 days. Two-way ANOVA with uncorrected Fisher’s LSD multiple comparisons; n = 3 biological replicates. Data are presented as median value ± range. h Immunofluorescence staining of AT2s from day 14 organoid cultures for SP-C and KRT8 (scale bar = 10 μm). i Per cell KRT8 mean florescent intensity. Ordinary one-way ANOVA with adjustments (Tukey’s) made for multiple comparisons. Violin display the median (center line), the first and third quartiles (outer lines). Source data are provided as a Source Data file. p value is shown if p  <  0.05. Schematics Created in BioRender. Rodriguez, L. (2025) https://BioRender.com/r96uyab.

CD44Hi Sftpc C121Gs AT2s pathologically activate alveolar fibroblasts through TGF-β signaling

With our AT2-fibroblast organoid cultures showing CD44Hi SftpcC121G AT2s can drive fibroblasts toward a fibrotic phenotype, we next sought to determine whether this process was mediated by soluble factors. We discovered that CD44Hi SftpcC121G AT2s in AT2-only culture retain enrichment for TGF-β signaling genes (Fig. 6a, b), and that supernatant from these cultures had increased levels of active TGFB1 (Fig. 6c). To test whether soluble factors from CD44Hi SftpcC121G AT2s can drive fibroblasts toward a fibrotic phenotype, we exposed sorted SftpcWT alveolar fibroblasts (PDFGRA+; SCA1) in 2D culture (Fig. 6d) to conditioned media from the AT2-alone cultures from the FGF7 10 ng/mL condition. Conditioned media from CD44Hi SftpcC121G AT2s significantly increased fibroblast expression of transitional/inflammatory marker Sfrp1 and fibrotic marker genes Cthrc1 and Col1a1 (Fig. 6e–g), affirming that a profibrotic secretome from CD44Hi SftpcC121G AT2s can induce pathological activation of alveolar fibroblasts.

Fig. 6: CD44Hi SftpcC121G AT2s stimulate a fibrotic response in alveolar fibroblasts through TGF-β signaling.
figure 6

a RT-qPCR gene expression of Tgfb1, Tgfb2, and Itgb6 in freshly sorted and (b) 3D AT2-only alveolospheres from SftpcWT and SftpcC121G CD44Hi/Lo AT2s (n = 3 biological replicates). Ordinary one-way ANOVA with uncorrected Fisher’s LSD test. Data are presented as median value ± range. c ELISA quantification of active TGFB1 in media supernatants from 3D AT2-only alveolosphere cultures from SftpcWT and SftpcC121G CD44Hi/Lo AT2s (n = 3 biological replicates SftpcWT and SftpcC121G CD44Lo and n = 4 biological replicates SftpcC121G CD44Hi). Ordinary one-way ANOVA with adjustments (Tukey’s) made for multiple comparisons. Data are presented as median value ± range. d (left) Schematic depicting exposure of healthy lung alveolar fibroblasts to conditioned media from AT2s alveolosphere culture, with and without ALK-5 inhibitor or anti-CTGF monoclonal antibody added to conditioned media. (right) Fibroblast cell state with marker genes. RT-qPCR gene expression of alveolar fibroblasts exposed for 2 days to AT2 conditioned media for (e) alveolar, f transitional/inflammatory, and (g) fibrotic state marker genes, showing CD44Hi SftpcC121G AT2 conditioned media stimulates fibrotic marker gene (Cthrc1, Col1a1) expression, which is attenuated by ALK-5 inhibition. Ordinary two-way ANOVA with uncorrected Fisher’s LSD multiple comparisons (* denotes differences between samples of the same epithelial genotype, # denotes differences between samples of separate genotypes either with or without ALK-5 inhibition). n = 3 biological replicates. Data are presented as median value ± range. Source data are provided as a Source Data file. p value is shown if p  <  0.05. Schematics Created in BioRender. Rodriguez, L. (2025) https://BioRender.com/sqohbxr.

Given the enrichment of TGF-β signaling marker genes and Ctgf in CD44Hi SftpcC121G AT2s, we hypothesized that signaling through these pathways mediates the CD44Hi SftpcC121G AT2 effect on alveolar fibroblasts. To test this, we added either an ALK-5 inhibitor (SB-525334, 1 µM) to suppress TGF-β signaling or a monoclonal antibody against CTGF (Pamrevlumab/FG-3019, 100 ng/mL) to the conditioned media (Fig. 6d). While CTGF inhibition reduced the expression of transitional/inflammatory markers (Lcn2, Sfrp1) and Col1a1, ALK-5 inhibition decreased the expression of both Col1a1 and Cthrc1 to a greater extent (Fig. 6e–g). These findings revealed that a profibrogenic secretome from CD44Hi SftpcC121G AT2s activated alveolar fibroblasts through TGF-β signaling.

To further validate findings using a second AT2 culture method, SftpcWT AT2s and CD44Hi/Lo SftpcC121G AT2s were cultured in 2D on transwells as epithelial-only cultures without fibroblasts and supernatants were collected (Supplementary Fig. 13). This 2D assay has been traditionally used to assess AT2-to-AT1 differentiation, and as expected, we observed in all conditions a decrease in AT2 gene expression at day 5 in culture (Supplementary Fig. 13b)20,47,48. However, we found that CD44Hi SftpcC121G AT2s had reduced expression of AT1 markers (Pdpn, Hopx) at day 5 compared to SftpcWT AT2s, further suggesting impaired AT1 differentiation potential. While 2D culture led to an increase in profibrotic markers in all alveolar epithelial cell conditions, CD44Hi SftpcC121G AT2s expressed significantly higher levels of Tgfb1, Tgfb2, and Itgb6 at day 5 compared to SftpcWT and CD44Lo SftpcC121G AT2s (Supplementary Fig. 12c). At day 5, supernatants were collected from all these alveolar epithelial cell cultures and used as conditioned media for exposure experiments (Supplementary Fig. 14). Similar to our supernatant exposure experiment from AT2-alone organoid culture (Fig. 6d–g), we observed significant upregulation of fibrotic (Cthrc1, Col1a1) marker genes in alveolar fibroblasts exposed to supernatant from CD44Hi SftpcC121G AT2s that was attenuated fully by ALK-5 inhibition but not the anti-CTGF antibody. To confirm this phenomenon was not unique to the chronic SftpcC121G mouse model, we performed a parallel experiment using conditioned media from alveolar epithelial cells isolated from bleomycin-treated lungs. We observed that conditioned media from CD44Hi AT2 cultures from bleomycin-injured lung similarly elicited increased expression of Lcn2, Hp, and Col1a1 in alveolar fibroblasts (Supplementary Fig. 15). Taken together, these data affirm a profibrotic secretome from CD44Hi SftpcC121G AT2s that pathologically activates alveolar fibroblasts toward a fibrotic phenotype through TGF-β signaling.

Expression of SFTPC BRICHOS mutant in patient-derived iAT2s results in UPR activation and emergence of an intermediate epithelial state

To examine the mechanisms by which expression of the clinical SFTPC mutation, which was the basis for the murine lung fibrosis model, results in human AT2 dysfunction, we sought to develop a comprehensive disease model using patient-specific iAT2s. We generated patient-specific iPSCs by reprogramming peripheral blood mononuclear cells from a previously reported human subject harboring a heterozygous SFTPC mutation in the 121 cystine residue (SFTPCC121Y) who developed interstitial lung disease with fibrosis in childhood (Fig. 7a)23. Using a gene editing strategy we have previously deployed49, we generated syngeneic iPSC clones expressing corrected (SFTPCWT/tdT) and mutant (SFTPCC121Y/tdT) SFTPC (Supplementary Fig. 16a, b). We then took these clones through our previously published directed differentiation protocol to generate corrected and mutant patient-specific iAT2s for disease modeling50,51. Corrected and mutant SFTPCtdT+ iAT2s were sorted to purity for culture expansion as self-renewing epithelial-only spheres that share the ultrastructural features and gene expression profile of human AT2s (Supplementary Fig. 16c, d)50,52. During initial culture passages, mutant iAT2s demonstrated a significantly lower yield of SFTPCtdT+ expressing cells suggesting an early passage diminished self-renewal capacity (Fig. 7b). To examine the effects of the SFTPCC121Y mutation on human surfactant protein processing, we analyzed the iAT2 alveolospheres by IFC for proSP-C and observed a reticular pattern to proSP-C in SFTPCC121Y/tdT iAT2s, lacking the punctate proSP-C seen in SFTPCWT/tdT iAT2s, which is indicative of lamellar body processing (Fig. 7c). Confirming that the mutant pro-protein caused a defect in SP-C processing, we found that SFTPCC121Y/tdT iAT2s generated a proSP-C protein that in western blot analysis failed to undergo post-ER processing to intermediate isoforms and to the mature SP-C peptide secreted in pulmonary surfactant (Fig. 7d). Thus, this human cell system faithfully reproduced the aberrant SP-C processing we have identified in our murine model.

Fig. 7: A patient-specific iPSC BRICHOS SFTPC mutation AT2 model reveals AT2 unfolded protein response and emergence of an intermediate epithelial state gene signature.
figure 7

a Schematic of mutant (SFTPCC121Y/tdT) and corrected (SFTPCWT/tdT) syngeneic iPSC clone development. b Graph showing iAT2 yield per input cell at each passage for mutant and corrected clones. Two-way Student’s t test at each passage; n = 4 biological replicates at P1 and n = 6 biological replicated at P2-4. c Immunofluorescent staining of mutant and corrected iAT2 alveolospheres for proSP-C shows reticular pattern in mutant proSP-C and punctate pattern in corrected proSP-C (representative of n = 3 biological replicates). d Western blotting of iAT2 lysates for proSP-C, mature SP-C and UPR marker proteins BIP and CHOP (n = 3 biological replicates). e UMAP projection of iAT2 scRNAseq from mutant and corrected iAT2s showing genotype-based segregation. f Violin plots of UPR genes DDIT3 (CHOP) and HSPA5 (BIP) in mutant and corrected iAT2s. Two-way Student’s t test. g (left) UMAP projection and (right) violin plot of UPR gene score. Welch two-way Student’s t test. h Densitometry analysis from (d) of BIP and CHOP protein (n = 3 biological replicates). Two-way Student’s t test. i Violin plots of intermediate cell state genes KRT7, CLDN4, and KRT19 in mutant and corrected iAT2s. Two-way Student’s t test. j (left) UMAP projection and (right) violin plot of KRT5-/KRT17+ aberrant basaloid gene score14. Welch two-way Student’s t test. k (left) UMAP projection and (right) violin plot of TGF-β gene score17. Welch two-way Student’s t test. l RT-qPCR gene expression of TGFB1CDKN1A and (m) FGFR2 in corrected and mutant iAT2s (n = 3 biological replicates). Two-way Student’s t test. b, h, l, m Data are presented as median value± range. g, j, k Box plot within violin display the median (center line), the first and third quartiles (box limits), and whiskers indicating the minimum and maximum values within 1.5 times the interquartile range. Source data are provided as a Source Data file. p value is shown if p  <  0.05. Schematics Created in BioRender. Rodriguez, L. (2025) https://BioRender.com/deg4vtp.

To understand the transcriptional alterations that occur with expression of human BRICHOS mutant SFTPC, we next profiled the syngeneic mutant and corrected iAT2s by scRNAseq. We found that the two genotypes segregated from each other in UMAP space (Fig. 7e). Among the top 50 differentially expressed genes in mutant iAT2s were the UPR marker genes DDIT3 (encoding CHOP) and HSP90B1 (Fig. 7f; Supplementary Fig. 16e). Additionally, HSPA5 (encoding BIP) was among the top 100 differentially expressed genes in mutant iAT2s and was significantly enriched in SFTPCC121Y/tdT iAT2s compared to SFTPCWT/tdT iAT2s (Fig. 7f, Supplementary Fig. 16f). Further, expression of UPR signaling, assessed as a gene score, was significantly higher in mutant iAT2s (Fig. 7g). We validated the upregulated UPR in mutant iAT2s by western blots which confirmed increased expression of the UPR markers BIP and CHOP (Fig. 7d, h). This confirmed that the effect of mutant pro-protein expression on the human iAT2s mirrored the induction of UPR signaling we previously reported in the AT2s in our in vivo model12.

In addition, differential gene expression analysis revealed that among the most enriched transcripts in the SFTPCC121Y/tdT iAT2s were genes associated with the recently described human alveolar-basal intermediate (ABI) 1 state (CLDN4, KRT7, KRT19) (Fig. 7i; Supplementary Fig. 16e, g)13. Affirming that SFTPCC121Y/tdT iAT2s developed a gene signature identified in IPF lungs, we found that they were enriched in the aberrant basaloid KRT5-/KRT17+ gene set (Fig. 7j)14. Together these data showed that the patient-derived mutant iAT2s in our epithelial-only culture system developed a gene signature characteristic of a pathologic cell state found in human IPF lungs. We next asked whether the mutant iAT2s, like the aberrant intermediate epithelial cells in the SftpcC121G mouse model, were enriched for TGF-β signaling genes. Indeed, SFTPCC121Y/tdT iAT2s were enriched for the TGF-β gene score17 and specifically had increased expression of TGFB1 and CDKN1A (Fig. 7k, l). Given that SftpcC121G CD44Hi AT2s downregulated Fgfr2 expression as part of their pathologic state, we analyzed FGFR2 expression in SFTPCC121Y/tdT iAT2s and found this was significantly lower than in SFTPCWT/tdT iAT2s (Fig. 7m). We thus demonstrated that not only do SFTPCC121Y/tdT iAT2s develop the IPF aberrant basaloid phenotype through intrinsic epithelial dysfunction but are also enriched for TGF-β signaling and lose expression of the canonical alveolar niche FGF signaling receptor.

Human iAT2s develop an alveolar-basal intermediate gene signature in response to loss of canonical progenitor signaling and TGF-β activation

Recent work has implicated autocrine TGF-β in the maintenance of pathological AT2 cell states43. To test the impact of TGF-β signaling in the SFTPCC121Y/tdT iAT2s, we cultured them in the presence of ALK-5 inhibitor (SB-431542, 10 μM) and found decreased expression of ABI marker genes KRT17 and KRT7 (Fig. 8a, b). However, the ALK-5 inhibitor did not restore FGFR2 expression. Based on this, we next asked whether altered alveolar niche signaling cues—specifically loss of canonical alveolar epithelial maintenance signals, TGF-β stimulation, or both—could also contribute to the pathological cell state change we identified occurring spontaneously in the SFTPCC121Y/tdT iAT2s. To test this, we cultured SFTPCWT/tdT iAT2s in different media conditions: (1) our standard media (CK + DCI) that includes WNT activator CHIR and recombinant KGF, also known as FGF7, (CK), (2) CK + DCI with the addition of recombinant TGF-β (5 ng/ml) (CK + DCI + TGF-β), (3) media without CHIR and recombinant KGF (DCI), and (4) DCI with recombinant TGF-β (5 ng/ml) (DCI + TGF-β) (Fig. 8c). We found that SFTPCWT/tdT iAT2s cultured for 24 h in either standard media with the addition of TGF-β (CK + DCI + TGF-β) or media without CHIR and KGF (DCI) showed slight alterations in AT2 and ABI1 gene expression– CK + DCI + TGF-β decreased AT2 gene expression and DCI increased expression of KRT7 (Fig. 8d, f). However, the combination of TGF-β stimulation and CK removal (DCI + TGF-β) both decreased AT2 marker gene expression and increased expression of alveolar-basal intermediate 2 (ABI2) genes, including KRT17 that we validated at a protein level through IFC, and TGF-β signaling genes in the iAT2s (Fig. 8d–g), without inducing transdifferentiation to basal cells confirmed by negligible expression of KRT5 in these cells compared to iPSC derived basal cells (median 0.0014 fold change, range 0.0006 to 0.057, n = 3 biological replicates)53. These data suggested that while loss of canonical signaling or stimulation with TGF-β alone can shift the iAT2 phenotype toward an ABI1 state, it is the loss of canonical epithelial maintenance signals in combination with TGF-β stimulation that promotes iAT2s to enter the ABI2 state identified in IPF. Thus, the human iAT2 model affirmed the altered alveolar niche signaling that we hypothesize caused SftpcC121G AT2s to enter and persist in the profibrotic aberrant intermediate cell state.

Fig. 8: Human iAT2s develop an alveolar-basal intermediate gene signature in response to loss of canonical progenitor signaling and TGF-β activation.
figure 8

a Schematic depicting mutant (C121Y) iAT2s exposed to ALK-5 inhibitor (SB-431542, 10 μM). b RT-qPCR gene expression of KRT17, KRT7, and FGFR2 in mutant iAT2s with or without ALK-5i (n = 3 biological replicates). Two-way Student’s t test. Data are presented as median value ± range. c Schematic depicting WT iAT2s exposed to different media conditions: (1) standard media with CHIR and KGF (CK + DCI), (2) standard media with recombinant TGF-β (5 ng/ml) (CK + DCI + TGF-β), (3) media without CK (DCI), and (4) media without CK with recombinant TGF-β (DCI + TGF-β). d RT-qPCR for aberrant basaloid gene expression in corrected iAT2s exposed to different media conditions shows increased expression with DCI + TGF-β. (n = 3 biological replicates), Two-way ANOVA with adjustments (Tukey’s) made for multiple comparisons. Data are presented as median value ± range. e Immunofluorescence staining representative of n = 3 biological replicate of WT iAT2 alveolospheres in CK + DCI and DCI + TGF-β conditions shows KRT17+ alveolospheres in DCI + TGF-β condition (scale bar =20 μm). f RT-qPCR for AT2 marker genes and (g) TGF-β signaling and senescence gene (CDKN1A) in WT iAT2s exposed to different media conditions (n = 3 biological replicates). Two-way ANOVA with adjustments (Tukey’s) made for multiple comparisons. Data are presented as median value ± range. Source data are provided as a Source Data file. p value is shown if p  <  0.05. Schematics Created in BioRender. Rodriguez, L. (2025) https://BioRender.com/qw2mkzc.

Discussion

In this study, we aimed to uncover key mechanisms driving lung fibrogenesis by investigating how alveolar epithelial cells signal to lung fibroblasts, promoting pathogenic activation, and how reciprocal signaling leads alveolar epithelial cells to enter and persist in a profibrogenic aberrant intermediate state. To accomplish this, we utilized two translationally relevant models that do not rely upon exogenous lung injury, stimulation with bleomycin, or AT1 ablation to activate lung injury-repair processes. The chronic SftpcC121G mutation mouse model, which we have previously demonstrated recapitulates the clinical course of human PF and produces spontaneous and progressive end stage pathology, leverages a clinical mutation in the BRICHOS domain of the Sftpc gene to generate intrinsic AT2 UPR signaling12. That AT2s in this progressive fibrosis model enter transitional states not in response to epithelial repair of lung injury as has been previously modeled and studied15,17,18,19,20,21,39, but through intrinsic stress better conforms with the prevailing conceptual model of aberrant alveolar epithelial states in human PF7,12,13,22. We coupled this model with a novel human cell translational platform we have developed through generation of iAT2s from a patient with PF carrying a heterozygous SFTPC BRICHOS mutation. This patient-derived cell model produced a distinct BRICHOS mutant induced cellular defect, UPR signaling, which mechanistically differs from the mistrafficking and blocked autophagy mechanisms associated with non-BRICHOS SFTPC mutations in our previously published iAT2 and mouse models49,54. In both our mouse and iAT2 BRICHOS SFTPC mutation models, AT2s developed an aberrant intermediate phenotype. Given the iAT2 culture is without fibroblasts, this model shows that AT2s can enter this cell state through intrinsic signaling in a cell autonomous manner. We then harnessed these models to understand the signaling through which they arise and persist and how they contribute to lung fibrosis.

A barrier to understanding the contribution of aberrant intermediate epithelial cells to lung fibrosis has been the inability of modeling their pathogenic interactions with fibroblasts ex vivo. Recent work showing that bleomycin stimulation of AT2s in culture activated aberrant cell programs and triggered profibrotic signaling to fibroblasts has partially overcome that barrier43. However, much of the in vivo contribution of these aberrant intermediate epithelial cells has been inferred from genomics. While our initial single-cell ligand–receptor mapping using the NICHES package provided insights into the epithelial-to-fibroblast signaling that promotes pathogenic activation, our identification that the transmembrane glycoprotein CD44 is an effective marker for FACS enriching a profibrotic aberrant intermediate subset of alveolar epithelial cells at the end stage of the SftpcC121G PF model allowed us to model the pathogenic signaling. We note that CD44, as a surface marker, does not delineate a dichotomous population in the SftpcC121G PF model. We contextualize this as reflecting that the aberrant cell state represents the termination of a continuum from homeostatic to aberrant AT2. Using this enrichment method, we found that CD44Hi SftpcC121G AT2s induced a fibrogenic shift in alveolar fibroblasts both in organoid co-cultures and through secreted supernatant exposure assays. Based on our NICHES analysis, we hypothesized that signaling through TGF-β or CTGF mediated this effect. Our inhibitor interventions identified that the CD44Hi SftpcC121G AT2 secretome activates alveolar fibroblasts to express Cthrc1 primarily through TGF-β signaling, with limited contribution from CTGF. This finding is particularly relevant in light of Pamrevlumab’s failure in a recent IPF clinical trial55, supporting a limited role for CTGF in the epithelial contribution to the fibrotic niche.

Additionally, we uncovered how disruptions in canonical mesenchymal-to-epithelial signaling archetypes impact AT2 entry and persistence in the fibrogenic cell state. While previous work found a proliferative CD44Hi perivascular AT2 subset in lung injury42, we found no evidence that the CD44Hi AT2s in the SftpcC121G fibrosis model came from an expansion of a basal CD44Hi population or that the SftpcC121G CD44Hi AT2s were proliferative. Rather, we hypothesize that similar to fibroblasts56 and immune cells57, Cd44 is upregulated during pathologic activation of AT2s, a finding supported by Cd44 enrichment in AT2s stimulated with bleomycin in organoid culture43. Through gene expression analysis and ligand–receptor mapping, we identified altered FGF7 signaling as a contributor to defective proliferation of CD44Hi SftpcC121G AT2s. Further, using our iAT2 model we found that it is the combined loss of homeostatic signaling (WNT/KGF) and the addition of TGF-β stimulation that drove the aberrant intermediate differentiation. Having identified that CD44Hi SftpcC121G intermediate cells are a source of TGF-β signaling ligands that can stimulate fibroblasts and are positioned in a TGF-β enriched niche, the iAT2 modeling conforms with a conceptual model where loss of canonical alveolar maintenance signals coupled with autocrine and paracrine sources of TGF-β create a feedforward loop that promotes AT2 entry into an aberrant, and profibrotic, intermediate cell state (Supplementary Fig. 17).

Pulmonary fibrosis remains a significant clinical challenge, with an ongoing need for the development of effective treatments through a deeper understanding of fibrogenic alveolar niche biology. With the recent identification of complex heterogeneity within the mesenchymal compartment during fibrosis and distinct mesenchymal to epithelial signaling archetypes from alveolar, transitional/inflammatory, and fibrotic fibroblasts, future studies are needed in the SftpcC121G model to disentangle how each fibroblast state signals to pathological epithelium and responds to reciprocal cues. While our study focused on aberrant alveolar epithelial-mesenchymal crosstalk during PF, we acknowledge that other cells within the epithelial and mesenchymal compartments, including distal airway epithelium and pericytes, as well as other cellular compartments, such as immune and endothelial cells, have critical roles in alveolar regeneration following injury and likely amplify the TGF-β signaling circuit we identified58,59. As such, future studies will incorporate characterization of all lung cellular compartments in the SftpcC121G model to better compare findings with human PF. We conclude that while the epithelial paradigm during lung fibrosis has focused on upstream injury and defective progenitor capacity resulting in aberrant lung repair, strategies that modulate the direct epithelial contribution to fibrogenic alveolar niche signaling may provide innovative and effective therapeutics.

Methods

Ethical approval

All experiments involving animals conformed to the Guide for the Care and Use of Laboratory Animals published by the US National Institutes of Health and were carried out in compliance with all ethical regulations under Institutional Animal Care and Use Committee (IACUC) approved protocols (Protocol 807399). The studies adhered to ARRIVE guidelines (Animals in Research: Reporting In Vivo Experiments). The Institutional Review Board of Boston University (protocol H-32506) approved procurement of peripheral blood mononuclear cells (PBMCs) from an individual with a heterozygous SFTPC variant, for reprogramming into iPSCs with written informed consent. All experiments involving the differentiation of human iPSC lines were performed with the approval of the Institutional Review Board of Boston University (protocol H-33122).

Animals and mutation induction

Both male and female animals (aged 12-16 weeks) were utilized in all protocols. Mice were housed in an AAALAC approved barrier facility at the Perelman School of Medicine (University of Pennsylvania). Rooms ranged from 68 to 78 F and 20–70% humidity. Lights are on 12-h cycle on 7a and off 7p year around. Mice were provided with unrestricted access to food (LabDiet, 5053 - PicoLab Rodent Diet 20) and water. All mice were maintained on a C57BL/6 genetic background. Homozygous SftpcC121G mice were achieved through crossing a SftpcC121Gneo founder line (C57BL/6 genetic background, Genoway Inc.) with a Rosa26ERT2-Cre strain, B6.129-Gt(ROSA)26Sortm1(cre/ERT2)Tyj/J (stock no. 008463; The Jackson Laboratory) as previously published12,23. Induction of SftpcC121G was achieved through oral gavage (OG) tamoxifen (TMX; Millipore Sigma, T5648-5G) induction (dissolved in corn oil at 20 mg/mL) at 200 mg/kg in female mice and 230 mg/kg in male mice once weekly, as previously published12. Homozygous Rosa26ERT2-Cre mice expressing wild-type Sftpc (hereafter, SftpcWT mice) treated with TMX were used as controls. All mice were weighed once a week during the duration of the study and the weekly TMX dosing was adjusted accordingly. For bleomycin induced injury, lineage tracing of AT2 was initiated by dosing SftpcCreERT2;Rosa26tdT mice (B6.129S-Sftpctm1(cre/ERT2)Blh/J stock no. 028054; The Jackson Laboratory and B6.Cg-Gt(ROSA)26Sortm14(CAG-tdTomato)Hze/J stock no. 007914; The Jackson Laboratory) with 200 mg/kg TMX (in corn oil) by intraperitoneal injections at 14, 12, and 10 days prior to bleomycin administration. Bleomycin (Fresenius Kabi, 103610) was administered at 1.5 U/kg intratracheally and mice were monitored daily.

Histology, H&E severity scoring, and immunofluorescence chemistry

Whole lungs were fixed by tracheal instillation of 10% neutral buffered formalin (Sigma, HT501640) at a constant pressure of 25 cm H2O. The Penn Vet Pathology Core Laboratory performed H&E and Trichrome staining on 6 μm sections of lung tissues and slides were imaged using an Aperio ScanScope Model: CS2 (Leica) at 40× magnification. Whole lung lobe Trichrome images were color coded blue (normal), green (moderate), and severe (red), and scored for fibrosis severity as the percent of total lung lobe area using a published algorithm60. Picrosirius red (PSR) staining of fibrillar collagen was performed on 6 μm sections of lung tissues using a PSR Stain Kit (Polysciences, PI34580). Immunofluorescence chemistry (IFC) staining was performed on 6 μm paraffin embedded lung sections using a combination of commercially available primary antibodies and fluorescent AlexaFluor anti-IgG secondary antibodies (Supplementary Table 1). Images were obtained using an Eclipse Ti2 Series inverted microscope (Nikon) and analyzed using Nikon NES-Elements software and Fiji 2.15.1.

Mouse lung single-cell RNA sequencing and analysis

Single cell suspensions from SftpcC121G or SftpcWT were obtained utilizing our published protocol22. Perfused whole lungs were diced using surgical scissors and digested in a Phosphate Buffered Saline (PBS; MgCa) solution supplemented with 5 mg/mL Collagenase Type I (Thermo, 17100017) and 50 units of DNase (Millipore Sigma, D5025-150KU) at 37 °C for 1 h with frequent mechanical perturbation. Single cell suspensions were subsequently passed through a 70 μm nylon mesh filter and red blood cells were removed using an ACK Lysis Buffer (Thermo, A1049201). To capture a cell suspension containing an equal mixture of strictly epithelial and mesenchymal cells, each lung was divided into two samples and coated with either CD45, CD31, and EpCAM antibody conjugated microbeads (Miltenyi Biotec, 130-052-301, 130-097-418, 130-105-958) to negatively select for mesenchymal cells or EpCAM antibody conjugated microbeads to positively select for epithelial cells. Microbead-coated cells were then passed through magnetic LS columns (Miltenyi Biotec, 130-042-401), and both mesenchymal and epithelial cell populations from each lung were pooled back together at a 50:50 ratio before loaded onto individual GemCode instrument (10× Genomics; 2 for each genotype).

ScRNAseq was performed and analyzed as previously accomplished by our group30. Single-cell barcoded droplets were produced using 10X Single Cell 3′ v3 chemistry. Libraries generated were sequenced using the HiSeq Rapid SBS kit, and the resulting libraries were sequenced across the 2 lanes of an Illumina HiSeq2500 instrument in a high-output mode. scRNA-Seq reads were aligned to mouse genome (mm10/GRCm38) using STARSolo (version 2.7.5b). After initial quality control and processing, we analyzed the scRNAseq data using the Scanpy pipeline61. Genes expressed in fewer than 3 cells were removed, and cells with fewer than 200 genes and a mitochondrial fraction of greater than 20% were excluded. Counts were log normalized using scanpy.pp.normalize_per_cell (counts_per_cell_after=1 × 104), followed by scanpy.pp.log1p. To integrate data from multiple samples, we used Scvi-tools62. We applied scvi.model.SCVI. setup_anndata to establish the model parameters for integration, including layer, categorical_covariate_keys, and continuous_covariate_keys. We then performed a principal component analysis (PCA) and generated a K-nearest neighbor (KNN) graph using scanpy.pp.neighbors with n_neighbors = 15. The resulting KNN graph was used to perform Uniform Manifold Approximation and Projection (UMAP) dimension reduction to visualize the cells in 2 dimensions using scanpy.tl.umap. Clustering was performed using the Leiden algorithm with scanpy.tl.leiden63. See code availability below.

We identified cell populations using known canonical marker genes (Supplementary Table 2) or by assessing cluster-defining genes based on differential expressions. Trajectory analysis was performed on the UMAP reduction using the R Slingshot package with both AT2s and the UPR active AT2s as a starting point without any assigned end points. The lineages were visualized using a custom function in R. Analysis of ligand–receptor signaling was performed using NICHES v1.0.032. Seurat v4.3.0, ggplot2, and ComplexHeatmap v2.14.0 were used to plot selected significant findings, trends, and patterns of interest.

Fluorescence-activated cell sorting (FACS) of epithelial and mesenchymal populations

Single cell suspensions of blood-free whole lung cells were obtained as described above. Fluorophore-conjugated antibodies were used to stain lung cell populations (Supplementary Table 1). All cells were sorted using a CytoFlex SRT (Beckman) and cells were captured in ice-cold FACS buffer (0.1% BSA, 2 mM EDTA, and PBS pH 7.4). SftpcWTAT2s and SftpcC121G AT2s were defined as (CD45;EpCAM+;CD104;CD200+). SftpcWTAT2s and SftpcC121G bulk fibroblasts were isolated as (CD45;EpCAM;CD31;MCAM;PDGFRA+). Fluorescent IgG isotype controls were used to determine staining positivity. Flow cytometry gating and quantification was performed using FlowJo v10.10 software.

Mouse cell RNA and DNA isolation, bulk RNA sequencing, and RT-qPCR

RNA was isolated from freshly sorted mouse lung cells, or cells derived from cell culture assays, using the RNeasy Mini Kit (Qiagen, 74106) following the manufacturer’s protocol and RNA quantification was performed using a NanoDrop One (Thermo Fisher Scientific). RNA was either processed for bulk RNA sequencing or reverse transcribed using a Verso cDNA Synthesis Kit (Thermo Fisher Scientific, AB1453B). See Supplementary Table 3 for RT-qPCR primers.

For RNA sequencing, library preps were performed by the Penn NextGen Sequencing Core. Fastq files were evaluated for quality control with the FastQC program and then aligned against the mouse reference genome (mm10) using the STAR aligner87. Duplicate reads were flagged with the MarkDuplicates program from Picard tools and excluded from analysis. Per gene read counts for Ensembl (v67) gene annotations were computed using the R package Rsubread. Gene counts, represented as counts per million (CPM), were nominalized using TMM method in the edgeR R package, and genes with 25% of samples with a CPM < 1 were considered low expressed and removed. The data were transformed with the VOOM function from the limma R package to generate a linear model and perform differential gene expression analysis. We employed the empirical Bayes procedure as implemented in limma to adjust the linear fit and to calculate p values given the small sample size of the experiment. We adjusted p values for multiple comparisons using the Benjamini-Hochberg procedure. Heatmaps were generated using Morpheus (https://software.broadinstitute.org/morpheus). Gene ontogeny analysis was performed using the Database for Annotation, Visualization, and Integrated Discovery (https://david.ncifcrf.gov/home.jsp) based on differentially enriched genes (FC > 1.5, adjusted p value < 0.05). Key pathway analyses were performed on gene lists identified from the GSEA molecular signature database.

Genomic DNA was extracted using DNeasy Blood and Tissue Kit (Qiagen, 69504). Neomycin excision was calculated based on an RT-qPCR strategy using primers for Rpp30 and Neomycin Cassette (Neo) as previously accomplished12,23.

AT2 cytopsins from freshly sorted AT2s were fixed with 4% Paraformaldehyde in PBS (Thermo Fischer, J61899.AP) for 5 min and washed in PBS, Permeabilized with 0.5% Triton X-100 (Sigma-Aldrich, X100), blocked with 5% bovine serum albumin (BSA) for 1 h, and incubated in CD44 and KRT8 primary antibody overnight at 4 °C at concentrations as per Supplementary Table 1, followed by secondary antibodies for 1 h at room temperature. Stained cytospins were mounted using a Vectashield antifade mounting medium with DAPI (Vector laboratories, H1200) for imaging. Images were obtained using an Eclipse Ti2 Series inverted microscope (Nikon) and analyzed in the Nikon NES-Elements software.

Primary Mouse Organoid AT2-fibroblast Culture and Analysis

Freshly sorted alveolar epithelial or mesenchymal cell populations were isolated from whole lungs and sorted by FACS as described above. Mouse lung co-culture organoids were formed according to previously established protocols64. For AT2-fibroblast co-culture organoids, various combinations of SftpcWT AT2s and CD44Hi/Lo SftpcC121G AT2s and PDGFRA+ fibroblasts were utilized. Regardless of cell origin, each technical replicate contained 5000 alveolar epithelial cells and 50,000 PDGFRA+ fibroblasts. Cell suspensions were resuspended in a mix of 50% growth factor reduced (GFR)-Matrigel (Corning, CB40230C) and 50% Small Airway Epithelial Cell Growth Media (SAGM; Lonza, CC-3118) and plated in a 24-well Falcon Cell Culture Insert (Falcon, 353095). The cell/Matrigel culture solidified for 30 min at 37 °C before 500 µl SAGM was added to the basal layer of the transwell insert. SAGM was made fresh for each experiment in accordance with the manufacturer’s instructions, using a Lonza BulletKit (with the exception of excluding Hydrocortisone, BSA, Triiodothyronine, and Epinephrine). For the first two days of culture SAGM was additionally supplemented with 10 μM rock inhibitor (Y-27632 dihydrochloride, Sigma, 125410). Subsequently, SAGM was refreshed every other day for the duration of the experiment. All organoid imaging was performed using an inverted Leica Thunder microscope and images were quantified for organoid size and colony forming efficiency (CFE) using Fiji 2.15.1

To enrich epithelial and mesenchymal cells out of organoid cultures for gene expression analysis, Matrigel droplets containing organoids were initially dissociated in 2 mg/mL Dispase (Fisher, 50-112-8831) for 30 min at 37 °C. Isolated organoids were subsequently resuspended in Accutase (Stemcell Technologies, 07920) for 15 min at 37 °C to achieve a single cell suspension. Single cell suspensions were resuspended in FACS Buffer with biotinylated anti-mouse CD326 (EpCAM; Biolegend, 118203) for 30 min at 4 °C. Excess antibody was washed off in FACS Buffer and cell pellet was resuspended in FACS Buffer containing Dynabeads M-280 Streptavidin (Intvitrogen, 11205D) for 30 min at 4 °C. Tubes containing cell and bead suspensions were then placed on a MagJet Rack (Thermo, MR04) to separate EpCAM+ (on-bead) and EpCAM- (in suspension) fractions. EpCAM- suspensions were pelleted before RNA isolation as described above. EpCAM+ (on-bead) fractions were lysed in RLT Buffer (Qiagen, 74106) and centrifuged at 300g for 5 min to pellet beads. Lysed supernatant was then processed using an RNeasy Mini Kit.

Primary mouse AT2-only culture and analysis

AT2 alone organoid culture was performed as using previously described protocols43,45,46. Freshly flow sorted AT2s were resuspended at GFR-Matrigel at 100 cells/μL and 50 μL Matrigel droplets were plated in a 12-well plate (Falcon, 353043). Matrigel droplets were allowed to solidify for 30 min at 37 °C. For media, a previously published AT2 alone maintenance media was utilized, composed of 3 μM CHIR99021 (Tocris, 442310), 1 μM BIRB796 (Stemcell, 598910), 1x Insulin-Transferrin-Selenium (ITS) (Thermo, 51500056), 200 ng/mL Primocin (InvivoGen, NC9141851), 15 mM HEPES (Thermo, 15630080), 1x Glutamax (Thermo, 35050061), 50 ng/mL hEGF (Gibco, 35050061), 5 μg/mL Heparin (Sigma, H3149), 1.25 mM N-Acetyl Cysteine (Sigma, A9165), and varying concentrations of FGF7 (Tocris, 251KG010) as described, in Advanced DMEM/F12 (Gibco, 11330032)46. For the first 2 days of culture AT2 maintenance media was additionally supplemented with 10 μM rock inhibitor (Y-27632 dihydrochloride, Sigma, 125410) and 10 ng/mL mouse TNFα (Biolegend, 575204). Subsequently, AT2 maintenance media was refreshed every other day for the duration of the experiment. All organoid imaging was performed using an inverted Leica Thunder microscope and images were quantified for organoid size and colony forming efficiency (CFE) using Fiji 2.15.1

For AT2 RNA analysis, on day 14 the AT2-only organoid Matrigel was dissociated with 2 mg/mL Dispase (Fisher, 50-112-8831) for 15 min. Organoids were then pelleted and resuspended in Accutase Solution (Stemcell Technologies, 07920) for 15 min and re-pelleted, followed by RNA collection. For IFC, on day 14 the AT2-only organoid Matrigel was dissociated with Cell Recovery Solution (Corning, 354253) for 30 min. Organoids were then pelleted and resuspended in 4% Paraformaldehyde in PBS for 30 min at room temperature. Organoids were then pelleted and resuspended in 20uL Matrigel, which was then coated on a coverslip. Adherent cells on coverslip were then permeabilized with 0.5% Triton X-100, blocked with 5% bovine serum albumin (BSA, Sigma, A8806) for 1 h, and incubated in proSP-C and primary antibody overnight at 4 °C at concentrations as per Supplementary Table 1, followed by secondary antibodies for 1 h at room temperature. Stained coverslip with organoids was mounted over a glass slide using a Vectashield antifade mounting medium with DAPI (Vector laboratories, H1200) for imaging. Images were obtained using an Eclipse Ti2 Series inverted microscope (Nikon) and analyzed using Nikon NES-Elements software.

AT2 alone organoid culture media supernatant was collected at day 14. TGFB1 was quantified in the supernatant by the Mouse TGF-beta 1 DuoSet ELISA kit (R&D, DY1679-05) Alveolar fibroblasts were cultured for 2 days before media was changed to conditioned media derived from AT2 culture with or without the ALK-5 inhibitor SB-525334 (Med Chem Express HY-12043) at 1 μM or anti-CTGF monoclonal antibody Pamrevlumab/FG-3019 (Med Chem Express HY-P99288) at 100ug/mL. At 2 days post-supernatant exposure, alveolar fibroblasts were lysed for RNA isolation as per above.

Primary mouse 2D culture

Alveolar epithelial cells sorted by FACS as described above or heathy alveolar fibroblasts from SftpcWT mice were sorted as (Pdgfra+;Sca1;CD45;CD31:EpCAM;MCAM) and plated on 24-well Falcon Cell Culture Inserts (Falcon, 353504) that had previously been coated with hESC-qualified Matrigel (Corning, 354277) at room temperature for 2 h. The apical compartment of each transwell was loaded with 500 µl DMEM:F12 (10% FBS, Glutamax, and Primocin) containing 1.0 × 105 alveolar epithelial cells or 2.0 × 105 alveolar fibroblasts. The basal compartment was additionally loaded with 500 µl of the same media. For alveolar epithelial cells, media supernatant was collected, and cell monolayers were lysed for RNA isolation at day 5. Alveolar fibroblasts were cultured for 2 days before media was changed to conditioned media derived from alveolar epithelial cell cultures with or without ALK-5 inhibitor SB-525334 at 1 μM or anti-CTGF monoclonal antibody Pamrevlumab/FG-3019 at 100 μg/mL.

SFTPC variant iPSC line generation, maintenance, and directed differentiation to iAT2s

Peripheral blood mononuclear cells from an individual with a heterozygous SFTPC variant in the BRICHOS domain, c.362G>A (p.C121Y) and clinically phenotyped childhood interstitial lung disease (chILD)23, were reprogrammed to generate iPSCs using the CytoTune-iPS 2.0 Sendai Reprogramming Kit (Invitrogen, Thermofisher, A16517) according to the manufacturer’s instructions, resulting in the generation of the iPSC line. Detailed methods are available for free download from the CReM’s protocols page at www.bumc.bu.edu/stemcells.

All iPSC lines used in this study demonstrated a normal karyotype (46XY) when analyzed by G-banding and/or array Comparative Genomic Hybridization (aCGH) both before and after gene-editing (Cell Line Genetics) (Supplementary Fig. 16). Pluripotency was confirmed by staining for pluripotency markers65. Culture conditions used for maintenance and editing of undifferentiated iPSCs were as previously published49,50,51. Further details of iPSC derivation, characterization, and culture are available for free download at http://www.bu.edu/dbin/stemcells/protocols.php.

Generation of gene-edited syngeneic iPSC lines

To generate syngeneic iPSC lines expressing corrected (SPC8-ST-Corr; SFTPCWT/tdT) and mutant (SPC8-ST-Mut; SFTPCC121Y/tdT) SFTPC, TALENs were used to insert a tdTomato (tdT) fluorescent reporter at the translation initiation (ATG) site of the endogenous SFTPC locus of the parental iPSC line, as previously described49,51. To deliver the donor template to the SFTPC locus, we generated a donor vector (p1303 DV-SFTPC-tdT) containing the tdT coding sequence and a floxed PGK promoter-driven puromycin resistance cassette, flanked by left and right arms of homology to the human endogenous SFTPC locus, and the TALENs and donor vector plasmids were co-transfected into the iPSC line using a lipofectamine based transfection protocol as previously described49. After 10 days individual colonies were picked and screened for targeting using the following primer pairs: 5’ GGG TGA GTG AGC TGA TTC GAG 3’, 5’ TGA CCT CCT CGC CCT TGC TCA CCA TG 3’. DNA sequencing using a primer binding in the genome outside any regions included in targeting plasmids confirmed targeting into the endogenous SFTPC locus (5’ GGG TGA GTG AGC TGA TTC GAG 3’). To confirm heterozygous targeting, colonies were screened for a remaining intact SFTPC allele using the following primers: 5’ CTA CGG ACA CAT ATA AGA CCC TGG TC 3’, 5’ CAA ATA GCC ACC ATT CGG ACA CTC 3’. DNA sequencing of the amplified portion of the intact SFTPC allele was performed using the following primer 5’ TAT GGG GAT GGG TAC CAC TGG CTG 3’ to pick colonies of either corrected (SFTPCWT/tdT) or mutant (SFTPCC121Y/tdT) iPSC clones as the tdT cassette is followed by a stop/polyA cassette, preventing expression of the subsequent SFTPC coding sequence from the targeted allele (Supplementary Fig. 16). Cre-mediated excision of the floxed puromycin resistance cassette was performed using a plasmid containing Cre-recombinase and neomycin resistance (PHAGE2 EF1a-Cre-IRES-NeoR-W; www.kottonlab.com) using the same lipofectamine-based protocol described above, with 4 days of 200 ng/μl geneticin-based (Life Tech, 11811-023) selection for clones that were transfected with Cre-containing plasmid. Excision of the puromycin cassette was confirmed by PCR using primers: 5’ ATG ACC GAG TAC AAG CCC ACG 3’, 5’ TCA GGC ACC GGG CTT GC 3’.

iPSC directed differentiation into alveolar epithelial type 2 cells (iAT2s) and flow cytometry and fluorescence activated cell sorting (FACS)

iAT2s were generated through iPSC-directed differentiation via definitive endoderm into NKX2-1 lung progenitors using methods we have previously published49,50,51. On days 15-17 of differentiation, live cells were sorted on a high-speed cell sorter (MoFlo Legacy or MoFlo Astrios EQ) to isolate NKX2-1+ lung progenitors based on CD47hi/CD26neg gating66. Immunostaining of cells in single cell suspension for FACS with antibodies against CD47 (PerCP-Cy5.5 conjugated (clone CC2C6) BioLegend Cat# 323110; RRID:AB_940461) and CD26 (PE conjugated (clone BA5b) BioLegend Cat# 302705; RRID:AB_314289) and FACS gating of the CD47hi/CD26neg population for purification of lung progenitors was performed as previously described49. Gating was based on isotype-stained controls or in the case of SFTPCtdT on non-lung endoderm outgrowths (day 15–16 sorted CD47low cells). Flow cytometry staining was quantified using the Stratedigm S1000EXI and analyzed with FlowJo v10.6.2 (FlowJo, Tree Star Inc). Flow cytometry plots shown represent single cells after forward-scatter/side-scatter gating to remove debris as previously detailed51.

Sorted lung progenitors were resuspended in undiluted GFR-3D Matrigel (Corning, 356231) at a density of 400 cells/µl and distal/alveolar differentiation of cells was performed in CK + DCI + Y medium, consisting of complete serum-free differentiation medium (cSFDM) base supplemented with 3 µM CHIR99021 (Tocris, 4423), 10 ng/ml rhKGF (R&D Systems, 251-KG-010) (CK), 50 nM dexamethasone (Sigma Sigma Aldrich,D4902), 0.1 mM 8-Bromoadenosine 3’,5’-cyclic monophosphate sodium salt (Sigma, B7880), 0.1 mM 3-Isobutyl-1-methylxanthine (IBMX; Sigma, I5879) (DCI), and 10 μM Y-27632 (Tocris, 1254) (Y). The resulting epithelial spheres were passaged without further sorting on approximate day 30 (day 29-31) of differentiation and a brief period (4–5 days) of CHIR99021 withdrawal followed by one week of CHIR99021 addback was performed to achieve iAT2 maturation, as previously shown50,51. After this 14-day period, SFTPCtdT+ cells were purified by FACS and resuspended in undiluted GFR- 3D Matrigel at a concentration of approximately 400 cells/µl in CK + DCI + Y media to establish long-term iAT2 cultures. iAT2s were then maintained through serial passaging as self-renewing monolayered epithelial spheres (alveolospheres) by plating in 3D Matrigel droplets at a density of 400 cells/μl, with refeeding every other day with CK + DCI + Y medium, according to our previously published protocol51, unless otherwise specified. iAT2 culture quality and purity was monitored at each passage by flow cytometry and iAT2s were periodically repurified by FACS based on SFTPCtdT reporter expression to maintain pure iAT2 cultures. For TGF-β inhibition of the SFTPCC121Y/tdT iAT2s, cells were cultured with or without ALK-5 inhibitor (SB-431542, Tocris, 1614) at 10 μM for 8 days (Fig. 8a).

iAT2 western blot analysis

Alveolospheres were harvested by incubating with 2 mg/ml dispase (Thermo Fisher Scientific, 17105-041) for 30–60 min at 37 °C, followed by resuspension in cell lysis buffer (RIPA buffer, Sigman 89901 and 1x Roche Complete Protease Inhibitor cocktail, Sigma, P8340). Cellular debris was cleared by centrifugation (15,000g for 10 min) and supernatants were harvested for further analysis. Protein concentration was measured using Lowery assay and 10 μg of lysates were resolved by SDS-PAGE using Novex Bis-Tris gels (Thermo Fisher Scientific, NP0301), transferred to PVDF membranes (Bio-Rad, 1620177), and blotted with primary antibodies [pro-SFTPC (NPRO-SFTPC, a rabbit polyclonal antibody against the Met [10]–Glu [23] domain of rat proSFTPC; (1:3000, in-house, NA), mature SFTPC (1:2500, Seven Hills, WRAB-76694, lot 12.13), BIP (1:1000, cell signaling, 3177, lot 11), CHOP (1:500, cell signaling, 5554, lot 7), β-actin (1:10000, Sigma, A1978, lot 088M4804V) followed by HRP conjugated secondary antibodies [Goat Anti-Mouse IgG (H + L), HRP conjugated (1:5000, BioRad, 0300-0108 P), Goat Anti-Rabbit IgG (H + L), HRP conjugated (1:5000, BioRad, 5196-2504;). Bands detected by enhanced chemiluminescence (ECL2 80196 Thermo Fisher Scientific or WesternSure 926-95000, LI-COR) were acquired by direct scanning using an LI-COR Odyssey Fc Imaging Station and quantitated using the manufacturers’ software. Full blots are shown in Source Data.

iAT2 immunofluorescence microscopy

For formalin fixation and paraffin embedding, alveolospheres were harvested by incubating with 2 mg/ml dispase as above and fixed with 4% paraformaldehyde (Fischer, AAJ19943K2) overnight at 4 °C, combined in low melting agarose to form a pellet and then paraffin embedded. For immunocytochemistry, sections on slides were rehydrated and antigen retrieval was performed at 95 °C for 40 min in Dako Target Retrieval Solution (DAKO, S-1699), permeabilized with 0.5% Triton X-100, blocked with 4% normal donkey serum (NDS, Jackson, 017-000-121) for 1 h, and incubated in pro-SFTPC primary antibody in 0.5% Triton X-100 and 4% NDS overnight at 4 °C. The staining was detected with secondary Alexa Fluor-conjugated anti-IgG antibodies purchased from Jackson Immunoresearch (goat anti-rabbit at 1:400 dilution for 1 h at room temperature). Nuclear counterstaining was performed with DAPI (300 nM, Invitrogen, D21490). Coverslips with cells were mounted on glass slides using Prolong Diamond Anti-Fade Mounting Reagent (Invitrogen, P36965) and imaged on an Eclipse Ti2 Series inverted microscope (Nikon) and analyzed using Fiji 2.15.1.

For cryo-embedding, alveolospheres were fixed in 4% paraformaldehyde for 20 min at room temperature. Samples were dehydrated in sucrose, frozen in OCT and cryosectioned at 10 μm. Sections were permeabilized with 0.5% Triton X-100 (Sigma-Aldrich), blocked with 4% normal donkey serum (NDS) for 1 h, and incubated overnight at 4 °C with Alexa Fluor 488-conjugated KRT17 antibody diluted in 0.5% Triton X-100 and 4% NDS. Nuclei were counterstained with Hoechst (1:500, Invitrogen, 62249). Coverslips were mounted using ProLong Diamond Antifade Mounting Reagent (Thermo Fisher Scientific, P36961). Confocal imaging was performed using a Zeiss LSM 710-Live Duo system and processed with Fiji.

iAT2 single-cell RNA sequencing and analyses

For scRNAseq of corrected (SFTPCtdT/WT) and mutant (SFTPCC121Y/tdT) iAT2s on day 194, cells were dissociated for sorting as described previously. All live cells (calcein blue + ) irrespective of SFTPCtdT expression were included. iAT2s had been previously purified by FACS for tdT on days 45, 87, and 182. Single cells were captured for sequencing library preparation using a 10X Chromium (10X Genomics, Pleasanton, CA) instrument. ScRNAseq libraries were prepared according to the Single Cell 3’ v3 Reagent Kits User Guide (10X Genomics). Cellular suspensions were loaded on a Chromium Controller instrument (10X Genomics) to generate single-cell Gel Bead-In-Emulsions (GEMs). Reverse transcription (GEM-RT) was performed in a Veriti 96-well thermal cycler (Thermo Fisher Scientific). After RT, GEMs were harvested and the cDNAs were amplified and cleaned with SPRIselect Reagent Kit (Beckman Coulter). Indexed sequencing libraries were constructed using the Chromium Single-Cell 3’ Library Kit (10X Genomics) for enzymatic fragmentation, end-repair, A-tailing, adapter ligation, ligation cleanup, sample index PCR, and PCR cleanup. The barcoded sequencing libraries were quantified by quantitative PCR using the KAPA Library Quantification Kit (KAPA Biosystems, Wilmington, MA). Sequencing libraries were loaded on a NextSeq500 (Illumina) with a custom sequencing setting (26 bp for Read 1 and 98 bp for Read 2), to obtain a sequencing depth of ~50 K reads/cell. Fastq files were generated using bcl2fastq v.2.2 and Cellranger v.3.0.2. The sequence files were mapped to the human genome reference (GRCh37) including the tdTomato reporter. We used Seurat v.3 to further process the data. We estimated the doublet rate according to the 10X Chromium guidelines, in proportion to the density of cells loaded. These rates were used to flag potential doublets based on their gene and UMI counts. Cells with fewer than 800 genes detected were also filtered out, as well as cells with high percentage of counts mapping to mitochondrial genes (thresholds set by manual inspection for each dataset at 25%). For each experiment, the mutant and corrected samples were merged and then normalized using SCTransform, with cell degradation effect regressed out. After an initial linear dimensionality reduction (PCA), we used UMAP projections to represent the data and the Louvain algorithm was used for clustering. Differential expression tests were done with MAST, applying independent prior to gene testing (minimum detection percentage of 10% in at least one of the populations, minimum average log-fold change of 0.15 between the two populations). An FDR-adjusted p value cutoff of 0.05 was used to determine statistically significant differential gene expression. Gene signature enrichment scores for KRT5-/KRT17+ cell state, UPR activation, and TGF-β signaling were computed using Seurat. The significance of the differences in these scores between corrected and mutant samples was tested using Welch Two Sample t-tests.

iAT2 response to altered niche signals RT-qPCR analysis

To test the effect of loss of homeostatic signaling (WNT and KGF/FGF7) in the emergence of the aberrant intermediate cell state in human AT2s, human iAT2s (clone SPC2-ST-B2; day 127) were incubated with TGF-β (5 ng/ml) for 24 h in the presence or absence of CHIR and KGF (CK). RNA was extracted by first lysing cells in Qiazol (Qiagen, 79306) and subsequently using RNAeasy mini kit according to the manufacturer’s protocol. Complementary DNA (cDNA) was generated using MultiScribe Reverse Transcriptase (Applied Biosystems, 4311235). RT-qPCR was performed using TaqMan Fast Universal PCR Master Mix (Thermo Fisher Scientific, 4352042) and TaqMan (Applied Biosystems) reagents. Relative gene expression, normalized to 18S control, was calculated as fold change in 18S-normalized gene expression, compared to baseline, using the 2(-ΔΔCt) method. Baseline expression, defined as fold change = 1, was set to iAT2s cultured in CK + DCI medium. Undetected probes were assigned a cycle number of 40 to allow for fold change calculations. Primers were all TaqMan probes purchased from Applied Biosystems. Specifics of primers used are detailed in Supplementary Table 3.

Statistics

All data are presented with dot plots and group median and range, unless otherwise indicated. Statistical analyses were performed with GraphPad Prism. 2-tailed Student’s t test were used for 2 groups. Multiple comparisons were done by 1-way ANOVA, which was performed with post hoc testing as indicated. In all cases, statistical significance was considered at P ≤ 0.05. Sample sizes for murine experiments were based on prior publications using the SftpcC121G model12,22,23, and for organoid modeling based on prior publications using primary mouse or iAT2 modeling22,49.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.