Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A molecular cell atlas of the human lung from single-cell RNA sequencing

Abstract

Although single-cell RNA sequencing studies have begun to provide compendia of cell expression profiles1,2,3,4,5,6,7,8,9, it has been difficult to systematically identify and localize all molecular cell types in individual organs to create a full molecular cell atlas. Here, using droplet- and plate-based single-cell RNA sequencing of approximately 75,000 human cells across all lung tissue compartments and circulating blood, combined with a multi-pronged cell annotation approach, we create an extensive cell atlas of the human lung. We define the gene expression profiles and anatomical locations of 58 cell populations in the human lung, including 41 out of 45 previously known cell types and 14 previously unknown ones. This comprehensive molecular atlas identifies the biochemical functions of lung cells and the transcription factors and markers for making and monitoring them; defines the cell targets of circulating hormones and predicts local signalling interactions and immune cell homing; and identifies cell types that are directly affected by lung disease genes and respiratory viruses. By comparing human and mouse data, we identified 17 molecular cell types that have been gained or lost during lung evolution and others with substantially altered expression profiles, revealing extensive plasticity of cell types and cell-type-specific gene expression during organ evolution including expression switches between cell types. This atlas provides the molecular foundation for investigating how lung cell identities, functions and interactions are achieved in development and tissue engineering and altered in disease and evolution.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Identities and locations of lung epithelial, endothelial, and stromal cell types.
Fig. 2: Identity and residency of lung immune cells.
Fig. 3: Chemokine signalling predicts immune cell homing in lung.
Fig. 4: Evolutionary divergence of lung cell types and expression patterns.

Similar content being viewed by others

Data availability

Counts/UMI tables, cellular metadata, Seurat objects, and scanpy objects are available on Synapse (https://www.synapse.org/#!Synapse:syn21041850). The data can be explored in a browser using cellxgene at https://hlca.ds.czbiohub.org/. Human sequencing data are available by data access agreement on the European Genome-phenome Archive (EGA) under accession EGAS00001004344. Use of human sequencing data are restricted to not for profit research only and requires approval or a waiver from requesting investigator’s institutional review board. Mouse sequencing data are available on the National Institute of Health’s Sequence Read Archive (SRA) under BioProject accession PRJNA632939. Source data are provided with this paper.

Code availability

The code for demultiplexing counts/UMI tables, clustering, annotation, downstream analyses, and obtaining source data/generating figures that include single-cell expression data are available on GitHub (https://github.com/krasnowlab/HLCA).

References

  1. Enge, M. et al. Single-cell analysis of human pancreas reveals transcriptional signatures of aging and somatic mutation patterns. Cell 171, 321–330.e14 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Tabula Muris Consortium. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature 562, 367–372 (2018).

    Article  ADS  CAS  Google Scholar 

  3. Han, X. et al. Mapping the mouse cell atlas by microwell-seq. Cell 173, 1307 (2018).

    Article  CAS  PubMed  Google Scholar 

  4. Zeisel, A. et al. Molecular architecture of the mouse nervous system. Cell 174, 999–1014.e22 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Saunders, A. et al. Molecular diversity and specializations among the cells of the adult mouse brain. Cell 174, 1015–1030.e16 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Vento-Tormo, R. et al. Single-cell reconstruction of the early maternal-fetal interface in humans. Nature 563, 347–353 (2018).

    Article  ADS  CAS  PubMed  Google Scholar 

  7. Young, M. D. et al. Single-cell transcriptomes from human kidneys reveal the cellular identity of renal tumors. Science 361, 594–599 (2018).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  8. Aizarani, N. et al. A human liver cell atlas reveals heterogeneity and epithelial progenitors. Nature 572, 199–204 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Han, X. et al. Construction of a human cell landscape at single-cell level. Nature 581, 303–309 (2020).

    Article  ADS  CAS  PubMed  Google Scholar 

  10. Young, J. Malpighi’s “De pulmonibus.”. Proc. R. Soc. Med. 23, 1–11 (1929).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Gehr, P., Bachofen, M. & Weibel, E. R. The normal human lung: ultrastructure and morphometric estimation of diffusion capacity. Respir. Physiol. 32, 121–140 (1978).

    Article  CAS  PubMed  Google Scholar 

  12. Balis, J. U., Paterson, J. F., Paciga, J. E., Haller, E. M. & Shelley, S. A. Distribution and subcellular localization of surfactant-associated glycoproteins in human lung. Lab. Invest. 52, 657–669 (1985).

    CAS  PubMed  Google Scholar 

  13. Hermans, C. & Bernard, A. Lung epithelium-specific proteins: characteristics and potential applications as markers. Am. J. Respir. Crit. Care Med. 159, 646–678 (1999).

    Article  CAS  PubMed  Google Scholar 

  14. Franks, T. J. et al. Resident cellular components of the human lung: current knowledge and goals for research on cell phenotyping and function. Proc. Am. Thorac. Soc. 5, 763–766 (2008).

    Article  PubMed  Google Scholar 

  15. Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).

    Article  CAS  PubMed  Google Scholar 

  16. Gawad, C., Koh, W. & Quake, S. R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17, 175–188 (2016).

    Article  CAS  PubMed  Google Scholar 

  17. Treutlein, B. et al. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature 509, 371–375 (2014).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  18. Reyfman, P. A. et al. Single-cell transcriptomic analysis of human lung provides insights into the pathobiology of pulmonary fibrosis. Am. J. Respir. Crit. Care Med. 199, 1517–1536 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Braga, F. A. V. et al. A cellular census of human lungs identifies novel cell states in health and in asthma. Nat. Med. 25, 1153–1163 (2019).

    Article  CAS  Google Scholar 

  20. Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).

    Article  CAS  PubMed  Google Scholar 

  21. Blondel, V. D. et al. Fast unfolding of communities in large networks. J. Stat. Mech. 2008, P10008 (2008).

    Article  MATH  Google Scholar 

  22. Howitt, M. R. et al. Tuft cells, taste-chemosensory cells, orchestrate parasite type 2 immunity in the gut. Science 351, 1329–1333 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  23. Rock, J. R. et al. Notch-dependent differentiation of adult airway basal stem cells. Cell Stem Cell 8, 639–648 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Garcia, S. R. et al. Single-cell RNA sequencing reveals novel cell differentiation dynamics during human airway epithelium regeneration. Preprint at https://doi.org/10.1101/451807 (2018).

  25. Nabhan, A. N., Brownfield, D. G., Harbury, P. B., Krasnow, M. A. & Desai, T. J. Single-cell Wnt signaling niches maintain stemness of alveolar type 2 cells. Science 359, 1118–1123 (2018).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  26. Zacharias, W. J. et al. Regeneration of the lung alveolus by an evolutionarily conserved epithelial progenitor. Nature 555, 251–255 (2018).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  27. Stan, R. V. et al. The diaphragms of fenestrated endothelia: gatekeepers of vascular permeability and blood composition. Dev. Cell 23, 1203–1218 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Gillich, A. et al. Capillary cell-type specialization in the alveolus. Nature 586, 785–789 (2020).

    Article  CAS  PubMed  ADS  PubMed Central  Google Scholar 

  29. Tan, S. Y. S. & Krasnow, M. A. Developmental origin of lung macrophage diversity. Development 143, 1318–1327 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. van den Brink, S. C. et al. Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations. Nat. Methods 14, 935–936 (2017).

    PubMed  Google Scholar 

  31. Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  32. Shiow, L. R. et al. CD69 acts downstream of interferon-α/β to inhibit S1P1 and lymphocyte egress from lymphoid organs. Nature 440, 540–544 (2006).

    Article  ADS  CAS  PubMed  Google Scholar 

  33. Mackay, L. K. et al. Hobit and Blimp1 instruct a universal transcriptional program of tissue residency in lymphocytes. Science 352, 459–463 (2016).

    Article  ADS  CAS  PubMed  Google Scholar 

  34. Moffitt, J. R. & Zhuang, X. RNA imaging with multiplexed error-robust fluorescence in situ hybridization (MERFISH). Methods Enzymol. 572, 1–49 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science 361, eaat5691 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  36. Eng, C. L. et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH. Nature 568, 235–239 (2019).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  37. Huang, C. et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395, 497–506 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Limjunyawong, N., Fallica, J., Horton, M. R. & Mitzner, W. Measurement of the pressure-volume curve in mouse lungs. J. Vis. Exp. 52376, 52376 (2015).

    Google Scholar 

  39. Seeley, R. R., Stephens, T. D. & Tate, P. Essentials of Anatomy and Physiology 7th edn (2005).

  40. Tabula Muris Consortium. A single cell transcriptomic atlas characterizes aging tissues in the mouse. Nature 583, 590–595 (2019).

    Article  CAS  Google Scholar 

  41. van Amerongen, R., Bowman, A. N. & Nusse, R. Developmental stage and time dictate the fate of Wnt/β-catenin-responsive stem cells in the mammary gland. Cell Stem Cell 11, 387–400 (2012).

    Article  PubMed  CAS  Google Scholar 

  42. Greif, D. M. et al. Radial construction of an arterial wall. Dev. Cell 23, 482–493 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Muzumdar, M. D., Tasic, B., Miyamichi, K., Li, L. & Luo, L. A global double-fluorescent Cre reporter mouse. Genesis 45, 593–605 (2007).

    Article  CAS  PubMed  Google Scholar 

  44. Madisen, L. et al. A robust and high-throughput Cre reporting and characterization system for the whole mouse brain. Nat. Neurosci. 13, 133–140 (2010).

    Article  CAS  PubMed  Google Scholar 

  45. Moraga, I. et al. Tuning cytokine receptor signaling by re-orienting dimer geometry with surrogate ligands. Cell 160, 1196–1208 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Desai, T. J., Brownfield, D. G. & Krasnow, M. A. Alveolar progenitor and stem cells in lung development, renewal and cancer. Nature 507, 190–194 (2014).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  47. Butler, A. et al. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  49. Amberger, J. S. et al. OMIM.org: Online Mendelian Inheritance in Man (OMIM), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43, D789–D798 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  50. Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47 (D1), D1005–D1012 (2019).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We are grateful to the tissue donors and the clinical staff at Stanford Medical Center who made tissue collection possible, especially J. Benson and E. Chen. We are especially grateful to Jim Spudich who spurred this study. We also thank the Stanford Shared FACS Facility for their expertise and sorting services, especially L. Nichols and M. Weglarz; members of Chan Zuckerberg Biohub and Quake laboratory who supported this work, particularly A. McGeever, B.Yu, B. Jones and S. Kolluru; M. Kumar for discussions on annotation of stromal cells; and M. Petersen for illustrating the lung schematic (Fig. 1b) and C. Kao for help with figure formatting. Some computing for this project was performed on the Sherlock cluster; we thank Stanford University and the Stanford Research Computing Center for providing computational resources and support that contributed to the results. We thank J. Spudich and members of the Krasnow laboratory for discussions and comments on the manuscript, and A. Lozano for discussions on bioinformatic analyses. This work was supported by funding from the Chan Zuckerberg Biohub (S.R.Q.), the Howard Hughes Medical Institute, National Institutes of Health, and the Vera Moulton Wall Center for Pulmonary Vascular Disease (M.A.K.), and the Ludwig Cancer Center at Stanford (I.L.W.). K.J.T was supported by a Paul and Mildred Berg Stanford Graduate Fellowship. M.A.K. is an investigator of the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Authors

Contributions

K.J.T., A.N.N., L.P., R.S., A.G., C.S.K., R.J.M. and M.A.K. conceived the project and designed the lung and blood cell isolation strategy, J.B.S. and C.S.K. designed clinical protocols, reviewed clinical histories and coordinated patient care teams to obtain profiled tissues, G.B. provided expert clinical evaluation and micrographs of donor tissue histology, K.J.T., A.N.N., R.S. and A.G. processed tissue to single-cell suspensions, K.J.T., A.N.N., L.P. A.G., R.S. and S.D.C. sorted cells for SS2, A.N.N., L.P., S.C. and R.V.S. prepared sequencing libraries, and K.J.T., R.V.S. and L.P. processed and aligned sequencing data. R.S., J.S. and Y.M. performed and supervised bulk mRNA sequencing on defined immune populations. K.J.T., A.N.N., R.S. A.G. and R.J.M. provided tissue expertise and annotated cell types. K.J.T., A.N.N. and M.A.K. designed and implemented bioinformatic methods and interpreted results. K.J.T., A.N.N. and A.G. performed follow up stains. M.A.K., S.R.Q., N.F.N., I.L.W., C.S.K. and R.J.M. supervised and supported the work. K.J.T., A.N.N. and M.A.K. wrote the manuscript, and all authors reviewed and edited the manuscript.

Corresponding authors

Correspondence to Stephen R. Quake or Mark A. Krasnow.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Shalev Itzkovitz and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Strategy for scRNA-seq and annotation of human lung and blood cells.

a, Workflow for capture and mRNA sequencing of single cells from the healthy unaffected regions indicated (D, distal; M, medial; P, proximal lung tissue; see d) of fresh, surgically resected lungs with focal tumours from three participants (1, 2 and 3) and their matched peripheral blood. Cell representation was balanced among the major tissue compartments (endothelial, immune, epithelial and stroma) by magnetic and fluorescence activated cell sorting (MACS and FACS) using antibodies for the indicated surface markers (CD31, CD45, EPCAM). Cell capture and scRNA-seq was done using 10x droplet technology or SS2 analysis of plate-sorted cells. Number of profiled cells from each compartment are shown in parentheses. For blood, immune cells were isolated on a high density Ficoll gradient, and unsorted cells profiled by 10x and sorted cells (using canonical markers for the indicated immune populations) by SS2. Total cell number (all three participants) and median number of expressed genes per cell are indicated for each method. b, Cell clustering and annotation pipeline. Cell expression profiles were computationally clustered by nearest-neighbour relationships and clusters were then separated into tissue compartments based on expression of compartment-specific markers (EPCAM (blue), CLDN5 (red), COL1A2 (green), and PTPRC (purple)), as shown for t-distributed stochastic neighbour embedding (t-SNE) plot of lung and blood cell expression profiles obtained by 10x from participant 3. Cells from each tissue compartment were then iteratively re-clustered until differentially-expressed genes driving clustering were no longer biologically meaningful. Cell cluster annotation was based on expression of canonical marker genes from the literature, markers found through RNA sequencing of purified cell populations (bulk RNA markers), ascertained tissue location, and inferred molecular function from differentially-expressed genes. c, Heat map of pairwise Pearson correlations of the average expression profile of each cluster in the combined 10x dataset plus SS2 analysis of neutrophils. n values are in Supplementary Table 2. Tissue compartment and identification number of each of the 58 clusters are indicated. For more details on statistics and reproducibility, see Methods. d, Representative micrographs of donor lungs from formalin-fixed, paraffin-embedded sections stained with haematoxylin and eosin showing bronchi, bronchioles, submucosal glands, arteries, veins and alveoli near regions used for scRNA-seq. Staining repeated on at least five sections (encompassing different anatomical regions) from each participant used for scRNA-seq. Scale bar, 100 μm.

Extended Data Fig. 2 Selectively expressed RNA markers of human immune cell types from bulk mRNA sequencing of FACS-purified immune cells.

a, Heat map of RNA expression of the most selectively-expressed genes from bulk mRNA sequencing of the indicated FACS-sorted immune populations (Supplementary Table 3). This dataset provided RNA markers for human immune cell populations that have been classically defined by their cell surface markers. b, Heat map of pairwise Pearson correlation scores between the average expression profiles of the immune cell types indicated that were obtained from bulk mRNA sequencing (BulkSeq, a) to the average scRNA-seq profiles of human blood immune cells in the SS2 dataset annotated by canonical markers and enriched RNA markers from the bulk RNA-seq analysis. The highest correlation in overall gene expression (white dot) of each annotated immune cell cluster in the SS2 dataset (columns) was to the bulk RNA-seq of the same FACS-purified immune population (rows), supporting the scRNA-seq immune cluster annotations (red squares). Cell numbers are in Supplementary Table 2. For more details on statistics and reproducibility, see Methods.

Extended Data Fig. 3 Expression differences and localization of lung cell states and canonical epithelial and endothelial subtypes.

a, Proliferative signature score (based on expression of indicated genes in cells from 10x dataset; cell numbers are in Supplementary Table 2) of each cluster of basal cells, T and natural killer cells, and macrophages. Three clusters had high scores: proliferating basal cells (Bas-p), proliferating natural killer/T cells (NK/T-p), and proliferating macrophages. b, Dot plot of mean level of expression (dot intensity, grey scale) of indicated basal cell markers and percent of cells in population with detected expression (dot size) for 10x dataset. Note partial overlap of markers among different basal populations. c, Immunostaining of adult human pseudostratified airway for differentiation marker HES1 (green) in basal cells (marked by KRT5, red) with DAPI (nuclear) counter stain (blue). Scale bars, 10 μm. Note apical processes extending from HES1+ basal cells (arrowheads) indicating migration away from basal lamina as they differentiate. Other HES1+ cells have turned off basal marker KRT5. Dashed outlines, basal cell nuclei. Quantification shows fraction of basal cells (cuboidal KRT5+ cells on basement membrane) and differentiating basal (Bas-d) cells (KRT5+ cells with apical processes) that were HES1+. n denotes KRT5+ cells scored in sections of two human lungs with staining repeated on four participants. d, Immunostaining of adult human pseudostratified airway for proliferation marker MKI67 (green) in basal cells (marked by KRT5, red) with DAPI counter stain (blue). Scale bars, 5 μm. Quantification shows abundance of proliferating (MKI67-expressing) basal cells in pseudostratified (pseudo) and simple epithelial airways; n denotes KRT5+ cells scored in sections of two human lungs with staining repeated on four participants. e, Relative abundance of epithelial and stromal cell types in scRNA-seq analysis of human lung samples obtained from proximal (blue; 10x cells from P3) and distal (red; 10x cells from D1a, D1b, D2, D3) lung sites. In addition to the expected proximal enrichment of some airway cell types (goblet cells, ionocytes, neuroendocrine cells) and distal enrichment of alveolar cell types (AT1, AT2, AT2-signalling, myofibroblasts), note three bracketed pairs of related cell types (ciliated and proximal ciliated; basal and proximal basal (Bas-px) cells; myofibroblasts and fibromyocytes) with one of them proximally enriched. Relative enrichment values are provisional because they can be influenced by efficiency of collection during cell dissociation and isolation. Cell number for proximal cells are (from left to right): 357, 275, 73, 175, 153, 191, 39, 145, 57, 24, 20, 10, 328, 1,505, 235, 25 and 70; and for distal cells are: 537, 806, 15, 197, 4, 58, 6, 14, 336, 0, 2, 1, 467, 2,095, 434, 198 and 28. f, RNAscope smFISH and quantification for general basal marker KRT5 (red) and proximal basal cell marker SERPINB3 (white) with DAPI counter stain (blue) and ECM autofluorescence (green) on proximal, pseudostratified bronchi and distal, simple bronchioles. Scale bars, 20 μm (inset, 10 μm). Note enrichment of proximal basal cells (KRT5 SERPINB3 double positive, yellow arrowhead and box) enrichment at base of pseudostratified airways. SERPINB3 was not detected in simple airways, indicating that basal cells (but not proximal basal cells) are present there. Staining repeated on two participants. g, Dot plot of expression in ciliated and proximal ciliated cells of canonical (general) ciliated cell markers and specific proximal ciliated markers (in 10x dataset). h, smFISH and quantification of human pseudostratified epithelial (left) and simple epithelial (right) airways for general ciliated marker C20orf85 (white) and proximal ciliated marker DHRS9 (red) with DAPI counterstain (blue) and ECM autofluorescence (green). Note restriction of proximal ciliated cells to pseudostratified airways. Scale bars, 10 μm. Staining repeated on two particpants. i, Heat map of expression of representative general AT2, AT2 selective, and AT2-signalling selective marker genes in AT2 and AT2-signalling human lung cells (SS2 data). AT2 selective markers include negative regulators of Hedgehog and Wnt signalling pathways (for example, HHIP and WIF1, highlighted red) and AT2-signalling selective markers include Wnt ligands, receptors and transcription factors (for example, WNT5A, LRP5 and TFC7L2 highlighted green). Values shown are ln(CPM + 1) for 50 randomly selected cells in each cluster (SS2 data). j, Dot plot of expression of endothelial markers (10x dataset). k, Micrograph (low magnification, left) of bronchial vessel (boxed region) showing vessel location near airway (dotted outline). smFISH for general endothelial marker CLDN5 (red, centre), bronchial vessel-specific markers MYC (green) and Bro1-specific marker ACKR1 (red, right) on serial sections of bronchial vessel cells (arrowheads), co-stained for DAPI (blue). Scale bar, 10 μm. Quantification shows relative abundance of Bro1 and Bro2 cells. Staining repeated on two participants. ln, smFISH and quantification of vessel types indicated (dotted outlines) showing vein marker ACKR1 (red; l), artery marker GJA5 (red; m), lymphatic marker CCL21 (red; n), and general endothelial marker CLDN5 with DAPI counter stain (blue) and ECM autofluorescence (green). Scale bars, 50 μm (l), 30 μm (m) and 40 μm (n). Staining repeated on two participants. For more details on statistics and reproducibility, see Methods.

Source data

Extended Data Fig. 4 Markers and lung localization of stromal and dendritic subtypes.

ad, smFISH for RNA of indicated marker genes of alveolar fibroblasts (a, b) and adventitial fibroblasts (c, d) in adult human (a, c) and mouse (b, e) alveolar (a, b) and pulmonary artery (c, d) sections. ECM autofluorescence (green; a, c) to show blood vessels; Elastin (green, b, d); DAPI counterstain (blue, all panels). Staining repeated on two human participants or three mice. a, smFISH probes: general fibroblast marker COL1A2 (white) and alveolar fibroblast-selective marker GPC3 (red). Arrowheads denote alveolar fibroblasts. Inset, close-up of boxed region showing merged (top) and split channels of an alveolar fibroblast. Scale bars, 20 μm (inset 60 μm). b, smFISH probes: alveolar fibroblast-selective markers Slc7a10 (white) and Frfr4 (red). Elastin (green) shows alveolar entrance ring. Arrowheads denote alveolar fibroblasts. Scale bar, 5 μm. c, smFISH probes: general fibroblast marker COL1A2 (white) and adventitial fibroblast -selective marker SERPINF1 (red). Adventitial fibroblasts (some indicated by arrowheads) localize around blood vessels (ECM, green). Inset, close-up of boxed region showing merged (top) and split channels of an adventitial fibroblast. Dashed line denotes the artery boundary. Scale bars, 30 μm (inset 90 μm). d, smFISH probes: adventitial fibroblast-selective markers Pi16 (white) and Serpinf1 (red). Adventitial fibroblasts (arrowheads) surround artery (marked by elastin, green). Scale bar, 10 μm. e, Heat map of expression of representative general, adventitial-selective, and alveolar-selective fibroblast markers in 50 randomly selected cells from adventitial (left) and alveolar (right) fibroblast clusters (SS2 dataset). Note specialization (highlighted red) in growth factors (AdvF: PDGFRL, IGFBP4; AlvF: FGFR4, VEGFD) and morphogen (AdvF: SFRP2; AlvF: NKD1, DKK3) signalling or regulation. f, g, smFISH and quantification of cell abundance in human alveolar (f) and pseudostratified epithelial airway (g) sections probed for myofibroblast and fibromyocyte marker ASPN (red), and for fibromyocyte and airway smooth muscle markers COX4I2 (white; f) and ACTG2 (white; g). ECM autofluorescence, green; DAPI counterstain, blue. Inset (f), boxed region showing close-up of merged (top) and split channels of ASPN+ COX4I2 myofibroblast. Myofibroblasts and fibromyocytes (see below) probably make up remaining cells in Fig. 1f quantification. Inset (g), boxed regions showing close-up of merged (top) and split channels of fibromyocyte (white box) and airway smooth muscle (yellow box) cells. Fibromyocytes (white arrowheads) and airway smooth muscle (yellow arrowheads) are intermingled in wall of pseudostratified airway (dotted outline). Staining repeated on two participants. h, i, smFISH of human alveolar sections probed for general stromal marker COL1A2 (white), pericyte marker COX4I2 (red; h), lipofibroblast marker APOE (red; i). ECM autofluorescence, green; DAPI counterstain, blue. Inset (h), boxed region showing close-up of pericyte. Inset (i), boxed region showing close-up of COL1A2 APOE double-positive LipF. LipF cells are intermingled among other stromal cells (single-positive COL1A2) and macrophages (single-positive APOE). Quantification in Fig. 1f. Scale bars, 20 μm. Staining repeated on two participants. j, Dot plot of COX4I2 expression in alveolar stromal cell types (10x dataset). k, Heat map of expression of dendritic cell marker genes in 50 randomly selected cells from indicated dendritic cell clusters (human blood and lung 10x datasets). Cells in all clusters express general dendritic markers including antigen presenting genes but each cluster also has its own selective markers. Red highlighted markers distinguishing the newly identified dendritic cell clusters (IGSF21+, EREG+, TREM2+) suggest different roles in asthma (IGSF21+), growth factor regulation (EREG+), and lipid handling (TREM2+). ln, smFISH of adult human lung proximal and alveolar (Alv) sections as indicated probed for IGSF21+ dendritic cell markers IGSF21 (red) and GPR34 (white) (l), EREG+ dendritic cell marker EREG (red) and general dendritic cell marker GPR183 (white) (m), and TREM2+ dendritic cell markers TREM2 (red) and CHI3L1 (white) (n). DAPI counterstain, blue. Non-punctate signal in red channel (l, n) is erythrocyte autofluorescence. Insets, boxed regions showing merged and split channels of close-up of single dendritic cell of indicated type. Scale bars, 20 μm. Arrowheads denote double-positive cells. Quantification shows distribution of each dendritic type; note IGSF21+ and EREG+ dendritic cells show strong proximal enrichment. Staining repeated on two participants. o, t-SNE of expression profile clusters of monocytes and B, T and natural killer cells (10x dataset, participant 1, 2,622 cells). Note separate cell clusters of each immune cell type isolated from lung (no outline) and blood (dashed outline). Asterisk denotes small number of B cells isolated from the lung that cluster next to blood B cells. For more details on statistics and reproducibility, see Methods.

Source data

Extended Data Fig. 5 Markers and transcription factors that distinguish human lung cell types.

a, Violin plots of expression levels (ln(UP10K + 1)) of the most sensitive and specific markers (gene symbols) for each human lung cell type in its tissue compartment (10x dataset). Cell numbers given in Supplementary Table 2. b, Scheme for selecting the most sensitive and specific marker genes for each cell type using Matthews correlation coefficient (MCC). Box-and-whisker plots below show MCCs, true positive rates (TPR), and false discovery rates (FDR) for each cell type (n = 58) using indicated number (nGene) of the most sensitive and specific markers (10x dataset). Note all measures saturate at approximately 2–4 genes, hence simultaneous in situ probing of a human lung for the approximately 100–200 optimal markers would assign identity to nearly every cell. c, Alveolar section of human lung probed by smFISH for AT1 marker AGER and transcription factor MYRF. MYRF is selectively expressed in AT1 cells (arrowheads; 97% of MYRF+ cells were AGER+, n = 250 scored cells). Inset, boxed region showing merged and split channels of AT1 cell. Scale bar, 10 μm. Staining repeated on two participants. d, Alveolar section of human lung probed by smFISH for pericyte marker COX4I2 and transcription factor TBX5. TBX5 is enriched in pericytes (arrowheads, 92% of TBX5+ cells were COX4I2+, n = 250). Inset, boxed region showing merged and split channels of pericyte. Scale bar, 5 μm. Staining repeated on two participants. e, Dot plot of expression of enriched transcription factors in each lung cell type (SS2 dataset). Red text, genes not previously associated with the cell type. Red shading, transcription factors including MYRF that are highly enriched in AT1 cells, and TBX5 and others highly enriched in pericytes. For more details on statistics and reproducibility, see Methods.

Extended Data Fig. 6 Lung cell targets of circulating hormones and local signals.

a, Dot plot of hormone receptor gene expression in lung cells (SS2 dataset). Type and name of cognate hormones for each receptor are shown at top. Teal, broadly-expressed receptors in lung; other colours, selectively-expressed receptors (<3 lung cell types). Small coloured dots next to cell type names show selectively targeted cell types. AA, amino acid; AM, adrenomedullin; CGRP, calcitonin gene-related peptide; EPO, erythropoietin; GCCT, glucocorticoid; GH, growth hormone; GIP, gastric inhibitory peptide; IGF, insulin-like growth factor; MCCT, mineralocorticoid; RA, retinoic acid; SST, somatostatin. b, Schematic of inferred pericyte cell contractility pathway and its regulation by circulating hormones (AGT, PTH) and capillary expressed signals (EDN, NO). Dots show expression of indicated pathway genes: values at left (outlined red) in each pair of dots in capillary diagram (top) show expression in Cap-a cells (aerocytes) and at right (outlined blue) show expression in general Cap cells (SS2 dataset). Note most signal genes are preferentially expressed in Cap relative to Cap-a cells. c, Heat maps showing number of interactions predicted by CellPhoneDB software between human lung cell types located in proximal lung regions (left panel in each pair) and distal regions (right) based on expression patterns of ligand genes (‘sending cell’) and their cognate receptor genes (‘receiving cell’) (SS2 dataset). The pair of heat maps at the top left show values for all predicted signalling interactions (‘all interactions’), and other pairs show values for the indicated types of signals (growth factors, cytokines, integrins, WNT, Notch, BMP, FGF and TFGβ). Predicted interactions between cell types range from 0 (lymphocyte signalling to neutrophils) to 136 (AdvF signalling to Cap-i1). Note expected relationships, such as immune cells expressing integrins to interact with endothelial cells and having higher levels of cytokine signalling relative to their global signalling, and unexpected relationships, such as fibroblasts expressing most growth factors and lack of Notch signalling originating from immune cells. For more details on statistics and reproducibility, see Methods.

Extended Data Fig. 7 Lung cell expression patterns of genes implicated in lung disease.

Dot plots of expression (in SS2 dataset) of 233 lung disease genes curated from genome-wide association studies (GWAS; genome-wide association genes ≥ 10−20 significance) and Online Mendelian Inheritance in Man (OMIM). For more details on statistics and reproducibility, see Methods.

Extended Data Fig. 8 Mapping cellular origins of lung disease by cell-selective expression of disease genes.

a, Dot plots of expression of lung disease genes (numbered, associated disease shown above) enriched in specific lung cell types (SS2 datasets). Red, novel cell type association of gene or disease; grey, diseases with developmental phenotype. AWS, Alagille–Watson syndrome; BBS, Bardet–Biedl syndrome; CF, cystic fibrosis; Dys, dysplasia; EDS, Ehlers-Danlos syndrome; Fam Med, familial Mediterranean; IPF, idiopathic pulmonary fibrosis; PH, pulmonary hypertension; SGB, Simpson-Golabi-Behmel; SM, smooth muscle; SMD, surfactant metabolism dysfunction; TB, tuberculosis; VDES, Van den Ende-Gupta syndrome. b, Dot plot of expression (SS2 dataset) of all genes implicated in pulmonary hypertension, tuberculosis and COPD or emphysema (OMIM, Mendelian disease genes from OMIM database; GWAS, genome-wide association genes ≥ 10−20 significance). Note canonical AT2 cells (red shading) express all and AT2-signalling cells (blue shading) express most. c, smFISH of alveolar section of adult human lung probed for in pulmonary hypertension disease gene KCNK3 (red) and pericyte marker COX4I2 (white) with DAPI counterstain (blue) and ECM autoflourescence (green). Note pericyte-specific expression (arrowheads, 91% of COX4I2+ pericytes were KCNK3+, n = 77). Scale bar, 5 μm. Cell numbers for each type given in Supplementary Table 2. d, smFISH of alveolar section of adult human lung probed for atrioventricular (AV) dysplasia gene ACVRL1 (red), endothelial marker CLDN5 (white) with DAPI counterstain. Note ACVRL1 CLDN5 double-positive capillaries (white arrowheads, 70% of CLDN5+ capillaries were ACVRL1+, n = 102) and some CLDN5 single-positive capillaries (yellow arrowheads). Scale bar, 5 μm. e, smFISH of alveolar section of adult human lung probed for COPD or emphysema gene SERPINA1 and AT2 marker SFTPC, and DAPI. Note AT2-specific expression (arrowheads; 93% of AT2 cells were SERPINA1+, n = 176). Scale bar, 5 μm. For more details on statistics and reproducibility, see Methods.

Extended Data Fig. 9 Lung cell expression patterns of respiratory virus receptors.

a, Dot plot showing expression in human lung cell types of entry receptors (indicated at left) for respiratory viruses (indicated at right, numbers indicate viral families) (SS2 dataset). Red shading, cell types inhaled viruses could directly access (epithelial cells and macrophages); darker red shading shows expression values for measles receptor NECTIN4 and rhinovirus C receptor CDHR3. b, Violin plots (left) and dot plots (immediately above violin plots) showing expression of coronavirus receptors ACE2, DPP4, and ANPEP in lung cell types (10x dataset, cell numbers given in Supplementary Table 2). Grey shading, cell types inhaled viruses can directly access. Doughnut plots (right) showing relative number of receptor-expressing cells of cell types viruses can directly access (shaded grey in a), normalized by their abundance values from Supplementary Table 1 (and refined by the relative abundance values in Fig. 1 and Extended Data Figs. 3 and 4). Note prevalence of AT2 alveolar cells for ACE2, receptor for SARS-CoV and SARS-CoV-2, and for DPP4, receptor for MERS-CoV, in contrast to prevalence of macrophages for ANPEP, receptor for common cold causing coronavirus 229E. For more details on statistics and reproducibility, see Methods.

Extended Data Fig. 10 Lung cell expression patterns of non-respiratory virus receptors.

Dot plot of expression of entry receptors for non-respiratory viruses in human lung cell types (compare with Extended Data Fig. 9a showing expression of receptors for respiratory viruses). For more details on statistics and reproducibility, see Methods.

Extended Data Fig. 11 Comparison of mouse and human gene expression profiles in homologous lung cell types and across age.

a, Scatter plots showing median expression levels (ln(CPM + 1)) in indicated cell types of each expressed human gene and mouse orthologue (mouse and human SS2 datasets, human and mouse cell numbers given in Supplementary Tables 2 and 6, respectively). Note tens to hundreds of genes that show a 20-fold or greater expression difference (and P < 0.05, MAST) between species (red dots, gene names indicated for some and total number given above). Basophil/mast cell 1 (Bas/Ma 1) cells have the most differentially expressed genes (343), and CD4+ M/E T cells have the least (79). Pearson correlation scores (R values) between the average mouse and human gene expression profiles for each cell type are indicated. ‘Mm()’ and ‘Hs()’ denotes genes in which duplications between mouse and human were collapsed to HomologyID. b, Heat map showing global transcriptome Pearson correlation between indicated human and mouse epithelial cells (SS2 dataset, human and mouse cell numbers given in Supplementary Tables 2 and 6, respectively). Red outline denotes homologous cell types based on classical markers described in Supplementary Table 6. White dot denotes human-to-mouse correlation. c, Dot plot of expression of canonical goblet cell markers MUC5B and MUC5AC and transcription factor SPDEF in mouse (left) and human (right) goblet cells. d, Scatter plot showing average expression levels (dots) across all cells (‘pseudo-bulk’ lung expression) of each expressed human gene and mouse orthologue (mouse and human SS2 datasets). Scale, ln(CPM + 1). Pearson correlation (R values) between the average mouse and human gene expression profiles are indicated. e, Scatter plots comparing median expression levels (ln(CPM + 1)) in indicated mouse lung cell types of each expressed gene at age 3 months (x axis) and 24 months (y axis) in SS2 datasets from Tabula Muris Senis40 (cell numbers given in Supplementary Table 6). Pearson correlation scores between average gene expression profile for each cell type at each age are indicated (R values), along with number of genes (red dots) showing 20-fold or greater expression difference (and P < 0.05, MAST) between ages. Names of some genes are given next to the corresponding red dot. For more details on statistics and reproducibility, please see Methods.

Extended Data Fig. 12 Patterns of conserved and divergent gene expression across human and mouse lung cell types.

a, Dot plots of PTPRC and MYL6 expression in mouse and human lung cell types (SS2 datasets) showing two examples of conserved (type 0) expression pattern. Blue shading, homologous cell types with conserved expression. b, Dot plots showing gain of expression (type 1 change) in several human cell types of RNASE1 (left) and all human cell types of TRIM38 (right). Red shading, cell types with divergent (gained) expression. c, Alveolar section of adult mouse lung probed by smFISH for general alveolar epithelial marker Nkx2-1, AT2 marker Sftpc, and transcription factor Myrf. Note Myrf is selectively expressed in mouse AT1 cells (Nkx2-1+ Sftpc cells), as it is in humans (Extended Data Fig. 6c). Scale bar, 5 μm. Staining repeated on three mice. d, Dot plots of expression of CGRP and ADM hormone receptor genes showing expansion of expression (type 2 change) in human endothelial cells (10x data sets). e, Dot plots of expression of emphysema-associated gene SERPINA1 showing switched expression (type 3 change) from mouse pericytes (top) to human AT2 cells (bottom) (SS2 datasets). f, Dot plots comparing expression and conservation of HHIP with those of other Hedgehog pathway genes including ligands (SHH, DHH, IHH), receptors (PTCH1, PTCH2, SMO), and transducers (GLI1, GLI2, GLI3) (SS2 datasets). g, Dot plots of expression of serous cell markers LTF, LYZ, BPIFBP1 and HP showing switched expression (type 3 change) from mouse airway epithelial cells to human serous cells, which mice lack (asterisk). Dot plots of expression of lipid handling genes APOE, PLIN2 and FST show switched expression (type 3 change) from mouse alveolar stromal cells to human lipofibroblasts, which mice lack (asterisks). ‘Mm()’ or ‘Hs()’, genes in which duplications between mouse and human were collapsed to HomologyIDs (10x and SS2 datasets). h, Pie chart of fraction of expressed genes in lung showing each of the four types of evolutionary changes in cellular expression patterns from mouse to human. Histogram below shows number of lung cell types that the 602 genes with perfectly conserved cellular expression patterns (type 0) are expressed in; note that almost all are expressed in either a single cell type (67%) or nearly all cell types (33%). For more details on statistics and reproducibility, see Methods.

Supplementary information

Supplementary Data

Supplementary Figure 1. FACS gating strategies for human lung and peripheral blood cells. a, Sequential FACS data and sorting gates (red) for dissociated human lung cells from subject sample D1b (plate B001223) following MACS depletion of highly abundant immune (CD45+) and endothelial (CD31+) cells. The final sort (right) was of viable single cells from the lung epithelial (EPCAM+CD45-), immune (CD45+EPCAM-), and stromal/endothelial (EPCAMCD45-) compartments into 384-well plates for SS2 scRNAseq. Plots are representative of FACS repeated on 3 subjects used for scRNAseq. b, Sequential FACS data and sorting gates (red) for white blood cells isolated on a Ficoll gradient of matched subject peripheral blood (subject 1, plate BP1). Viable, single CD235a- (non-RBC) cells were captured without additional gating (panel 4), or further sorted as CD8 T (CD8+; panel 8), CD4 T (CD4+; panel 7), B (CD19+CD3-; panel 6), NK (CD19-CD3-CD56+CD14-; panel 9), or CD14+ monocytes (CD19-CD3-CD56- CD14+; panel 9) for SS2 scRNAseq. Contours, 5% increments in cell density. Sorting on blood cells was done only for subject 1.

Reporting Summary

Supplementary Table

Supplementary Table 1. Canonical cell types (45) in the human lung and their abundances, markers, and available expression data. a, numbers of each type were calculated with their abundances and the total number of lung cells (estimated by comparing volume of lungs to the whole body). b, Canonical markers were obtained from referenced expression data or commonly used markers in the literature. c, Expression profiles captured immediately following tissue dissociation are considered primary. d, Alveoli were assumed to occupy ~90% of the total lung volume for all estimations. e, Inferred from mean relative abundance in proximal, medial and distal airway epithelium. f, Calculated by stereology g, Resin casts showed similar surface area of arteries and veins. h, Vascular smooth muscle is estimated to be slightly more abundant than airway smooth muscle. i, abundance of a more general cell type was split evenly. j, inferred from impression of light or electron microscopy. k, inferred from histological abundance in nonperfused healthy tissue. l, inferred from abundance among immune cells with FACS. m, Calculated using microfluidic capture.

Supplementary Table

Supplementary Table 2. Human lung cell cluster identities, abundances, and locations. Cell numbers are stratified by type, subject, and sequencing technology. Cell abbreviations indicated are used throughout.

Supplementary Table

Supplementary Table 3. Surface markers used to isolate canonical immune cell types in bulk mRNA sequencing. See Methods for details on antibodies.

Supplementary Table

Supplementary Table 4. Enriched markers found in each cluster, with transcription factors, receptors/ligands, and disease associated genes annotated. Includes all enriched genes for each type (p-val > 0.05, MAST, downsampled cells per group to 100 for SS2 clusters or 500 for 10x clusters). Abbreviations: avg_logFC, the natural log of the average fold change between the cell type and other cell types in its tissue compartment; pct_in_cluster, percentage of cells within the cluster that express the gene; pct_out_cluster, percentage of cells outside cluster that express the gene; p_val_adj, p-value with Bonferroni correction applied; TF, transcription factor; OMIM, Online Mendelian Inheritance in Man; GWAS, genome wide association study.

Supplementary Table

Supplementary Table 5. P-value and scores of each CellPhoneDB Receptor-Ligand interaction from each cluster. Expression scores are given for significant (p > 0.1, CellPhoneDB statistical framework) pairwise interactions between all SS2 cells (cell numbers given in Supplementary Table 2). Table includes all 1,085 receptor-ligand pairs and all pairwise cell type combinations. Abbreviations: partner_a, cellphonedb ID for the first interaction partner protein; partner_b, cellphonedb ID for the second interaction partner protein; source, reference from cellphonedb; secreted, whether the ligand is secreted or membrane bound; is_integrin, whether the interaction includes an integrin complex.

Supplementary Table

Supplementary Table 6. Mouse lung cell cluster identities and their abundances in each dataset. Cell numbers are stratified by type, mouse, and sequencing technology. Mouse-specific molecular markers are indicated.

Supplementary Table

Supplementary Table 7. Genes specific to mouse and human in each cluster and lung wide. Includes all differentially expressed genes (p-val > 0.05, MAST) for all SS2 cells in indicated mouse and human clusters (human and mouse cell numbers given in Supplementary Tables 2 and 6, respectively). Abbreviations: avg_logFC, the natural log of the average fold change between the mouse and human cell type indicated; pct_mouse, percentage of mouse cells within the cluster that express the gene; pct_human, percentage of human cells within the cluster that express the gene; p_val_adj, p-value with Bonferroni correction applied; enriched, gene is enriched in cluster in mouse or human.

Supplementary Table

Supplementary Table 8. Evolutionary changes in cellular patterns of lung gene expression between mouse and human SS2 cells. Numbers of genes and percentages are mean (±SEM) from using a median expression cutoff minus 0 to 2 standard deviations (0.25 increments) and a median percent cutoff plus 0 to 2 standard deviations (0.25 increments). Human and mouse cell numbers given in Supplementary Tables 2 and 6, respectively.

Supplementary Table

Supplementary Table 9. Evolutionary and functional classes of genes. Includes genes conserved between mouse and human. Abbreviations: Evo type, evolutionary scenario gene falls into, with NA for genes not expressed; Gene class, type of gene (receptor, ligand, enzyme, transcription factor, etc); conserved clusters, number of homologous clusters gene where gene is expressed in both mouse and human; Human specific clusters, number of homologous clusters where the gene is only expressed in human; Mouse specific clusters, above for mouse.

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Travaglini, K.J., Nabhan, A.N., Penland, L. et al. A molecular cell atlas of the human lung from single-cell RNA sequencing. Nature 587, 619–625 (2020). https://doi.org/10.1038/s41586-020-2922-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41586-020-2922-4

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing