A spatial transcriptomics dataset of pancreas sections in normal glucose tolerance and type 2 diabetic donors

Howell, Nick; Weiss, Zoe; Bonnycastle, Lori L.; Grenko, Caleb M.; Randazzo, Davide; Dampier, Christopher H.; Sinha, Neelam; Narisu, Narisu; Swift, Amy J.; Erdos, Michael R.; Biesecker, Leslie G.; Collins, Francis S.; Robertson, Catherine C.; Taylor, D. Leland

doi:10.1038/s41597-025-05450-6

Download PDF

Data Descriptor
Open access
Published: 01 September 2025

A spatial transcriptomics dataset of pancreas sections in normal glucose tolerance and type 2 diabetic donors

Nick Howell ORCID: orcid.org/0009-0004-2255-073X¹^na1,
Zoe Weiss¹^na1,
Lori L. Bonnycastle¹^na1,
Caleb M. Grenko^1,2,
Davide Randazzo³,
Christopher H. Dampier⁴,
Neelam Sinha¹,
Narisu Narisu¹,
Amy J. Swift¹,
Michael R. Erdos¹,
Leslie G. Biesecker¹,
Francis S. Collins¹^na2,
Catherine C. Robertson^1,5^na2 &
…
D. Leland Taylor¹^na2

Scientific Data volume 12, Article number: 1526 (2025) Cite this article

5072 Accesses
Metrics details

Subjects

Abstract

Understanding the spatial distribution of gene expression in the pancreas is essential for establishing the molecular basis of pancreatic function in healthy and disease contexts. Recent platforms offer a robust method for quantifying gene expression within a spatial context. Here, we report spatial transcriptomic profiling from pancreas samples obtained from three donors with type 2 diabetes (T2D) and three donors with normal glucose tolerance (NGT). Our analysis identified a major technical challenge: substantial transcript bleed of highly abundant genes (e.g., INS and GCG) into adjacent tissue regions. We demonstrate that this bleed can be computationally corrected using probabilistic models. Our analysis highlights the importance of incorporating bleed-correction techniques in the preprocessing of spatial transcriptomic profiling data. In summary, this study provides a dataset, methods, and resources to investigate the spatial regulation of gene expression in normal and T2D-affected human pancreas.

Multi-omics profiling of living human pancreatic islet donors reveals heterogeneous beta cell trajectories towards type 2 diabetes

Article 28 June 2021

Pancreatic organogenesis mapped through space and time

Article Open access 08 January 2025

The role of pancreas to improve hyperglycemia in STZ-induced diabetic rats by thiamine disulfide

Article Open access 20 June 2022

Background & Summary

The pancreas is a dual-purpose organ with exocrine (digestion) and endocrine functions (hormone secretion for blood glucose regulation)¹. Endocrine cell types and functions are localized to pancreatic islets, mini-organs that control glucose homeostasis through tightly regulated secretion of endocrine hormones including insulin and glucagon². Dysregulated hormone secretion in pancreatic islets is a hallmark of T2D. Spatial mapping of gene expression within the pancreas could improve our understanding of the molecular basis of islet dysfunction leading to T2D.

Historically, studies of spatial trends in gene expression relied on techniques like fluorescence in situ hybridization³. However, such methods have inherent limitations, such as low throughput capacity and a limited number of markers. Recent advances in spatial transcriptomics have greatly increased the resolution and scale for studying spatial gene expression (reviewed in Rao et al.⁴). These spatial transcriptomics technologies—pairing traditional histology with spatial RNA sequencing⁵—have enabled exploration into the spatially resolved transcriptional patterns simultaneously across many genes and tissues^6,7. To date, few studies using spatial platforms have considered the human pancreas in the normal and diabetic context⁸. To investigate how spatial patterns change between donors with normal glucose tolerance (NGT) and type 2 diabetes (T2D) (Table 1), we generated a dataset of three NGT and three T2D donors, with replication, using the 10x Genomics Visium Spatial Gene Expression v1 platform.

Table 1 Donor metadata.

Full size table

Methods

Source of human pancreas

Human pancreas samples were obtained from Prodo Laboratories (Aliso Viejo, CA). These samples were isolated from cadaverous donors whose organs were consented for research. As per the National Institutes of Health (NIH) Office of Human Subjects Research Protection (OHSRP) policy, tissue obtained from deceased individuals do not fall under the guidelines of human subject research. All experimental protocols performed for this study were approved under NIH guidelines. Approximately 1.5 cm³ of pancreas tissue was excised from the tail of each cadaveric pancreas. The tissue was dipped in 1% chlorhexidine and washed with saline to remove contaminants. The tissue was then rinsed in Krebs-Ringer solution containing RNase inhibitor (Sigma R7397) and prepared for cryosectioning by embedding them in optimal cutting temperature (OCT) compound, using isopentane cooled at liquid nitrogen temperature. The frozen tissue was placed into a cryovial cooled on dry ice and stored at −80 °C for up to 72 hours before shipping to our laboratory in dry ice. Upon receipt, we placed the frozen sample in a liquid nitrogen tank for long term storage.

Sample preparation, fixation, and staining

We embedded frozen pancreatic tissue samples into OCT compound, solidified in an isopentane ice bath to form individual OCT blocks, and maintained at a temperature of −15 °C to −20 °C for processing or at −80 °C for storage. We sectioned tissue OCT blocks into 10 µm sections and placed them onto Visium Spatial Tissue Optimization Slides (PN-3000394) or Visium Spatial Gene Expression Slides (PN-2000233) as recommended by the manufacturer (10x Genomics, Visium Spatial Protocols - Tissue Preparation Guide CG000240). We fixed and stained Visium slides with hematoxylin and eosin (H&E) according to manufacturer’s instructions (10x Genomics, Methanol Fixation, H&E Staining & Imaging for Visium Spatial Protocols CG000160).

Microscopy for spatial transcriptomics analysis

Following fixation and H&E staining, we imaged the slides using a Leica DMi8 microscope equipped with a CMOS DMC6200 color camera with pixel shift technology (Leica), driven by the Leica LAS X software. We performed image acquisition by employing a HC Plan Apochromatic CS2 10x/0.4 dry lens (Leica) with exposure time of the camera set to 10 ms (consistently for all images) and digital resolution set to 9.2 megapixels (3,840 × 2,400 px, 8 bit). The navigator module of the LAS X software used a tile scanning approach to acquire the entire area of each section, including the fiduciary markers. We captured images as.lef files and exported them as.tif files with lossless compression.

Optimization of tissue permeabilization conditions

After fixation, staining, and imaging, we performed a permeabilization time course followed by fluorescent cDNA synthesis and imaging. We determined an optimal incubation time of 18 minutes for tissue permeabilization (10x Genomics, Visium Spatial Gene Expression Reagent Kits - Tissue Optimization User Guide CG000238) (Fig. 1).

Spatial RNA capture, library construction, and sequencing

After staining and imaging, we processed Visium slides to generate spatially barcoded cDNA libraries as per the manufacturer’s instructions (10x Genomics, Visium Spatial Gene Expression Reagent Kits User Guide CG000239; Fig. 2). Briefly, we permeabilized the tissue, releasing mRNA for capture by primers on the Visium slide capture areas. Each capture area was 6.5 × 6.5 mm and contained 4,992 spots of spatial barcodes, where each spot is 55 μm in diameter. We reverse transcribed captured mRNAs into cDNA and coupled them to spatial barcodes during second strand synthesis. We transferred spatially barcoded cDNA sequences from the Visium slide to a tube for library construction. We pooled and sequenced dual-indexed libraries on the NovaSeq platform (Illumina, San Diego, CA, USA) with the following read lengths: Read1 + Index1 + Index2 + Read2 (28 + 10 + 10 + 90), where Index1 and Index2 are the sample indices, Read1 contains the 16 bp Spatial Barcode and 12 bp UMI, and Read2 contains the cDNA insert.

Data Records

We deposited images from the Visium experiments into the Gene Expression Omnibus (GEO) under accession ID GSE264331⁹. We deposited sequencing data (FASTQs from Illumina NovaSeq) and 10x Genomics Visium Space Ranger output (feature, barcode, and raw count matrices as h5 files) as part of the RNA-seq molecular dataset in the database of Genotypes and Phenotypes (dbGaP) under accession ID phs001188.v3.p1¹⁰. We deposited cloupe files compatible with 10x Genomics’ Loupe Browser software to Zenodo¹¹.

Sample naming conventions vary across data repositories. To address this, we have provided a key (Table 2) mapping sample IDs to donor/replicate numbering used in this data descriptor.

Table 2 Sample mapping across data repositories.

Full size table

Technical Validation

Spatial transcriptomic data processing, islet annotation, and quality control procedures

We used Space Ranger (version 2.1.1) with default parameters to process the raw Visium spatial gene expression data and generate gene expression values associated with each spatial location¹². The gene expression data consist of gene-specific unique molecular identifier (UMI) counts, measured across thousands of 55 μm diameter spots with spot-specific barcodes on 6.5 × 6.5 mm Visium capture areas. We normalized spot UMI counts using scanpy v1.10.1. Sequencing depth varied from spot-to-spot in ST assays. To address this, we normalized spots by total counts per spot and scaled (counts per 10,000 [CP10k]) using scanpy’s normalize_total function.

To verify tissue integrity and accuracy of spatial capture, we conducted a histological review of H&E stained images to identify major histological features, including islets, vasculature, and ducts (Fig. 3a). We compared the identified islet regions to the expression patterns of islet marker genes (e.g., INS for beta cells that reside in islets). We found a high overlap of the islet region labelled from H&E image review and the regions with the highest INS expression, validating the accurate identification of islets within the pancreatic tissue samples (Fig. 3a,b).

For each sample, our dataset contained two to four tissue slices. We evaluated the number of spots per tissue as an indirect measure of the area of each section (Fig. 3c). We found consistent results across replicates, in line with our expectations as these samples were sequential or near-sequential sections (Fig. 4).

Identification and cleaning of transcript bleed artifacts

While the spatial distribution of INS expression was largely concordant with manually annotated islets from H&E images (Fig. 3a,b), visual inspection of INS expression revealed substantial levels of insulin transcripts across the entire tissue section, including exocrine regions. This observation suggested a degree of transcript diffusion during sample preparation (Fig. 3b). To correct for such transcript bleed, we used a computational approach that uses probabilistic models to adjust spatial gene expression data for transcript bleed (SpotClean¹³ (version 1.4.1)) (Fig. 5).

After transcript bleed-correction, cell type marker genes for endocrine and exocrine cell types were more cleanly divided into distinct compartments, which more faithfully reflected established understanding of human islets as distinct endocrine cell clusters, delineated by a basement membrane, surrounded by primarily exocrine tissue (dominated by acinar cells) (Fig. 6)¹⁴. These findings indicated that bleed correction can validate the quality of spatial transcriptomic datasets, and users who access the deposited raw data can replicate this result by following our methods.

Concluding remarks

In summary, this dataset offers spatial transcriptomic profiles of the human pancreas from both NGT and T2D donors, at 55 μm resolution. Our initial observations of these profiles identified substantial bleed of abundant islet genes, but probabilistic computational correction mitigated this technical artifact effectively.

Limitations of these data include the limited spatial resolution of the Visium Spatial Gene Expression v1 assay and the limited number of donors profiled. The Visium v1 platform spatially barcodes transcripts in a 6.5 × 6.5 mm capture area covered by 4,992 spots, each 55 μm in diameter. At this resolution, data from each spot represented transcriptomes from multiple cells. Recently released assays improve resolution to a near single-cell level¹².

While exploratory analyses of transcriptional changes in T2D pancreas are a primary application of these data, a larger sample size will be required to overcome donor-to-donor variation when making these comparisons. Having demonstrated the integrity of these data, they may be combined and meta-analyzed with future data sets towards this aim. Moreover, these data alone may be suitable for targeted analyses, including validating the presence or spatial distribution of genes of interest. Notably, since these spatial profiles represent direct transfer of RNA from fresh frozen pancreas and are not dependent on pre-defined human transcriptome probe sets, this resource may be used to detect unannotated and/or non-coding transcripts of interest, which may not be detectable with probe-based RNA assays or protein-based assays (e.g., immunofluorescence imaging) of fixed tissue. Third-party software and frameworks enable continued exploration of spatial patterns across tissues in Visium v1 data. Deep learning models¹⁵ can improve resolution by predicting transcript counts in inter-spot space on a capture slide. Computational methods, like those provided by the Spatial-eXpression-R (spacexr) library^16,17, can deconvolute cell types in spots and predict differential expression across spatial axes. Paired with tools like these, the spatially resolved transcriptional profiles of human pancreas presented here may yield additional insights about healthy and T2D pancreas biology.

Code availability

The code used for data processing and bleed correction are available at https://github.com/CollinsLabBioComp/publication-visium_preprocessing.

References

Leung, P. S. Overview of the pancreas. Adv. Exp. Med. Biol. 690, 3–12 (2010).
Article PubMed Google Scholar
Campbell, J. E. & Newgard, C. B. Mechanisms controlling pancreatic islet cell function in insulin secretion. Nat. Rev. Mol. Cell Biol. 22, 142–158 (2021).
Article PubMed PubMed Central CAS Google Scholar
Simonis, M. & de Laat, W. FISH-eyed and genome-wide views on the spatial organisation of gene expression. Biochim. Biophys. Acta 1783, 2052–2060 (2008).
Article PubMed CAS Google Scholar
Rao, A., Barkley, D., França, G. S. & Yanai, I. Exploring tissue architecture using spatial transcriptomics. Nature 596, 211–220 (2021).
Article ADS PubMed PubMed Central CAS Google Scholar
Spatial Gene Expression - 10x Genomics. https://www.10xgenomics.com/products/spatial-gene-expression.
Rao, N., Clark, S. & Habern, O. Bridging genomics and tissue pathology. Genetic Engineering & Biotechnology News 40, 50–51 (2020).
Article Google Scholar
Ståhl, P. L. et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 353, 78–82 (2016).
Article ADS PubMed Google Scholar
Cui Zhou, D. et al. Spatially restricted drivers and transitional cell populations cooperate with the microenvironment in untreated and chemo-resistant pancreatic cancer. Nat. Genet. 54, 1390–1405 (2022).
Article PubMed PubMed Central CAS Google Scholar
Bonnycastle, L. L. et al. GEO. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE264331 (2024).
Boehnke, B. et al. The Finland-United States Investigation of NIDDM Genetics (FUSION) Study - Islet Expression and Regulation by RNAseq and ATACseq. dbGaP https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001188.v3.p1.
Howell, N. et al. A spatial transcriptomics dataset of pancreas sections in normal glucose tolerance and type 2 diabetic donors. Zenodo https://doi.org/10.5281/zenodo.15177311 (2025).
Spatial Transcriptomics - 10x Genomics. https://www.10xgenomics.com/spatial-transcriptomics.
Ni, Z. et al. SpotClean adjusts for spot swapping in spatial transcriptomics data. Nat. Commun. 13, 2971 (2022).
Article ADS PubMed PubMed Central CAS Google Scholar
van Gurp, L. et al. Generation of human islet cell type-specific identity genesets. Nat. Commun. 13, 2020 (2022).
Article ADS PubMed PubMed Central Google Scholar
Monjo, T., Koido, M., Nagasawa, S., Suzuki, Y. & Kamatani, Y. Efficient prediction of a spatial transcriptomics profile better characterizes breast cancer tissue sections without costly experimentation. Sci. Rep. 12, 4133 (2022).
Article ADS PubMed PubMed Central CAS Google Scholar
Cable, D. M. et al. Robust decomposition of cell type mixtures in spatial transcriptomics. Nat. Biotechnol. 40, 517–526 (2022).
Article PubMed CAS Google Scholar
Cable, D. M. et al. Cell type-specific inference of differential expression in spatial transcriptomics. Nat. Methods 19, 1076–1087 (2022).
Article PubMed PubMed Central CAS Google Scholar

Download references

Acknowledgements

NIH NIAMS Genomic Technology Section supported sequencing efforts for this study. This research was supported by the US National Institutes of Health grants ZIAHG000024 (to F.S.C. and L.G.B.), K99DK13917501 (to D.L.T.).

Funding

Open access funding provided by the National Institutes of Health.

Author information

These authors contributed equally: Nick Howell, Zoe Weiss, Lori L. Bonnycastle.
These authors jointly supervised this work: Francis S. Collins, Catherine C. Robertson, D. Leland Taylor.

Authors and Affiliations

Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
Nick Howell, Zoe Weiss, Lori L. Bonnycastle, Caleb M. Grenko, Neelam Sinha, Narisu Narisu, Amy J. Swift, Michael R. Erdos, Leslie G. Biesecker, Francis S. Collins, Catherine C. Robertson & D. Leland Taylor
Graduate School of Biomedical Sciences, Mayo Clinic, Rochester, MN, 55905, USA
Caleb M. Grenko
Office of Science and Technology, Light Imaging Section, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
Davide Randazzo
Laboratory of Pathology, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
Christopher H. Dampier
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
Catherine C. Robertson

Authors

Nick Howell
View author publications
Search author on:PubMed Google Scholar
Zoe Weiss
View author publications
Search author on:PubMed Google Scholar
Lori L. Bonnycastle
View author publications
Search author on:PubMed Google Scholar
Caleb M. Grenko
View author publications
Search author on:PubMed Google Scholar
Davide Randazzo
View author publications
Search author on:PubMed Google Scholar
Christopher H. Dampier
View author publications
Search author on:PubMed Google Scholar
Neelam Sinha
View author publications
Search author on:PubMed Google Scholar
Narisu Narisu
View author publications
Search author on:PubMed Google Scholar
Amy J. Swift
View author publications
Search author on:PubMed Google Scholar
Michael R. Erdos
View author publications
Search author on:PubMed Google Scholar
Leslie G. Biesecker
View author publications
Search author on:PubMed Google Scholar
Francis S. Collins
View author publications
Search author on:PubMed Google Scholar
Catherine C. Robertson
View author publications
Search author on:PubMed Google Scholar
D. Leland Taylor
View author publications
Search author on:PubMed Google Scholar

Contributions

L.L.B., C.C.R., D.L.T. and F.S.C. designed the research. L.L.B., A.J.S., C.M.G., M.R.E. and D.R. generated data for this study. N.H., Z.W., N.N., N.S., C.D., C.C.R. and D.L.T. analyzed data. L.L.B., C.C.R., D.L.T., L.G.B. and F.S.C. supervised the study. N.H., Z.W., C.C.R. and D.L.T. wrote the paper. All authors reviewed and approved the paper.

Corresponding authors

Correspondence to Michael R. Erdos or Francis S. Collins.

Ethics declarations

Competing interests

L.G.B. is a member of the Illumina Medical Ethics Board, receives research support from Merck, Inc, and royalties from Wolters-Kluwer.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Dataset 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Howell, N., Weiss, Z., Bonnycastle, L.L. et al. A spatial transcriptomics dataset of pancreas sections in normal glucose tolerance and type 2 diabetic donors. Sci Data 12, 1526 (2025). https://doi.org/10.1038/s41597-025-05450-6

Download citation

Received: 29 January 2025
Accepted: 24 June 2025
Published: 01 September 2025
Version of record: 01 September 2025
DOI: https://doi.org/10.1038/s41597-025-05450-6