Extended Data Fig. 2: Data processing and quality control of the STARmap PLUS data analysis pipeline. | Nature Neuroscience

Extended Data Fig. 2: Data processing and quality control of the STARmap PLUS data analysis pipeline.

From: Integrative in situ mapping of single-cell transcriptional states and tissue histopathology in a mouse model of Alzheimer’s disease

Extended Data Fig. 2: Data processing and quality control of the STARmap PLUS data analysis pipeline.

a, Examples showing the final imaging cycle detecting cell nuclei, cDNA amplicons, and protein signals in the 13-month control (left, N = 2 independent animals) and TauPS2APP (right, N = 2 independent animals) mouse brain samples. Blue, Propidium Iodide (PI) staining of cell nuclei. Green, fluorescent DNA probe staining of all cDNA amplicons. White, X-34 staining of Amyloid β plaque. Red, immunofluorescent staining of p-tau (AT8 primary antibody followed by fluorescent goat anti-mouse secondary antibody). b, The flowchart of the STARmap PLUS data analysis pipeline. c, Violin plot showing the accuracy (correct rate) of SEDAL sequencing for each FOV for all samples (96.87% ± 5.00%). d, Histograms showing the ln-transformed number of transcripts (left) and genes (right) per cell in the 2,766-genes dataset before quality control. Red vertical lines represent median values. e, Histogram showing the number of transcripts after logarithmic transformation in the 2,766-genes dataset. Red vertical lines represent the filtering thresholds estimated by median absolute deviation (MAD). f, Number of transcripts and genes across samples. Violin plots showing the distribution of the number of reads per cell (top) and genes per cell (bottom) detected in each sample after quality control (N = 72,165 cells). Box plots depict the median (center) and interquartile range (IQR, bounds of the box), with whiskers extending to either the maxima/minima or to the median ± 1.5× IQR, whichever is nearest. g, Number of transcripts and genes across major cell types. Violin plots showing the distribution of the number of reads per cell (top) and genes per cell (bottom) detected in each major cell type (N = 72,165 cells). Box plots as in (f).

Source data

Back to article page