Extended Data Fig. 2: Machine learning model trained with the mouse early postnatal hippocampal scRNA-seq dataset. | Nature

Extended Data Fig. 2: Machine learning model trained with the mouse early postnatal hippocampal scRNA-seq dataset.

From: Molecular landscapes of human hippocampal immature neurons across lifespan

Extended Data Fig. 2

a, b, Unsupervised clustering and t-distributed Stochastic Neighbor Embedding (t-SNE) visualization of all cells from the mouse postnatal (P5) hippocampus9 colored by cluster (a) and marker gene expression (b). imGC: immature dentate granule cell; GC: dentate granule cell; IPC: intermediate progenitor cell; OPC: oligodendrocyte precursor cell. RGL: radial glia-like cell; VLMC: vascular and leptomeningeal cell. c, A schematic illustration of the machine learning-aided analysis using the mouse hippocampal scRNA-seq datasets9, mirroring our analysis pipeline in human studies (Fig. 1a). In brief, Dcx+Calb1Prox1+ imGCs in the P5 mouse dentate gyrus were selected as prototypes to train a scoring model to comprehensively learn their gene features. The trained model containing an aggregate of weighted features (“gene weights”) was then used to quantitatively evaluate the similarity of each cell to the imGC prototype in query (test) datasets of the early postnatal (P5; self-scoring), the juvenile (P12-35) and the adult (P120-132) hippocampus9. To assess the efficacy of our method, we classified cells with high similarity scores to the imGC prototype as imGCs and compared our model classifications to the published annotations based on unsupervised clustering9 (Shown in Extended Data Fig. 3). d, Measuring performance of the machine learning model. Line plot showing the accuracy score of the machine learning classifier varying with decreasing regularization strength as estimated by cross-validation. Red line shows 95% confidence interval on the estimation of the accuracy score. #Sum abs (coeffs): sum of the absolute value of regression coefficients. e, Heatmap showing expression of top-weighted genes in top-scoring cells of each prototype determined by the machine learning model. Genes listed are the top 25 weights defining mouse imGCs. f, Wheel plot visualizing the scores of each cell to each prototype. Dots represent individual cells whose distance to each prototype is proportional to the score of that prototype. Red and lime green dots represent the prototypical imGCs and all other GCs, respectively. Dotted line indicates a similarity score of 0.85 to each prototypical cell type. Note that unlike in the human system (Fig. 1c), no mature oligodendrocyte (mOli) cluster was present in the P5 mouse hippocampus.

Back to article page