Extended Data Fig. 4: Sources of variation in the gene expression data. | Nature

Extended Data Fig. 4: Sources of variation in the gene expression data.

From: Developmental convergence and divergence in human stem cell models of autism

Extended Data Fig. 4: Sources of variation in the gene expression data.The alternative text for this image may have been generated using AI.

(a) Hierarchical clustering of all samples based on gene expression showing that samples cluster by differentiation day and the genetic form of ASD. Annotation colour bars represent day (shades of blue) and form (colourful). (b) Association of the top 20 gene expression principal components (PC) with the different covariates. Numbers in brackets on the y-axis are the percent of variance explained by these PCs. Differentiation time is highly associated with the 1st PC (Adjusted R2 = 0.92). (c) Variance explained by each of the covariates for each gene (n = 17,963 genes). Distributions represent the density of the percent of variance explained. The median value of percent of explained variance for each variable are denoted below the plots. Boxplots in c and e show: centre, median, lower hinge – 25% quantile, upper hinge – 75% quantile, lower whisker – smallest observation greater than or equal to lower hinge –1.5× interquartile range, upper whisker – largest observation less than or equal to upper hinge +1.5× interquartile range. (d) Scatter plot of the first two gene expression PCs from each form of ASD. Colour represents the differentiation day (top) or form (bottom). PC1 explains 24% of the variation and is highly associated with differentiation day. PC2 explains 7.9% of the variation. Sample numbers: Day 25: Controls – 46, 15q13.3del – 7, 16p11.2del – 5, 16p11.2dup – 4, 22q11.2del – 13, 22q13.3del – 11, Idiopathic – 12, PCDH19 – 2, SHANK3 – 2, Timothy Syndrome – 3. Day 50: Controls – 53, 15q13.3del – 6, 16p11.2del – 6, 16p11.2dup – 4, 22q11.2del – 15, 22q13.3del – 12, Idiopathic – 15, PCDH19 – 2, SHANK3 – 2, Timothy Syndrome – 3. Day 75: Controls – 54, 15q13.3del – 7, 16p11.2del – 6, 16p11.2dup – 4, 22q11.2del – 23, 22q13.3del – 12, Idiopathic – 15, PCDH19 – 2, SHANK3 – 2, Timothy Syndrome – 2. Day 100: Controls – 50, 15q13.3del – 6, 16p11.2del – 6, 16p11.2dup – 4, 22q11.2del – 15, 22q13.3del – 12, Idiopathic – 15, PCDH19 – 2, SHANK3 – 2, Timothy Syndrome – 2. (e) Correlation of gene expression with control samples and forms (Affected) shows no difference in within-control and within-form correlation across development. Correlation by age shows a slight effect (Anova Eta2 = 0.04) with a slight decrease in correlation across development. Total number of pair-wise correlations: Day 25: Control – 1035, Affected – 241, Day 50: Control – 1378, Affected – 317, Day 75: Control – 1431, Affected – 469, Day 100: Control – 1225, Affected – 315. (f) Pseudotime testing of forms using linear mixed model and contrasts with Wald test shows no significant FDR corrected differences in pseudotime (PC-1).

Back to article page