Fig. 1: Overview of the study.

A Workflow of data collection and analysis. Of the 238 datasets identified by the search query, 188 datasets included at least six samples and had NCBI-generated RNA sequencing count data available. Of these datasets, those without case-control design, with less than three cases and/or controls, and without NDD cases were excluded, resulting in 115 transcriptomic datasets that were included in the analysis. Datasets were stratified based on mutation type and/or cell type/tissue before performing differential expression analysis, provided that there were at least three cases and controls within each stratum. This resulted in a total of 151 distinct datasets that were used to identify common, disorder-associated, and phenotype-specific changes. B Distribution of the number of cases and controls among the 151 datasets. The y-axis shows the number of datasets with the specified number of cases and controls on the x-axis. Most datasets include only three or four cases and controls, while only a few datasets include more than ten. C Donut chart of the number of datasets for Rett syndrome, Duchenne muscular dystrophy, Fragile X syndrome, Down syndrome, and others. D Principal coordinate analysis (PCoA) of the 151 datasets. The distance between the datasets was calculated using the Spearman correlation of the gene’s P values. The gene’s P values were calculated for each dataset through the differential expression analysis of NDD cases versus controls. The first component is associated with the cell type/tissue. Particularly, the T Cell Receptor Gamma Variable 4 (TRGV4) reaches higher levels of significance (i.e., P value rank) in immune cells, while the cholinergic receptor nicotinic alpha 4 Subunit (CHRNA4) has higher P value ranks in neural cells.