Fig. 8

Scores generated for different subjects by the CSS/NCBRS classification model. An SVM classifier generates scores for every subject as the probability of having a DNA methylation profile similar to what is observed in CSS/NCBRS. The y axis represents scores 0–1, with higher scores indicating a higher chance of carrying a methylation profile related to CSS/NCBRS, as stratified on the x axis for different groups of tested subjects. Every point represents a single sample. By default, the SVM classifier defines a cut-off of 0.5 for assigning the class (dashed line); however, the vast majority of the tested individuals received a score <0.2 or >0.8. Therefore, to improve visualization, the points are jittered. CSS/NCBRS T: CSS/NCBRS patients from the training set (n = 21); Control T: control samples used to train the model (n = 126); CSS/NCBRS V: Samples from the testing set (n = 8), who were not used for feature selection or model training; Control V: healthy subjects from one internal (n = 122) and two external cohorts (n = 48 and 186) used to measure the specificity of the model (n = 356); Other syndromes: subjects diagnosed with various syndromic diseases presenting with DD/ID (n = 531, details in Methods); Chr6q25del: four patients with interstitial deletions in Chr6q25 (Table 2); VUS: subjects with variants of unknown clinical significance (VUS) in a CSS/NCBRS-related gene, or with clinical suspicion for CSS/NCBRS (Table 2); Screening: subjects with various presentations of developmental delay and intellectual disability, but with no diagnosis, used for case finding (n = 508). The first two categories represent the classification of the subjects used to train the algorithm. The second two validate the performance of the classification model on the testing dataset. In the fifth category, the model demonstrates the ability to accurately distinguish other DD/ID cases from CSS/NCBRS. All of the Chr6q25del patients (sixth category) are scored high for having a profile related to CSS/NCBRS. The model classifies four of the subjects with VUS or clinical suspicion for CSS/NCBRS as a case of CSS/NCBRS and assigns non-CSS/NCBRS status to the rest of the subjects in the seventh category. In the last category, screening of a DD/ID cohort identifies two subjects as a potential case of CSS/NCBRS