Extended Data Fig. 3: External validation on Seattle Barrett’s Study SNP data.
From: Genomic copy number predicts esophageal cancer years before transformation

Predicting the Seattle Barrett’s Study SNP data using our sWGS CN model results in a lower AUC of 0.77 for all samples (including blood/gastric normals as non-progressor controls) a, Restricted to only BE samples (that is excluding normal), with our higher sensitivity threshold results in an AUC of 0.71 (sensitivity = 0.82, specificity=0.34) b, Overall, the progressor samples show the same pattern of risk classification that the sWGS samples did with high risk classifications occurring at a higher rate in progressive patients independent of pathology. The HGD group in the non-progressor patient group also indicates that our model would classify most of these as progressive. c, Compares ROC values for the SNP data using various additional criteria including: defining patients with HGD as progressed; excluding those with less than 1% of the genome altered (low SCA) and the whole-genome duplicated non-progressor patients (NP WGD); only within the baseline (T1) and penultimate endoscopy (T2) groups respectively. Demonstrating that the model improves as the samples are taken nearer to EAC diagnosis. All error bars denote the 95% confidence interval for the sensitivity, specificity, and AUC at a threshold of Pr = 0.3. d, Plots the mean ratio of the genome altered (y-axis) versus the computationally derived purity value (x-axis) for all timepoint-merged biopsies versus the blood/gastric normal samples. None of the normal samples have more than 1% of the genome altered, and all are >90% purity. Given the issues with assessing very pure, mostly diploid samples, those samples in blue are excluded from the ROC analyses as indicated.