Supplementary Figure 9: Cancer patients without established driver mutations are enriched for deleterious mutations in NetSig5000 genes

a) We compared the fraction of genes with damaging (i.e, probably damaging and possibly damaging pooled into one set) versus benign mutations as determined by PolyPhen in the NetSig5000 genes (on the background of all genes in the genome), and show a statistically significant enrichment of damaging mutations in the NetSig5000 set (P = 0.016, using Fischer’s exact test, NetSig5000 is indicated by dark red and all genes in the genome by light red). b) Using PolyPhen2, we transformed all mutations observed in the NetSig5000 set to continuous normalized scores of how much the mutation is predicted to affect gene function negatively (less damaging to more damaging oriented left to right on the x-axis). When comparing to all genes in the genome, mutations in the NetSig5000 genes are significantly depleted for less damaging PolyPhen scores, and significantly enriched for more damaging PolyPhen scores (P = 0.046, using a non-parametric two-sample Kolmogorov-Smirnov test, histograms show the binned proportions, the line the cumulative distributions of scores). For comparison, we show the results for the same tests run on the Cancer5000 set in panels c) and d), respectively (with Cancer5000 significant genes in dark blue and the background genes in light blue). While the trends and proportions of deleterious versus benign mutations observed in the Cancer5000 genes are similar to our observations for the NetSig5000 genes (thus supporting the cancer relevance of the NetSig5000 set), the statistical significances levels are higher due to more genes in the Cancer5000 set and because the effect size, as expected, is larger.