Extended Data Fig. 7: Comparison of different methods for cancer-gene identification.
From: Identification of cancer driver genes based on nucleotide context

We benchmarked the performance of our method against 7 previously published methods for cancer gene identification based on the sequencing data of 11,873 samples spanning 28 different cancer types. To benchmark the performance of a method, we sorted genes according to the significance values (adjusted for multiple testing) returned by the method. As a conservative approximation of the true-positive rate we used Cancer Gene Census (CGC) genes (a, c, e) and OncoKB genes (b, d, f) to derive ROC and precision-recall curves. We quantified the performance of each method as the area under the ROC curve (AUC) for the top 150 (a, b) or 1000 (c, d) non-CGC/OncoKB genes, respectively. Further, we determined the precision at 5% recall for each method (e, f). Box plots indicate the distribution of these performance measures for each method across cancer types. Each cancer type is represented by a dot. Boxes indicate the 25%/75% interquartile range, whiskers extend to the 5%/95%-quantile range. The median of each distribution is indicated as a vertical line.