Fig. 3: The distribution of optimal and greedy individual target set (ITS) size values in four different cancer types.

We study both our baseline parameter setting (upper row panels) and a markedly more stringent one (middle row plots). For the more stringent parameter setting, we compare the ITS sizes obtained using MadHitter (middle row plots) and a greedy algorithm that tries to add pairs of genes at a time (bottom row plots). In each plot the patients are assigned letters on the x-axis and the ITS size is on the y-axis. In each of the 12 plots, the patients are sorted from left to right in increasing order of the mean ITS values in the optimal stringent (\({lb}\) = 0.9, \({ub}\) = 0.05) regime. For each patient and each parameter setting, we generated 20 replicates by sampling 500 cells and the box plot shows the 20 data points and their mean. For most patients in the head and neck and melanoma data sets, there is no variation in the ITS value among the 20 replicates and the 20 circles are hence overplotted, but as evident, for two patients each in the brain and lung cancer data sets, there is considerable variation in the ITS sizes between replicates. Additional comparisons between ITS sizes at different parameter settings can be found in Supplementary Note 5. A description of the greedy algorithm and more comparisons between the optimal and greedy algorithms are provided in Supplementary Note 8. All boxplots are drawn with the ggplot2 library function geom_boxplot, which shows the median as a horizontal line, puts the hinges at 25th and 75th percentiles and puts the whiskers at 1.5× Interquartile-Range from the nearest hinge. Source data are provided as a Source data file.