Fig. 4: Results of the benchmark of feature selection methods.
From: Feature selection methods affect the performance of scRNA-seq data integration and querying

a, Summary of method performance by metric type. Points show scores for individual datasets and diamonds show the mean values (Extended Data Fig. 5a). Methods are sorted by mean overall score, and baseline methods are indicated by gray shading. Shaded areas show scores less than (red) or greater than (blue) the baseline range (0–1). Average rankings for each metric type are shown on the right, with color indicating mean rank and size s.d. (smaller is more variable) (Extended Data Fig. 5b). b, Overlap of features selected by different methods. The heatmap shows the mean Jaccard index (JI) between feature sets selected by different methods (excluding random gene sets) (Extended Data Fig. 6). Sizes of squares indicate the s.d. (smaller is more variable). Mean JI values greater than 0.5 are highlighted with white borders. c, The number of features (on a log10 scale) selected by at least n methods (n = 25, 20, 15, 10 and 5) for each dataset. Colors indicate the number of methods. d, The number of features selected by different methods. Points are colored by dataset, and blue bars show the mean for each method. Only methods which automatically determine the number of features are shown. Most other methods were set to select 2,000 features, as indicated by the red line, except scPNMF, which uses 200 features. e, Heatmap of the relative performance of batch-aware variants of scanpy methods. Colors show the difference in score for each metric type on each dataset, with negative values (purple) indicating that the batch-aware variant performed worse than the standard approach and positive values (green) that it performed better.