Fig. 2: FUSIL categories and human gene features.
From: Human and mouse essentiality screens as a resource for disease gene discovery

a Notched box plots showing the distribution of recombination rates for the different FUSIL bins. Human recombination rates58 were mapped to the closest gene and average recombination rates per gene were computed. b Distribution of human gene expression values for different tissues. Median logTPM expression values from the GTEx database for selected non-correlated tissues are shown. c Protein–protein interaction network parameters. Notched box plots showing the distribution of degree and topological coefficient computed from human protein–protein interaction data extracted from STRING. Only high-confidence interactions, defined as those with a combined score of >0.7, were kept. d Protein complexes. Bar plots representing the percentage of genes in each FUSIL bin being part of a protein complex (human protein complexes). e Paralogues. The bar plot shows the percentage of genes without a protein-coding paralogue gene in each FUSIL bin. Paralogues of human genes were obtained from Ensembl Genes 95. A cut-off of 30% amino acid similarity was used. f Probability of mutation. Distribution of gene-specific probabilities of mutation from Samocha et al.65. g Transcript length. Maximum transcript lengths among all the associated gene transcripts (Ensembl Genes 95, hsapiens data set). h GIMS Selection Score. Distribution of Gene-level Integrated Metric of negative Selection (GIMS)66 scores across the different FUSIL bins. i Probability of loss-of-function intolerance (pLI) retrieved from gnomAD2.1. Notched box plots and density plots showing the bimodal distribution of this score, with higher values indicating more intolerance to variation. j Distribution of gnomAD o/e LoF scores. Upper bound fraction of the confidence interval around the observed versus expected LoF score ratio (gnomAD 2.1.). A score <0.35 (dashed line) has been suggested to identify intolerant to LoF variation genes56. For a–c, f, g–j: centre line, median; notch, CI around the median; box edges, interquartile range, 75th and 25th percentile, respectively; whiskers, 1.5 times the interquartile range; outliers not shown. Significance for pairwise comparisons for all features is shown in Supplementary Tables 4 and 5. CL cellular lethal (pink), DL developmental lethal (orange), SV subviable (yellow), VP viable with phenotypic abnormalities (light blue), VN viable with normal phenotype (dark blue).