Fig. 4: The COSMIS score quantifies constraint at amino acid resolution and provides structural insights into variant pathogenicity.
From: The 3D mutational constraint on amino acid sites in the human proteome

a COSMIS score distributions are significantly different between loss of function tolerant, unsure, and intolerant (as defined by pLI) genes. The COSMIS scores of amino acid sites in intolerant genes differ from those in tolerant genes (median −1.1 vs −0.12). Statistic test: two-sided Mann–Whitney U test. In boxplot graphs center line indicates median, bounds of box indicate 25th and 75th percentiles, and whiskers indicate minimum and maximum. b COSMIS scores of UBA5 sites mapped to structure of one subunit of a dimerized UBA5 bound with the UFM1 target protein (PDB ID: 6H77). UBA5 is predicted to be LoF tolerant, but it exhibits substantial constraint on specific spatial regions. Structures of all subunits of the complex are rendered in surface. c Locations of the top 10% most constrained sites in UBA5 ranked by COSMIS score. Sites are rendered in spheres and colored according to their likely functional roles. Location of variant M57V implicated in early-onset encephalopathy is indicated. Proteins are rendered in cartoons. We note that because the COSMIS score of a site is directly informed by the genetic variability of its contact set, it comes as natural to interpret the scores in the context of a 3D structure. pLI probability of loss-of-function intolerant, UBA5 ubiquitin-like modifier-activating enzyme 5, UFM1 ubiquitin-fold modifier 1, ATP adenosine triphosphate. Source data are provided as a Source Data file.