Fig. 1: Conceptual overview of MAVERICK. | Nature Communications

Fig. 1: Conceptual overview of MAVERICK.

From: Deep structured learning for variant prioritization in Mendelian diseases

Fig. 1

a Diagram of MAVERICK inputs and outputs. MAVERICK takes as inputs the reference and altered protein sequence, the evolutionary conservation of each amino acid in the protein, and structured data including genetic constraint and allele frequency information. These inputs are then processed through MAVERICK’s ensemble of transformer-based neural networks to produce the output: a three-class prediction corresponding to the probability that the input variant is benign, pathogenic with dominant inheritance, or pathogenic with recessive inheritance. The three output probabilities always sum to one. b MAVERICK training and testing datasets. MAVERICK’s training and validation datasets were created from variants in ClinVar prior to the year 2020. The known and novel genes test datasets were created from variants added to ClinVar during 2020, following the same rules for creation as the training set.

Back to article page