Extended Data Fig. 6: Improved prediction of 3’ UTR polyadenylation and paQTLs with AlphaGenome.
From: Advancing regulatory variant effect prediction with AlphaGenome

(a) Polyadenylation predicted vs observed coverage ratio (COVR) for each gene’s most distal and proximal PAS for Borzoi and AlphaGenome. (b) Polyadenylation variants scoring scheme. (c,d) Example of variant disrupting a polyadenylation motif in the TP53 gene. (c) Distribution of scores for the indicated variant across GTEx tissues (gray swarmplot) with the highest and lowest scoring tissues and average score highlighted (red dots). (d) Observed and predicted RNA-seq for the tissues highlighted in (c), and in silico saturation mutagenesis scores (ISM) of the 72 bp flanking the variant. ISM clearly highlights the relevance of the polyadenylation motif exclusively in the reference background, where the variant does not compromise the motif. (e,f) As (c) and (d) but for a variant generating a new polyadenylation motif in the RIGI gene. (g,h) As (c) and (d) but for a paQTL with a low variant score. The variant disrupts the PAS but a cryptic one emerges nearby with limited effect on gene expression. (i,j) As (c) and (e) but for a failure case. AlphaGenome correctly identifies the emergence of a novel PAS but fails to correctly predict the effect on gene expression.