Fig. 4: Genetic analysis of G1 alleles.

a Heatmap showing linkage disequilibrium (LD) of variants, ordered by positions in the canonical Mamu-E transcript from left (5ā²) to the right (3ā²). Each entry indicates the LD between two variants, whereas darker entries are those with higher LD values. b A number of variants (left) and single amino-acid polymorphisms (SAPs, i.e., non-synonymous SNPs) in Mamu-E transcript regions. Variants are colored by type (grayā=āsingle nucleotide polymorphism (SNP), coralā=ādeletion, purpleā=āinsertion). c Hierarchical clustering of variants based on their correlations, with correlated groups of SNPs numbered. Two large variant clusters are shown in blue (1) and red (2). d Maximum likelihood tree of the final G1 allele sequences, including 3ā² UTRs from haplotigs. Major G1 allele subgroups are colored based on their overall grouping. Tips are colored based on the protection outcome of the animal from which the allele was recovered. e Number of protected (blue) and not protected (red) animals from vaccine groups O, S, and X carrying homozygous dominant (00) versus heterozygous (01) and homozygous recessive (11) variant clusters. Groups are ordered and labeled by their number of variants (k, i.e., SNPs, indels) as shown in panel (c), beginning with the largest in the top left.