Main

The continuous arms races between viruses and their hosts have driven the evolution of several immune defences across all life forms1,2. In humans, an emerging actor of cell-autonomous antiviral immunity is the gene family composed of the paralogues sterile alpha motif (SAM) domain-containing proteins 9 and 9L (SAMD9/9L). These interferon-stimulated genes are in tandem on human chromosome 7 and encode large, multidomain proteins with antiviral properties against poxviruses and lentiviruses3,4,5. SAMD9 and SAMD9L inhibit cellular and viral protein translation6,7,8, acting through an essential ribonuclease site in the AlbA_2 domain5,9. They are also modulators of endosomal trafficking10,11 and SAMD9 was recently identified as a nucleic acid sensor12. Deleterious germline mutations in human SAMD9 or SAMD9L dysregulate their activity leading to severe life-threatening genetic syndromes, such as MIRAGE (myelodysplasia, infection, restriction of growth, adrenal hypoplasia, genital phenotypes and enteropathy)11, SAMD9L-associated autoinflammatory disease13, ataxia-pancytopaenia14 and normophosphataemic familial tumoural calcinosis15. Although the SAMD9/9L gene family extends beyond humans, its evolutionary history and functional diversification remain largely unexplored.

Recently, some human immune genes were shown to have a deep evolutionary origin in bacterial defences against phages16. For example, major eukaryotic antiviral immune sensors or effectors, such as cGAS or Viperin, have originated from prokaryotic antiviral systems16,17,18,19. Ancestral immunity therefore broadly defines shared immune defences between prokaryotes and eukaryotes through the presence of conserved immune modules (domains or proteins)18, that results from horizontal gene transfer, convergent evolution or vertical inheritance.

SAMD9 and SAMD9L are large multidomain proteins, which are presumed members of the STAND (signal transduction ATPases with numerous domains) superfamily20. SAMD9/9Ls are composed of (from N to C terminus): a SAM, a Schlafen (SLFN)-like AlbA_2 domain with nuclease site, a SIR2, a P-loop NTPase, tetratricopeptide repeats (TPRs) and an oligonucleotide/oligosaccharide-binding OB-fold. All are predicted domains, except the AlbA_2 for which the structure was recently solved21. Computational analyses suggested the presence of homologues of SAMD9/9L domains in other animals and bacteria20. However, the ancient evolutionary and functional history of SAMD9/9Ls and its potential link to immunity outside mammals remain unknown.

In mammals, the gene family consists of two paralogues, SAMD9 and SAMD9L, which originally duplicated in placentals and have evolved under positive selection22, suggesting past molecular arms races with pathogens. Interestingly, these mammalian paralogues exhibit both functional redundancy and divergence in their antiviral functions. Both are restriction factors against poxviruses, but with species-specific variations in their susceptibility to poxviral countermeasures4. In humans, they have different functions in human immunodeficiency virus (HIV) and lentiviral infections: SAMD9L is antiviral, while SAMD9 has no, or a proviral, effect5.

Lentiviruses naturally infect most African non-human primate species (simian immunodeficiency viruses (SIVs)), with the notable exception of some species, such as bonobos23,24,25,26. SIVcpz from chimpanzees is at the origin of the HIV-1 group M, responsible for the AIDS pandemic27. Lentiviruses have co-evolved with primates for millions of years and have been selective drivers of adaptation in primate antiviral defence factors, such as APOBEC3G, Tetherin/BST-2 or TRIM5 (refs. 28,29,30). These species- and lineage-specific adaptations have further shaped host molecular barriers to cross-species transmissions31,32.

In this study, we combined AI-predicted structural similarity searches, phylogenetics and population genomics with experimental assays of SAMD9/9L functions in prokaryotes and great apes to resolve the evolutionary and functional dynamics of this antiviral system across scales. We notably found that SAMD9/9Ls and some prokaryotic antiviral STAND (Avs) originated from convergent evolution and depend on AlbA_2 domain for their activity, and that SAMD9/9Ls have evolved under recurrent diversifying evolution in mammals by genomic structural variation, notably gene losses and adaptive polymorphisms to lentiviruses in great apes.

Results

SAMD9/9L bacterial structural analogues with conserved multidomain architecture and predicted antiphage activity

Benefiting from recent advances in structural similarity methods, we investigated the evolutionary history of SAMD9/9Ls across the tree of life. We first used Foldseek33, which allows fast protein structure alignment and search, with the AlphaFold-predicted SAMD9/9L structures and amino acid sequences as inputs (Methods). We identified 238 hits that share high structural similarity, mainly belonging to bacteria and metazoa (30% template modelling (TM) score and 80% coverage cut-off; Supplementary Table 2). Analysis of bacterial hits with DefenseFinder34, which performs hidden Markov model (HMM) searches against a database of prokaryotic antiviral systems, showed that some of these hits belong to the Avs family of antiphage immune proteins35,36,37,38. Avs proteins are nucleotide-binding oligomerization domain-like receptor (NLR)-like proteins that sense viral phage infections by their C-terminal sensor domains and perform specific antiviral functions through their N-terminal effector domains. The effector domain may have diverse specific activities, such as ATP degradation via a PNP domain or NAD+ depletion via a SIR2 domain17,37,39,40. Here we found that SAMD9/9L shares strong structural similarity with Avs5 from Desulfobacula sp. and from a newly identified protein, which we named ‘Avs6’, from Labedella endophytica (Uniprot IDs A0A1F9N8W4 and A0A3S0VH60, respectively) (Fig. 1a). However, Avs5 and Avs6 lack the N-terminal AlbA_2 domain from SAMD9/9L, with Avs6 instead encoding a PNP domain predicted to degrade ATP molecules (Fig. 1a). Remarkably, our analyses also identified other bacterial proteins sharing strong structural similarity with human SAMD9/9L on up to 88% of the protein length (for example, 1,407 of 1,589 amino acids total in SAMD9 and 1,400 of 1,584 in SAMD9L for A0A7T4VS34 from Pseudomonas fluorescens). This similarity covers all its functional domains, except the first N-terminal SAM, which seems exclusive to eukaryotes41. Despite the low amino acid sequence identity (around 15%), the structural similarity is highly significant (TM score up to 0.56; Supplementary Fig. 1a)42. We therefore propose ‘Avs9’ as a name for bacterial Avs proteins harbouring the same domain composition as SAMD9/9L at the exception of the SAM domain, referring to the latter gene family name (Fig. 1a). Furthermore, our analysis identified hundreds of uncharacterized proteins across various domains of life: metazoa, bacteria and archaea. It is however notable that, although STAND proteins are also present in Plantae and fungi43,44, we did not retrieve any significant hits in these kingdoms of life, neither in algae nor in protozoa. Further searches with various inputs, including Avs9, and more relaxed parameters did not recover any.

Fig. 1: Structural homology analyses show strong similarity between SAMD9/9L gene family and Avs antiphage proteins.
Fig. 1: Structural homology analyses show strong similarity between SAMD9/9L gene family and Avs antiphage proteins.The alternative text for this image may have been generated using AI.
Full size image

a, Structural homology searches of human SAMD9/9L in prokaryotes identify known and predicted Avs antiviral systems. Linear representation of the multidomain organization of proteins with strong SAMD9/9L structural similarity showing a common conserved organization, on more than 1,300 amino acids for Avs9. Avs3, Avs4 and Avs5 were previously identified35,37, while Avs6 and Avs9 are proposed names of newly identified Avs members (in parentheses, representative members with protein UniProt ID). b, Circular ultrametric tree representing structural clustering of SAMD9/9Ls and Avs from FoldMason MSTA showing widespread and diverse SAMD9/9L structural analogues in bacteria (red) or metazoan (blue) SAMD9/9Ls. Tree rooting is only for representation purposes. Statistical supports are from 1,000 bootstrap replicates (values above 90 are represented by thick lines). Protein with an absent domain have either a square, a cross or a filled circle at the tip of the corresponding branch, for the absence of SAM, SIR2 or OB, respectively. Credit: Species silhouettes from phylopic (https://www.phylopic.org) under a Creative Commons license: Homo sapiens sapiens, Edwin Price (CC0 1.0); Rattus tunneyi, Carly Monks (CC0 1.0); Methylococcus capsulatus and Acidimicrobium ferrooxidans, Matthew Crook (CC BY-SA 3.0); Streptomyces, Guillaume Dera (CC0 1.0); Enterobacteria phage T2, T. Michael Keesey (CC BY-SA 3.0); Euoticus pallidus, Joseph Wolf and T. Michael Keesey (CC0 1.0); Homo sapiens sapiens, by Carlo De Rito (CC0 1.0); Danio rerio, Ian Quigley (CC BY 3.0); Carassius auratus, Andrew Hoadley (CC0 1.0); Luciobarbus graellsii, Carlo Cano-Barbacil (CC0 1.0); Salmo salar, Domino Joyce (CC0 1.0); Perca flavescens, NOAA Great Lakes Environmental Research Laboratory (illustration) and Timothy J. Bartley (silhouette) (CC BY-SA 3.0); Nothobranchius sp., Ryan Cupo (CC0 1.0); Rattus norvegicus, Mozillian (CC0 1.0); Mus musculus, Jiro Wada (CC0 1.0); Acidobacterium capsulatum, Poribacteria morphotype 4, Chlorobium and Bacillus subtilis, Matthew Crook (CC BY-SA 3.0).

To investigate the evolutionary history of the structural analogues, we performed maximum likelihood phylogenetic analyses, using IQ-TREE, of both the sequence and the structural alignments of these proteins (from Muscle and FoldMason, respectively) (Supplementary Fig. 1b). We found that SAMD9/9Ls and Avs9s were in two distinct clades falling within the same lineage. However, because the proteins bear different domain compositions, we next performed analyses for specific domains, individually (Supplementary Fig. 1c). Starting with the central P-loop NTPase domain, which is a domain common to all hits, we similarly found that Avs9 and SAMD9/9L clades fell into the same lineage in the phylogeny (Supplementary Fig. 1d). Next, we only kept proteins encoding for an AlbA_2 effector domain, resulting in 23 hits (the small number of hits are due to the AlphaFold Database clustered at 50% identity): 13 eukaryotic SAMD9/9Ls and 10 bacterial Avs9s. The resulting tree topology also showed two clades corresponding to SAMD9/9Ls and Avs9s (Fig. 1b). We further observed scattered absence of SAM, OB-fold or SIR2 domains (Fig. 1b). While this suggests that these proteins have a propensity for domain modularity and that these domain losses could be tolerated, the functional impact of such modulation would require further investigations. Overall, the extensive structural similarity of SAMD9/9L- and Avs9-like proteins observed across such a vast evolutionary range is remarkable. It suggests a strong selective advantage driving the structural and functional conservation of this putative immune antiviral system from bacteria to mammals.

Eukaryotic SAMD9/9Ls and bacterial Avs9s result from convergent evolution

To determine whether eukaryotic SAMD9/9Ls result from the ancestral acquisition of a full-length bacterial analogue (such as Avs9) or emerged through convergent evolution, we performed new structural similarity searches for AlbA_2-containing proteins and studied the phylogenetic distribution of this effector domain across the tree of life (Fig. 2a,b). First, we found that AlbA_2 was not only present in Avs9-like prokaryotic proteins, but was widely represented in bacteria. Concerning eukaryotes, well-known AlbA_2-containing proteins included members of SAMD9/9L and SLFN antiviral immune factors, which, in our analysis, fell in distinct clades (except for SLFNL1, which is distantly related to other SLFNs45 and whose evolutionary history was uncertain), suggesting different evolutionary origins (Fig. 2b and Supplementary Fig. 2b,c). As for the ten identified Avs9s, they all clustered within a single clade that was distant from eukaryotic SAMD9/9Ls (Fig. 2b and Supplementary Fig. 2b,c), showing that full-length Avs9- and SAMD9/9L-like proteins did not originate from a single common ancestor. Overall, the presence of evolutionarily distant AlbA_2 domains together with structural similarity and a similar domain organization strongly suggest that bacterial Avs9s and eukaryotic SAMD9/9Ls evolved similar domain architectures through convergent evolution.

Fig. 2: Avs9 is part of prokaryotic defence systems and induces cell death in bacteria, through its SAMD9/9L/SLFN-analogous active site in the AlbA_2 domain.
Fig. 2: Avs9 is part of prokaryotic defence systems and induces cell death in bacteria, through its SAMD9/9L/SLFN-analogous active site in the AlbA_2 domain.The alternative text for this image may have been generated using AI.
Full size image

a, Linear representation of the multidomain organization of key proteins bearing an AlbA_2 domain: prokaryotic Avs9 and proteins from the SAMD9/9Ls and SLFNs. b, Unrooted phylogenetic tree of an MSA of AlbA_2 domains detected in proteins from kingdoms of life, visualized on iTOL. The scale bar represents the number of substitutions per site. Shown for each clade is the defence score, that is the fraction of bacterial AlbA_2 domains encoded in the vicinity of known defence system, as well as a P value representing the significance of this association (Supplementary Table 1; Methods). Statistics were performed using the two-sided binomial tests with FDR correction. c, Predicted structures of AlbA_2 domains of P. fluorescens A0A7T4VS34 (Avs9) and human SAMD9L, showing conserved SLFN-like nuclease catalytic site. Residues forming the catalytic sites are in clear grey with their coordinates in red (Bacteria) or blue (Eukaryota). d, Tenfold serial dilutions of E. coli cells transformed with plasmids encoding Avs9, Avs9-D45N or RFP as a control, with induction (100 µM IPTG) or without induction (1% glucose). Shown are photographs of bacterial drops. Credit: Species silhouettes are from phylopic (https://www.phylopic.org) under a Creative Commons license: Homo sapiens sapiens, Edwin Price (CC0 1.0); Rattus tunneyi, Carly Monks (CC0 1.0); Acidimicrobium ferrooxidans, Matthew Crook (CC BY-SA 3.0).

Antiphage defence systems tend to physically cluster in defence islands of bacterial genomes46. To investigate the possible immune function of bacterial AlbA_2 domains, we used DefenseFinder to evaluate their propensity to be encoded in genomic neighbourhoods of known defence systems. We found a significant association of bacterial AlbA_2 domains with defence systems in all tested clades, including Avs9 (Fig. 2b), strongly suggesting that their primary function is immune defence.

AlbA_2 domain is a shared effector determinant of the activity of SAMD9/9Ls and Avs9s

To investigate the function of bacterial Avs9 proteins, we first looked for the conservation of the AlbA_2 catalytic site. This active ribonuclease site is composed of three to four negatively charged residues, which are necessary for SAMD9/9L and SLFN11/12/13/14 antiviral activities5,9,47,48,49,50,51,52. Here we found similar residues and structure in Avs9 AlbA_2 (E14, D19, D45, E65; Fig. 2c and Supplementary Fig. 2a), suggesting the presence of similar catalytically active site and function.

Second, to investigate Avs9 activity in vivo, we selected and synthesized an Avs9 from P. fluorescens (A0A7T4VS34), and we cloned its full-length gene, or the AlbA_2 domain alone, under an inducible promoter into Escherichia coli. Expression of full-length Avs9 (Fig. 2d), or the AlbA_2 domain alone (Supplementary Fig. 2d), led to major cell death upon mild induction at 25 °C. Interestingly, the P. fluorescens Avs9 cell death induction was abolished by introducing a single-residue mutation in the AlbA_2 predicted catalytic site (D45N; Fig. 2d and Supplementary Fig. 2d). Of note, an equivalent mutation in human SAMD9/9L and SLFN AlbA_2 also abolishes their activity. Therefore, our results strongly suggest a cell-killing defence function of bacterial Avs9, dependent on AlbA_2 and its predicted nuclease active site.

Ancient duplication followed by frequent CNVs of the SAMD9/9L gene family in mammals

Although SAMD9/9Ls are present in several vertebrate hosts (Fig. 1b), the SAMD9/9L duplication at the origin of the human paralogous SAMD9 and SAMD9L genes seems more recent. This duplication was previously described in placental eutherians, after the divergence of marsupials from placentals22. To address the evolutionary history of the duplicated SAMD9/9L, we first performed genomic analyses of the SAMD9/9L gene family locus in various vertebrate species focusing on mammals: 38 ungulates, 22 chiropterans, 45 carnivores, 34 primates, 41 rodents and 27 additional species from other mammalian and non-mammalian vertebrate orders (Fig. 3a). We found that all these analysed vertebrate species, except monotremes (n = 2), exhibited at least one gene from the SAMD9/9L gene family (Fig. 3a).

Fig. 3: Ancient duplication followed by frequent CNVs of the SAMD9/9L gene family in mammals.
Fig. 3: Ancient duplication followed by frequent CNVs of the SAMD9/9L gene family in mammals.The alternative text for this image may have been generated using AI.
Full size image

a, Representation of the SAMD9/9L gene family locus in each mammalian order and other vertebrates. Order cladogram is presented on the left with diamonds and rounds on the branches representing events of gene gain and loss, respectively, in the major lineages only. Coloured rectangles represent the gene members of the SAMD9/9L gene family with orthologues to human SAMD9L and SAMD9 in blue and pink, respectively. Grey rectangles represent adjacent syntenic genes. b, Maximum likelihood phylogenetic tree (IQ-TREE, GTR + F + I + G4 substitution model) generated with selected SAMD9/9L homologues from eutherians and marsupials, as well as aves and amphibians, used as outgroups. Bootstraps are from 1,000 replicates. The complete tree is shown in Supplementary Fig. 3a. The scale bar represents the number of substitutions per site. c, Schematic diagrams of the origin of SAMD9/9L duplication in mammals. Top, grey tree in the background represents the species evolution of the three mammalian groups. Black tree inside the grey one represents the evolution of the SAMD9/9L gene tree. Bottom, alternative representation with the gene tree cladogram. Legend is embedded for a and c. d, For each order, CNVs in the SAMD9/9L gene family are indicated by histograms. Alignments and trees are given in Data availability. Bar colours (grey scale) represent the number of SAMD9/9L copies (legend embedded). Bar lengths indicate the proportion of genomes with the indicated number of copies for each order, with the total number of analysed genomes presented in parentheses for each order (aligned to a). Credit: Species silhouettes are from phylopic (https://www.phylopic.org) under a Creative Commons license: Ovis aries, Andrés Delgado (CC0 1.0); Pliohippus, Zimices (CC BY-SA 3.0); Plecotus austriacus, Andy Wilson (CC0 1.0); Canis lupus monstrabilis, Tracy A. Heath (CC0 1.0); Gorilla gorilla, Andy Wilson (CC0 1.0); Apodemus sylvaticus, Anthony Caravaggi (CC0 1.0); Elephas maximus, Andy Wilson (CC0 1.0); Phascolarctos cinereus, Gavin Prideaux (CC0 1.0); Ornithorhynchus anatinus, Sarah Werning (CC0 1.0); Accipiter gentilis, Andy Wilson (CC0 1.0).

Furthermore, thanks to genomic sequence advances and contrary to previous findings22, our observations indicated that marsupials also possess two copies and therefore the gene duplication of SAMD9/9Ls was not restricted to eutherians (Fig. 3a,b). To elucidate the origin of SAMD9/9L in mammals, we performed phylogenetic analyses from 301 homologue sequences of 189 mammalian species spanning 160 million years of divergence (using the National Center for Biotechnology Information (NCBI) blastn implemented in DGINN53) and from 18 non-mammalian vertebrate species (12 aves and 6 amphibians) as outgroups (Fig. 3b with selected species and IQ-TREE tree, Supplementary Fig. 3a,b for the complete IQ-TREE and PhyML trees using sequences from Supplementary Table 3; of note, PhyML trees were similar to IQ-TREE trees at key branches). Despite the presence of two copies in marsupial genomes (Fig. 3a), one of them, named SAMD9m, did not group with the eutherian SAMD9 or SAMD9L clade in the homologous gene tree, but branched outside (Fig. 3b, statistical significance assessed from 1,000 bootstrap replicates). Therefore, it is likely that the marsupial SAMD9m resulted from an ancestral independent duplication predating the divergence of marsupials and eutherians (Fig. 3b). Following this divergence, one copy was potentially lost in eutherians, followed by a subsequent duplication event (Fig. 3c). Alternatively, other hypotheses may involve a single duplication event followed by gene conversion within the two eutherian copies, or loss of a copy through incomplete lineage sorting.

In placentals, following the duplication event that gave rise to SAMD9 and SAMD9L orthologues, we identified at least ten independent losses of one of the two paralogues throughout mammalian evolution (five losses in the main represented lineages of Fig. 3a, as well as additional losses within orders: Fig. 3d and Supplementary Fig. 3c). Notably, artiodactyls experienced the loss of SAMD9L, while carnivores lost SAMD9. Furthermore, although the synteny and the copy numbers remained largely conserved within the divergence of the last two groups, radiations of primates, bats and rodents, exhibited many SAMD9/9L gene losses and some duplication events (Fig. 3d and Supplementary Fig. 3c; Data availability).

Ancient and recent unfixed gene losses during primate SAMD9/9L evolution, through different genetic and genomic mechanisms

Genomic structural changes, such as gene duplication and loss (copy number variations (CNVs)), alongside mutation and recombination events, are an important source of genetic diversity upon which natural selection can act, and thus have strong adaptive potential during virus–host arms races54,55. Such genomic adaptations during mammalian evolution have been rampant in response to past lentiviral or poxviral epidemics30,56. Because of (1) the very high rate of CNVs in primate SAMD9/9Ls, (2) the co-evolution of primates with lentiviruses for millions of years32 and (3) the potential lentivirus-driven adaptation of SAMD9/9L (ref. 5), we performed in-depth phylogenomic, genetic and positive selection analyses in primates.

We identified at least four independent gene loss events during primate evolution (Fig. 4a). SAMD9L was lost in prosimians, while bonobos and the common ancestors of Platyrrhini and Colobinae experienced the independent loss of SAMD9. Genomic analyses showed that SAMD9 loss in Platyrrhini resulted from the complete loss of the SAMD9 genomic locus in the common ancestor. In contrast, SAMD9 loss in Colobinae occurred through a different genetic mechanism by single nucleotide changes introducing several premature stop codons in the coding sequence (Fig. 4a and Supplementary Fig. 4a).

Fig. 4: Ancient and recent unfixed gene losses in primates, through different genetic and genomic mechanisms.
Fig. 4: Ancient and recent unfixed gene losses in primates, through different genetic and genomic mechanisms.The alternative text for this image may have been generated using AI.
Full size image

a, Representation of the SAMD9/9L genomic locus from primate genomes. Species cladogram is presented on the left. SAMD9L and SAMD9 are in blue and pink, respectively. Adjacent syntenic genes are in grey. Genes containing early stop codons are hatched. b, Africa map with inset representing the current geographic ranges of Pan populations and their status regarding natural SIV infections. Numbers of individuals studied for their whole-genome sequences are indicated for each (sub)species. Populations naturally infected by SIVcpz are highlighted by a virus symbol. c, SAMD9/9L genomic locus alignment among Pan individuals showing recent unfixed loss of SAMD9 in the bonobo population.

We next specifically tested whether episodes of positive selection occurred in SAMD9 or SAMD9L in some primate lineages that experienced paralogue loss. We performed targeted branch-specific analyses, using the adaptive branch-site random effects likelihood (aBSREL), testing specifically lineages associated with gene loss or paralogue retention (‘tested branch’, also known as ‘foreground’ branches) in the SAMD9 or the SAMD9L gene trees (others branches were set as ‘background branches’)57 (Supplementary Fig. 4b,c). Our analysis suggested that several primate lineages that lost either SAMD9 or SAMD9L might have been the targets of episodic positive selection (Supplementary Fig. 4b,c). These losses could result from adaptation to ancient pathogen challenges, cost of maintaining the two copies or relaxation of functional constraints that occurred on specific branches.

Bonobos and chimpanzees are human’s closest living relatives with a genetic divergence of approximatively 1.3% with humans and only of 0.4% amongst themselves58. Despite this strong proximity, bonobos possess a unique genetic distinction, with a 41.46 kilobase (kb) deletion in the SAMD9/9L locus. In fact, they stand out as the sole hominid with a single copy of the SAMD9/9L gene family, retaining only SAMD9L (ref. 59). To characterize the recent loss of SAMD9, we investigated the prevalence of the 41.46 kb deletion in the SAMD9/9L locus in bonobos at a population level (Pan paniscus; n = 13) in a joint analysis that additionally included all currently recognized chimpanzee subspecies or populations (Pan troglodytes spp.; n = 59) (Fig. 4b). We found that, among 13 individuals, 10 bonobos exhibited the same deletion with the complete absence of the SAMD9 gene, indicating common homozygous genomic deletion (Fig. 4c). However, the remaining three bonobos presented this deletion on a single chromosome, representing a heterozygous genomic absence of SAMD9 (Fig. 4c). Importantly, the analysed bonobo individuals are not related60, suggesting that, despite the small sample size, the heterogeneous genomic makeup in SAMD9/9L may be representative of the population. This was in sharp contrast with chimpanzees, which had SAMD9+/+ present in all 59 individuals, a pattern shared with humans (Fig. 4c). This suggests a recent SAMD9 genomic loss event specific to the bonobo lineage still segregating in the population, potentially impacting the immunity of bonobos.

Chimpanzee and bonobo (Pan) SAMD9Ls have an increased anti-HIV-1 activity compared with human SAMD9L

Beyond the genomic loss that occurred in the SAMD9/9L locus during hominid evolution, we investigated the evolutionary divergence of SAMD9 and SAMD9L at the genetic level. We therefore analysed the non-synonymous single nucleotide polymorphisms (SNPs) within the coding sequences of both genes for the 72 Pan individuals, using panTro6 genome as reference, as well as for more than 4,000 humans (Fig. 5a and Supplementary Fig. 5a,b). For SAMD9, we found very few non-synonymous SNPs amongst Pan. This amino acid conservation was particularly exemplified in the 12 Eastern chimpanzees (Pan troglodytes verus) and in the 3 bonobos encoding a single SAMD9 copy, as they encoded an identical protein sequence (Fig. 5a). In SAMD9L, we found a more widespread distribution of missense SNPs in chimpanzees and bonobos (Fig. 5a). Remarkably, we found two specific variants in SAMD9L that stood out because of their high frequency in the bonobo population. All bonobos (n = 13) encoded for a homozygous serine (S) at position 90 and 9 out of 13 bonobos encoded an arginine (R) at position 1,446 (either homozygous or heterozygous) (Fig. 5a and Supplementary Fig. 5a). Of note, there were no SAMD9L SNPs differentiating SAMD9+/− and SAMD9−/− bonobos. Intriguingly, chimpanzees and all other primates analysed to date, including the 4,099 human genomes, encoded a leucine L90 and a lysine K1446 (Fig. 5a and Supplementary Fig 5a,c), suggesting that these variants are specific to bonobos amongst primates. Furthermore, compared with bonobos, Eastern and Central chimpanzees showed a more widespread distribution of SNPs within SAMD9L, and no missense polymorphisms appeared at high frequency in the chimpanzees (Fig. 5a and Supplementary Fig. 5a).

Fig. 5: Chimpanzee and bonobo (Pan) SAMD9L major variants have an increased restrictive activity against HIV-1 pWITO, but not SIVcpz EK505, compared with human SAMD9L.
Fig. 5: Chimpanzee and bonobo (Pan) SAMD9L major variants have an increased restrictive activity against HIV-1 pWITO, but not SIVcpz EK505, compared with human SAMD9L.The alternative text for this image may have been generated using AI.
Full size image

a, Polymorphisms impacting SAMD9 and SAMD9L amino acid coding sequences among the Pan populations, with chimpanzee PanTro6 as reference. Highly frequent bonobo-specific SNPs in SAMD9L are highlighted by orange triangles at the top. See Supplementary Fig. 5 for details, human polymorphisms and comparative analyses with other primate species. b, Location of the chimpanzee- and bonobo-specific variants as compared with human SAMD9L on the two-dimensional predicted protein domain structure (n = 8 for chimpanzees versus humans and n = 9 or 10 for bonobos versus humans). c, Experimental setup to investigate chimpanzee and bonobo SAMD9L restriction on replication of full-length HIV-1 and SIVcpz IMCs. d, Relative infectious virus yields of HIV-1 pWITO in indicated SAMD9L conditions, normalized to the empty control (experimental setup in c). The two most frequent variants at position 1,446 in the bonobo populations were functionally tested. Results are from five independent biological replicates. Data are presented as mean ± s.d. Statistics were performed using the two-sided ratio paired t-test versus the human SAMD9L condition (**P < 0.005; *P < 0.05). Of note, all SAMD9Ls significantly restricted HIV-1 pWITO as compared with the empty control (P < 0.005). Below, western blot analysis in the producer cells showing similar protein expression of SAMD9Ls. Loading control is from total proteins (prestained gels). e, Similar experiment to c and d, with SIVcpz EK505 strain. Statistics were performed using the two-sided ratio paired t-test versus the human SAMD9L condition (NS, not significant) and versus the empty control (**P < 0.005). Luc, Luciferase; RLU, Relative light units.

Source data

Human SAMD9L inhibits cellular protein synthesis and restricts lentiviral HIV-1 infection5,8,61,62. This is particularly interesting in the context of natural lentiviral infections in hominids, where humans and two chimpanzee subspecies are infected by HIV-1 and SIVcpz, respectively. Yet, two chimpanzee subspecies (Eastern and Nigerian-Cameroon chimpanzees) and bonobos have no evidence of modern natural lentiviral infections25,26,27,63 (Fig. 4b). Furthermore, pandemic HIV-1 in humans originated from cross-species transmission of SIVcpzPtt from Central chimpanzees (Pan troglodytes troglodytes)27,64.

We therefore determined the functional consequences on lentiviral infections of the genotypic differences between human, chimpanzee and bonobo SAMD9Ls. We cloned the native chimpanzee SAMD9L and the two major variants of bonobo SAMD9L (bonobo R1446 and bonobo K1446) in an expression plasmid to compare them with human SAMD9L (Fig. 5b,c). Of note, the chimpanzee SAMD9L has eight amino acid changes compared with the human one (Fig. 5b). We investigated their functions in the context of HIV-1 replication, as in ref. 5, as well as SIVcpzPtt replication (SIVcpz EK505, a kind gift from B. Hahn27,65). Briefly, we cotransfected 293 T cells with the mCherry-SAMD9L plasmids along with an infectious molecular clone (IMC) encoding a transmitted/founder HIV-1 natural strain (pWITO, as in ref. 5) or an SIVcpzPtt strain (EK505). Two days later, we measured viral and cell protein expression in the producer cells (Fig. 5c–e and Supplementary Fig. 5d) and the infectious virus yield by Tzm-bl reporter assay (Fig. 5c–e). Remarkably, although all ectopic SAMD9Ls were expressed at similar levels, we found that chimpanzee and bonobo SAMD9Ls had a significant increase in anti-HIV-1 pWITO activity compared with human SAMD9L (Fig. 5d, P < 0.005), suggesting some species-specificity.

By testing effects on SIVcpzPtt EK505, we first showed that human SAMD9L was also restrictive against this SIVcpzPtt strain. Yet, chimpanzee and bonobo SAMD9Ls did not present an increased anti-SIVcpz activity compared with human SAMD9L, suggesting possible lentiviral-strain specificity. Therefore, chimpanzee and bonobo SAMD9Ls seem to have an increased anti-HIV-1 activity, as compared with human SAMD9L (Fig. 5e). It is further possible that SIVcpz, naturally circulating in chimpanzee populations, adapted to the Pan SAMD9L increased antiviral function.

Bonobo-specific polymorphisms confer an increased anti-HIV-1 activity to human SAMD9L without compromising cellular protein synthesis

We wondered whether minimal changes in human SAMD9L informed from some of these natural Pan variants could impact human SAMD9L functions. We specifically tested the bonobo SNPs, which could constitute species-specific adaptations with functional implications. We therefore cloned L90S and/or K1446R variants in the context of the human SAMD9L plasmid and investigated their effects on two key functions of human SAMD9L: antilentiviral function as well as cellular protein synthesis shutdown (Fig. 6).

Fig. 6: Bonobo-specific polymorphisms enhance human SAMD9L anti-HIV-1 activity, without affecting its translation shutdown effect.
Fig. 6: Bonobo-specific polymorphisms enhance human SAMD9L anti-HIV-1 activity, without affecting its translation shutdown effect.The alternative text for this image may have been generated using AI.
Full size image

a,b, Relative infectious virus yields of HIV-1 pWITO (a) and SIVcpz EK505 (b) in indicated SAMD9L conditions (hSAMD9L (human SAMD9L)), normalized to the empty control, with four independent biological replicates. Data are presented as mean ± s.d. Statistics were performed using the two-sided ratio paired t-test versus the human SAMD9L condition (***P < 0.0005; *P < 0.05). Below, western blot analysis in the producer cells showing similar protein expression of SAMD9Ls. Loading control is from total proteins (prestained gels). In b, empty and hSAMD9L wild-type conditions are identical to Fig. 5e. c, Bonobo-specific SNPs do not modify human SAMD9L restriction of cellular translation. Quantification of protein synthesis assay was performed in two independent biological replicates in the context of ectopic expression of SAMD9Ls. Two doses of input DNA plasmids per condition were tested. HPG MFI ratio was calculated within each experimental condition using the MFI of the mCherry+ cells (expressing mCherry-SAMD9L) normalized to the MFI of the mCherry cells (not expressing SAMD9L). MFI, median fluorescence intensity.

Source data

First, we investigated the effect of these bonobo-specific variants on human SAMD9L function in the context of lentiviral replication, as in Fig. 5. Interestingly, we found that ectopic SAMD9L-L90S/K1446R and the single variants were expressed at a similar level to wild-type SAMD9L, but had a significant twofold increase in anti-HIV-1 pWITO activity (Fig. 6a). The double-mutant SAMD9L-L90S/K1446R appeared the most restrictive, while the single mutants had intermediate effects, suggesting additive functions (Fig. 6a). The increased effects of the human SAMD9L mutant with L90S and/or K1446R seemed independent of HIV-1 protein translation shutdown (Supplementary Fig. 6a).

Second, we tested the activity of SAMD9L-L90S/K1446R on SIVcpzPtt EK505 replication and found that, similarly to wild-type Pan SAMD9Ls, the specific SNPs in the context of the human SAMD9L did not increase its anti-SIVcpz function (Fig. 6b).

Lastly, we assessed the extent of whole cellular translation in human 293 T cells transfected with high doses of wild-type human mCherry-SAMD9L or mCherry-SAMD9L-L90S/K1446R or the single mutants. We used Click-iT l-homopropargylglycine (HPG) synthesis assays, which measures the incorporation of HPG an analogue of methionine into newly synthesized proteins by flow cytometry Supplementary Fig. 6b). As previously shown5,8,61,62, we found a dose-dependent shutdown of cellular translation in cells ectopically expressing human SAMD9L (mCherry+) normalized to the control mCherry cells (Fig. 6c and Supplementary Fig. 6b). Importantly, we found that hSAMD9L-L90S/K1446R and the single variants showed no differences in the cellular translation repression compared with the human wild-type SAMD9L (Fig. 6c). This suggests that the non-synonymous bonobo variants in SAM and TPR domains do not change the activity of human SAMD9L on cellular protein synthesis.

Overall, bonobo-specific polymorphisms specifically enhance human SAMD9L antiviral function against HIV-1, without affecting the translation shutdown function.

Discussion

This study unveils key aspects of the functional evolution of SAMD9/9Ls at different timescales, highlighting multidomain and functional convergence between metazoan SAMD9/9Ls and prokaryotic Avs9s, as well as recurrent genetic and genomic adaptations in mammals from ancient to very recent times. On one hand, we identified SAMD9/9L structural analogues in bacterial defence systems that induce cell death with similar AlbA_2 effector determinants, suggesting a remarkable shared immune factor across billions of years. On the other hand, our analyses of mammalian SAMD9/9L revealed dynamic and episodic adaptations, notably in primates, probably in response to epidemics, including from lentiviruses. By an ‘immuno-evo’ framework, we bring key insights to understand the duality between the maintenance of a key antiviral shared immunity and its constant adaptations to pathogens.

SAMD9/9L as shared immunity in prokaryotes and metazoans

The SAMD9/9L gene family may be part of the ‘shared immunity’, in which antiviral mechanisms are similar between life kingdoms, either resulting from (1) horizontal gene transfer of bacteria, (2) from vertical inheritance originating from LUCA (last universal common ancestor) or (3) from convergent evolution16,18,66. The list of shared immune defence systems is rapidly expanding with currently about a dozen identified antiviral systems, including cGAS, Viperin and TLRs18. Here we revealed striking structural similarity between human antiviral SAMD9/9L and prokaryotic Avs proteins37, specifically with a new Avs protein family (Avs9). Notably, they both share the key nuclease domain, AlbA_2, which, in human SAMD9/9L, is responsible for transfer RNAPhe cleavage and viral and cellular translational inhibition5,9. In Avs9 from P. fluorescens, we uncovered cell-killing activity, which also depended on AlbA_2 and its predicted nuclease site. Future biochemical and functional investigations will determine if Avs9 has bone fide nuclease activity—and which substrate—as well as identify if Avs9 has specific antiphage functions. Furthermore, similar to the Avs system, in which infected bacteria use an altruistic self-killing mechanism for the benefit of the colony37, tight regulation of SAMD9/9L is probably essential and is certainly possible thanks to the intermediate and C-terminal domains8.

Uncovering phylogenetic relationships of prokaryotic Avs9 and metazoan SAMD9/9L protein families strongly suggest that they result from evolutionary convergence, possibly driven by analogous viral selective pressure. Moreover, domain shuffling is an important driver of protein evolution and can give rise to similar advantageous multidomain combinations. This work exemplifies a surprising evolutionary and functional parallel through the independent emergence of analogous antiviral proteins across vast evolutionary timescales.

Extensive, and probably adaptative, SAMD9/9L CNVs

Despite its simultaneous presence in evolutionarily distant organisms, we globally found a patchy distribution of SAMD9/9L homologues and structural analogues across domains of life. For example, plants, fungi, algae and protozoa do not seem to harbour complete SAMD9/9L homologues. Furthermore, we found a very rapid evolution at the genomic and genetic levels during mammalian evolution in SAMD9/9L. Therefore, this ancient conservation is concomitant with rapid evolution, almost certainly as the result of virus–host arms races. Gene loss and duplication may, for example, provide an advantage, similar to observations in other innate immune genes, such as the APOBEC3 family31,67,68,69 and many other factors1. It is noteworthy that most genomic variations were gene losses, rather than extensive duplications, at least in most analysed mammalian species. These may be the result of evolutionary trade-offs, where the benefits of losing a gene (potentially escaping viral antagonism or hijacking) outweighed the costs.

At the mechanistic level, while most SAMD9/9L losses occurred by genomic loss of a chromosomal region, Colobinae primates seem to have lost SAMD9 by early stop codons and pseudogenization. However, it cannot be excluded that, in some species, this may lead to the expression of a truncated SAMD9 retaining the AlbA_2 effector function but lacking the crucial regulatory intermediate and C-terminal domains, similar to some SAMD9L autoinflammatory gain-of-function variants, for example8,61. Further study on its selective pressure, and on messenger RNA transcript and protein expression in natural tissues from Colobinae species under immune stimulation would resolve this question.

The specific case of SAMD9 unfixed loss in bonobos and adaptive chimpanzee and bonobo SAMD9Ls: modern implications, functions and potential past lentiviral drivers

Bonobos harbour a recent unfixed loss of SAMD9, which occurred through a large chromosomal deletion. Despite bonobos, chimpanzees and humans being closely related, the variability observed at this locus is intriguing. Two bonobo-specific missense polymorphisms in SAMD9L that confer an increased antiviral activity against HIV-1 are located in the SAM and TPR domains, which could be involved in protein–protein or protein–RNA interactions37,70,71,72. One possibility is that the variants may modulate SAMD9L sensing, especially for the TPR domains, which have been reported to act as (viral) sensors in Avs and IFIT proteins (interferon-induced protein with TPRs)37,73. Further, although most human deleterious gain-of-function mutations in SAMD9/9L associated diseases are in the P-loop NTPase domain, some are described in the TPR or SAM domains74. However, we did not observe a gain-of-function phenotype on global cellular translation for the bonobo-specific variants. This therefore suggests that the variants may not destabilize the inactive closed form of the protein, nor change SAMD9L basal activities. Instead, it might modify its specificity and sensitivity in viral sensing, potentially adapting its interface with viruses. Otherwise, it may impact other potential functions of SAMD9L, for example in endosomal trafficking, increasing specific anti-HIV functions5,10,11. It would also be interesting to determine if, and how, chimpanzee and bonobo SAMD9Ls restrict other infections from poxviruses or other RNA viruses3,12, and whether those may have driven some of the adaptations.

Our data show that chimpanzee and bonobo SAMD9Ls have an increased anti-HIV-1 phenotype compared with human. The additional loss of the pro-HIV-1 SAMD9 (ref. 5) may have been particularly advantageous (overall increased fitness) during past lentiviral infections in bonobo ancestors (that is, increased antiviral SAMD9L and loss of prolentiviral SAMD9). The presence of 3 of 13 unrelated bonobos with 1 SAMD9/SAMD9L allele and 1 SAMD9L-only allele suggests that SAMD9 loss is unfixed, and probably recent. If selection favours individuals without SAMD9, the gene could eventually be completely lost in the bonobo population. The modern genomic makeup of the SAMD9/9L locus in chimpanzees, and even more so in bonobos, therefore suggests adaptation to lentiviral-like epidemics that occurred in Pan, as well as since the bonobo–chimpanzee divergence. In the future, performing this evo-functional study with a larger bonobo sample size, genomic phased data and long-read sequencing would enable robust determination of the exact selective pressures shaping antiviral defence mechanisms in this species. It would also help in determining the haplotype structures on which the SAMD9 gene was lost and the SAMD9L substitutions occurred, potentially providing further insight into the epistatic interactions between these genetic changes.

Unlike some chimpanzees and humans that are infected by SIVcpz and HIVs, respectively, and suffer from AIDS symptoms75,76,77, modern bonobos are not known to be naturally infected by any lentiviruses25,26. Overall, SAMD9/9L adaptation may nowadays participate, with other factors78, in bonobo population resistance against lentiviral/SIV infections.

Finally, it is noteworthy that in this study SIVcpzPtt EK505 did not show an increased sensitivity to chimpanzee and bonobo SAMD9Ls, or to Pan SNPs in the context of human SAMD9L. This may be the result of virus–host co-evolution1,30, where SIVcpz has adapted to the natural genetic makeup of its host, particularly of chimpanzee antiviral innate immunity.

Altogether, our findings highlight the strength of evo-immuno approaches in unravelling links between the evolutionary history of innate immunity and contemporary challenges in human health. The identification of SAMD9/9L homologues and structural–functional analogues across diverse taxa, as prokaryotes and primates, shows a shared convergent immunity. Common challenges, such as fighting viral infections, drive both conservation or convergence of key immune systems as well as their rapid evolution through arms races. In this regard, using diverse models (human, diverse eukaryotic cells and bacteria) and natural variants in closely related species for functional studies can bring valuable insights with broader medical applications, such as the incorporation of potentiator mutations in antiviral factors (protein engineering) or the use of bacterial antiviral proteins that could act against human viruses.

Methods

Comparative genomics, phylogenetics and positive selection analyses in mammals

To obtain the coding sequences of the SAMD9 and SAMD9L homologues in bats, rodents, primates, ungulates and carnivores, we used the detection of genetic innovations (DGINN) pipeline53 with, respectively, Myotis myotis, Rattus norvegicus, Homo sapiens, Hippopotamus amphibius and Phoca vitulina Refseq SAMD9 and SAMD9L, as queries. Briefly, the coding sequences from each group were automatically retrieved with NCBI blastn79,80, cleaned and aligned with MAFFT81. Homologous sequences from marsupials, aves and amphibians were retrieved using NCBI blastn. Of note, these analyses are based on publicly available genome annotations (not necessarily genes annotated as ‘SAMD9/9L-like’ but regions annotated as coding regions), so it is not excluded that some unannotated genes were not analysed. The species and accession numbers are presented in Supplementary Table 2. Nucleotide alignments of SAMD9 or SAMD9L from each mammalian group were manually curated before being used as input in DGINN for automatic codon alignment using the probabilistic alignment kit (PRANK)82 and phylogenetic tree building using PhyML83 (with default settings in DGINN).

Furthermore, all codon alignments, as well as outgroup sequences, were realigned in a three-step fashion to obtain a high-quality mammalian-wide codon alignment (1) using Muscle84, (2) manually curating the sequences, (3) codon aligned with PRANK. A phylogenetic tree was inferred from this alignment using IQ-TREE webserver (GTR + F + I + G4 identified as the best substitution model by ModelFinder implemented in IQ-TREE)85. We also performed analyses with PhyML (best model estimated from Smart model selection, SMS: GTR + R). These analyses allowed us to attribute the phylogenetically aware ‘SAMD9’ or ‘SAMD9L’ nomenclature.

We next tested for positive selection occurring at specific branches using the aBSREL program on the DataMonkey webserver57. We used as inputs the codon alignments of (1) primate SAMD9 with Cricetulus griseus (criGri) and Tupaia chinensis (tupChi) SAMD9s as outgroups and (2) primate SAMD9L with Cricetulus griseus (criGri) SAMD9L as outgroup. Each branch that we tested for evidence of positive selection was defined as ‘tested branch’ (that is, ‘foreground’) and the remaining as ‘background branches’. Two models were fit to each tested branch: one that allows for episodic diversifying selection (with ω > 1) and one that does not. A likelihood ratio test is then used to compare these models and assess whether the tested branch shows evidence of positive selection. For branches where selection is detected, aBSREL also estimates the proportion of codon sites that are subject to positive selection.

For cases in which we suspected gene losses, we confirmed the absence of coding genes by several methods. We analysed the SAMD9/9L genomic locus (between HEPACAM2 and CDK6 syntenic genes) on NCBI genome data viewer of specific species. We verified that there were no missing data or low sequence quality in this genomic region. Pseudogenes, here identified by several and very early stop codons, were only analysed systematically in primates. For the two monotreme species, in which no SAMD9/9L homologues could be retrieved by genome-wide blast, the HEPACAM2-CDK6 genomic locus was retrieved. We found no missing data (no ‘N’) and no homology using blast or alignments (with relaxed parameters) of non-annotated regions with human SAMD9/9L.

Genome alignment and SNPs analyses in hominids

The genomic sequences of 13 bonobos, 59 chimpanzees and 4,099 humans were retrieved from public online databases: 1000 Genomes Project, Human Genome Diversity Project (HGDP), NCBI bioprojects PRJNA189439, SRP018689 and PRJEB15083 (refs. 60,86) (The 1000 Genomes Project Consortium 2015). DNA sequences from Pan individuals were aligned to Clint_PTRv2/panTro6 reference genome using BWA-MEM87. Then, variant calling was done using FreeBayes88 to obtain a vcf file. The SAMD9 and SAMD9L locus regions (chromosome 7: 88599930–89350079 in panTro6) were extracted and parsed using an ad hoc R script using VariantAnnotation, GenomicFeatures, AnnotationHub, org.Pt.eg.db, ggplot2, R packages. This script was used to identify non-synonymous variants among SNPs and to visualize them. The equivalent genomic region in human was retrieved by using the LiftOver tool of the UCSC genome browser (chromosome 7: 92540798–93289621 in the human reference GRCh38/hg38). These coordinates were then used to subset the HGDP + 1000 Genomes Project vcf for this region.

Structure similarity search, structurally aware alignment and phylogenetic analyses across kingdoms of life

Protein structures were obtained from AlphaFold DB89,90 and RCSB PDB91. Foldseek33 was used for detection of structural similarity. We used a sequential strategy. First, we queried Foldseek v.427df8a with SAMD9 (PDB ID Q5K651) and SAMD9L (Q8IVG5) against the AlphaFold database clustered at 50% sequence identity (AFDB50), using a 30% TM-score threshold and a maximum E-value of 0.0001 (default settings). Then, we constructed a query database consisting of SAMD9 (Q5K651), SAMD9L (Q8IVG5) and bacterial top hits—AVAST type V (A0A1F9N8W4) and Avs9 (A0A100VJR7, A0A2S4Y961 and A0A7T4VS34)—and used it for searches under the same settings. Search hits were subsequently filtered for a minimum 80% query coverage to cover at least 1,000 amino acids of the query structures, resulting in 238 analogue structures. FoldMason v.333d54c (ref. 92) was used to generate a multiple structure alignment (MSTA) of the identified structural analogues and MUSCLE v.5 was used to generate a multiple amino acid sequence alignment (MSA). The domain coordinates used for presence/absence analysis of each domain or for the extraction of a given domain MSTA are presented in Supplementary Fig. 1c and are based on the predicted SAMD9 three-dimensional structure. Additionally, using FoldMason, an MSTA was generated on the human SLFNs (SLFNL1, SLFN5 and SLFN11–14, respectively, corresponding to PDB IDs Q499Z3, Q08AF3, Q7Z7L1, Q8IYM2, Q68D06 and P0C7P3) with the 23 hits over 238 containing an AlbA_2 domain. Phylogenetic trees were constructed from both the MSA and MSTA using IQ-TREE 2.3.0 (ref. 93) with the LG + F + G4 substitution model and 1,000 bootstrap replicates, and visualized using the ggtree R library94 or interactive tree of life (iTol)95.

Phylogenetic analysis of prokaryotic and eukaryotic AlbA_2 domains

The HMM profile of the Pfam AlbA_2 (PF04326) domain was retrieved from the Pfam database96. This profile was searched against a custom protein database combining: (1) 41,150 complete bacterial genomes downloaded from Refseq in August 2024, filtered for redundancy using the clusthash function of MMseqs2 (v.13.45111) using default parameters97; (2) 455 complete archaeal genomes downloaded from Refseq in August 2024; (3) 993 representative eukaryotic genomes from the EukProt database98, filtered for redundancy using the clusthash function of MMseqs2 (13.45111) using default parameters. The AlbA_2 HMM profile was searched into this combined protein database using hmmsearch (v.3.3.2) with default parameters99. Hits with at least 90 covered profile residues were selected and the amino acid sequences of the aligned regions complemented with ten residues on each side were extracted. Sequences were clustered using the easy-cluster function of MMseqs2 (v.13.45111) with parameter --min-seq-id 0.8 (ref. 97). Cluster representatives were aligned with Clustal-Omega100 with default parameters and the alignment was trimmed using ClipKit101. The trimmed alignment was used to compute a tree using IQ-TREE with parameters -m L -bb 10000 -nm 10000, which was then visualized using iTOL.

For each genome, defence systems were detected using DefenseFinder (v.1.3)34. For each clade, we calculated a defence score as the fraction of bacterial genes found within ten genes upstream or downstream of a defence protein as detected by DefenseFinder in their genome of origin. As a control, we calculated for each clade the defence score expected by chance as the fraction of all genes (from the same genomes) that are found within ten genes upstream or downstream of a defence protein. This value was used to assess for each clade whether AlbA_2-containing genes colocalize with antiphage systems more frequently than expected by chance, using a binomial test and false discovery rate (FDR) correction (Supplementary Table 1).

Bacterial strains and plasmids

The codon-optimized open reading frame encoding Avs9 from P. fluorescens (Uniprot ID A0A7T4VS34) was ordered as a gene fragment from Twist Bioscience, cloned into the pBbA6c vector102 by T5 exonuclease-dependent assembly cloning103 and transformed into E. coli DH5α λpir. Constructions were sequence-verified by Sanger sequencing (Microsynth). The complete Avs9 gene fragment sequence synthesized and used in this study is provided in Data availability. We further made the Avs9-D45N mutant by site-directed mutagenesis using polymerase chain reaction amplification with Q5 polymerase (New England Biolabs) and KLD cloning (New England Biolabs).

Bacterial drop assays

E. coli DH5α λpir cells carrying pBbA6c, pBbA6c-Avs9 or pBbA6c-AvsD45N were grown for 6 h at 37 °C and 180 rpm in Luria–Bertani (LB) medium supplemented with chloramphenicol at 20 μg ml−1 and glucose at 1 g ml−1. After tenfold serial dilutions, 5 µl of each dilution were spotted on LB agar plates supplemented with 20 µg ml−1 of chloramphenicol and either 1% glucose or 100–500 µM isopropyl β-d-1-thiogalactopyranoside (IPTG). Plates were incubated at 37 and 25 °C for 24 and 48 h, respectively. Strong toxicity of pBbA6c-Avs9 was observed upon incubation at 25 °C.

Plasmids for expression in human cells

HIV-1 T/F pWITO (human immunodeficiency virus 1 pWITO.c/2474, ARP-11739) encoding a full-length IMC was contributed by J. Kappes and C. Ochsenbauer through the National Institutes of Health (NIH) AIDS repository programme. IMC for SIVcpz EK505 was a gift from B. Hahn27,65. The pMT06-Flag-mCherry-SAMD9L plasmid was constructed by cloning the synthetized human SAMD9L gene into a pMT06-Flag-mCherry backbone, from the original RRL.sin.cPPT.CMV/Flag-E2-crimson.IRES-puro.WPRE (MT06, a gift from C. Goujon: Addgene plasmid no. 139448; http://n2t.net/addgene:139448; RRID: Addgene_139448)104. The pMT06-Flag-mCherry-chimp SAMD9L plasmid (chimpanzee SAMD9L) was synthesized and cloned by Azenta Genewiz. The bonobo SAMD9L plasmids were generated through site-directed mutagenesis of the pMT06-FLAG-mCherry-chimp SAMD9L plasmid. Human SAMD9L-L90S/K1446R, SAMD9L-L90S and SAMD9L-K1446R (double and single mutant) plasmids were generated from pMT06-Flag-mCherry-SAMD9L plasmid, using the QuikChange Lightning Site-Directed Mutagenesis Kit (Agilent) following the manufacturer’s instructions. Sequences were confirmed through full-length plasmid and/or Sanger sequencing (Microsynth).

Cell lines and culture

Human embryonic kidney 293 T cell lines (ATCC, catalogue no. CRL-3216) and TZM-bl (NIH AIDS Research and Reference Reagent Program, catalogue no. 8129) were grown in Dulbecco modified Eagle medium containing 10% fetal calf serum (Sigma catalogue no. F7524) and 100 U ml−1 of penicillin/streptomycin. TZM-bl cells express the cell-surface proteins CD4, CCR5 and CXCR4 and encode for luciferase and β-galactosidase under the control of the long-terminal repeat (LTR) promoter. They are commonly used for lentiviral titration of culture supernatants (Tzm-bl assays).

Production and quantification of replication-competent lentivirus

A total of 293 T cells were initially seeded in 6-well plates at a density of 0.2 M cells ml−1 (400,000 cells total per well). After 24 h, the cells were cotransfected using TransIT-LT1 (Mirus) with a plasmid encoding a fully replication-competent lentivirus (IMCs), alongside either a plasmid encoding SAMD9L or an empty control. The quantity of DNA used was 250 ng for host plasmids and 1,200 ng for virus plasmids. Subsequently, 48 h post-transfection, cells were harvested for western blot. The supernatants were collected and stored at −80 °C for further titration of infectious virus yield via TZM-bl cells. For titration, TZM-bl cells were plated in 96-well plates and exposed to serial dilutions of viral supernatant. Following 48 h of infection, cell lysis was performed using BrightGlow Lysis Reagent (Promega E2620) and relative light units were measured using the Tecan Spark Luminometer. Infectious virus yields under various conditions were consistently expressed as fold-change compared with paired viral infection conditions in the absence of SAMD9L.

Western blot analysis

Cells were harvested and lysed using ice-cold RIPA buffer (composed of 50 mM Tris pH 8, 150 mM NaCl, 2 mM EDTA and 0.5% NP40) supplemented with protease inhibitors (Roche), followed by sonication. Proteins from cell lysates or supernatants were separated by electrophoresis and transferred onto a PVDF membrane via overnight wet transfer at 4 °C. Stain-Free gel (BioRad) was used for loading and protein transfer controls. Following blocking in TBS-T 1X solution (Tris buffer saline, consisting of Tris HCl 50 mM pH 8, NaCl 30 mM and 0.05% Tween 20) with 5% powdered milk, the membranes underwent incubation with primary antibodies for a duration ranging from 1 h to overnight, followed by subsequent 1-h incubation with secondary antibodies. Detection was carried out using SuperSignal West Pico Chemiluminescent Substrate (ThermoFisher Scientific) and imaged using the Chemidoc Imaging System (BioRad). Antibodies used included anti-SAMD9L (Proteintech, 25173-1-AP), anti-Gag (NIH HIV Reagent Program, 183-H12-5C), anti-HIV-1-gp120 (Aalto, D7324; NIH HIV Reagent Program, 16H3), as well as secondary IgG-peroxidase conjugated anti-mouse (Sigma, catalogue no. A9044) and anti-rabbit (Sigma, catalogue no. AP188P). ‘Total proteins were used as a loading control with BioRad stain-free gel.

Protein synthesis assay

A total of 293 T cells were seeded at 0.2 M cells ml−1 in 12-well plates (200,000 cells total per well). At 24 h after seeding, cells were transfected with 250 or 500 ng of host DNA plasmid, using TransIT-LT1 (Mirus) following the manufacturer’s instructions. At 48 h post-transfection, cells were incubated in HPG for 30 min at 37 °C. Medium was discarded and cells were harvested and fixed with PFA 4%. Cells were then washed with PBS BSA 3% and permeabilized in PBS 0.5% Triton X-100 for 15 min. Click-iT Plus Alexa Fluor Picolyl Azide assay was then performed and cells were analysed on MACSQuant VYB Cytometer (Miltenyi Biotec, SFR BioSciences).

Other softwares and statistical analyses

DefenseFinder34 was used to analyse sequences from Foldseek analyses33. Sequencing analyses and representations were conducted using Geneious (Biomatters), ESPript 3.0 https://espript.ibcp.fr (ref. 105) and UGENE v.52.0 (ref. 106). R scripts were used to conduct analyses of genomic data. Graphic representations and statistical analyses were carried out using GraphPad Prism 9 and R scripts. In the figures, data are presented as mean ± s.d. and each point correspond to an independent biological replicate. Statistics were performed using the two-ratio paired t-test, except where indicated in the figure legends. P values are according to figure legends and exact P values are reported in Supplementary Table 1 and Source data.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.