Extended Data Fig. 1: Conservation, structural modeling and purification of SPARDA.

a, Alignment of the MID and PIWI domains in short pAgos from the SPARDA (NbaAgo, CmeAgo), SPARTA (CrtAgo), SPARSA (CcAgo) systems and in long pAgos (RsAgo, TtAgo, CbAgo, MjAgo, PfAgo). Residues involved in interactions with the guide 5’-end in the MID pocket are shown with green dots (YKQK in RsAgo, RKQK in NbaAgo and CmeAgo, see Fig. 1e). The active site residues in the PIWI domain in active long pAgo nucleases (TtAgo, CbAgo, MjAgo, PfAgo) are shown with blue dots. The source of pAgo proteins: NbaAgo – N. baekryungensis (WP_022673743.1), CmeAgo – C. metallidurans (WP_011516870.1), CrtAgo – Thermoflavifilum thermophilum (former Crenotalea thermophila) (WP_092459742.1), CcAgo - Caballeronia cordobensis (WP_053571899.1), RsAgo - Rhodobacter sphaeroides (A4WYU7.1), TtAgo - Thermus thermophilus (WP_011174533.1), CbAgo - Clostridium butyricum (WP_058142162.1), MjAgo - Methanocaldococcus jannaschii (WP_010870838.1), PfAgo - Pyrococcus furiosus (WP_011011654.1). The amino acid numbering is shown for NbaAgo. b, Residues of the nuclease active site in DREN in SPARDA systems from various species: Nba – N. baekryungensis (WP_033317603.1), Cme – C. metallidurans (WP_011516871.1), Rhod - Rhodoplanes elegans (WP_111355734.1), Rph - Rhizobium phaseoli (WP_126906501.1), Mma - Mycobacterium marinum (WP_117407168.1), M.leaf - Methylobacterium sp. Leaf113 (WP_056186121.1). The amino acid numbering is shown for Nba DREN. Predicted active site residues are indicated with blue dots. Predicted elements of the secondary structure are indicated (α-helices, π-helices and β-strands; strict β-turns are shown as TT letters). The alignment is generated with ESPript 3.0. c-d, Confidence of the AlphaFold prediction for NbaSPARDA. c, pLDDT (local-distance difference test) – the per-residue estimate of prediction confidence measure for the five best rank models of the NbaAgo and DREN-APAZ complex. Positions 1–485 correspond to NbaAgo, while the rest correspond to DREN-APAZ (separated by a vertical line). The lDDT value > 80 reflects high confidence of the backbone prediction. d, PAE (Predicted Aligned Error) – the expected position error at residue x if the predicted and true structures were aligned on residue y, shown for the best ranked model. Positions 1–485 (top on the y-axis) correspond to NbaAgo, the rest (bottom on the y-axis) correspond to DREN-APAZ. The color gradient from blue to red indicates the error value, ranging from low to high. The graph shows that the relative positions of NbaAgo and the APAZ domain are confident, while the position of the DREN domain is flexible. Both graphs were created in ColabFold v1.5.3. e-g, Purification of SPARDA complexes. e, Co-elution of NbaAgo and DREN-APAZ during Ni-chelating, heparin and anion-exchange (MonoQ column) chromatography steps. Individual fractions and the final protein sample are shown for the wild-type complex. f, Final purified samples of wild-type (WT) NbaSPARDA and its mutants with substitutions in the active site of DREN (CD, catalytically dead) and in the MID pocket of NbaAgo (MID). g, Purification of CmeSPARDA. The elution fractions from the final chromatography step (MonoQ) are shown.