Extended Data Table 4 Evaluation of the PR classification and reconstruction tasks for camelid VHH sequences

From: Assessing antibody and nanobody nativeness for hit selection and humanization with AbNatiV

  1. The assessment is carried out for AbNatiV trained on camelid VHH sequences (first row) and the AbLSTM model retrained on the same training set of AbNatiV (see Methods and second row). The first eight columns report the area under the curve for PR curves (shown in Fig. 4c and Supplementary Fig. 12), assessing the ability of the models to separate sequences in the Camelid Test (T) or Human Diverse >5% (D) sets from those from human, mouse, rhesus, and PSSM-generated (see column headers). The Camelid Diverse >5% dataset is used as a control to specifically assess the ability to generalize to sequences distant from those in the training set. The last two columns quantified the ability of each model to reconstruct camelid sequences in each dataset (column header). Corresponding ROC results are in Supplementary Table 4.