Fig. 2: PLMsearch reaches similar sensitivity as structure search methods. | Nature Communications

Fig. 2: PLMsearch reaches similar sensitivity as structure search methods.

From: PLMSearch: Protein language model powers accurate and fast sequence search for remote homology

Fig. 2

a–c The all-versus-all search test on SCOPe40-test. For family-level, superfamily-level, and fold-level recognition, TPs were defined as same family, same superfamily but different family, and same fold but different superfamily, respectively. Hits from different folds are FPs. After sorting the search result of each query according to similarity, we calculated the sensitivity as the fraction of TPs in the sorted list up to the first FP. a We took the mean sensitivity over all queries as AUROC. In addition, the total search time for the all-versus-all search test with a 56-core Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40 GHz and 256 GB RAM server is shown on the legend. b Precision-recall curve. c MAP, P@1, and P@10. d Evaluation on new proteins (see “New protein search test" Section). Supplementary Table 2 and Supplementary Table 4 record the specific values of each metric. Source data are provided as a Source Data file.

Back to article page