Fig. 4: Performance comparison of CataPro and UniKP in enzyme engineering scenarios. | Nature Communications

Fig. 4: Performance comparison of CataPro and UniKP in enzyme engineering scenarios.

From: Robust enzyme discovery and engineering with deep learning using CataPro

Fig. 4

a-c illustrate the models' ability to rank mutants within a specific reaction on the kcat, Km, and kcat/Km datasets. To ensure statistical significance of the results, only enzyme-substrate reactions with a number of mutants (including the wild-type) exceeding N (where N is 20 or 30) in each dataset were used for this test. d-f show the accuracy of the models in identifying the better-performing mutants from any two mutants across these three datasets. The box plots depict the distribution of metrics achieved by the model across all reactions where the number of mutants is greater than or equal to N. In each box plot, the lower and upper boundaries of the box represent the first quartile (Q1) and the third quartile (Q3), respectively. The whiskers extend from the quartiles to the minimum and maximum values within 1.5 times the interquartile range. The white circle represents the mean value of each statistic and the black line inside the box represents the median. In the kcat, Km, and kcat/Km datasets, the number of reactions with mutants (including the wild-type) where N≥20 are 39, 57, and 30, respectively, while the number of reactions with N≥30 are 10, 16, and 7, respectively. Source data are provided as a Source Data file.

Back to article page