Fig. 1: The illustration of ESM-Ezy workflow.

a In the fine-tuning stage, the ESM-1b model was fine-tuned through binary classification on positive and negative data sets. In the searching phase, the Fine-tuned ESM-1b Backbone was used to generate query embeddings and candidate embeddings, and Euclidean distance in the embedding space was employed to identify the closest sequences for further validation. In the searching stage, the Binary Classification Head was omitted, and the ESM-1b Backbone from the fine-tuning stage was retained as the Fine-tuned ESM-1b Backbone. b After fine-tuning, the MCOs cluster (positive) became distinctly separated from the non-MCOs cluster (negative). c The embeddings of the selected sequences generated by the fine-tuned model clustered closely with the QEs. d The sequence and structure similarity matrix of the MCOs and QEs. The newly discovered enzymes exhibit low sequence similarity but are structurally conserved. Source data are provided as a Source Data file.