Fig. 9: Distribution of rxnfp fingerprints for the reactions in the combined space of ECREACT (grey) and RetroBioCat test set reactions (blue), embedded with TMAP.
From: Biocatalysed synthesis planning using data-driven learning

a The reactions from the RetroBioCat test set are forming distinct clusters in the combined reaction space. b For RetroBioCat test set (blue) reactions, the fraction of nearest neighbours (k = 10) from the set itself is consistently higher compared to reactions from ECREACT (grey).