Fig. 2: Perturbagen CLass (PCL) analysis for reference-based mechanism of action (MOA) prediction.

a Visualization using Uniform Manifold Approximation and Projection (UMAP) of all reference set compound-dose chemical-genetic interaction (CGI) profiles reveals MOA-based clustering of compound-dose conditions. Three exemplary MOAs are highlighted: DprE1-DprE2 complex (purple), HadABC (light blue), and QcrB (green). Grey circles represent reference CGI profiles from all other MOAs. The UMAP representation of the data is shown here for illustration purposes only; none of the steps in the PCL analysis method depend on this representation. b Schematic of the results of spectral clustering of each MOA category. Circles represent CGI profiles from two MOAs, X (blue) and Y (orange), each yielding two clusters (connected circles) and some singleton CGI profiles. c (left) Schematic of a high-confidence prediction region (light blue shaded circle). The blue circles all share an MOA X, which is different from any MOA represented by the light gray circles. The blue circles connected by lines mark the cluster for which the high-confidence region is drawn. The radius of the high-confidence region for a cluster is defined as the largest distance (lowest similarity) between a CGI profile and the cluster such that all profiles contained within that radius belong to the same MOA X. Clusters for which such a high-confidence region exists are called Perturbagen CLass (PCL) clusters. Similarity score between a given CGI profile and a cluster is defined as the median of the correlations between the CGI profile and all the cluster profiles. (right) Schematic example of a non-predictive cluster to which the most similar CGI profile is out-of-MOA. Such clusters are considered not reliable for MOA prediction and are discarded. d Performance statistics of PCL analysis method for MOA prediction on 337 active reference compounds in leave-one-out cross-validation (LOOCV). Source data are provided as a Source Data file.