Figure 1
From: Utilizing yeast chemogenomic profiles for the prediction of pharmacogenomic associations in humans

(A) Algorithmic pipeline: Step 1: using genetic datasets (e.g. PFam), drug datasets (e.g. DrugBank) and chemogenetic datasets (e.g. Lee) to construct gene similarity measurements, drug similarity measurements and HIP/HOP scores respectively. Step 2: Example of how to combine the three similarity measurements (gene similarity, drug similarity and HIP/HOP score) to construct one feature score. Step 3: Generating eight feature scores for each of the three main HIP/HOP data-sources, resulting in three three-dimentional feature matrices. Step 4: Uniting the three feature matrices into one matrix with 24 features. Using true PGx association extracted from PharmGKB as the positive training set, and applying Random Forest classifier to predict PGx associations. (B) Feature construction example, demonstrating step 2 in the algorithmic pipeline. The feature score for a given PGx association between a drug D, and human gene G, is the maximal geometric mean of three measurements, across all drugs and genes in a chemogenomic database. In this example the maximal score (marked in red) is achieved by the geometric means of the three following measurements: (i) the chemical similarity between the query drug and the drug marked in blue (ii) the domain similarity between the human query gene and the yeast gene marked in green and (iii) the HIP chemogenomic association between the drug marked in blue and the yeast knock-out gene marked in green.