Table 1 Summary of lessons learned and corresponding best practices, for the generation and validation of GPCR models for the purpose of prospective use with FEP

From: AI meets physics in computational structure-based drug discovery for GPCRs

Lesson learned

Best practices

Small structural details such as ECL2 conformation, helical tilts, side chain orientation, or protonation state can significantly impact FEP accuracy, but are difficult to pinpoint a priori61. (Fig. 8)While structural features close to the binding site are more likely to matter, distant regions can also allosterically affect the binding pocket and thereby have an effect on FEP accuracy.

Pay attention to structural details and prepare the structure rigorously. Carefully analyze the structure relevance (e.g. the activation state and experimental conditions), quality (resolution, B-factors), completeness (e.g. unresolved regions).

Use an empirical trial-and-error approach to assess models.

Use an ensemble approach to increase the chances of identifying a predictive model.

Evaluate side-by-side models that are seemingly very similar.

ICLs can impact the dynamics around the TM orthosteric site, and cannot be ignored in model building.

Remove chimeric constructs inserted into intracellular loops.

For discontinuous ICLs, rebuild the missing regions using different approaches (e.g. grafts or crosslink) or apply constraints.

Water molecules and their displacement are particularly important in the ligand binding site149, and cannot be ignored in model building.

Use an FEP implementation that handles water molecules well137.

Alternatively, carefully consider binding site water molecules and use a trial-and-error approach.

Retrospective FEP validation is only as robust as the SAR used in the validation. Identifying a good model for prospective FEP relies completely on retrospective validation.

Validation dataset should ideally consist of potency data spanning 3 or more orders of magnitude. Data should be well distributed across the potency range.

Make sure there is no intrinsic bias in the retrospective SAR data, like potency being correlated to MW or LogP.

If the model is intended to explore modifications in a particular region of the ligand, make sure the validation set includes modifications in that region. Ideally, a model would be validated for different regions of the ligand before prospective use.

Induced-fit effects of the compound series may not be accurately modeled in the available models.

Use a physics-based approach like IFD-MD to locally refine the ligand binding site for a particular compound series154.