Table 1 The major phases of the drug discovery and development value chain, with approximate timescales in the absence of AI or data science tools

From: Addressing infectious diseases in Africa by accelerating drug discovery through data science

Target identification and validation (1–2 years)

Hit identification and lead optimisation (2–5 years)

Preclinical testing (1–2 years)

Clinical trials (6–10 years)

Regulatory approval and post-marketing surveillance (1–3 years)

• Examine genomic datasets to uncover factors contributing to diseases prevalent in Africa, to identify novel therapeutic targets and personalised treatments tailored to the genetic diversity of African populations.

• Integrate analyses of biological interaction networks and chemical information to identify key genes or proteins involved in disease pathways, supporting both target discovery and validation.

• Apply predictive models to estimate the ‘druggability’ of a target (i.e. the likelihood of it being successfully modulated by a small molecule) to streamline drug discovery efforts.

• Virtual screening of large compound libraries to identify molecules with promising biological activity, particularly in the case of natural products for which scaffold-hopping strategies can support synthetic accessibility.

• Explore opportunities for repurposing and repositioning existing drugs for diseases that disproportionately affect African populations.

• Devise synthetic routes tailored to available reagents in order to reduce costs and avoid delays linked to reagent procurement.

• Employ predictive models for absorption, distribution, metabolism, excretion and toxicity (AMDET) to prioritise molecules with favourable profiles.

• Generate new chemical structures that are computationally optimised for desired drug-like properties to accelerate the transition from a hit to a lead compound.

• Predict the binding affinity between potential drugs and their target proteins to support optimisation efforts.

• Anticipate potential toxicities early in the development process to reduce the need for costly animal studies.

• Identify biomarkers linked to treatment outcomes or disease progression, aiding preclinical testing and patient stratification.

• Predict patient responses to treatments in order to enable adaptive trial designs and optimal dosing regimens.

• Assess patient datasets to match individuals to clinical trials according to genetic profiles, disease characteristics and other factors, leading to more efficient recruitment.

• Track real-world health data to detect adverse reactions or drug interactions, strengthening post-approval monitoring and ensuring patient safety.

  1. Examples of ways in which AI and data science tools are expected to accelerate this pipeline are listed, with those that are particularly relevant to the African context highlighted in bold.