A generative framework for enhancing drug target interaction prediction in drug discovery

Kotkondawar, Roshan R.; Sutar, Sanjay R.; Kiwelekar, Arvind W.; Kadam, Vinod J.; Jadhav, Shivajirao M.

doi:10.1038/s41598-025-01589-9

Download PDF

Article
Open access
Published: 13 October 2025

A generative framework for enhancing drug target interaction prediction in drug discovery

Roshan R. Kotkondawar¹,
Sanjay R. Sutar¹^na1,
Arvind W. Kiwelekar²^na1,
Vinod J. Kadam¹ &
…
Shivajirao M. Jadhav¹

Scientific Reports volume 15, Article number: 35588 (2025) Cite this article

289 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

In silico drug-target interaction (DTI) prediction plays a key role in accelerating drug discovery and understanding molecular mechanisms. Traditional methods often struggle with the complexity and scale of biochemical data, thus limiting prediction accuracy. This study presents VGAN-DTI, a generative AI framework that combines generative adversarial networks (GANs), variational autoencoders (VAEs), and multilayer perceptrons (MLPs) to improve DTI predictions. The framework precisely encodes molecular features, uncovers underlying molecular mechanisms, and enhances predictive capabilities. GANs generate diverse molecular candidates, whereas VAEs optimize the feature representations. MLPs trained on BindingDB classify interactions and predict binding affinities. The model achieves 96% accuracy, 95% precision, 94% recall, and 94% F1 score, outperforming existing methods. Rigorous ablation studies validated the robustness of the framework. The proposed system enhances drug discovery by optimizing innovation, synthetic feasibility, and predictive accuracy, ensuring reliable DTI predictions and advancing data-driven pharmaceutical research.

DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation

Article Open access 30 May 2025

Predicting drug-target interactions using machine learning with improved data balancing and feature engineering

Article Open access 03 June 2025

DrugGen enhances drug discovery with large language models and reinforcement learning

Article Open access 18 April 2025

Introduction

Drug development is a prolonged and resource-intensive endeavor, often taking over ten years and costing approximately USD 1.4 billion, with clinical trials representing 80% of these costs^1,2. The challenge of identifying effective compounds stems from the vastness of the chemical space and complexity of biological interactions. Traditional approaches such as high-throughput screening and ligand-based design are limited by inefficiency and scalability^3,4. In contrast, recent advances in machine learning (ML) and deep learning (DL) have shown great promise in addressing these limitations, accelerating chemical space exploration, and enhancing drug discovery workflows^5,6.

Generative artificial intelligence (AI) models, particularly Generative Adversarial Networks (GANs)⁷ and Variational Autoencoders (VAEs)⁸, have emerged as transformative tools for drug discovery. These models generate novel molecular data that replicate the properties of existing compounds, enabling efficient exploration of the chemical space and more accurate drug-target interaction (DTI) prediction^9,10. VAEs primarily focus on producing synthetically feasible molecules, whereas GANs generate structurally diverse compounds with desirable pharmacological characteristics^11,12.

While VAEs effectively capture latent molecular representations, they may generate overly smooth distributions, limiting structural diversity. GANs complement this by introducing adversarial learning, enhancing molecular variability, mitigating mode collapse, and generating novel chemically valid molecules. This synergy ensures precise interaction modeling, optimizing both feature extraction and molecular diversity, and ultimately improving DTI prediction accuracy¹³. Integrating GANs with multilayer perceptrons (MLPs) further improves the accuracy of DTI predictions by leveraging generated features

Generative AI has the potential to streamline the drug discovery process by reducing costs, accelerating timelines, and improving overall efficiency. For example, it can minimize clinical development costs by up to 50%, shorten the trial duration by over 12 months, and raise the net present value by at least 20% through automation, regulatory optimization, and enhanced quality control. The McKinsey Global Institute reported that generative AI could contribute between USD 60 billion and USD 110 billion annually to the pharmaceutical sector, underscoring its transformative role in drug development and therapeutic advancement^14,15. Integrating generative models with simplified molecular input line entry system (SMILES)-based molecular data can optimize drug discovery, improve DTI predictions, and facilitate the development of novel therapies^16,17.

Problem identification and research objectives

Drug discovery involves extensive laboratory experiments and clinical trials, which limit the exploration of vast chemical spaces¹⁸. This study presents an advanced computational framework for precise DTI prediction designed to streamline drug discovery, while reducing costs and timelines. The primary objectives of this study are as follows:

Utilize VAEs to generate latent representations of molecular structures and novel molecules for target protein interactions.
Employ GANs to generate realistic molecular structures, enhancing compound efficacy.
Integrate MLP to refine the DTI predictions using a labeled dataset.
Improve predictive accuracy while ensuring effective molecule-target interactions.

Methodology

This section outlines the procedures and essential aspects of the experiments.

Model development

The proposed framework for DTI prediction is illustrated in Fig. 1, which highlights its architecture and workflow. The subsequent sections provide a detailed discussion of the methodologies and components integral to framework design. The VGAN-DTI framework integrates GANs, VAEs, and MLPs to enhance the DTI prediction. Its components include:

VAEs for refining small molecular representations
GANs for generating diverse drug-like molecules
MLPs for predicting binding affinities

VAE architecture

VAEs use a probabilistic encoder-decoder structure that encodes data into a distribution to generate diverse, smooth samples.

Encoder network

The encoder input layer receives molecular features as fingerprint vectors, whereas the hidden layers consist of fully connected units activated by a Rectified Linear Unit (ReLU). Typical configurations include two to three hidden layers, each with 512 units. The encoder function $f_{\theta }$ is represented by Eq. (1).
$$\begin{aligned} z = f_{\theta }(x) \end{aligned}$$
(1)
where $x$ is the input molecular structure, and $z$ is the latent representation.

The latent-space layer generates the mean ($\mu$) and log-variance ($\log \sigma ^2$) of the latent-space distribution. This was achieved using an initial dense layer followed by two distinct and separately parameterized dense layers for ($\mu$) and ($\log \sigma ^2$). This process is described by Eq. (2).
$$\begin{aligned} q(z|x) = {\mathcal {N}}(z|\mu (x), \sigma ^2(x)) \end{aligned}$$
(2)
where $\mu (x)$ and $\sigma ^2(x)$ denote the mean and variance output of the encoder, respectively.
Decoder network The decoder input layer receives a sample z from the latent space in this architecture. The hidden layers consisted of fully connected layers with ReLU activation, which were typically designed to mirror the encoder configuration. The output layer generates molecular representations such as SMILES strings through a final dense layer. The decoder function is given by Eq. (3).
$$\begin{aligned} p(x|z) = \textrm{Bernoulli}\big (x|\textrm{Decoder}(z)\big ) \end{aligned}$$
(3)
where $\textrm{Decoder}(z)$ is the decoder network output.

The decoder reconstructs the original molecular structure by using the latent representation described in Eq. (4).

$$\begin{aligned} {\hat{x}} = g_{\phi }(z) \end{aligned}$$

(4)

where ${\hat{x}}$ denotes the reconstructed molecular structure,

The VAE loss function combines the reconstruction loss with the Kullback-Leibler (KL) divergence¹⁹ between the learned latent distribution and prior distribution p(z) as described in Eq. (5).

$$\begin{aligned} {\mathcal {L}}_{\text {VAE}} = {\mathbb {E}}_{q_{\theta }(z|x)}[\log p_{\phi }(x|z)] - D_{\text {KL}}[q_{\theta }(z|x) || p(z)] \end{aligned}$$

(5)

The reconstruction loss measures the accuracy of the decoder in reconstructing the input from the latent space. The KL divergence penalizes deviations between the learned latent distribution and the prior distribution p(z), which is typically a standard normal distribution.

GAN architecture

GANs rank among the most productive and commonly used generative architectures, and deliver notable positive outcomes⁷. They comprise two neural networks, the generator and discriminator, which are trained adversarially. By utilizing these two modules, GANs can generate realistic molecular structures and predict DTI with high accuracy.

Generator network The generator input layer receives a random latent vector z from the generator network. The hidden layers are fully connected networks with activation functions, such as rectified linear units (ReLUs), and the output layer produces molecular representations. The generator function is expressed in Eq. (6).
$$\begin{aligned} x = G(z) \end{aligned}$$
(6)
where $G$ denotes the generator network parameterized by $\theta _g$.
Discriminator network

The discriminator input layer receives the molecular representations in the discriminator network. The hidden layers comprise fully connected networks with activation functions such as leaky ReLU. The output layer provides a probability that indicates whether an input molecule is authentic. The discriminator function is given by Eq. (7).
$$\begin{aligned} D(x) = \sigma (D(x)) \end{aligned}$$
(7)
where $\sigma$ is the sigmoid function and $D$ is the discriminator network parameterized by $\theta _d$.
Loss function

The discriminator loss is expressed as Eq. (8).
$$\begin{aligned} {\mathcal {L}}_D = {\mathbb {E}}_{z \sim p_{\text {data}}(x)} \left[ \log D(x) \right] + {\mathbb {E}}_{z \sim p_z(z)} \left[ \log \left( 1 - D(G(z)) \right) \right] \end{aligned}$$
(8)
where $p_{\text {data}}(x)$ represents the distribution of real molecules and $p_z(z)$ is the prior distribution of the latent vectors.

Similarly, generator loss is expressed by Eq. (9).
$$\begin{aligned} {\mathcal {L}}_G = -{\mathbb {E}}_{z \sim p_z(z)} \left[ \log D(G(z)) \right] \end{aligned}$$
(9)
The loss function prompts the generator to produce molecules that the discriminator classifies as real.

Multilayer perceptron (MLP)

MLPs are essential for improving DTI predictions. After generating molecules using VAEs and GANs, MLPs predict the interactions between these molecules and target proteins. As universal function approximators, MLPs capture complex nonlinear relationships, enabling accurate predictions from labeled DTI datasets. The MLP DTI prediction model included an input layer, several hidden layers, and an output layer. The input layer merges the features of the drug molecule and the target protein into a vector, which is processed through three hidden layers using linear transformations and nonlinear activation functions. The output layer produces a scalar that indicates the probability of interaction. During the forward pass, the model computes the interactions, and the loss function measures the errors to improve predictive accuracy.

Forward pass The output of each layer is computed during the forward pass through the linear and nonlinear activation functions. The output layer subsequently predicts interactions based on the learned features. Consider the following notations,
- x is the input feature vector,
- $W_i$ is the weight matrix of the i-th layer,
- $b_i$ is the bias vector of the i-th layer,
- $h_i$ is the output of the i-th hidden layer,
- $\sigma$ is the activation function (e.g., ReLU for hidden layers, and sigmoid for the output layer).
Equations (10) and (11) define the computations for the hidden layers and output layer, respectively.
$$\begin{aligned} h_i= & \sigma (W_i x + b_i) \end{aligned}$$
(10)
$$\begin{aligned} y= & \sigma (W_4 h_3 + b_4) \end{aligned}$$
(11)
where y represents the predicted interaction between the drug and the target protein.
Loss function The MLP model is trained using the Mean Squared Error (MSE) loss, which calculates the average squared difference between the predicted interaction (${\hat{y}}$) and true interaction (y). The loss function $L_{MLP}$ is defined by Eq. (12).
$$\begin{aligned} L_{MLP} = {\mathbb {E}}[(y - {\hat{y}})^2] \end{aligned}$$
(12)
where y denotes the true interaction label, ${\hat{y}}$ is the predicted interaction label, and ${\mathbb {E}}$ denotes the expectation (average) for all the training examples.

GAN algorithm

The GAN framework consists of two networks: a generator, which creates synthetic samples, and a discriminator, which distinguishes real data from generated data. Both networks are improved through adversarial training. The generator produces more realistic samples, and the discriminator enhances its detection accuracy. The discriminator computes the loss by distinguishing between real and synthetic data, whereas the generator adjusts to minimize this loss. Both networks were optimized using their respective loss functions. Algorithm 1 presents the execution steps for the GAN model.

VAE algorithm

The VAE generates new data using a probabilistic model composed of an encoder and decoder. Initially, the encoder and decoder weights are randomly initialized, and optimization is performed with a fixed learning rate. The encoder projects the input into the latent space, and the decoder recreates the original input from the samples within this space. The VAE loss combines the reconstruction loss and KL divergence, regularizing the latent space to follow a standard normal distribution. Minimizing this loss function enhances reconstruction accuracy and optimizes the latent-space structure. Algorithm 2 outlines the execution steps of the VAEs.

MLP algorithm

MLP training starts by initializing the model parameters and optimizer. In each epoch, the molecular and protein features are concatenated and passed through the MLP. The forward pass computes the activations through the hidden layers to produce the predicted interaction. The mean squared error (MSE) loss was calculated, and the weights were updated using back propagation and gradient descent. This process is outlined in Algorithm 3.

Experiments

This section outlines the experimental configurations used for training and evaluating the VGAN-DTI framework.

Dataset

The dataset used in this study was sourced from the BindingDB repository, which contains extensive data on small molecule-protein binding affinities. A subset of approximately 1.3 million records was selected, ensuring that each entry included a PubChem CID, SMILES string, UniProt ID, sequence data, and Gene Ontology annotations. From this collection, only records containing IC₅₀ values were considered, as IC₅₀ is the most consistently reported and widely utilized binding affinity metric in DTI classification tasks^20,21.

Entries with only Ki or Kd values were excluded from this study. While Ki (inhibition constant) and Kd (dissociation constant) provide valuable insights into binding affinity, they lack standardized binary classification thresholds and are highly context-dependent, making them unsuitable for consistent label assignment across large-scale datasets.

To ensure high-quality labeling and clear binary classification, we adopted a well-established thresholding strategy commonly used in prior DTI studies^22,23, where strong and weak interactions are defined as follows:

Strong (positive) interaction: IC₅₀ below 100 nM
Weak (negative) interaction: IC₅₀ greater than 10,000 nM

Entries with IC₅₀ values between 100 nM and 10,000 nM or missing any essential fields were removed to maintain binary label precision and eliminate ambiguity. The dataset was further filtered to retain only drug-like small molecules (molecular weight below 1000 Da), based on extended thresholds beyond Lipinski’s Rule of Five, to include structurally diverse bioactive compounds with potential therapeutic relevance^24,25.

Data processing

The data quality significantly affects the accuracy of DL-based frameworks²⁶. Data optimization for model training involves converting molecular structures into the SMILES notation, which is a compact and linear representation. Morgan fingerprints have been computed to capture essential chemical properties and to identify structural features and substructures²⁷.

The dataset was preprocessed to ensure standardized molecular and target representations. Canonical SMILES strings were used for drug molecules, and UniProt IDs were utilized to identify protein targets. All IC₅₀ values were converted to nanomolar (nM) units to maintain uniformity across the dataset. Records lacking essential information such as SMILES, protein sequences, or annotations were excluded. Additionally, duplicate entries and those with inconsistent or ambiguous activity values were removed to minimize noise and improve label consistency.

A systematic partitioning scheme was applied, following established protocols in the DTI literature²³, to evaluate model generalizability. The curated dataset was divided into four experimental settings:

1.
Both seen: Drug-target pairs in which both the drug and protein were present during training.
2.
Drug unseen: Test pairs included novel drugs not encountered during training, with known proteins.
3.
Protein unseen: Test pairs contained novel proteins, with drugs previously seen during training.
4.
Both unseen: Drug-target pairs in which neither the drug nor the protein appeared during training.

This partitioning approach enables comprehensive evaluation across varying levels of difficulty and biological relevance, simulating real-world DTI prediction scenarios where novel compounds or targets may be encountered.

Feature extraction

Feature extraction from the BindingDB dataset utilizes techniques that effectively represent the molecular structures. Morgan fingerprints encode structural fragments as vectors, whereas physicochemical properties such as octanol-water partition coefficient, denoted as log P, molecular weight, hydrogen bond acceptor counts, and topological polar surface area enrich the chemical and biological profiles of compounds²⁸. The molecular structures in the SMILES notation²⁹ are transformed into numerical formats, such as embeddings, for compatibility with DL models. Graph-based features, representing atoms as nodes and bonds as edges, capture structural details including connectivity, bond lengths, and ring sizes. These descriptors form the robust foundation for DTI prediction and generative drug design³⁰.

Feature representation

A key strength of the VGAN-DTI framework is its emphasis on precise feature representation. Small molecules encoded as SMILES strings were converted into molecular fingerprints using the RDKit module³¹ to represent structural and chemical properties for model training. Fingerprints were converted into numerical vectors that were compatible with the predictive model. Protein sequences were encoded as numeric vectors by integrating the amino acid composition and physicochemical properties to represent key biochemical features. This comprehensive feature representation enables the model to detect complex DTIs with enhanced predictive accuracy.

Model setup and training

The VGAN-DTI framework incorporates the VAE, GAN, and MLP components to predict drug-target interactions (DTIs). The setup and key training aspects are as follows.

VAE setup

The VAE optimizes the latent representations of molecular structures by minimizing a loss function that combines reconstruction loss and KL divergence. The reconstruction loss is defined in Eq. (13).

$$\begin{aligned} L_{\text {recon}} = {\mathbb {E}}_{q(z|x)} \left[ \log p(x|z) \right] \end{aligned}$$

(13)

KL divergence, as given in Eq. (14), regularizes the latent space to approximate a standard normal distribution.

$$\begin{aligned} L_{\text {KL}} = D_{KL} \left[ q(z \mid x) \parallel p(z) \right] \end{aligned}$$

(14)

The total VAE loss, which combines both terms, is expressed by Eq. (15).

$$\begin{aligned} L_{\text {VAE}} = L_{\text {recon}} + L_{\text {KL}} \end{aligned}$$

(15)

GAN setup

The GAN framework comprises a generator and a discriminator. The generator loss, expressed in Eq. (16), drives the generation of realistic molecular samples.

$$\begin{aligned} L_G = -{\mathbb {E}}_{z \sim p(z)} \left[ \log D(G(z)) \right] \end{aligned}$$

(16)

The discriminator loss, given in Eq. (17), enables the discriminator to differentiate real from generated data.

$$\begin{aligned} L_D = -{\mathbb {E}}_{x \sim p_{\text {data}}(x)} \left[ \log D(x) \right] - {\mathbb {E}}_{z \sim p(z)} \left[ \log (1 - D(G(z))) \right] \end{aligned}$$

(17)

MLP setup

The MLP predicts DTIs by minimizing the Mean Squared Error (MSE) loss, as shown in Eq. (18).

$$\begin{aligned} L_{\text {MLP}} = {\mathbb {E}} \left[ (y - {\hat{y}})^2 \right] \end{aligned}$$

(18)

where $y$ denotes the true interaction value, and ${\hat{y}}$ denotes the predicted value. This setup ensures accurate DTI predictions.

By combining these components, the VGAN-DTI framework efficiently learns molecular representations and predicts the DTIs.

Optimizing GAN, VAE, and MLP

The GAN module was trained using the Adam optimizer, with a learning rate of 0.0002. The generator and discriminator were alternately updated to minimize their respective losses, with a batch size of 64 and a latent dimension of 100. Similarly, VAE was trained for 150 epochs using the same optimizer settings to ensure effective latent-space learning and accurate molecular representations.

The MLP model consists of three hidden layers, each with 512 units and ReLU activation. The training utilized the Adam optimizer at a learning rate of 0.0002, with the output layer employing either sigmoid or softmax activation, depending on the task. The key optimization parameters are listed in Table 1.

Table 1 Hyperparameters for the proposed VGAN-DTI model.

Full size table

DTI prediction

The DTI prediction process integrates the molecules generated by the VAE and GAN models as inputs for the MLP, facilitating accurate predictions based on their learned representations. These models generate feature vectors that capture the structural properties of drug molecules. These vectors are combined with the target protein feature vectors to form inputs for the MLP, which then predicts the likelihood of interaction between drugs and target proteins.

Let ${\textbf{x}}_d$ represent the feature vector of a drug molecule and ${\textbf{x}}_t$ represent the feature vector of the target protein. The input to the MLP is a concatenated vector ${\textbf{x}} = [{\textbf{x}}_d, {\textbf{x}}_t]$.

The forward pass through the MLP is mathematically defined by Eqs. (19), (20), and (22).

$$\begin{aligned} h_1&= \sigma (W_1 {\textbf{x}} + b_1) \end{aligned}$$

(19)

$$\begin{aligned} h_2&= \sigma (W_2 h_1 + b_2) \end{aligned}$$

(20)

$$\begin{aligned}&\vdots \end{aligned}$$

(21)

$$\begin{aligned} y&= \sigma (W_n h_{n-1} + b_n) \end{aligned}$$

(22)

Here, $W_i$ and $b_i$ are the weight and bias for the $i$-th layer, respectively, $\sigma$ is the activation function (ReLU for the hidden layers and sigmoid or softmax for the output layer), and $y$ is the predicted interaction score.

Evaluation

The proposed model was rigorously evaluated using key performance metrics, with validation loss monitored during training to mitigate overfitting. Early stopping was implemented to optimize performance, and an independent test set was used to ensure unbiased assessment. The evaluation framework included precision, recall, accuracy, and F1 score. Accuracy reflects the proportion of correct predictions; precision indicates the reliability of positive predictions; recall quantifies the model’s ability to identify true positives; and the F1 score, as the harmonic mean of precision and recall, offers a comprehensive measure of predictive performance.

DTI prediction distinguished between strong (IC50 < 100 nM) and weak interactions (IC50 > 10,000 nM). The performance metrics were analyzed across the four configurations by varying the inclusion of drugs and proteins in the training set. This comprehensive approach ensured thorough evaluation and identified areas for potential improvement.

Results

This section presents the experimental results of the VGAN-DTI approach for DTI prediction and drug design, emphasizing the effectiveness of integrating VAEs, GANs, and MLPs to achieve accurate and reliable outcomes.

Model performance

The VGAN-DTI framework achieved excellent performance across key evaluation metrics, as illustrated in Fig. 2. It achieved an AUC-ROC of 0.9523, precision of 0.9542, recall of 0.9412, F1 score of 0.9442, and an accuracy of 0.9635. These results highlight the robustness and precision of the framework, which makes it a valuable tool for drug discovery.

Comparative assessment with baseline methods

The VGAN-DTI framework advances DTI predictions by leveraging the combined strengths of the VAEs, GANs, and MLPs. VAEs explore the latent chemical space to identify novel drug candidates, whereas GANs generate diverse, high-quality molecules to augment the training data. The MLP trained on this enriched dataset achieved significant improvements in prediction accuracy. Three popular baseline methods are considered to evaluate the overall performance of the framework. As shown in Table 2, VGAN-DTI demonstrated superior predictive performance, outperforming the existing methods.

The ELECTRA-DTA model (Wang et al., 2022) reported an AUC-ROC of 0.8745, precision of 0.8321, recall of 0.8154, F1-score of 0.8236, and accuracy of 0.8452³². The MDCT-DTA model (Zhu et al., 2024) had an AUC-ROC of 0.8543, precision of 0.8215, recall of 0.7987, F1-score of 0.8099, and accuracy of 0.8368³³. The HoTS model (Lee et al., 2022) delivered an AUC-ROC of 0.8891, precision of 0.8454, recall of 0.8327, F1-score of 0.8390, and accuracy of 0.8675³⁴. The VGAN-DTI framework delivers promising results, demonstrating significant improvements over the state-of-the-art methods.

Table 2 Comparison of VGAN-DTI approach with state-of-the-art methods.

Full size table

Figure 3 compares the AUC-ROC scores of the VGAN-DTI, ELECTRA-DTA³², MDCT-DTA³³, and HoTS³⁴ models, highlighting the superior performance of VGAN-DTI for DTI predictions.

Ablation study

Ablation studies were conducted to assess the individual impact of the VAE, GAN, and MLP components in the VGAN-DTI framework, with each module evaluated separately to determine its contribution to the overall predictive performance. Figure 5 illustrates the training loss curves for the VAE, GAN, and MLP components, showcasing their convergence and stability throughout the training process.

Molecule reconstruction by VAE

The VAE model encodes key features related to biological activity in the latent space, as shown in Fig. 4, enabling the generation of diverse molecular structures. The target protein interactions of these novel molecules were evaluated using the MLP model.

The VAE training optimizes the reconstruction loss and KL divergence to ensure high-quality molecule generation, as shown in Fig. 5a. This approach enabled the generated molecules to closely mirror the original molecules while retaining their potential for target-protein interactions. The model demonstrated balanced performance, achieving 92% accuracy with precision, recall, and F1 scores of 90%, 91%, and 91%, respectively, underscoring its ability to generate novel and relevant molecular structures for DTI.

Molecule generation by GAN

The GAN model generates realistic molecular structures using a discriminator, which is subsequently employed for DTI prediction, thereby improving accuracy. The steady improvement in the training loss curve, illustrated in Fig. 5b, reflects the optimization process of the model. The GAN achieved 94% accuracy, with precision, recall, and F1 scores of 93%, 92%, and 93%, respectively. These remarkable results are attributed to efficient adversarial training, in which the generator produces artificial molecules and the discriminator assesses their authenticity, thereby enhancing the overall molecular quality.

DTI prediction by MLP

The MLP module serves as the final predictor, leveraging the molecular representations generated by the VAE and GAN models. The model training and optimization processes are illustrated in Fig. 5c, which shows a steady improvement. The model achieved 96% accuracy, distinguishing between interacting and non-interacting drug-target pairs. It also achieved a precision of 95%, reduced false positives, and a recall of 94%, thus validating its strength in identifying true positives. An F1 score of 94% indicated a balanced performance, optimizing both precision and recall. These findings underscore the efficacy of the model in capturing complex relationships and affirm its reliability for DTI predictions.

Discussion

This study highlights the integration of VAEs, GANs, and MLPs as a robust framework to improve DTI predictions. The following section examines the impact of the key parameters on model optimization.

Impact of training strategies

The efficacy of deep generative models depends on the optimized training strategies. The VAE reconstruction loss ensures accurate molecule generation, whereas GAN adversarial training enhances the discriminator to produce realistic molecules. The excellent performance of MLP is attributed to its deep architecture and dropout layers, which prevent overfitting and diverse datasets. Hyperparameter optimization, including learning rate and batch size adjustments, further improved the model efficiency.

Cross-validation and generalization analysis

A 5-fold cross-validation method was used to assess the stability and reliability of the models. The dataset was split into five segments, with four segments used for training and one segment for validation in each fold. The performance metrics along with the average and standard deviation, were recorded for each fold. The results in Table 3 highlight the consistency and robustness of the proposed model.

Since VGAN-DTI is the proposed and implemented model, its performance metrics are derived from five-fold cross-validation, allowing us to report mean ± standard deviation. In contrast, baseline models are state-of-the-art methods with performance metrics reported in prior studies or standard implementations, where standard deviations are often unavailable. This distinction ensures a fair comparison without extrapolating unreported variations from existing literature.

Table 3 5-fold cross-validation performance metrics.

Full size table

The cross-validation results shown in Fig. 6 confirm the robustness and generalizability of the integrated model. Low standard deviations indicate consistent performance across data subsets, highlighting the reliability of the model. This consistency is crucial to ensure the applicability of the model to real-world data.

Error analysis

A detailed error analysis was conducted to examine misclassification patterns in VGAN-DTI. The model exhibited a misclassification rate of 12%, primarily owing to overlapping molecular features, such as Molecular Weight, Hydrophobicity, and Polar Surface Area between interacting and non-interacting classes. This feature redundancy can lead to ambiguous predictions, impacting the classification performance. Figure 7 illustrates the distribution of key features in the misclassified samples, highlighting the significant overlaps that contribute to the errors. To address these challenges, the framework leverages adversarial training to enhance feature distinctions and reduce misclassifications. Additionally, the SHAP-based sensitivity analysis provides insights into feature contributions and improves interpretability and robustness. These findings indicate that the generative framework in VGAN-DTI enhances molecular representation, minimizing error propagation and improving prediction reliability over conventional DTI models.

Feature importance and interpretability

The Shapley Additive Explanation (SHAP) summary and sensitivity analysis were performed to assess the performance and interpretability of the proposed model, as illustrated in Fig. 8. Table 4 records observations from the SHAP analysis, highlighting key features, such as molecular weight and hydrophobicity, that influence prediction. SHAP values offer an interpretable framework that is essential for real-world applications and can inform future improvements. Figure 8a shows the SHAP summary plot, emphasizing the importance of features and their impact on predictions.

Table 4 Feature importance based on mean SHAP values.

Full size table

Sensitivity analysis showed minimal variation with changes in the input data. Figure 8b highlights the robustness and reliability of the model, ensuring consistent predictions, even in the presence of noise or slight data fluctuations. This robustness is crucial for real-world applications where data may not always be perfectly accurate or clean.

Implications of findings

The optimized integration of VAEs, GANs, and MLPs has significantly advanced drug discovery by expanding chemical space and improving DTI prediction. The ability of this framework to make precise DTI predictions, even for structurally diverse candidates, underscores its potential to streamline drug discovery, enhance diversity, and accelerate novel drug identification.

Limitations and future work

Despite these promising results, this study has notable limitations that require further investigation. The performance of VAEs and GANs is heavily influenced by the quality and diversity of the training data, with biases and limited diversity restricting chemical space exploration and reducing the prediction accuracy. The computational complexity of training these models requires significant resources, thereby highlighting the need for more efficient algorithms and hardware accelerators.

Interpretability remains a challenge because the understanding of the biological mechanisms behind predictions is limited. Employing explainable AI methods is essential to enhance model transparency and trust. Incorporating additional data modalities, such as genomic or proteomic data, could further improve DTI predictions by offering a more holistic view of drug-target interactions. Ethical concerns, including data privacy, model bias, and responsible AI deployment in healthcare, must also be addressed. Ensuring transparency, fairness, and safety in AI-driven drug discovery is critical for broader adoption.

Collaboration between computer scientists and biologists is vital to overcome these challenges. Combining domain expertise with technological innovation can enhance model performance, interpretability, and ethical practices, thereby advancing AI-driven drug discovery.

Related Work

The integration of generative AI models, particularly GAN-based architectures, VAEs, transformer-based large language models (LLMs) such as generative pretrained transformers (GPT)^35,36, and diffusion models³⁷, has gained significant attention in drug discovery and presents new opportunities for enhancing DTI predictions and optimizing critical drug development processes. Research in this domain has used vast biochemical resources to explore innovative applications and to advance drug discovery³⁸. Generative models, such as GANs and VAEs, analyze patterns within chemical databases to design novel molecules from scratch, thereby facilitating key tasks such as quantitative structure-activity relationship (QSAR) modeling and molecular optimization^39,40. This section reviews pioneering studies in this field, emphasizing their contributions in optimizing drug discovery pipelines.

Lin et al. explored recent advancements in drug design using GANs and introduced the FL-DISCO framework, which combines GANs with GNNs in a federated learning context. This approach demonstrates the ability to generate novel compounds with desirable drug-like properties⁴¹. Abbasi et al. developed the Feedback GAN framework, which utilizes an encoder-decoder architecture to convert SMILES into latent-space vectors and train a WGAN-GP network. This framework effectively generates realistic, diverse, and unique molecules, exploring new chemical spaces optimized for receptor-binding affinities and achieving 99% reconstruction accuracy⁴². Xu et al. introduced DeepGAN, a generative model trained on DeepSMILES data that addressed the limitations of traditional SMILES representations. Reinforcement learning principles have been used to optimize rewards and adversarial loss, allowing the model to generate diverse molecules while improving the validity and metrics⁴³.

Li et al.. presented advanced quantum machine learning (ML) methods for drug molecule generation and protein binding site classification by integrating GANs, CNNs, and VAEs. In this study, a qubit-efficient quantum GAN (QGAN-HG) and an image-based method were used to advance quantum techniques for drug discovery^44,45. Surana et al. developed PandoraGAN, a DL approach that uses GANs to accelerate the development of novel antiviral peptides (AVPs). Using a dataset of 130 highly active peptides, PandoraGAN efficiently generated novel peptide backbones with properties similar to those of known active peptides, thereby presenting significant potential for drug development against pathogenic viruses⁴⁶. Song et al. introduced DNMG, a deep GAN architecture that utilizes transfer learning to enhance de novo molecular design by incorporating 3D spatial information and atomic physicochemical properties. This model generates novel drug-like ligands with enhanced binding affinities and physicochemical properties for specific targets, thereby advancing drug discovery⁴⁷.

Li et al. addressed the challenges in de novo drug molecule design by developing a scalable quantum generative autoencoder (SQ-VAE) for molecule reconstruction and sampling. This study explored hybrid quantum-classical networks that generated high-dimensional molecular structures with superior drug properties in various dimensions⁴⁸. Joo et al. proposed a conditional variational autoencoder (CVAE) architecture to address the challenge of generating syntactically invalid molecules in DL-based generative models. The CVAE framework, trained on molecular fingerprints and GI50 results for breast cancer cell lines, generates valid fingerprints with the desired properties and enhances database search capabilities⁴⁹.

Accurate prediction of protein-ligand binding affinity is essential for optimizing compounds and enhancing their interactions with target proteins⁵⁰. This study applied generative AI frameworks to streamline DTI predictions, improving prediction accuracy using benchmark datasets and binding affinity measurements.

Conclusion

The VGAN-DTI framework presented in this study uniquely combines variational autoencoders (VAEs), generative adversarial networks (GANs), and multilayer perceptrons (MLPs), enabling significant advancements in drug-target interaction (DTI) prediction. This methodology sets a new standard for DTI prediction, achieving remarkable outcomes, with 96% accuracy, 95% precision, 94% recall, and a high F1 score. The focus on data quality and accurate feature representation enables scalable and efficient prediction, which optimizes molecular interaction strategies and discovers novel drug candidates. The potential of generative AI models to streamline drug discovery by expanding the chemical space for critical tasks was outlined in this study. Leveraging generative AI-based computational methods can significantly reduce both the timelines and the costs of drug discovery. Future research should validate this framework across diverse datasets and integrate additional biological data to enhance its applicability and impact on personalized medicine and drug discovery.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Code availability

The code used and developed in this study is available from the corresponding author upon reasonable request.

References

DiMasi, J. A. Research and development costs of new drugs. JAMA 324(5), 517–517 (2020).
Article PubMed Google Scholar
Hughes, J. P., Rees, S., Kalindjian, S. B. & Philpott, K. L. Principles of early drug discovery. Br. J. Pharmacol. 162(6), 1239–1249 (2011).
Article CAS PubMed PubMed Central Google Scholar
Berdigaliyev, N. & Aljofan, M. An overview of drug discovery and development. Future Med. Chem. 12(10), 939–947 (2020).
Article CAS PubMed Google Scholar
Özçelik, R., van Tilborg, D., Jiménez-Luna, J. & Grisoni, F. Structure-based drug discovery with deep learning. ChemBioChem 24(13), e202200776 (2023).
Article PubMed Google Scholar
Johansson, S. et al. AI-assisted synthesis prediction. Drug Discov. Today Technol. 32, 65–72 (2019).
Article PubMed Google Scholar
Korn, M., Ehrt, C., Ruggiu, F., Gastreich, M. & Rarey, M. Navigating large chemical spaces in early-phase drug discovery. Curr. Opin. Struct. Biol. 80, 102578 (2023).
Article CAS PubMed Google Scholar
Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020).
Article MathSciNet Google Scholar
Kingma, D. P. et al. An introduction to variational autoencoders. Found. Trends. Mach. Learn. 12(4), 307–392 (2019).
Article Google Scholar
Azlim Khan, A. K. & Ahamed Hassain Malim, N. H. Comparative studies on resampling techniques in machine learning and deep learning models for drug-target interaction prediction. Molecules 28(4), 1663 (2023).
Article CAS PubMed PubMed Central Google Scholar
Cheng, Y., Gong, Y., Liu, Y., Song, B. & Zou, Q. Molecular design in drug discovery: A comprehensive review of deep generative models. Brief. Bioinform. 22(6), bbab344 (2021).
Article PubMed Google Scholar
Grebner, C. et al. Application of deep neural network models in drug discovery programs. ChemMedChem 16(24), 3772–3786 (2021).
Article CAS PubMed Google Scholar
Wei, R. & Mahmood, A. Recent advances in variational autoencoders with representation learning for biomedical informatics: A survey. IEEE Access. 9, 4939–4956 (2020).
Article Google Scholar
Gupta, A. et al. Generative recurrent networks for de novo drug design. Mol. Inf. 37(1–2), 1700111 (2018).
Article Google Scholar
Bordukova, M., Makarov, N., Rodriguez-Esteban, R., Schmich, F. & Menden, M. P. Generative artificial intelligence empowers digital twins in drug discovery and clinical trials. Expert Opin. Drug Discov. 19(1), 33–42 (2024).
Article CAS PubMed Google Scholar
Viswa, C. A., Bleys, J., Leydon, E., Shah, B. & Zurkiya, D. Generative AI in the Pharmaceutical Industry: Moving from Hype to Reality (McKinsey & Company, 2024).
Google Scholar
Prabhod, K. J. Leveraging generative AI for personalized medicine: applications in drug discovery and development. J. AI-Assist. Sci. Discov. 3(1), 392–434 (2023).
Google Scholar
Zeng, X. et al. Deep generative molecular design reshapes drug discovery. Cell Rep. Med. 3(12) (2022).
Husnain, A., Rasool, S., Saeed, A. & Hussain, H. K. Revolutionizing pharmaceutical research: Harnessing machine learning for a paradigm shift in drug discovery. Int. J. Multidiscip. Sci. Arts 2(2), 149–157 (2023).
Google Scholar
Raiber, F., Kurland, O. Kullback-leibler divergence revisited. In Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval 117–124 (2017).
Gao, K. Y. et al. Interpretable drug target prediction using deep neural representation. In IJCAI. vol. 2018, 3371–3377 (2018).
Zhang, Y. et al. A survey of drug-target interaction and affinity prediction methods via graph neural networks. Comput. Biol. Med. 163, 107136 (2023).
Article CAS PubMed Google Scholar
Pahikkala, T. et al. Toward more realistic drug-target interaction predictions. Brief. Bioinform. 16(2), 325–337 (2015).
Article CAS PubMed Google Scholar
Öztürk, H., Özgür, A. & Ozkirimli, E. DeepDTA: Deep drug-target binding affinity prediction. Bioinformatics 34(17), i821–i829 (2018).
Article PubMed PubMed Central Google Scholar
Ghose, A. K., Viswanadhan, V. N. & Wendoloski, J. J. A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases. J. Combinatorial Chem. 1(1), 55–68 (1999).
Article CAS Google Scholar
Lipinski, C. A. Lead-and drug-like compounds: The rule-of-five revolution. Drug Discov. Today Technol. 1(4), 337–341 (2004).
Article CAS PubMed Google Scholar
López del, Río Á, et al. Data preprocessing and quality diagnosis in deep learning-based in silico bioactivity prediction. (2021).
Askr, H. et al. Deep learning in drug discovery: An integrative review and future challenges. Artif. Intell. Rev. 56(7), 5975–6037 (2023).
Article PubMed Google Scholar
Stephenson, N. et al. Survey of machine learning techniques in drug discovery. Curr. Drug Metab. 20(3), 185–193 (2019).
Article CAS PubMed Google Scholar
Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28(1), 31–36 (1988).
Article CAS Google Scholar
Monteiro, N. R., Ribeiro, B. & Arrais, J. P. Drug-target interaction prediction: End-to-end deep learning approach. IEEE/ACM Trans. Comput. Biol. Bioinf. 18(6), 2364–2374 (2020).
Article Google Scholar
Landrum, G. Rdkit documentation. Release 1(1–79), 4 (2013).
Google Scholar
Wang, J., Wen, N., Wang, C., Zhao, L. & Cheng, L. ELECTRA-DTA: A new compound-protein binding affinity prediction model based on the contextualized sequence encoding. J. Cheminform. 14(1), 14 (2022).
Article PubMed PubMed Central Google Scholar
Zhu, Z. et al. Drug-target binding affinity prediction model based on multi-scale diffusion and interactive learning. Expert Syst. Appl. 255, 124647 (2024).
Article Google Scholar
Lee, I. & Nam, H. Sequence-based prediction of protein binding regions and drug-target interactions. J. Cheminform. 14(1), 5 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Gillioz, A., Casas, J., Mugellini, E. & Abou Khaled, O. Overview of the transformer-based models for NLP tasks. In 2020 15th Conference on Computer Science and Information Systems (FedCSIS), 179–183 (IEEE, 2020).
Wolf, T. et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 38–45 (2020).
Yang, L. et al. Diffusion models: A comprehensive survey of methods and applications. ACM Comput. Surv. 56(4), 1–39 (2023).
Article Google Scholar
Deng, J., Yang, Z., Ojima, I., Samaras, D. & Wang, F. Artificial intelligence in drug discovery: Applications and techniques. Brief. Bioinform. 23(1), bbab430 (2022).
Article PubMed Google Scholar
Bian, Y. & Xie, X. Q. Generative chemistry: drug discovery with deep learning generative models. J. Mol. Model. 27, 1–18 (2021).
Article Google Scholar
Walters, W. P. & Barzilay, R. Critical assessment of AI in drug discovery. Expert Opin. Drug Discov. 16(9), 937–947 (2021).
Article CAS PubMed Google Scholar
Lin, E., Lin, C. H. & Lane, H. Y. Relevant applications of generative adversarial networks in drug design and discovery: Molecular de novo design, dimensionality reduction, and de novo peptide and protein design. Molecules 25(14), 3250 (2020).
Article CAS PubMed PubMed Central Google Scholar
Abbasi, M. et al. Designing optimized drug candidates with Generative Adversarial Network. J. Cheminform. 14(1), 40 (2022).
Article PubMed PubMed Central Google Scholar
Xu, M., Cheng, J., Liu, Y., Huang, W. & Deepgan: Generating molecule for drug discovery based on generative adversarial network. In IEEE Symposium on Computers and Communications (ISCC). vol. 2021, 1–6 (IEEE, 2021).
Li, J. et al. 58th ACM/IEEE Design Automation Conference (DAC). vol. 2021, 1356–1359 (IEEE, 2021).
Schuld, M., Sinayskiy, I. & Petruccione, F. An introduction to quantum machine learning. Contemp. Phys. 56(2), 172–185 (2015).
Article ADS Google Scholar
Surana, S., Arora, P., Singh, D., Sahasrabuddhe, D. & Valadi, J. Pandoragan: Generating antiviral peptides using generative adversarial network. SN Comput. Sci. 4(5), 607 (2023).
Article Google Scholar
Song, T. et al. DNMG: Deep molecular generative model by fusion of 3D information for de novo drug design. Methods 211, 10–22 (2023).
Article CAS PubMed Google Scholar
Li, J. & Ghosh, S. Scalable variational quantum circuits for autoencoder-based drug discovery. In Design, Automation & Test in Europe Conference & Exhibition (DATE). vol. 2022, 340–345 (IEEE, 2022).
Joo, S., Kim, M. S., Yang, J. & Park, J. Generative model for proposing drug candidates satisfying anticancer properties using a conditional variational autoencoder. ACS Omega 5(30), 18642–18650 (2020).
Article CAS PubMed PubMed Central Google Scholar
Rezaei, M. A., Li, Y., Wu, D., Li, X. & Li, C. Deep learning in drug design: Protein-ligand binding affinity prediction. IEEE/ACM Trans. Comput. Biol. Bioinf. 19(1), 407–417 (2020).
Article Google Scholar

Download references

Author information

These authors are contributed equally to this work: Sanjay R. Sutar and Arvind W. Kiwelekar.

Authors and Affiliations

Department of Information Technology, Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad, Maharashtra, 402103, India
Roshan R. Kotkondawar, Sanjay R. Sutar, Vinod J. Kadam & Shivajirao M. Jadhav
Department of Computer Engineering, Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad, Maharashtra, 402103, India
Arvind W. Kiwelekar

Authors

Roshan R. Kotkondawar
View author publications
Search author on:PubMed Google Scholar
Sanjay R. Sutar
View author publications
Search author on:PubMed Google Scholar
Arvind W. Kiwelekar
View author publications
Search author on:PubMed Google Scholar
Vinod J. Kadam
View author publications
Search author on:PubMed Google Scholar
Shivajirao M. Jadhav
View author publications
Search author on:PubMed Google Scholar

Contributions

RRK: Conceptualization, formal analysis, methodology, data curation, visualization, model development, result analysis and validation, writing original draft. SRS: Conceptualization, methodology, validation, supervision, project administration, writing-review & editing. AWK: Conceptualization, methodology, supervision, project administration, writing-review & editing. VJK: Data curation, validation, writing-review & editing. SMJ: Validation, project administration, writing, review, and editing.

Corresponding author

Correspondence to Roshan R. Kotkondawar.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Kotkondawar, R.R., Sutar, S.R., Kiwelekar, A.W. et al. A generative framework for enhancing drug target interaction prediction in drug discovery. Sci Rep 15, 35588 (2025). https://doi.org/10.1038/s41598-025-01589-9

Download citation

Received: 31 January 2025
Accepted: 07 May 2025
Published: 13 October 2025
DOI: https://doi.org/10.1038/s41598-025-01589-9

Subjects

Abstract

Similar content being viewed by others

DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation

Predicting drug-target interactions using machine learning with improved data balancing and feature engineering

DrugGen enhances drug discovery with large language models and reinforcement learning

Introduction

Problem identification and research objectives

Methodology

Model development

VAE architecture

GAN architecture

Multilayer perceptron (MLP)

GAN algorithm

VAE algorithm

MLP algorithm

Experiments

Dataset

Data processing

Feature extraction

Feature representation

Model setup and training

VAE setup

GAN setup

MLP setup

Optimizing GAN, VAE, and MLP

DTI prediction

Evaluation

Results

Model performance

Comparative assessment with baseline methods

Ablation study

Molecule reconstruction by VAE

Molecule generation by GAN

DTI prediction by MLP

Discussion

Impact of training strategies

Cross-validation and generalization analysis

Error analysis

Feature importance and interpretability

Implications of findings

Limitations and future work

Related Work

Conclusion

Data availability

Code availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links