Exploring the macrocyclic chemical space for heuristic drug design with deep learning models

Hu, Feng; Jia, Xiaotong; Liao, Wenjie; Chen, Ziqi; Bi, Hongjie; Ge, Huan; Liu, Dandan; Zhang, Rongrong; Hu, Yuting; Mei, Wenyi; Zhao, Zhenjiang; Zhang, Kai; Zhu, Lili; Diao, Yanyan; Li, Honglin

doi:10.1038/s42004-025-01686-w

Download PDF

Article
Open access
Published: 07 October 2025

Exploring the macrocyclic chemical space for heuristic drug design with deep learning models

Feng Hu ORCID: orcid.org/0009-0008-1654-1246¹^na1,
Xiaotong Jia¹^na1,
Wenjie Liao¹^na1,
Ziqi Chen¹,
Hongjie Bi¹,
Huan Ge¹,
Dandan Liu¹,
Rongrong Zhang¹,
Yuting Hu¹,
Wenyi Mei¹,
Zhenjiang Zhao¹,
Kai Zhang²,
Lili Zhu¹,
Yanyan Diao² &
…
Honglin Li ORCID: orcid.org/0000-0003-2270-1900^1,2

Communications Chemistry volume 8, Article number: 299 (2025) Cite this article

Subjects

Abstract

Macrocyclic compounds hold great promise as therapeutic agents. However, their structural optimization remains constrained by the limited availability of bioactive candidates, which in turn hampers the systematic exploration of structure-activity relationships. Here we introduce CycleGPT, a generative chemical language model designed specifically to address these challenges. CycleGPT is characterized by a progressive transfer learning paradigm that incrementally transfers knowledge from pre-trained chemical language models to specialized macrocycle generation, thereby overcoming the data shortage issue. Meanwhile, it adopts an innovative probabilistic sampling strategy that effectively improves the structural novelty of generated macrocycles while ensuring domain-specific adaptability. In a prospective drug design based on CycleGPT and a JAK2 activity prediction model, we successfully developed a new JAK2 drug candidate with a good selectivity profile (inhibiting 17 wild-type kinases) and promising potential for treating polycythemia in vivo, demonstrating the practicality of deep learning methods in macrocyclic drug design.

Macrocyclization of linear molecules by deep learning to facilitate macrocyclic drug candidates discovery

Article Open access 28 July 2023

Optimizing drug design by merging generative AI with a physics-based active learning framework

Article Open access 08 August 2025

Synthesis and direct assay of large macrocycle diversities by combinatorial late-stage modification at picomole scale

Article Open access 02 July 2022

Introduction

Innovative drug research, propelled by advances in chemistry, aims to identify effective molecular agents for disease treatment¹. While small molecules—especially those following Lipinski’s rule—remain central to drug development, their limited binding interfaces and insufficient affinity for desired targets often result in poor stability², short half-life, and off-target effects^3,4. To address these limitations, researchers are exploring distinct modalities to expand the chemical space of druggability, with macrocycles emerging as promising candidates for overcoming the constraints of traditional small-molecule drugs^1,5. Macrocycles are typically defined as cyclic chemical molecules or peptides containing a dodecyl ring or larger ring structure, bridging the gap between small molecules and antibodies^1,6,7. These chemical structures can form large contact interfaces with proteins⁸ and with higher binding affinity and improved selectivity. Meanwhile, their chameleon properties⁷ enhance drug stability and confer favorable pharmacokinetic profiles^9,10,11. Nowadays, macrocycles have been successfully used as potential therapeutic agents for a variety of drug targets, such as kinases, proteases, and G-protein-coupled receptors, achieving excellent outcomes in the treatment of many diseases.

Rational macrocyclic drug design typically involves the macrocyclization of bioactive linear compounds and the subsequent modification of the resulting macrocycles¹². In the macrocyclization step, there are already some computational methods^13,14 that leverage geometrically constrained linker searching and connection strategy from a prebuilt 3D linker database to derive distinct macrocyclic skeletons. In addition, our team previously introduced Macformer, a substructure-aligned SMILES augmentation strategy with a Transformer architecture to automatically generate macrocyclic linkers¹⁵. However, given a bioactive macrocyclic compound as a starting point, how to further modify and optimize its structure (e.g., through macrocyclic scaffold hopping or substituent modifications on the macrocyclic ring) to enhance druggability or expand the pool of candidates for early-stage clinical screening remains an unresolved challenge. So far, such modifications primarily depend on the expert experience of pharmaceutical chemists or the utilization of iterative methods such as pharmacophore replacement, which are time-consuming and labor-intensive. A representative case is the discovery of macrocyclic compound Z11, a selective CDK9 inhibitor, designed to overcome resistance to Osimertinib in non-small cell lung cancer. Beginning with the linear precursor Z0, the researchers first performed macrocyclization to obtain Z1, followed by three iterative rounds of structural modification, ultimately yielding the optimized macrocyclic compound Z11¹⁶. Despite such progress, effective computational strategies for the structural optimization of macrocycles remain scarce in the literature.

Thoroughly dissecting the intricate chemical space of macrocycles is a fundamental guidance for their structural modification and optimization. As an emerging and highly underexplored class of structures, human knowledge about macrocycles is limited, and researchers have been struggling to elucidate what properties make a macrocyclic compound suitable for drug development. Viarengo-Baker et al.¹⁷ conducted a principal component analysis of oral and non-oral macrocyclic compounds and identified 13 properties that could be used to design macrocyclic compounds in the same chemical space as oral macrocyclic drugs. Jimenez et al.¹ systematically analyzed the FDA-approved macrocyclic drugs and proposed a simple bi-descriptor model, i.e., hydrogen bond donors less than or equal to 7 in combination with either molecular weight less than 1000 or cLogP greater than 2.5, as filtering criteria for screening oral macrocyclic drugs. However, due to the structurally complex nature of macrocycles and the scarcity of available macrocyclic therapeutics, explicit descriptors offer only limited guidance for the design of macrocyclic drugs.

In this paper, we consider exploring the compounds situated in the immediate chemical neighborhood of lead macrocycles, which could have great potential to serve as drug candidates. We achieve this by introducing CycleGPT‌, a molecular generative model tailored for systematic exploration and expansion of privileged macrocycle chemical space. CycleGPT is characterized by a progressive transfer learning paradigm to incrementally transfer knowledge from pre-trained chemical language models to specialized macrocycle generation, overcoming critical data scarcity in the target domain. In particular, it can effectively sample macrocycles from the neighboring chemical space of privileged macrocyclic candidates, converting the problem of macrocycle structural modification into the exploration of the chemical space of macrocycles. In addition, we designed a generative sampling scheme named HyperTemp to facilitate CycleGPT to dynamically balance the exploitation of high-probability tokens with an exploration of alternative pathways, thus achieving a superior equilibrium between the novelty and validity of macrocycles. Using CycleGPT, we have designed a number of macrocycles specifically for the JAK2 target based on the macrocycles designed by Macformer from our previous research¹⁵. In this CycleGPT-driven prospective drug design, three potent macrocyclic JAK2 inhibitors were identified, with IC₅₀ values reaching 1.65 nM, 1.17 nM, and 5.41 nM, respectively. One of them exhibits an even better kinase selectivity profile compared with the marketed drugs, Fedratinib and Pacritinib. Furthermore, the discovered macrocycle can inhibit RhePO-mediated polycytiosis and splenomegaly in BALB/c mice at a lower dose than Fedratinib and Pacritinib. These therapeutic candidates demonstrate the significant potential of CycleGPT for advancing macrocyclic drug discovery.

Results

Model overview

GPT-based models have demonstrated exceptional performance in sequence processing tasks in recent years¹⁸. To explore the macrocyclic chemical space, we developed a model called CycleGPT based on the progressive transfer learning strategy. The model interprets and generates chemical language representations at a concise and efficient character level, which can effectively sample the adjacent chemical space of privileged macrocycles and explore potential alternative macrocyclic drug candidates. To address the challenge of insufficient macrocyclic compounds, CycleGPT was first pre-trained using 365,063 active compounds with biological activity labels to grasp SMILES semantics. These compounds were extracted from the ChEMBL¹⁹ database with IC₅₀/EC₅₀/K_d/K_i values lower than 1 μM and SMILES strings shorter than 140 tokens in length. Subsequently, we collected 19,920 macrocyclic molecules with SMILES lengths of less than 140 characters from the CHEMBL and Drugbank²⁰ databases. These macrocycles were utilized for transfer learning on the pre-trained CycleGPT model, aiming to adapt the model’s knowledge from the chemical space of bioactive linear molecules to that of macrocyclic compounds. The model can be further fine-tuned with macrocyclic hits for designing target-specific drug candidates. (Fig. 1a, b). We use the Lion²¹ optimizer to adjust the network parameters.

In molecular generation, the sampling algorithm and network architecture jointly determine the quality of the generated SMILES. However, the performance of existing sampling algorithms in macrocyclic SMILES remains challenging, especially in achieving a satisfactory balance between the validity and novelty of generated compounds. In this paper, we propose a heuristic sampling algorithm, HyperTemp, which makes a transformation strategy based on tempered sampling to facilitate fine-grained adjustments of the token probabilities. HyperTemp can appropriately reduce the probability of optimal tokens while increasing the probability of suboptimal tokens to improve novelty and maintain the validity of the generated macrocycles. We combine HyperTemp sampling with the CycleGPT architecture to realize a complete macrocyclic generation model, and systematically compare HyperTemp with existing sampling algorithms. To verify the validity and effectiveness of this introduced transformation, we also selected three other transformations to adjust the probability differently and performed a comparison with HyperTemp (Supplementary Formula 1–3). To the best of our knowledge, this work is among the first to comprehensively explore sampling algorithms in molecular generation for macrocycles, an underexplored yet crucial subclass.

Model performance

The performance of CycleGPT-HyperTemp was evaluated and compared with other molecular generation methods^{22,23,24,25,26,27,28,29,30}, with results shown in Table 1. In addition to validity and macrocycle_ratio, greater emphasis is placed on novel_unique_macrocycles, a comprehensive metric quantifying the proportion of generated valid and unique macrocycles that are absent from the training dataset. Char_RNN can generate enough valid macrocycles but with a very low novel_unique_macrocycles value (11.76%), while the GPT-based models MolGPT and cMolGPT failed to capture the semantics of macrocycles. Llamol and MTMol-GPT demonstrate advantages over other models in terms of the novel_unique_macrocycles metric (38.13% and 31.09%, respectively), yet there remains a notable margin compared to our CycleGPT-HyperTemp model (55.80%).

Table 1 Comparison of CycleGPT-HyperTemp and other models

Full size table

To have a more comprehensive understanding of the characteristics and performance of the HyperTemp sampling, we took CycleGPT as the base model and replaced HyperTemp with 13 different sampling strategies. As shown in Supplementary Table 1, when employing the MaxEntropy or Noised Top-K algorithm, CycleGPT undergoes a cliff-like decline across all metrics. Overall, with respect to the novel_unique_macrocycles metric, the HyperTemp sampling method performs best among all algorithms. We further analyzed the effect of the HyperTemp sampling algorithm on the generated tokens. As illustrated in Fig. 2a–c, through finer probability adjustment based on tempered sampling, HyperTemp further reduces the preference for optimal tokens and enhances the exploration of suboptimal tokens, which promotes the diversity of token sampling and improves the novelty.

**Fig. 2: Behavior of HyperTemp sampling and chemical space exploration of Loratinib using CycleGPT-HyperTemp.**

In addition, we conducted a systematic evaluation of MOSES⁹ metrics for macrocycles generated by different methods, excluding those with poor performance in novel_unique_macrocycles (<20%). Considering the numerous comparison methods and their subtle variations, we summarized only the top three methods for each property and displayed them in Supplementary Tables 2–3. Across the 10 properties, CycleGPT combined with either HyperTemp or Top-p sampling ranks in the top three for six, outperforming all other methods. Moreover, the molecular properties analyses (Supplementary Figs. 1–3) reveal that the generated macrocycles from CycleGPT-HyperTemp possess a similar distribution compared with the training dataset. The above results demonstrate that our CycleGPT-HyperTemp model could generate a higher proportion of valid and unique macrocycles while maintaining molecule diversity and property quality.

To evaluate the effectiveness of our method in downstream target applications, as an example, we further expanded the chemical space of the macrocyclic compound Loratinib using our CycleGPT-HyperTemp method. The visualization of the generated molecular chemical space is shown in Fig. 2d. It can be seen that after performing fine-tuning with Loratinib, the generated macrocycles migrated to the nearby chemical space of Loratinib, demonstrating the correct chemical space exploration ability. Furthermore, it illustrates that our method can achieve structural modification of macrocycles, including 1. macrocyclic scaffold hopping, and 2. peripheral substituent modifications. These structural modification functions are consistent with the common modification methods in medicinal chemistry, reflecting the practicality of structural modifications implemented by our method.

Modification of macrocyclic JAK2 inhibitors using CycleGPT

The Janus kinases (JAKs), a family of intracellular tyrosine kinases, play a pivotal role in mediating the signaling of many cytokines and are implicated in the pathogenesis of various diseases, such as myeloproliferative neoplasms and rheumatoid arthritis^31,32,33. We previously designed three macrocyclic Janus kinase 2 (JAK2) inhibitors using Macformer through macrocyclization of Fedratinib, with compound M3 (renamed for clarity) demonstrating potential as a drug candidate⁵. Having alternative drug candidates is an effective strategy to provide contingency options to mitigate risks and ensure continuity in the drug development process. In this study, CycleGPT was utilized to explore their neighboring chemical space to obtain potential alternative drug candidates targeting JAK2 for further investigation.

Ten-fold augmentation was performed on the three macrocyclic JAK2 inhibitors with randomized SMILES strings, which were subsequently used to fine-tune the CycleGPT model for transfer toward a specific chemical space. The HyperTemp algorithm was employed in the inference process, generating 5058 macrocycles distributed around the starting macrocycles, indicating that CycleGPT-HyperTemp learned and explored the space of the JAK2 macrocycles correctly (Fig. 1d). To further enrich the macrocycles generated by CycleGPT, we employed the CyclePred, a heterogeneous graph transformer model, for the prediction of JAK2 inhibitory activity (Fig. 3a). The RMSE value of CyclePred from 5 times 5-fold is 0.6717 and can be lowered to 0.5776 when predicting the JAK2 macrocycles in the non-training set (lower is better). The predicted and experimental -pIC₅₀ values exhibit a good correlation, with an R² value of 0.7036 (Fig. 3c).

**Fig. 3: Workflow and performance of CyclePred when predicting the activity of JAK2 inhibitors and in vitro activities of picked 6 generated macrocycles.**

After predicting JAK2 activity with CyclePred, the top 70 macrocycles were selected for docking simulation using Glide³⁴ and Rosetta³⁵, respectively. The Fedratinib-JAK2 kinase domain complex (PDB code 6VNE)³⁶ from the PDB database was used for molecular docking of the above 70 macrocyclic compounds. According to the docking results (Fig. 3d) and the synthetic experience of pharmaceutical chemists, six macrocycles (Fig. 3e) were finally selected for chemical synthesis (Supplementary Figs. 5–19) and biological activity evaluation. The molecular binding modes of these six macrocyclic compounds with the target are shown in Supplementary Fig. 4 (visualization using ChimeraX)³⁷. It can be seen that they all have a good interaction with the target.

Activities of designed macrocyclic JAK2 inhibitors

Kinase assays were performed using the Z’-LYTE^TM kinase assay kit to determine the inhibitory activities of these six macrocyclic compounds, with compound M3, which we gained from the previous study, and Fedratinib as positive controls (Fig. 3f and Supplementary Fig. 21). Three of the six macrocycles demonstrated strong JAK2 inhibition with single nanomolar IC₅₀ values, and the most potent compound 2 (IC₅₀ = 1.17 nM) showed superior activity compared to Fedratinib. These compounds were further evaluated for their antiproliferative activities against HEL and SET-2 cells (Fig. 3f), which are human erythroleukemia cells and primary thrombocytosis cells containing the JAK2 V617F mutation. Compound 2 exhibited significant antiproliferative activities against HEL and SET-2 cells with JAK2^V617F mutant by inhibiting the JAK2-STAT signaling pathway in a dose-dependent manner (Fig. 4a and Supplementary Fig. 22). Furthermore, compound 2 displayed a superior kinome selectivity profile than Fedratinib and Pacritinib (Fig. 4b). At a concentration of 100 nM, the number of wild-type kinases inhibited by compound 2 is 17 (percent control <35%), while those are 55 and 34 for Pacritinib and Fedratinib, respectively. Kinase selectivity profiles of Fedratinib can be obtained from our previous article⁵.

**Fig. 4: Compound 2 potently inhibited JAK2-mediated signaling and inhibited rhEPO-mediated polycythemia and splenomegaly in BALB/c mice.**

Building upon the superior inhibition and the kinase profiling results of compound 2, subsequent in vivo efficacy studies were conducted to further validate its biological effects. Studies have shown that treating rodents with recombinant erythropoietin (rhEPO) can simulate the main symptoms of human erythrocytosis (PV)^38,39, including increased reticulocyte count and hematocrit, splenomegaly caused by extramedullary hematopoiesis, and so on. We treated BALB/c mice with daily s.c. injections of either 10 units of rhEPO along with once daily oral administration of compound 2 dosed at 25, 50, and 75 mg/kg or vehicle for 4 consecutive days. Compared to the control group, the rhEPO-induced model group showed elevation of reticulocyte counts and hematocrit, expansion of Ter119/CD71 erythroblast in spleen and bone marrow, and splenomegaly. Excitedly, compound 2 suppressed rhEPO-induced hematocrit expansion, reticulocytosis, and splenomegaly (Fig. 4c–f), with the effects of 100 mg/kg compound 2 being comparable to those of 120 mg/kg Fedratinib and superior to those of 120 mg/kg Pacritinib. Moreover, compound 2 inhibited the expansion of Ter119/CD71 erythroblasts in both the spleen and bone marrow (Fig. 4g, h and Supplementary Figs. 20 and 23a–b). Additionally, it had no significant impact on white blood cell counts and body weights (Fig. 4i and Supplementary Fig. 23c). Overall, the above results indicated that compound 2 has potential for the treatment of polycythemia.

Discussion

Macrocycles are a promising class of drugs with certain properties that could address the shortcomings in drug design for small-molecule drugs. Performing structural modification of privileged macrocycles in drug research is an important route for drug design. To deal with macrocycle structural modification more effectively, CycleGPT was proposed based on a progressive transfer learning strategy to transform the macrocycle structural modification into macrocyclic chemical space exploration in this study. For this specific task, the HyperTemp sampling algorithm was designed to improve the diversity and quality of generated macrocycles. The HyperTemp sampling algorithms in conjunction with CycleGPT can achieve structural modification of macrocycles, including macrocyclic scaffold hopping and modification of substituents on the macrocyclic ring, which is in line with the modification ideas of medicinal chemists.

From the molecules generated by CycleGPT-HyperTemp, six compounds were selected for synthesis based on JAK2 activity prediction and molecular docking scores. Subsequent biological tests revealed that compounds 1, 2, and 6 manifested high inhibitory activity against JAK2 at both enzyme and cellular levels, with compounds 2 displaying an improved selectivity profile against 468 kinases compared to Pacritinib and Fedratinib. In addition, compound 2 possessed the potential to treat polycythemia. This prospective case highlights the practicality of CycleGPT combined with HyperTemp sampling to explore macrocyclic chemical space. Moreover, it provides potential macrocyclic drug candidates for further development of JAK2-targeted therapies. It is expected that CycleGPT will significantly enhance macrocyclic drug design as an advanced extension of current computational methods, enabling efficient exploration and modification of emerging candidate compounds.

Methods

Datasets

Bioactive molecules with activity values (IC₅₀, EC₅₀, K_d, K_i) < 1 μM were collected from the ChEMBL¹⁹ database. The molecules were converted into canonical SMILES strings using RDKit, with only those no longer than 140 tokens retained. After removing stereochemistry, salts, and duplicate molecules, 365,063 active molecules were eventually obtained as unique SMILES strings.

The macrocycles used for progressive transfer learning mainly came from two databases, ChEMBL and Drugbank²⁰. Only macrocycles with SMILES strings shorter than 140 tokens in length were retained. We integrated and removed duplicate molecules from both datasets, resulting in a total of 19,920 macrocycles.

CycleGPT model

CycleGPT was implemented based on the nanoGPT architecture (https://github.com/karpathy/nanoGPT), which utilized torch for constructing the GPT network. In CycleGPT, we used the auto-regressive approach to generate macrocycle SMILES. For a sequence ${{\rm{x}}}=({x}_{1},{x}_{2},{x}_{3}{..}.{{\rm{x}}}_{t})$, the auto-regressive model decomposes the joint probability distribution through the chain rule:

$${p}_{\theta }({{\rm{{X}}}})=\mathop{\prod }\limits_{t=1}^{T}{p}_{\theta }({x}_{t}|{x}_{ < t}),$$

(1)

where ${x}_{t}$ represents the vector of the t th token, $\theta$ represents the model parameters. When training the auto-regressive model, we used the cross-entropy loss function to minimize and optimize the model parameters $\theta$.

CycleGPT used 12 blocks, with the hidden dimension set to 768. The input and generated SMILES token length was set to 140, and the embedding vector size of each token was set to 768. It employed the Causal Self-Attention mechanism with attention heads set to 12. These attention layers are interconnected and projected into the final output. The scaled-dot attention layer uses three matrices as inputs: a matrix ${{\rm{Q}}}$ containing a set of queries, a matrix ${{\rm{K}}}$ containing keys, and a matrix ${{\rm{V}}}$ containing values. The attention calculation is as follows:

$${{\rm{attention}}}({{\bf{Q}}},{{\bf{K}}},{{\bf{V}}})={{\rm{softmax}}}\left(\frac{{{\bf{Q}}}\cdot {{{\bf{K}}}}^{{{\bf{T}}}}}{\sqrt{{d}_{k}}}\right){{\bf{V}}},$$

(2)

where ${d}_{k}$ is a scaling factor determined by the size of the weight matrices. We employed the optimizer Lion²¹, developed by Google in 2023, to optimize parameters.

Sampling strategy

The sampling strategies of language models play a key role in affecting the generative quality. There are many popular sampling schemes in the literature, including MaxEntropy sampling, Random mask sampling, Top-K sampling, Noised top-K sampling, Top-p sampling, Tempered sampling, and Tempered top-K sampling⁴⁰. However, these sampling schemes were mostly designed for generating text sequences. When they are used directly for generating SMILE sequences, the quality of the generated molecules remains to be improved, especially in terms of the balance between the validity and the novelty of the molecules (see Supplementary Table 3).

We designed a sampling algorithm named HyperTemp for generating high-quality SMILE sequences. The main idea of HyperTemp is to introduce a nonlinear transform to adaptively modulate the output probabilities for each token as computed by the language model, and then apply multinomial distribution sampling on the rectified probabilities. Specifically, the nonlinear transform function is chosen as the hyperbolic tangent (tanh) function, defined as:

$${\hat{p}}_{i}=\frac{{p^{\prime} }_{i}}{{\sum }_{j=1}^{|N|}{p^{\prime} }_{j}},\,s.t.{p^{\prime} }_{i}=\,\tanh \left(\frac{\exp (\log ({p}_{i})/T)}{{\sum }_{j=1}^{|N|}\exp (\log ({p}_{j})/T)}\right),$$

(3)

where $N$ is vocabulary size, ${p}_{i}$ represents the output probability of the ith token in the vocabulary, and $T$ is adjustable temperature. Then the rectified probabilities with HyperTemp are used for generating the next token.

The HyperTemp sampling strategy employs a nonlinear transformation to recalibrate token probabilities. This transformation moderately attenuates the top-ranked token probability while proportionally enhancing suboptimal token probabilities, thereby expanding the exploration of molecular candidates at each step. Crucially, the tanh function preserves the original relative ranking of tokens due to its monotonicity, ensuring that sequence generation adheres to chemically valid and syntactically regular patterns. By dynamically balancing exploitation of high-probability tokens with exploration of alternative pathways, HyperTemp achieves superior equilibrium between molecular novelty and structural validity compared to conventional sampling approaches (Supplementary Table 1).

It is noteworthy that increasing the temperature during the sampling process can enhance the novelty of the generated samples, as is widely used in the AI literature. However, this adjustment is restricted to a narrow range since excessively high temperatures substantially compromise the validity (or normality) of the generated samples. Therefore, within the desired temperature range, the capacity of elevated temperatures to improve the sampling novelty remains restricted. In comparison, the proposed Hyper-Temp sampling scheme offers a highly robust scheme to achieve the balance between novelty and validity. Empirical results demonstrate its efficacy in increasing the attention to suboptimal tokens to improve the novelty of generation (Fig. 2a–c and Supplementary Table 1)

JAK2 activity prediction

CyclePred was implemented based on MacFrag⁴¹ and PharmHGT⁴², which were used to predict the JAK2 activity of generated macrocycles for further screening. Training data were the experimental IC₅₀ values of 8157 JAK2 inhibitors from PubChem⁴³, ChEMBL¹⁹, and BindingDB⁴⁴. CyclePred uses MacFrag to segment the molecules into the smallest building blocks, combine atomic information with MacFrag building blocks information to construct a heterogeneous graph of the macrocycle, and then perform multi-view message passing block and reading out block.

Model evaluation metrics

The performances of CycleGPT used some metrics widely used in the previous work of molecule generation²², including the following:

Validity refers to the percentage of generated valid molecules in the sampled molecules.

Macrocycle_ratio evaluates the proportion of compounds that are macrocycles in the generated molecules.

Novel_unique_macrocycles refers to the percentage of macrocycles that are unique and not present in the dataset in the generated molecules (remove duplicate macrocycles). It is a comprehensive metric that considers the novelty, uniqueness, and validity (macrocycle) of generated molecules.

In addition, we evaluated the properties in MOSES, including internal diversity (IntDiv, IntDiv2), Fréchet ChemNet Distance (FCD), Similarity to the nearest neighbor (SNN), Fragment similarity (Frag), and Scaffold similarity (Scaff), together with molecular weight, octanol-water partition coefficient (log P), quantitative estimation of drug-likeness (QED), and synthetic accessibility score (SA).

The performance of CyclePred used metrics include: Root Mean Square Error (RMSE) and R-squared (R²), commonly used dimensionless metrics in machine learning and deep learning for regression models. We have chosen to retain four decimal places to maintain the precision and consistency of our results.

Molecular docking

The crystal structure of JAK2 bound to Fedratinib (PDB code 6VNE) was selected as the reference structure for our docking simulations. Maestro and Rosetta were used to evaluate the potential of generated macrocycles against JAK2.

In Maestro docking, we prepared the protein using the Protein Preparation Wizard in Maestro v11.5. A grid-enclosing box was placed at the center of the crystal ligand, and the van der Waals radius scaling factor was set to 0.8, with partial atomic charges less than 0.15, to soften the non-polar parts of the receptor. The three-dimensional structure of the compound was generated and minimized using the Ligprep v3.3 module. The molecules were docked to the binding site using the Glide standard precision (SP) method with default parameters, and only the top pose of each molecule was retained. The binding modes of the docking were calculated using Prime MM-GBSA to calculate the ligand binding energy.

In Rosetta docking, we initially generated a set of low-energy conformations of the macrocycles using Maestro. The molfile_to_params.py script of Rosetta was employed to process the conformation file and subsequently incorporate the small molecule into the protein file. Finally, docking was carried out using the rosetta_scripts function.

Enzyme assay

Recombinant proteins of JAK1/2/3 and TYK2 were produced by the baculovirus expression system and JAKs kinase activity were measured using the Z′-LYTE™ Kinase Assay Platform⁴⁵. In short, the reaction mixture contains 5 μL enzyme and Z′-LYTE Try6/3 peptide substrate (4 μM), 2.5 μL ATP (10 μM for JAK2, 25 μM for JAK2, 20 μM for JAK3, and 25 μM for TYK2), which were preincubated with various concentrations of tested compounds at room temperature for 1 h, then the development reagent was added to each well. After incubating at room temperature for another hour, a stop reagent was added to stop the reaction. Fluorescence was measured under 400 nm excitation and 445/520 nm emission. The IC₅₀ value was calculated using a sigmoidal curve fit by GraphPad Prism 8.0.

Cell proliferation assay

HEL 92.1.7 and SET-2 cells were seeded at 5000 cells/well in 70 μL RPMI medium with 10% FBS to the 96-well plate and incubated overnight at 37 °C with 5% CO₂. 30 μL tested compounds were added from an initial concentration of 25 μM with a 3-fold gradient dilution. After 72 h, 10 μL of Cell Counting Assay Kit-8 solution were added to each well and incubated for 2 h at 37 °C. The absorbance was measured at 450 nm using a microplate reader with a reference wavelength of 630 nm. All experiments were repeated in triplicate, and these data were plotted in Graphpad 8.0 to determine the half-maximal inhibition (IC₅₀) values.

Western blot analysis

HEL 92.1.7 cells were added into a 6-well plate (1 × 10⁶ cells/well) and then put into the incubator. After overnight growth, cells were treated with or without compound 2 for 4 h. Then cells were collected and lysed for protein concentration determination. For immunoblotting, the protein samples were isolated by SDS-PAGE and transferred to a PVDF membrane. The membrane was sealed with 5% skim milk in TBST for 2 h at room temperature, and incubated with the p-JAK2, p-STAT5, p-STAT3, p-AKT, p-ERK, JAK2, STAT5, STAT3, ERK, AKT, or β-actin primary antibody overnight at 4 °C. Then the membrane was washed with TBST and incubated with HRP-conjugated secondary antibody at room temperature for 2 h, and then exposed by chemiluminescence method using enhanced ECL immunoblotting system (Tanon, China). All experiments were repeated in triplicate and analyzed using Image J and Graphpad 8.0.

In vivo efficacy study

Female BALB/c mice aged 8–10 weeks (17–20 g body weight) were subcutaneously injected with 10 units of recombinant human erythropoietin (rhEPO) daily for 4 consecutive days, and orally administered either vehicle or compounds. Controls were injected with corresponding volumes of saline buffer. On day 5, mice were euthanized within 24 h after the last administration. Blood was collected into microtubes containing EDTA-K2 through orbital blood collection for routine blood test by Idexx ProCyte Dx® analyzer. After the spleens and bone marrows were isolated, they were placed in 1× PBS on ice for subsequent test. For flow cytometric analysis, the spleens and bone marrows were homogenized and filtered with a 70 µm cell strainer to isolate single cells and lysed red blood cells with RBC Lysis Buffer, washed in 1× PBS and Fc receptors were blocked by TruStain FcX™ (anti-mouse CD16/32) to reduce non-specific staining. Cells were stained by APC-conjugated Ter119 and PE-conjugated CD71 antibody in the dark at room temperature for 30 min. After washing twice, cells were resuspended in the Cell Staining Buffer and analyzed by flow cytometry. Data were analyzed using GraphPad Prism software. The results of in vivo experiments are expressed as mean ± SEM. Means were used for statistical comparisons between groups (one-way ANOVA followed by Dunnett’s test). P < 0.05 was considered to indicate a statistically significant difference. All animal experiments were conducted with the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health and approved by the Animal Ethics Committee of East China University of Science and Technology (Permit Number: ECUST‑2022‑035).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All data used in this paper include molecule dataset and JAK2 activity dataset are from publicly accessible databases, which include ChEMBL (https://www.ebi.ac.uk/chembl/), Drugbank (https://go.drugbank.com/), PubChem (https://pubchem.ncbi.nlm.nih.gov/) and BindingDB (https://www.bindingdb.org/bind/index.jsp). For the crystallographic structure using in this paper can be accessed at PDB database (https://www.rcsb.org). Supplementary Data 1 contains all the source data of this paper.

Code availability

Source code of CycleGPT are available at https://github.com/fengalbert/CycleGPT.

References

Garcia Jimenez, D., Poongavanam, V. & Kihlberg, J. Macrocycles in drug discovery─ learning from the past for the future. J. Med. Chem. 66, 5377–5396 (2023).
Article CAS PubMed PubMed Central Google Scholar
Beck, H., Härter, M., Haß, B., Schmeck, C. & Baerfacker, L. Small molecules and their impact in drug discovery: a perspective on the occasion of the 125th anniversary of the Bayer Chemical Research Laboratory. Drug Discov. Today 27, 1560–1574 (2022).
Article CAS PubMed Google Scholar
Edmondson, S. D., Yang, B. & Fallan, C. Proteolysis targeting chimeras (PROTACs) in ‘beyond rule-of-five’ chemical space: recent progress and future challenges. Bioorg. Med. Chem. Lett. 29, 1555–1564 (2019).
Article CAS PubMed Google Scholar
Passioura, T. The road ahead for the development of macrocyclic peptide ligands. Biochemistry 59, 139–145 (2019).
Article PubMed Google Scholar
Giordanetto, F. & Kihlberg, J. Macrocyclic drugs and clinical candidates: what can medicinal chemists learn from their properties?. J. Med. Chem. 57, 278–295 (2014).
Article CAS PubMed Google Scholar
Lipinski, C. A. Rule of five in 2015 and beyond: target and ligand structural limitations, ligand chemistry structure and drug discovery project decisions. Adv. Drug Del. Rev. 101, 34–41 (2016).
Article CAS Google Scholar
Whitty, A. et al. Quantifying the chameleonic properties of macrocycles and other high-molecular-weight drugs. Drug Discov. Today 21, 712–717 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mallinson, J. & Collins, I. Macrocycles in new drug discovery. Future Med. Chem. 4, 1409–1438 (2012).
Article CAS PubMed Google Scholar
Doak, B. C., Zheng, J., Dobritzsch, D. & Kihlberg, J. How beyond rule of 5 drugs and clinical candidates bind to their targets. J. Med. Chem. 59, 2312–2327 (2016).
Article CAS PubMed Google Scholar
Over, B. et al. Structural and conformational determinants of macrocycle cell permeability. Nat. Chem. Biol. 12, 1065–1074 (2016).
Article CAS PubMed Google Scholar
Dougherty, P. G., Sahni, A. & Pei, D. Understanding cell penetration of cyclic peptides. Chem. Rev. 119, 10241–10287 (2019).
Article CAS PubMed PubMed Central Google Scholar
Zhang, C., Liu, F., Zhang, Y. & Song, C. Macrocycles and macrocyclization in anticancer drug discovery: important pieces of the puzzle. Eur. J. Med. Chem. 268, 116234 (2024).
Article CAS PubMed Google Scholar
Wagner, V. et al. Computational macrocyclization: from de novo macrocycle generation to binding affinity estimation. ChemMedChem 12, 1866–1872 (2017).
Article CAS PubMed PubMed Central Google Scholar
Sindhikara, D. et al. Automated design of macrocycles for therapeutic applications: from small molecules to peptides and proteins. J. Med. Chem. 63, 12100–12115 (2020).
Article CAS PubMed Google Scholar
Diao, Y. et al. Macrocyclization of linear molecules by deep learning to facilitate macrocyclic drug candidates discovery. Nat. Commun. 14, 4552 (2023).
Article CAS PubMed PubMed Central Google Scholar
Wu, T. et al. Discovery of selective and potent macrocyclic CDK9 inhibitors for the treatment of osimertinib-resistant non-small-cell lung cancer. J. Med. Chem. 66, 15340–15361 (2023).
Article CAS PubMed Google Scholar
Viarengo-Baker, L. A., Brown, L. E., Rzepiela, A. A. & Whitty, A. Defining and navigating macrocycle chemical space. Chem. Sci. 12, 4309–4328 (2021).
Article CAS PubMed PubMed Central Google Scholar
Radford, A., Narasimhan, K., Salimans, T. & Sutskever, I. Improving language understanding by generative pre-training. Preprint at OpenAI. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf (2018)
Zdrazil, B. et al. The ChEMBL database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res. 52, D1180–D1192 (2024).
Article CAS PubMed Google Scholar
Knox, C. et al. Drugbank 6.0: the drugbank knowledgebase for 2024. Nucleic Acids Res. 52, D1265–D1275 (2024).
Article CAS PubMed Google Scholar
Chen, X. et al. Symbolic discovery of optimization algorithms. Adv. Neural Inf. Process. Syst. 36, 49205–49233 (2024).
Google Scholar
Polykovskiy, D. et al. Molecular sets (MOSES): a benchmarking platform for molecular generation models. Front. Pharmacol. 11, 565644 (2020).
Article CAS PubMed PubMed Central Google Scholar
Segler, M. H., Kogej, T., Tyrchan, C. & Waller, M. P. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4, 120–131 (2018).
Article CAS PubMed Google Scholar
Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).
Article PubMed PubMed Central Google Scholar
Polykovskiy, D. et al. Entangled conditional adversarial autoencoder for de novo drug discovery. Mol. Pharm. 15, 4398–4405 (2018).
Article CAS PubMed Google Scholar
Guimaraes, G. L., Sanchez-Lengeling, B., Outeiral, C., Farias, P. L. C. & Aspuru-Guzik, A. Objective-reinforced generative adversarial networks (organ) for sequence generation models. Preprint at https://arxiv.org/abs/1705.10843 (2017).
Bagal, V., Aggarwal, R., Vinod, P. & Priyakumar, U. D. MolGPT: molecular generation using a transformer-decoder model. J. Chem. Inf. Model. 62, 2064–2076 (2021).
Article PubMed Google Scholar
Dobberstein, N., Maass, A. & Hamaekers, J. Llamol: a dynamic multi-conditional generative transformer for de novo molecular design. J. Cheminform. 16, 73 (2024).
Article CAS PubMed PubMed Central Google Scholar
Ai, C. et al. MTMol-GPT: de novo multi-target molecular generation with transformer-based generative adversarial imitation learning. PLoS Comput. Biol. 20, e1012229 (2024).
Article CAS PubMed PubMed Central Google Scholar
Wang, Y., Zhao, H., Sciabola, S. & Wang, W. cMolGPT: a conditional generative pre-trained transformer for target-specific de novo molecular generation. Molecules 28, 4430 (2023).
Article CAS PubMed PubMed Central Google Scholar
O’Shea, J. J. & Plenge, R. JAK and STAT signaling molecules in immunoregulation and immune-mediated disease. Immunity 36, 542–550 (2012).
Article PubMed PubMed Central Google Scholar
Taylor, P. C. Clinical efficacy of launched JAK inhibitors in rheumatoid arthritis. Rheumatology 58, i17–i26 (2019).
Article CAS PubMed PubMed Central Google Scholar
Hobbs, G. S., Rozelle, S. & Mullally, A. The development and use of Janus kinase 2 inhibitors for the treatment of myeloproliferative neoplasms. Hematol. Oncol. Clin. North Am. 31, 613–626 (2017).
Article PubMed Google Scholar
Friesner, R. A. et al. Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J. Med. Chem. 47, 1739–1749 (2004).
Article CAS PubMed Google Scholar
Fleishman, S. J. et al. RosettaScripts: a scripting language interface to the Rosetta macromolecular modeling suite. PLoS ONE 6, e20161 (2011).
Article CAS PubMed PubMed Central Google Scholar
Davis, R. R. et al. Structural insights into JAK2 inhibition by ruxolitinib, fedratinib, and derivatives thereof. J. Med. Chem. 64, 2228–2241 (2021).
Article CAS PubMed PubMed Central Google Scholar
Pettersen, E. F. et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 30, 70–82 (2021).
Article CAS PubMed Google Scholar
Grinfeld, J. et al. Classification and personalized prognosis in myeloproliferative neoplasms. N. Engl. J. Med. 379, 1416–1430 (2018).
Article CAS PubMed PubMed Central Google Scholar
Greenfield, G., McMullin, M. F. & Mills, K. Molecular pathogenesis of the myeloproliferative neoplasms. J. Hematol. Oncol. 14, 1–18 (2021).
Article Google Scholar
Nadeem, M., He, T., Cho, K. & Glass, J. A systematic characterization of sampling algorithms for open‑ended language generation. In Proc. 1st Conference of the Asia‑Pacific Chapter of the Association for Computational Linguistics (AACL-IJCNLP 2020), 334–346 (AACL, 2020).
Diao, Y., Hu, F., Shen, Z. & Li, H. MacFrag: segmenting large-scale molecules to obtain diverse fragments with high qualities. Bioinformatics 39, btad012 (2023).
Article CAS PubMed PubMed Central Google Scholar
Jiang, Y. et al. Pharmacophoric-constrained heterogeneous graph transformer model for molecular property prediction. Commun. Chem. 6, 60 (2023).
Article CAS PubMed PubMed Central Google Scholar
Kim, S. et al. PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 47, D1102–D1109 (2019).
Article PubMed Google Scholar
Gilson, M. K. et al. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 44, D1045–D1053 (2016).
Article CAS PubMed Google Scholar
Ge, H. et al. Efficacy of WWQ-131, a highly selective JAK2 inhibitor, in mouse models of myeloproliferative neoplasms. Biomed. Pharmacother. 156, 113884 (2022).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This work has been supported in part by the National Key Research and Development Program of China (2022YFC3400501), the National Natural Science Foundation of China (82425104, 82404517), the Science and Technology Commission of Shanghai Municipality (No. 24JS2830200), and Fundamental Research Funds for the Central Universities. H.L. was also sponsored by the National Program for Special Supports of Eminent Professionals and the National Program for Support of Top-notch Young Professionals.

Author information

These authors contributed equally: Feng Hu, Xiaotong Jia, Wenjie Liao.

Authors and Affiliations

Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
Feng Hu, Xiaotong Jia, Wenjie Liao, Ziqi Chen, Hongjie Bi, Huan Ge, Dandan Liu, Rongrong Zhang, Yuting Hu, Wenyi Mei, Zhenjiang Zhao, Lili Zhu & Honglin Li
Innovation Center for AI and Drug Discovery, School of Pharmacy, East China Normal University, Shanghai, China
Kai Zhang, Yanyan Diao & Honglin Li

Authors

Feng Hu
View author publications
Search author on:PubMed Google Scholar
Xiaotong Jia
View author publications
Search author on:PubMed Google Scholar
Wenjie Liao
View author publications
Search author on:PubMed Google Scholar
Ziqi Chen
View author publications
Search author on:PubMed Google Scholar
Hongjie Bi
View author publications
Search author on:PubMed Google Scholar
Huan Ge
View author publications
Search author on:PubMed Google Scholar
Dandan Liu
View author publications
Search author on:PubMed Google Scholar
Rongrong Zhang
View author publications
Search author on:PubMed Google Scholar
Yuting Hu
View author publications
Search author on:PubMed Google Scholar
Wenyi Mei
View author publications
Search author on:PubMed Google Scholar
Zhenjiang Zhao
View author publications
Search author on:PubMed Google Scholar
Kai Zhang
View author publications
Search author on:PubMed Google Scholar
Lili Zhu
View author publications
Search author on:PubMed Google Scholar
Yanyan Diao
View author publications
Search author on:PubMed Google Scholar
Honglin Li
View author publications
Search author on:PubMed Google Scholar

Contributions

F.H. and X.J. wrote the manuscript. F.H. developed the CycleGPT, HyperTemp sampling, and CyclePred method. Y.H. helped F.H. to test the model. Y.D. and K.Z. guided the experiment of the computational section. W.L., H.B., D.L., R.Z., and W.M. synthesized the macrocyclic compounds. Z.Z. guided the synthesis of macrocyclic compounds. X.J., Z.C., and H.G. evaluated the activities of the compounds against JAK2 at enzymatic and cellular level. X.J. and Z.C. carried out the in vivo experiments. L.Z. guided the in vitro and in vivo experiments. K.Z. and L.Z. helped to revise the manuscript. Y.D. and H.L. designed the whole project and revised the manuscript.

Corresponding authors

Correspondence to Kai Zhang, Lili Zhu, Yanyan Diao or Honglin Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Chemistry thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Hu, F., Jia, X., Liao, W. et al. Exploring the macrocyclic chemical space for heuristic drug design with deep learning models. Commun Chem 8, 299 (2025). https://doi.org/10.1038/s42004-025-01686-w

Download citation

Received: 14 July 2025
Accepted: 01 September 2025
Published: 07 October 2025
DOI: https://doi.org/10.1038/s42004-025-01686-w