Development of an AI model for DILI-level prediction using liver organoid brightfield images

Tan, Shiyi; Ding, Yan; Wang, Wei; Rao, Jianhua; Cheng, Feng; Zhang, Qiuyin; Xu, Tingting; Hu, Tianmu; Hu, Qinyi; Ye, Ziliang; Yan, Xiaopeng; Wang, Xiaowei; Li, Mingyue; Xie, Peng; Chen, Zaozao; Liang, Geyu; Pu, Yuepu; Zhang, Juan; Gu, Zhongze

doi:10.1038/s42003-025-08205-6

Download PDF

Article
Open access
Published: 07 June 2025

Development of an AI model for DILI-level prediction using liver organoid brightfield images

Shiyi Tan ORCID: orcid.org/0000-0002-4198-0223^1,2^na1,
Yan Ding ORCID: orcid.org/0009-0006-4028-2518²^na1,
Wei Wang^1,2,
Jianhua Rao³,
Feng Cheng³,
Qiuyin Zhang²,
Tingting Xu²,
Tianmu Hu²,
Qinyi Hu^2,4,
Ziliang Ye^2,4,
Xiaopeng Yan²,
Xiaowei Wang²,
Mingyue Li⁴,
Peng Xie ORCID: orcid.org/0000-0002-9509-7268⁴,
Zaozao Chen⁴,
Geyu Liang¹,
Yuepu Pu¹,
Juan Zhang ORCID: orcid.org/0000-0002-4974-4540^1,2 &
…
Zhongze Gu ORCID: orcid.org/0000-0001-8926-7710^2,4

Communications Biology volume 8, Article number: 886 (2025) Cite this article

10k Accesses
7 Citations
1 Altmetric
Metrics details

Subjects

Abstract

AI image processing techniques hold promise for clinical applications by enabling analysis of complex status information from cells. Importantly, real-time brightfield imaging has advantages of informativeness, non-destructive nature, and low cost over fluorescence imaging. Currently, human liver organoids (HLOs) offer an alternative to animal models due to their excellent physiological recapitulation including basic functions and drug metabolism. Here we show a drug-induced liver injury (DILI) level prediction model using HLO brightfield images (DILITracer) considering that DILI is the major causes of drug withdrawals. Specifically, we utilize BEiT-V2 model, pretrained on 700,000 cell images, to enhance 3D feature extraction. A total of 30 compounds from FDA DILIrank are selected (classified into Most-, Less-, and No-DILI) to activate HLOs and corresponding brightfield images are collected at different time series and z-axis. Our computer vision model based on image-spatial-temporal coding layer excavates fully spatiotemporal information of continuously captured images, links HLO morphology with DILI severity, and final output DILI level of compounds. DILITracer achieves an overall accuracy of 82.34%. To our knowledge, this is the first model to output ternary classification of hepatotoxicity. Overall, DILITracer, using clinical data as an endpoint categorization label, offers a rapid and effective approach for screening hepatotoxic compounds.

High-content imaging of human hepatic spheroids for researching the mechanism of duloxetine-induced hepatotoxicity

Article Open access 01 August 2022

A large-scale human toxicogenomics resource for drug-induced liver injury prediction

Article Open access 13 November 2025

Hepatotoxicity evaluation method through multiple-factor analysis using human pluripotent stem cell derived hepatic organoids

Article Open access 28 March 2025

Introduction

The computer vision (CV) model has shown significant potential in clinical applications by enabling detailed analysis of complex visual information from cell images¹. Recently, vision transformer (ViT) has made breakthrough progress in the field of CV, signaling a transition from the Convolutional Neural Network to the Transformer backbone^2,3. ViT utilizes a self-attention mechanism to capture long-distance relationships in an image, enabling it to understand global dependencies in the data⁴. This ability to capture the overall structure of biomedical images is why we selected ViT for this work. CV models using two-dimensional (2D) biomedical images have delivered impressive predictive capabilities in tasks such as detecting cell death⁵, segmenting cell nuclei⁶, and localizing subcellular protein⁷—achievements that are difficult with manual analysis. However, the emergence of physiologically relevant three-dimensional (3D) models like spheroids^8,9 and organoids¹⁰, underscores the urgent need for the development of advanced 3D imaging techniques and novel cell morphology analysis algorithms in CV. Currently, while drug screening based on phenotypes or statuses from cell images has gradually been applied, there has been less focus on identifying drug-induced liver injury (DILI). This may be due to the substantial metabolic differences between humans and animals¹¹, making it difficult to reflect the actual effects of compounds. For example, traditional preclinical safety trials of 150 drugs reported predictive accuracies of only 63% and 43% in non-rodent and rodent animals, respectively, with the lowest accuracy observed in the hepatobiliary system¹². Preclinical identification of DILI-risk compounds remains a challenge in drug discovery^13,14, emphasizing the need for alternative in vitro strategies to assess hepatotoxicity compounds and generate reliable data for CV model development. The aim of this work is to develop an expeditious tool based on a CV model for preclinical even clinical drug safety assessment.

Organoid culture technology provided new experimentally tractable, physiologically relevant models of human pathologies and subsequent drug screening¹⁵. Human liver organoids (HLOs) offer distinct advantages over HepG2 spheroids, as they comprise both hepatic parenchymal and non-parenchymal cells, reflecting accurate intercellular interactions. As 3D multicellular clusters, HLOs carry a cytochrome P450 system involved in drug metabolism, and preserve the phenotype and function of hepatocytes longer than primary human hepatocytes (PHHs)^16,17. Recent studies highlight the potential of artificial intelligence (AI)-driven image processing to explore the strong correlation between organoid morphology and compound toxicity or disease status^18,19. Therefore, the organoid model is not only a viable alternative to the animal model but also a promising tool primed for assessing DILI risks through morphological analysis. In image analysis, dyes are frequently used to highlight cell features, and CV techniques are then utilized to identify any changes²⁰. Notably, brightfield imaging surpasses fluorescence imaging in several aspects: real-time capabilities, non-destructive nature, and the absence of additional sample processing requirements. Furthermore, brightfield imaging excels in information retrieval due to its high capacity, richness, and depth, all while being cost- and time-effective. To capture the 3D features of the organoid model, we applied 3D video processing principles and developed a CV model based on image-spatial-temporal coding layers to extract spatiotemporal information from high-content screening (HCS). Herein, we developed an evaluation system, named DILITracer, capable of predicting the clinical DILI of compounds based on the HLO technology platform and an AI-assisted algorithm for data analysis. The model achieved an impressive overall accuracy of 82.34%, with particularly high accuracy (90.16%) in identifying non-DILI compounds.

To our knowledge, DILITracer is the first model able to categorize hepatotoxicity levels (no, less, or most DILI levels) rather than merely dictating hepatotoxicity. It is simple, non-destructive, and low-cost, with rich information extracted, making it ideal for high-throughput DILI risk evaluation. Our endeavor also represents a significant advancement in compliance with the principles of the 3Rs (Replacement, Reduction, and Refinement). In summary, our innovative AI model utilizes clinical data as an endpoint categorization label, providing a rapid and simple approach to accurately screen compounds with potential clinical liver injury effects.

Results

The strategy for the DILI-level evaluation system based on the morphology of HLO under brightfield

As shown in Fig. 1, our approach consists of two stages: system construction and system application: (1) In the stage of system construction: we ensured that the HLOs were in a “drug-ready” state. We also selected 30 structurally and functionally representative compounds with known levels of DILI from DILIrank database²¹, including four pairs of toxic drugs and their non-toxic structural analogs (troglitazone & pioglitazone, tolcapone & entacapone, nefazodone & buspirone, trovafloxacin & levofloxacin), as well as 22 drugs known to cover known DILI mechanisms^22,23^, (such as mitochondrial injury, reactive metabolites, biliary transport inhibition, and immune responses) for drug testing. It is noteworthy that the Food and Drug Administration (FDA) DILIrank database categorizes compounds into different degrees of hepatotoxicity based on clinical data, confirming the close alignment of our model with clinical reality during the prediction process. Throughout the testing process, we continuously collected brightfield images of the dosed HLOs using a HCS imager. We then used the DILI levels (No, Less, and Most) of the tested compounds as labels that were added to a series of brightfield images of the corresponding HLOs to generate image sequence DILI-level data pairs. Finally, we trained an AI model to predict the DILI level based on the image sequences to learn the relationship between the image sequences and the DILI severity; (2) In the stage of system application: HLOs, also in the “drug-ready” state, were activated by compounds with unknown DILI severity and then HLO brightfield images were continuously collected during the testing process. The corresponding sequence of brightfield images was input into the AI model to obtain the predicted value of DILI level. The model of this work exhibited an overall accuracy of 82.34%, with a particularly impressive performance in the vNo-DILI-concern category, where it achieved an accuracy of 90.16% (Table 1). This highlights the model’s exceptional ability to identify compounds with no DILI risk, ensuring a high degree of reliability in distinguishing non-hepatotoxic compounds. More detailed comparisons will be discussed in the subsequent sections.

**Fig. 1: The workflow of the development of the DILI-level prediction model.**

Table 1 Predictive performance metrics comparison between two in vivo 3D platforms

Full size table

Stable establishment of the DILI toxicity testing platform using two distinct 3D liver models

To explore the suitability of liver models with varying levels of complexity for DILI toxicity testing platforms, we established a single-type cell 3D model (HepG2 spheroid) and a multi-type cell 3D model (liver organoid). Specifically, HepG2 spheroids and HLOs were each exposed to 30 compounds with or without hepatotoxicity. The levels of ALB from the supernatant and cellular activity (ATP) from the spheres were further assessed at the end of Day 3 to validate the reliability of the system (Figs. 2b, d, f and 3b, d, f). For detailed results regarding changes in ALB and ATP levels of two in vitro 3D models under the treatment of 30 compounds, please refer to Supplementary Figs. 1–4. The brightfield images across different time series and different z-axis orientations were collected daily to generate image data for morphological analysis (Supplementary Figs. 5 and 6). Taking the HLO-based DILI toxicity platform as an example, the significant difference by compounds classified at different levels of liver toxicity potency could be observed. When treated with chlorpheniramine, labeled “No-DILI” by DILIrank, HLOs still increased in diameter and developed into a typical translucent hollow sphere with clear boundaries. In contrast to non-hepatotoxic compounds, Gefitinib-stimulated HLOs, labeled as “Most-DILI”, underwent cell death, failed to maintain their original spherical structure, and disintegrated by the end of Day 3. The state of HLOs treated with Simvastatin (with a label of “Less-DILI”) was between No- and Most-DILI, i.e., HLOs showed growth inhibition but their morphology was still in the form of a complete sphere. Overall, we provided a robust biological basis for the subsequent development of DILI risk prediction models (Fig. 2a, c, e).

**Fig. 2: DILI toxicity testing platform based on human liver organoid models.**

**Fig. 3: DILI toxicity testing platform based on HepG2 spheroid models.**

Comparison of DILI-level model using two distinct 3D liver models

On this basis, we comprehensively compared the performance of the image-only model across two different experimental platforms (Table 1). As shown in Fig. 4a–d, the DILI classifier exhibited a commendable predictive performance on the dataset of the HLOs platform, with an accuracy of 82.34%, far exceeded by HepG2 spheroids (an average accuracy of 77.41%). Notably, the HLO-based model correctly identified 90.16% of the actual cases of vNo-DILI-concern. Furthermore, for the vNo- and vLess-DILI-Concern cases, the recall of the model for the HLO dataset exceeded those for the HepG2 spheroids, signaling the remarkably higher predictive power of the model when utilizing the HLOs platform. Also, the HLO-based model exhibited robust specificity and had a better capability than HepG2 spheroid in effectively identifying instances that did not pertain to compounds belonging to vLess- and vMost-DILI-Concern. The notion that organoid models outperform HepG2 spheroid models is further substantiated by their superior precision.

**Fig. 4: Evaluation of the predictive performance of human liver organoids and HepG2 spheroids image-only models.**

However, some limitations still exist in the model using the HLO dataset. The model using the HLO dataset lacked accuracy in labeling the true positives among all the actual vMost-DILI-Concern samples compared to the HepG2 spheroid. Another caveat of the model using the HLO dataset was the lower likelihood of labeling correctly samples as non-vNo-DILI-Concern than using HepG2 spheroid (85.57% vs. 94.16%). As an indicator of balance in precision and recall, the F1 score was slightly lower for HLOs in the vMost-DILI-Concern classification. This reflected the previously mentioned lower recall, indicating a minor imbalance in the model that favored minimizing false positives at the expense of potentially missing true positives. The AUC value of vNo-, vLess-, and vMost-DILI-Concern in our prediction model using HLOs and HepG2 spheroids has been shown in Fig. 4e, f, indicating the reliable performance of our prediction models in classifying different labels. Overall, judging by the five model evaluation criteria, the DILI-classification model performed relatively better when using HLOs compared to HepG2 spheroids. Also, the findings revealed a commendable discriminatory capability of our HLO-based model in distinguishing instances of vNo-DILI-Concern from others.

Superiority analysis of the DILI prediction model using HLOs from in vitro and in silico perspectives

Next, we attempted to demonstrate the superiority of our AI model from the perspective of in vitro biological models. The result of immunofluorescence (Fig. 5a) confirmed liver-specific “bile duct-like structure” as indicated by the markers of bile salt export pump (BSEP). Also, tight junction protein stained by zonula occludens-1 (ZO-1) suggested a multi-cellular-type 3D hollow body, including hepatocytes rich with hepatocyte nuclear factor 4-alpha (HNF4a), CD31-expressing liver sinusoidal endothelial cells, CD68⁺ Kupffer cells, and DES-containing hepatic stellate cells. Importantly, the HLO model showed significantly higher expression levels of metabolic enzyme CYP34A (1.80 folds), CYP1A2 (88.96 folds), CYP2D6 (4.95 folds), CYP2E1 (10.79 folds), CYP2C9 (8.16 folds), and CYP2C19 (8.62 folds) than those in HepG2 spheroids (Fig. 5b). Therefore, we assumed that liver organoids, as a more physiologically relevant in vitro liver model, would be able to generate more realistic toxicological responses and thus provide more reliable image data for the development of DILI models.

**Fig. 5: Superiority analysis and ability validation of human liver organoids image-only model.**

We further conducted ablation experiments to investigate the impact of temporal and spatial modalities on the DILI prediction model (Table 2). First, we used Day 0–Day 1 image data instead of Day 0–Day 3 images to evaluate the role of temporal dimension information. The results showed a 12.09% decrease in prediction accuracy compared to the original model, with lower recall, specificity, precision, and F1 scores for each label (Fig. 5c). Second, we replaced multiple separate images of the same sample taken at different heights with 3D composite images generated by a fusion algorithm (provided by the HCS instrument), which combines images from different focal planes. After removing the spatial coding layer from the model, the prediction accuracy dropped by 6.24% compared to the original model. Recall, precision, and F1 scores for each label did not surpass the original model’s metrics (Fig. 5d). Overall, both temporal and spatial modalities exerted a substantial positive influence on the development of DILI prediction models, thus not only validating the soundness and effectiveness of the modeling approach but also emphasizing the pivotal role of temporal and spatial dimensions in replicating intricate biological mechanisms.

Table 2 Predictive performance metrics in ablation experiments

Full size table

Attention mechanism visualization for DILI prediction model based on HLOs

To further verify that the model effectively learned the morphological features of HLOs before and after drug exposure, we visualized the model’s output. Typical samples belonging to the Most- and No-DILI categories were selected to input into the STViT model, and then output attentional heat maps based on the attentional weights of the corresponding samples to determine the importance of each part of the HLO images in the model decision. Figure 5e, f were two sets of images (the overlay of the heatmap and the brightfield image) at different z-axis after drug exposure from Day 0 to Day 3. We found that on the same z-axis, the high attentional weights were predominantly concentrated on the HLOs that were accurately focused within the current visual field. Meanwhile, attentional weights grew significantly over time in response to significant changes in organoid activity (e.g., disintegration or growth). For example, in the Gefitinib-treated positive samples, the HLOs in the area marked by the small red box (Day 2) started to undergo significant cell death and structural disintegration. And in the Chlorphenamine-treated negative samples, the HLOs (Day 2 or Day 3) in the marked area still grew significantly. Correspondingly, both of these areas were assigned higher attentional weights by the model. The results above indicated that the model can accurately locate the position of HLOs and understand morphologically the change of organoid status.

Discussion

In this study, we successfully developed a DILI prediction model based on organoids, which we named “DILITracer” to highlight its ability to “trace” the DILI level (Most-, Less-, or No-DILI). Our model achieved an average accuracy of 82.34%, demonstrating improved predictive performance for DILI prediction compared to HepG2 spheroids and animal models. Almost all of the indicators (recall, specificity, precision, and F1 score) of each classification label exhibited a better value in the prediction model using HLO imaging compared to HepG2 spheroids. Our organoid experimental platform has been demonstrated to effectively mimic cell-cell interactions and exhibit higher levels of functional cytochrome P450 enzymes, suggesting that organoids serve as a more physiologically relevant in vitro 3D liver model compared to HepG2 spheroids. Furthermore, the generation of a comprehensive series of image data capturing detailed morphological features of organoids could provide a convenient and effective approach to reflect more realistic toxicological responses, thereby facilitating the establishment of robust AI models. Importantly, our model has the potential to identify certain “clinically specific toxic drugs” that induce liver toxicity clinically, despite having passed standard preclinical toxicology evaluations using animal models prior to first-in-human administration. Specifically, our model successfully identified simvastatin and stavudine as “non-No-DILI” cases, which had been poorly predicted by hepatic spheroids in a previous study²⁴. This may be partly attributed to the clinical relevance of the labels used in our model, where we employed clinical data-based drug classifications from the FDA DILIrank database for model training. This approach ensures that our model is closely aligned with clinical reality, highlighting its significant practical value in clinical drug development and safety assessment. Overall, we believe it qualifies as a suitable tool for predicting DILI during preclinical drug development.

A previous study has developed a DILI prediction strategy based on features of fluorescence images of PHHs analyzed by a random forest algorithm²⁵. However, the fluorescent dye or probe, as an invasive way, was limited to detection at endpoints of toxicity. Therefore, we collected non-destructive brightfield images across different time series (once a day for a total of 4 days) to realize the “dynamic monitoring” when clarifying the DILI toxicity. Currently, describing the structure of organoids with high phenotypic complexity using traditional morphological features such as radius length, area, and perimeter is challenging. Deep learning, however, offers a viable solution by effectively capturing the intricate patterns and features of organoids^19,26,27. To date, for the construction of the DILI prediction model, neither 2D nor 3D imaging technologies have been used in combination with CV techniques based on deep learning. In this study, we referred to the technical principles of video processing to fully excavate the spatial and temporal features of brightfield images. These two features are extremely important for us to generate a DILI image-only model with a favorable predictive performance, as evidenced by ablation experiments: the accuracy of our establishing model (82.34%) is extremely higher than that of the model without spatial feature (76.10%) and model without temporal feature (70.25%). Also, we categorized input labels into three groups of DILIranks (vMost-, vLess-, and vNo-DILI concern) based on confirmed causal evidence in clinical linking a drug to liver injury, providing a more nuanced assessment of hepatotoxicity. To the best of our knowledge, this is the first model to output ternary classification of hepatotoxicity rather than simply indicating whether or not hepatotoxicity is present.

The integration of our HLO-based DILI prediction model into preclinical testing workflows has the potential to revolutionize drug safety assessment. By providing an early-stage, in vitro platform for hepatotoxicity evaluation, our model might significantly reduce reliance on animal models, which often struggle to predict DILI, particularly idiosyncratic DILI²⁸. Compared to previous DILI prediction models that rely on chemical structure^29,30,31 or gene expression^32,33 as data modalities, our approach offers a more convenient data acquisition process. Moreover, the early identification of clinically relevant toxic drugs during preclinical testing enables the detection of compounds with a high risk of liver toxicity before they advance to human trials, ultimately reducing costly late-stage failures. In this study, our model’s high accuracy in identifying vNo-DILI cases (90.16%) ensures that safe drugs are prioritized for clinical trials, minimizing DILI risk and improving the likelihood of success in clinical trials and new drug projects. This approach may help lower drug development costs, provide further insights into liver toxicity risks, and offer a more reliable reference for clinical decision-making. Interestingly, the attention mechanism employed in this study revealed that our model is capable of identifying critical time points for distinguishing drug effects on organoids. This might provide valuable insights into the time window of clinical toxicity efficacy, serving as an important reference for optimizing clinical monitoring and intervention strategies—an area that warrants further investigation in future studies. By integrating dynamic brightfield imaging, machine learning, and clinical data from the FDA DILIrank database, our model offers an opportunity to enhance the predictive reliability of early-stage toxicity screening. Additionally, its non-invasive nature and real-time monitoring capabilities can be seamlessly incorporated into existing drug safety pipelines, facilitating more efficient drug development and the early elimination of hepatotoxic compounds. Overall, we hope to accelerate the transition of the DILI prediction model using organoids “from the bench to the bedside.”

There are still some limitations. The imbalance among the vMost-, -vLess, and vNo-DILI-Concern sample sizes may also contribute to the significant discrepancy of accuracy across different categories. Also, HLOs remain simplified and lack immune components in comparison to organ-on-a-chip¹⁰, potentially leading to false-negative results for certain compounds. In the future, we will focus on additional modifications of organoid or organoid-on-a-chip platforms to fully predict DILIs of different mechanisms, particularly immune-mediated DILI.

In this study, we successfully developed DILITracer, a DILI prediction model that analyzes spatiotemporal features from continuously captured brightfield images of liver organoids under various DILI conditions. The model correlates organoid morphology with DILI severity, providing risk assessments for compounds categorized as most-, less-, or no-DILI. DILITracer demonstrates impressive accuracy in predicting DILI levels and incorporates clinical data as outcome variables, ensuring strong clinical relevance. This AI-driven system offers a rapid and reliable tool for predicting hepatotoxicity in early-stage drug development and provides valuable insights for clinical drug screening.

Methods

Culture of HLOs and HepG2 spheroids

HLOs from liver cancer patient adjacent normal tissue were donated by Avatarget Co., Ltd (Suzhou, China), under informed consent and ethical approval (the Ethics Committee of the First Affiliated Hospital of Nanjing Medical University, 2021-SR-575). HLOs were embedded in Matrigel (Corning, 356231) and cultured in corresponding media (Avatarget, KLV0010101). All experiments were performed using HLOs derived from a single donor tissue. For each assay, three technical replicates were analyzed, corresponding to three independently cultured wells seeded with organoids. For passage, the following embedding method was used: HLOs were enzymatically or mechanically fragmented (with pre-cold phosphate-buffered saline (PBS)) and reseeded in new Matrigel. Specifically, the pre-cold Matrigel/cell mixture is seeded in 24-well plates at 30 μL/well to enable the formation of dome-shaped structures. Incubation in the cell incubator (37 °C, 5% CO₂) for 15 min is required for Matrigel polymerization. After solidification, 500 μL of specific medium is added and later renewed at specific intervals.

The HepG2 cells were donated by Avatarget Co., Ltd (Suzhou, China). The HepG2 cells were cultured in DMEM (Gibco, 10567014) containing 10% fetal bovine serum (GIBCO, A3161001C), and 1×penicillin/streptomycin (Gibco, 15140148). HepG2 cells (70–80% cell confluency) were washed twice in sterile PBS and then dissociated with 1 mL 0.25% EDTA-Trypsin. Using 3×volume of the complete medium neutralized with reagent above, count cells and ensure 10,000 cells/well. The desired volume of cell suspension was inoculated into the 200 μL/well of ultra-low attachment microplate (Corning, 7007). The plate was centrifuged at 1500 rpm for 10 min and then incubated in an incubator at 37 °C in 5% CO₂ for 4 days. The medium is changed every other day at a 1:1 ratio.

Compounds screening

According to the DILI classification by DILIrank²¹, the Liver microphysiological systems development guidelines²², we selected 30 compounds for testing. All selected compounds underwent extensive Liver-Chip-based DILI testing according to Emulate Co., Ltd³⁴. The specific list of DILI-related drugs is shown in Table 3.

Table 3 List of drugs tested in HepG2 spheroids or liver organoids

Full size table

DILI toxicity test

The peak serum concentration (Cmax) can intuitively reflect the actual presence level of a drug in the body’s circulatory system. Therefore, we used 1*Cmax as a reference for the drug dosage to better approximate the in vivo conditions. On this basis, we selected 10*Cmax as the reference concentration, taking into account the coefficient of variation in toxicology (individual variability: tenfold). Also, based on existing literature^25,34, we included 100*Cmax to account for the possibility of higher drug concentrations in extreme or therapeutic scenarios, thereby ensuring a more comprehensive assessment of drugs that may exhibit toxicity under elevated exposure conditions. In summary, for HepG2 spheroids, the concentration gradients of 0, 1, 10, and 100*Cmax were selected to cover a range of drug exposure scenarios. Each drug was tested in triplicate. Plates were dosed with the drug for 72 h in each cycle (referred to as Day 0 through Day 3). The image dataset for HepG2 spheroid-based DILI test was further trained using multimodal ViLT and iRENE models, revealing that the concentration of 10*Cmax yielded the highest accuracy in the average test set (Supplementary Tables 1 and 2). Consequently, for the subsequent DILI assay on HLOs, only the concentration of 10*Cmax was used. At the end of day 3, the supernatant from each well was collected to determine ALB secretion, and the spheroids from each well were used for the cell viability assay. The Cmax information of the drugs is shown in Supplementary Table 3.

Albumin, urea, and alanine transaminase assay

The culture supernatants were collected and stored at −80 °C until use. The supernatant was assayed with a Human Albumin ELISA Kit (PROTEINTECH, KE00076). Assays were performed according to the manufacturer’s instructions.

Cellular viability determination

CellTiter-Glo (CTG, G9681, PROMEGA) reagent and culture medium were added to each well in a 1:1 ratio. The contents were mixed for 2 min to induce cell lysis. The plate was incubated at room temperature for 10 min to stabilize the luminescent signal. Recording of luminescence was carried out with BioTek Microplate Reader Synergy H1 (Vermont, USA).

RNA isolation, reverse transcription (RT), and RT–quantitative polymerase chain reaction (qPCR)

Total RNA was extracted from the HepG2 spheroids or HLOs using RNA-easy Isolation Reagent (YAZYME, R701-01) according to the manufacturer’s protocol. Reverse transcription was carried out using the HiScript III Reverse Transcriptase (YAZYME, R302-01) according to the manufacturer’s protocol. qPCR was conducted using the SYBR Green Master Mix Kit (YAZYME, Q331-02) on a QuantStudio^TM 5 Real-Time PCR Instrument (THERMOFISHER). The data were normalized to GAPDH as the endogenous control. Relative expression was calculated using the 2^−ΔΔCt method. The specific primer sequences are shown in Supplementary Table 4.

Immunofluorescence

The HLOs were collected from Matrigel and fixed with 4% paraformaldehyde for 30 min, permeabilized with PBS containing 0.2% Triton X-100 (BEYOTIME, P0096) and blocked with PBS containing 5% bovine serum albumin (BIOSHARE, BS114) for 30 min at room temperature. HLOs were then incubated with primary antibodies overnight at 4 °C, followed by secondary antibodies for 1 h. Finally, the antifade mounting medium with DAPI (BEYOTIME, P0131) was used for nuclear staining. The dilutions for primary and secondary antibodies were: anti-ZO-1 (1:500, SERVICEBIO, GB111402-100), anti-ABCB11/BSEP (1:300, SERVICEBIO, GB113909), anti-HNF4a (1:100, SERVICEBIO, GB115549-100), anti-Desmin (1:500, SERVICEBIO, GB12088-100), anti-CD31 (1:300, SERVICEBIO, GB12064-100), anti-CD68 (1:500, SERVICEBIO, GB113150-100). Images were taken with an Olympus IX83 microscope (Tokyo, Japan).

Statistics and reproducibility

All statistical analyses were performed using GraphPad Prism 9 software. Unpaired Student’s t tests (two-tailed) were used for comparisons between two groups. Data are presented as mean ± SD, and P < 0.05 was considered statistically significant. For each condition, experiments were conducted on biologically independent organoid samples, defined as organoids derived from the same donor but established and cultured independently across different batches and wells.

Data collecting, labeling, and pre-processing

In all, 478 and 158 sample data were obtained from the HepG2 spheroids and HLO-based DILI assay, respectively. Each sample contains 4 days (Day 0 through Day 3) of 3D images, with 12–20 images taken equidistantly in the spatial dimension using the high-content imaging instrument (Avatarget). HLOs were imaged at ×10 magnification, while HepG2 spheroids were imaged at ×4 magnification. The training and testing datasets were split in an 80:20 ratio.

DILIrank is the largest reference drug list ranked by the risk of developing DILI in humans. The DILIrank dataset includes DILI risk classification, which is determined based on FDA-approved drug label information and literature-reported causality assessments. It consists of four categories:

(1) vMost-DILI concern: drugs withdrawn due to DILI, or drugs with black box warnings or labels containing warnings, precautions, or descriptions of severe or moderate liver injury, as validated through causality assessment; (2) vLess-DILI concern: drugs assessed as low risk based on drug labels, confirmed through causality validation; (3) Ambiguous DILI concern: drugs evaluated as high or low risk based on drug labels but lacking sufficient evidence of causality; (4) vNo-DILI concern: drugs with no literature reports confirming their role in causing DILI.

Drug labeling serves as a critical basis for DILI risk classification and is derived from a systematic evaluation of preclinical toxicology data, clinical trials, post-marketing surveillance, and literature data. It provides essential drug safety information, including DILI risk, and is regarded as the “most reliable data source”²¹. Herein, we categorized the data into three groups based on DILIrank classification, i.e., vMost-, vLess, and vNo-DILI concern. Notably, from the perspective of practical application, we excluded “Ambiguous DILI concern” category to reduce the uncertainty caused by the lack of clear causal evidence.

The original images are resized to a uniform size of 224 × 224 pixels to align with the network model and optimize dataset utilization. Subsequently, we apply pixel normalization and standardization, resulting in input parameters $x\in {R}^{H\times W\times C}$ $(H=224,{W}=224,{C}=3)$.

Model architecture, training, and validation

The CV classification model employed in this study consisted of four layers: (1) Image Encoder Layer, (2) Spatial Encoder Layer, (3) Temporal Encoder Layer, and (4) the Classification Layer. The architecture is inspired by the Video Vision Transformer (ViViT) model², which captures temporal dynamics in video data by treating it as sequences of images over time. Furthermore, we augment the image encoder layer with a spatial encoder layer to capture spatial relationships among images positioned at different levels along the z-axis in three-dimensional space. This multi-layered approach strikes an optimal balance between model complexity and performance.

(1)
Image Encoder Layer: the BEiT-V2³⁵ model is used to comprehensively capture image features. We initialize this layer with weights pretrained on a dataset of 700,000 cell images, leveraging the model’s ability to extract deep-dimensional features from cell images. The pre-training of the BEiT-V2 model occurs in two stages: the first stage involves Vector-quantized Knowledge Distillation (VQ-KD)³⁵, and the second stage trains the BEiT-V2 model itself using the visual symbols generated in the first stage. Visual symbols derived from VQ-KD serve as the training targets of the subsequent stage.

Contrastive language-image pre-training (CLIP)³⁶ is employed as the teacher model in the VQ-KD model of the first stage. In the second stage, for a given input image $x\in {R}^{H\times W\times C}$, it is reshaped to $N={HW}/{P}^{2}$ patches ${\{{x}_{i}^{p}\}}_{i=1}^{N}$, where ${x}^{p}\in {R}^{N\times ({P}^{2}C)}$ and $(P,P)$ is the patch size. In this experiment, each 224 × 224 image is divided into a 14 × 14 grid of image patches, with each patch being 16 × 16, and approximately 40% of the patches are randomly selected for masking. These masked positions are denoted as $M$.

To handle the masked patches, a shared learnable embedding ${e}_{[M]}$ is used to replace the original image patch embeddings ${e}_{i}^{p}$ if $i\in M$, as shown in the following Eq. 1:

$${x}_{i}^{M}=\delta \left(i\in M\right)\odot {e}_{\left[M\right]}+\left(\right.1-\delta (i\in M)\odot {x}_{i}^{p}\,$$

(1)

where $\delta (\cdot )$ is the indicator function and $\odot$ demotes element-wise multiplication.

Subsequently, a learnable CLS token is prepended to the input, which now becomes $[{e}_{{CLS}},\,{\{{{x}}_{i}^{M}\}}_{i=1}^{N}]$. This sequence is then fed to the ViT⁴ block, where the final encoding vectors are denoted as ${\{{h}_{i}\}}_{i=0}^{N}$, with ${h}_{0}$ corresponding to the CLS token. The visual tokens of the masked positions are then predicted based on the corrupted image ${x}^{M}$, and a simple fully connected layer is used for this prediction.

When using the BEiT-V2 model as the image encoder layer, a distinction from the pre-training stage lies in the handling of image patches. Specifically, after the input image $x$ is segmented into patches, these patches are not masked. Instead, the image is processed directly by prepending a learnable CLS token to the input, as shown in Eq. 2. This CLS token, denoted as$\,{e}_{{CLS}}$.

$${z}_{0}=\left[{e}_{{CLS}},\,{\left\{{\,x}_{i}^{p}\right\}}_{i=1}^{N}\right]\,$$

(2)

where$\,{x}^{p}\in {R}^{N\times \left({P}^{2}C\right)}$.

The image feature vector $y={\{{H}_{i}\}}_{i=0}^{N}$ (where ${H}_{0}$ corresponds to the CLS token) is derived through the ViT block (Eqs. 3–5), which is equivalent to the Transformer encoder layer. This layer alternates between multi-headed self-attention (MSA) and Multi-Layer Perceptron (MLP) blocks. Specifically, Eq. 3 describes the MSA operation, Eq. 4 the MLP operation, and Eq. 5 provides the final output after the last layer:

$${z}_{l}^{{\prime} }={MSA}\left({LN}\left({z}_{l-1}\right)\right)+\,{z}_{l-1},\,l=1,\,\ldots \,,\,L\,$$

(3)

$${z}_{l}={MLP}\left({LN}\left({z}_{l}^{{\prime} }\right)\right)+\,{z}_{l}^{{\prime} },\,l=1,\,\ldots \,,\,L\,$$

(4)

$$y={{ViT}}_{L}(x)={LN}\left({z}_{L}\right)\,$$

(5)

The MSA block extends self-attention (SA, Eqs. 6 and 7) by running k self-attention operations (or “heads”) and concatenating their outputs (Eq. 8). Layer Norm is applied before each block, with residual connections following each block.

$$\left[{\boldsymbol{q}},\,{\boldsymbol{k}},\,{\boldsymbol{v}}\right]={\boldsymbol{x}}{U}_{{\boldsymbol{qkv}}}\,,\,{U}_{{qkv}}\in {R}^{D\times 3{D}_{h}},\,x\in {R}^{N\times D}\,$$

(6)

$${SA}\left({\boldsymbol{x}}\right)={softmax}\left(\frac{{\boldsymbol{q}}{{\boldsymbol{k}}}^{T}}{\sqrt{{D}_{h}}}\right){\boldsymbol{v}}\,$$

(7)

$${MSA}\left({\boldsymbol{x}}\right)=\left[{{SA}}_{1}\left(x\right);{{SA}}_{2}\left(x\right);\ldots ;{{SA}}_{k}\left(x\right)\right]{U}_{{msa}},\,{U}_{{msa}}\in {R}^{k\cdot {D}_{h}\times D}\,$$

(8)

(2)
Spatial Encoder Layer: we acquire the image CLS token ${H}_{0}$, a representation of the global image, after the original image has passed through the image encoder layer. We employ a two-layer ViT block (Eq. 5, $L=2$) as the spatial encoding layer to further extract spatial information from the sample. Specifically, we use the multi-dimensional vector of the image CLS token as input, obtained by encoding the same sample at different spatial heights. We then prepend a new learnable CLS token to this input before feeding it into the spatial encoder layer. This process yields the spatial image CLS token enriched with spatial image features.
(3)
Temporal Encoder Layer: we utilize a Bidirectional Long Short-Term Memory (Bi-LSTM) layer, which processes the input sequence in both forward and backward directions simultaneously. This bidirectional approach allows the Bi-LSTM to capture information from both past (previous time steps) and future (following time steps), effectively capturing long-term dependencies in the input sequence. After obtaining the multi-dimensional vector of CLS token from the spatial encoding layer across different time points of the same sample, we derive a vector containing both temporal and spatial features, suitable for the subsequent classification task.
(4)
Classification Layer: we employ a simple MLP to implement three specific classifications, i.e., vMost-, vLess, and vNo-DILI concern.

We use Cross Entropy as the loss function and Adaptive Moment Estimation with decoupled weight decay (AdamW) as the optimizer to guide model training, setting ${lr}=2e-4,{batch\; size}=6,{epoch}=100$ for training. To eliminate the model’s dependence on the test dataset and ensure the reliability of the model performance indicators, we employ K-fold cross-validation (K = 5) during training, with the training dataset comprising 80% of the data through random partitioning. This approach allows all data to be used as test datasets for validation, and the average result served as the model evaluation indicator. The entire DILI classification prediction model is implemented using the PyTorch framework with Cuda 11.6 and trained on a Tesla V100 GPU (32GB).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The source data can be obtained from Supplementary Data 1–3. Supplementary Data 4 contains the CSV index file for the organoid image dataset, including metadata and classification labels for all images used in the study. The corresponding compressed image archive has not been deposited in a public repository, but the full dataset is available from the corresponding author upon reasonable request.

Code availability

The code involved in this article is available on GitHub (https://github.com/js-ish/dilipredict).

References

Zhou, S., Chen, B., Fu, E. S. & Yan, H. Computer vision meets microfluidics: a label-free method for high-throughput cell analysis. Microsyst. Nanoeng. 9, 116 (2023).
Article PubMed PubMed Central CAS Google Scholar
Arnab, A. et al. Vivit: a video vision transformer. In Proc. IEEE/CVF International Conference on Computer Vision 6836–6846 (IEEE, 2021).
Matsoukas, C., Haslum, J. F., Söderberg, M. & Smith, K. Is it time to replace CNNs with transformers for medical images? Preprint at https://arxiv.org/abs/2108.09038 (2021).
Dosovitskiy, A. et al. An image is worth 16×16 words: Transformers for image recognition at scale. Int. Conf. Learn. Represent. (2021).
Linsley, J. W. et al. Superhuman cell death detection with biomarker-optimized neural networks. Sci. Adv. 7, eabf8142 (2021).
Article PubMed PubMed Central CAS Google Scholar
Kostrykin, L., Schnörr, C. & Rohr, K. Globally optimal segmentation of cell nuclei in fluorescence microscopy images using shape and intensity information. Med. Image Anal. 58, 101536 (2019).
Article PubMed CAS Google Scholar
Wen, J. W., Zhang, H. L. & Du, P. F. Vislocas: vision transformers for identifying protein subcellular mis-localization signatures of different cancer subtypes from immunohistochemistry images. Comput Biol. Med. 174, 108392 (2024).
Article PubMed CAS Google Scholar
Bell, C. C. et al. Characterization of primary human hepatocyte spheroids as a model system for drug-induced liver injury, liver function and disease. Sci. Rep. 6, 25187 (2016).
Article PubMed PubMed Central CAS Google Scholar
Liu, T. et al. Squaramide-based supramolecular materials drive HepG2 spheroid differentiation. Adv. Health. Mater. 10, e2001903 (2021).
Article Google Scholar
Zhang, C. J. et al. A human liver organoid screening platform for DILI risk prediction. J. Hepatol. 78, 998–1006 (2023).
Article PubMed PubMed Central CAS Google Scholar
Messelmani, T. et al. Liver organ-on-chip models for toxicity studies and risk assessment. Lab Chip 22, 2423–2450 (2022).
Article PubMed CAS Google Scholar
Jang, K.-J. et al. Reproducing human and cross-species drug toxicities using a Liver-Chip. Sci. Transl. Med. 11, eaax5516 (2019).
Article PubMed CAS Google Scholar
Lin, S., Schorpp, K., Rothenaigner, I. & Hadian, K. Image-based high-content screening in drug discovery. Drug Discov. Today 25, 1348–1361 (2020).
Article PubMed CAS Google Scholar
Krentzel, D., Shorte, S. L. & Zimmer, C. Deep learning in image-based phenotypic drug discovery. Trends Cell Biol. 33, 538–554 (2023).
Article PubMed CAS Google Scholar
Zhao, Z. et al. Organoids. Nat. Rev. Methods Prim. 2, 94 (2022).
Article CAS Google Scholar
Brooks, A. et al. Liver organoid as a 3D in vitro model for drug validation and toxicity assessment. Pharmacol. Res. 169, 105608 (2021).
Article PubMed CAS Google Scholar
Hu, Y. et al. Liver organoid culture methods. Cell Biosci. 13, 197 (2023).
Article PubMed PubMed Central Google Scholar
Powell, R. T. et al. deepOrganoid: a brightfield cell viability model for screening matrix-embedded organoids. SLAS Discov. 27, 175–184 (2022).
Article PubMed Google Scholar
Park, T. et al. Development of a deep learning based image processing tool for enhanced organoid analysis. Sci. Rep. 13, 19841 (2023).
Article PubMed PubMed Central CAS Google Scholar
Walters, W. P. & Barzilay, R. Critical assessment of AI in drug discovery. Expert Opin. Drug Discov. 16, 937–947 (2021).
Article PubMed CAS Google Scholar
Chen, M. et al. DILIrank: the largest reference drug list ranked by the risk for developing drug-induced liver injury in humans. Drug Discov. Today 21, 648–653 (2016).
Article PubMed CAS Google Scholar
Baudy, A. R. et al. Liver microphysiological systems development guidelines for safety risk assessment in the pharmaceutical industry. Lab Chip 20, 215–225 (2020).
Article PubMed CAS Google Scholar
Dragovic, S. et al. Evidence-based selection of training compounds for use in the mechanism-based integrated prediction of drug-induced liver injury in man. Arch. Toxicol. 90, 2979–3003 (2016).
Article PubMed PubMed Central CAS Google Scholar
Proctor, W. R. et al. Utility of spherical human liver microtissues for prediction of clinical drug-induced liver injury. Arch. Toxicol. 91, 2849–2863 (2017).
Article PubMed PubMed Central CAS Google Scholar
Xu, J. J. et al. Cellular imaging predictions of clinical drug-induced liver injury. Toxicol. Sci. 105, 97–105 (2008).
Article PubMed CAS Google Scholar
Du, X. et al. Organoids revealed: morphological analysis of the profound next generation in-vitro model with artificial intelligence. Biodes. Manuf. 6, 319–339 (2023).
Article PubMed PubMed Central Google Scholar
Bai, L. et al. AI-enabled organoids: construction, analysis, and application. Bioact. Mater. 31, 525–548 (2024).
PubMed Google Scholar
Fernandez-Checa, J. C. et al. Advanced preclinical models for evaluation of drug-induced liver injury - consensus statement by the European Drug-Induced Liver Injury Network [PRO-EURO-DILI-NET]. J. Hepatol. 75, 935–959 (2021).
Article PubMed CAS Google Scholar
Liu, A. et al. Prediction and mechanistic analysis of drug-induced liver injury (DILI) based on chemical structure. Biol. Direct 16, 6 (2021).
Article PubMed PubMed Central CAS Google Scholar
Chen, Z. et al. ResNet18DNN: prediction approach of drug-induced liver injury by deep neural network with ResNet18. Brief. Bioinform. 23, bbab503 (2021).
Lim, S. et al. Supervised chemical graph mining improves drug-induced liver injury prediction. iScience 26, 105677 (2023).
Article PubMed CAS Google Scholar
Kohonen, P. et al. A transcriptomics data-driven gene space accurately predicts liver cytopathology and drug-induced liver injury. Nat. Commun. 8, 15932 (2017).
Article PubMed PubMed Central CAS Google Scholar
Feng, C. et al. Gene expression data based deep learning model for accurate prediction of drug-induced liver injury in advance. J. Chem. Inf. Model. 59, 3240–3250 (2019).
Article PubMed CAS Google Scholar
Ewart, L. et al. Performance assessment and economic analysis of a human Liver-Chip for predictive toxicology. Commun. Med.2, 154 (2022).
Article PubMed PubMed Central Google Scholar
Zhiliang Peng, Z., Dong, L., Bao, H., Ye, Q. & Wei, F. BEiT v2: masked image modeling with vector-quantized visual tokenizers. Preprint at https://arxiv.org/abs/2208.06366 (2022).
Radford, A. et al. Learning transferable visual models from natural language supervision. Proc. 38th Int. Conf. Mach. Learn. 139, 8748–8763 (2021).
Google Scholar

Download references

Acknowledgements

This research was funded by the Natural Science Foundation of Jiangsu Province, Major Project (BK20222008), Jiangsu Province Hospital (the First Affiliated Hospital with Nanjing Medical University) Clinical Capacity Enhancement Project (JSPH-MB-2023-9), and Open Project of Key Laboratory of Environmental Medicine Engineering of Ministry of Education (2024EME003).

Author information

These authors contributed equally: Shiyi Tan, Yan Ding.

Authors and Affiliations

Key Laboratory of Environmental Medicine Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing, China
Shiyi Tan, Wei Wang, Geyu Liang, Yuepu Pu & Juan Zhang
Jiangsu Institute for Sport and Health (JISH), Nanjing, China
Shiyi Tan, Yan Ding, Wei Wang, Qiuyin Zhang, Tingting Xu, Tianmu Hu, Qinyi Hu, Ziliang Ye, Xiaopeng Yan, Xiaowei Wang, Juan Zhang & Zhongze Gu
Hepatobiliary Center of The First Affiliated Hospital, Nanjing Medical University; Research Unit of Liver Transplantation and Transplant Immunology, Chinese Academy of Medical Sciences, Nanjing, China
Jianhua Rao & Feng Cheng
State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
Qinyi Hu, Ziliang Ye, Mingyue Li, Peng Xie, Zaozao Chen & Zhongze Gu

Authors

Shiyi Tan
View author publications
Search author on:PubMed Google Scholar
Yan Ding
View author publications
Search author on:PubMed Google Scholar
Wei Wang
View author publications
Search author on:PubMed Google Scholar
Jianhua Rao
View author publications
Search author on:PubMed Google Scholar
Feng Cheng
View author publications
Search author on:PubMed Google Scholar
Qiuyin Zhang
View author publications
Search author on:PubMed Google Scholar
Tingting Xu
View author publications
Search author on:PubMed Google Scholar
Tianmu Hu
View author publications
Search author on:PubMed Google Scholar
Qinyi Hu
View author publications
Search author on:PubMed Google Scholar
Ziliang Ye
View author publications
Search author on:PubMed Google Scholar
Xiaopeng Yan
View author publications
Search author on:PubMed Google Scholar
Xiaowei Wang
View author publications
Search author on:PubMed Google Scholar
Mingyue Li
View author publications
Search author on:PubMed Google Scholar
Peng Xie
View author publications
Search author on:PubMed Google Scholar
Zaozao Chen
View author publications
Search author on:PubMed Google Scholar
Geyu Liang
View author publications
Search author on:PubMed Google Scholar
Yuepu Pu
View author publications
Search author on:PubMed Google Scholar
Juan Zhang
View author publications
Search author on:PubMed Google Scholar
Zhongze Gu
View author publications
Search author on:PubMed Google Scholar

Contributions

J.Z. and Z.G. designed the study and supervised the work. S.T., T.X., and W.W. performed organoid culture, toxicity detection, and experimental data analysis. J.R. and F.C. participated in the liver organoid methods and provided clinical insights. X.Y. and X.W. selected a test list of drugs. Y.D., Q.Z., T.H., Q.H., and Z.Y. established the AI model. M.L., P.X., and Z.C. reviewed and commented on the paper from theoretical viewpoints. G.L. and Y.P. provided intellectual input in project development. S.T., Y.D., and Q.Z. wrote the manuscript that was reviewed and edited by all authors.

Corresponding authors

Correspondence to Juan Zhang or Zhongze Gu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Inclusion and ethics

This study obtained informed consent and received ethical approval from the Ethics Committee of the First Affiliated Hospital of Nanjing Medical University (2021-SR-575).

Peer review

Peer review information

Communications Biology thanks Juhi Tayal and the other anonymous reviewer(s) for their contribution to the peer review of this work. Primary handling editors: Dr Debarka Sengupta and Dr Ophelia Bu. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Transparent Peer Review file (download PDF )

Supplementary information (download PDF )

Description of Additional Supplementary Files (download PDF )

Supplementary Data 1 (download XLSX )

Supplementary Data 2 (download XLSX )

Supplementary Data 3 (download XLSX )

Supplementary Data 4 (download XLSX )

Reporting summary (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Tan, S., Ding, Y., Wang, W. et al. Development of an AI model for DILI-level prediction using liver organoid brightfield images. Commun Biol 8, 886 (2025). https://doi.org/10.1038/s42003-025-08205-6

Download citation

Received: 15 September 2024
Accepted: 12 May 2025
Published: 07 June 2025
Version of record: 07 June 2025
DOI: https://doi.org/10.1038/s42003-025-08205-6

This article is cited by

Opportunities and challenges of artificial intelligence in hepatology
- Sarah M. G. Morel
- Shuyang Wu
- Jonathan A. Fallowfield
npj Gut and Liver (2026)
Constructing biomimetic microenvironments for liver regeneration
- Yawen Zhu
- Wanqi Yang
- Haozhen Ren
Journal of Nanobiotechnology (2025)

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

The strategy for the DILI-level evaluation system based on the morphology of HLO under brightfield

Stable establishment of the DILI toxicity testing platform using two distinct 3D liver models

Comparison of DILI-level model using two distinct 3D liver models

Superiority analysis of the DILI prediction model using HLOs from in vitro and in silico perspectives

Attention mechanism visualization for DILI prediction model based on HLOs

Discussion

Methods

Culture of HLOs and HepG2 spheroids

Compounds screening

DILI toxicity test

Albumin, urea, and alanine transaminase assay

Cellular viability determination

RNA isolation, reverse transcription (RT), and RT–quantitative polymerase chain reaction (qPCR)

Immunofluorescence

Statistics and reproducibility

Data collecting, labeling, and pre-processing

Model architecture, training, and validation

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Inclusion and ethics

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links