Abstract
The hematopoietically-expressed homeobox protein (HHEX), an evolutionarily conserved regulator of endodermal organogenesis, remains uncharacterized in intestinal epithelial differentiation. To address this gap, we developed a Drosophila enterocyte-specific HHEX knockdown strain (NP1-Gal4 > UAS-HHEX RNAi) and generated a comprehensive cellular atlas of Drosophila third-instar larval (L3) midgut by single-cell RNA sequencing to precisely dissect HHEX’s regulation. Our analysis delineated major midgut lineages including adult midgut progenitors (AMPs), enteroendocrine cells (EEs), and functionally distinct enterocyte (EC) subtypes defined by metabolic genes. HHEX depletion significantly reduced EC abundance while preserving subtype diversity. This inaugural single-cell transcriptomic comparison delivers a precision transcriptional atlas of HHEX-deficient Drosophila midgut for studying HHEX’s critical role in regulating midgut epithelial maturation, while providing a roadmap to explore potential HHEX-regulated endodermal patterning conservation in mammalian systems.
Similar content being viewed by others
Background & Summary
HHEX demonstrates evolutionary conservation in directing endodermal specification, with versatile regulatory functions observed across species1,2. Vertebrate studies reveal its essential role in both anterior-posterior axis establishment and lineage commitment of hepatic and pancreatic lineages through spatiotemporal modulation of endodermal progenitor niches3,4,5,6,7. HHEX plays a pivotal role in modulating epithelial biology, influencing hepatic ductal morphogenesis, biliary epithelial functional development, and endothelial cell regulation8,9,10. For example, loss of HHEX expression induces embryonic lethality at embryonic day 10.5 (E10.5) in mice and manifests lethality phenotypes at day 18 in porcine embryos, confirming its indispensable role in endodermal organogenesis11.
The intestine, a quintessential endodermal organ, cooperates with the liver and pancreas to regulate systemic metabolism12,13. During vertebrate embryogenesis, pancreatic progenitors originate from intestinal endodermal epithelia, subsequently forming dorsal and ventral pancreatic buds14,15. Notably, HHEX is indispensable for ventral bud specification—HHEX-knockout mice exhibit complete ventral pancreas agenesis16,17. Moreover, recent studies have identified that CK2-mediated interactions between HHEX and the YAP-TEAD complex promote colorectal carcinogenesis18. These findings underscore HHEX’s roles in regulating intestinal endodermal development. Nevertheless, the precise cellular and molecular mechanisms by which HHEX modulates gut epithelial differentiation remains poorly characterized. Accounting for approximately 60% of midgut cells, ECs represent the predominant epithelial population, thus establishing HHEX-mediated regulation of EC development as an essential determinant of intestinal morphogenesis and homeostasis19,20,21.
The Drosophila midgut shares structural and functional homology with mammalian intestines and exhibits a conserved cellular composition, including intestinal stem cells (ISCs), ECs, and EEs. Furthermore, the Drosophila GAL4-UAS system offers unparalleled precision for EC-specific HHEX perturbation, enabling direct investigation of its cell-autonomous effects on ECs differentiation22,23,24. These attributes collectively establish the Drosophila midgut as an ideal model for dissecting HHEX’ s role in gut biology25,26,27.
We generated an NP1-Gal4 > UAS-HHEX RNAi strain to achieve EC-specific knockdown of HHEX (HHEX-KD). Then we performed single-cell RNA sequencing (scRNA-seq) to investigate the functional impact of HHEX on the Drosophila midgut. First, HHEX-KD larvae exhibited shortened midguts without significant changes in body weight (Fig. 1c–e). Subsequent scRNA-seq analysis revealed that HHEX-KD disrupted EC differentiation, as evidenced by reduced mature EC populations and impaired differentiation of midgut primordium (MP) into ECs, with HHEX-knockdown-induced cells (HICs) emerged.
Experimental Design and Drosophila Stock Construction. (a) Schematic workflow of the study. Ctrl group: NP1-Gal4 > UAS-mCherry RNAi; KD group: NP1-Gal4 > UAS-HHEX RNAi KD. (b) RT-qPCR validation of HHEX knockdown efficiency in dissected midguts. HHEX mRNA levels were significantly reduced in KD compared to Ctrl (unpaired two-tailed t-test, with P value < 0.05 considered statistically significant (*)). (c,d) Midgut length comparison between KD and Ctrl groups. KD midguts (n = 30) were shorter than Ctrl (n = 30). Scale bars: 1 mm. (e) No significant difference in larval body weight between groups (t-test, P value > 0.05). (f) UMAP visualization of HHEX expression from scRNA-seq data. Enterocytes (ECs) in Ctrl showed higher HHEX expression levels compared to KD. Each point represents one cell; cells colored white (low) to blue (high HHEX); annotated ECs were demarcated with solid blue outlines.
Our study establishes the first single-cell transcriptomic atlas of HHEX perturbation in Drosophila L3 midgut epithelium. The data suggest that midgut homeostasis during this developmental stage may involve adaptive mechanisms that preserve digestive capacity under HHEX-associated differentiation constraints. These findings suggest that HHEX may play a potential role in coordinating midgut epithelial differentiation dynamics, while possibly participating in establishing endodermal organ differentiation paradigms. This dataset serves as a foundational resource for exploring conserved regulatory principles of intestinal homeostasis across species.
Methods
The overall research pipeline of this study is illustrated in Fig. 1a. We constructed a Drosophila EC-specific HHEX KD strainand performed scRNA-seq, followed by analysis using a well-established analytical workflow for single-cell transcriptomics. This approach enabled comprehensive annotation of major cell types in the third-instar larval midgut of Drosophila. We systematically compared differences in subtype proportions, population sizes, and gene expression profiles between the control group and the HHEX-KD group.
EC-specific HHEX knockdown straingeneration and third-instar larval cultivation
The following strains were used: (1) White Dahomey (wDah); NP1-Gal4/CyO (a gift from Dr. Yuxuan Lyu, Southern University of Science and Technology); (2) UAS-HHEX-shRNA-attP40/CyO (TH04118.N; Tsinghua Fly Center); (3) yv; UAS-mCherry-shRNA-attP2 (BDSC #35785; Bloomington Drosophila Stock Center). To investigate the role of HHEX in regulating larval midgut EC development, we knocked down HHEX using a gut EC-specific Gal4 driver, NP1-Gal4 (also known as Myo31DF-Gal4)22,25,28. Flies carrying NP1-Gal4 were crossed with either UAS-HHEX-shRNA-attP40/CyO or yv; UAS-mCherry-shRNA-attP2 lines to generate experimental groups:
KD group: wDah; NP1-Gal4/CyO (male) × UAS-HHEX-shRNA-attP40/CyO (female).
Ctrl group: wDah; NP1-Gal4/CyO (male) × yv; UAS-mCherry-shRNA-attP2 (female).
Flies were maintained at 25 °C on standard cornmeal-agar medium under 12 h light/dark cycles. Synchronized wandering third-instar larvae were collected based on their characteristic wandering behavior (migration from food substrate to vial walls for pupariation) at ~120 hours post-egg laying.
RNA Isolation and RT-qPCR analysis
Midguts from Ctrl group and KD group were microdissected for RNA extraction using RNA Isolator (Vazyme R401). Reverse transcription of 1 μg total RNA per sample was performed using HiScript II Q Select RT Supermix (Vazyme R233). cDNA templates were diluted 10 times prior to quantitative PCR analysis with ChamQ SYBR Mastermix (Vazyme Q311). Primer pairs (designed using FlyPrimerBank29) were validated through standard curve generation with serial cDNA dilutions. Gene expression was quantified based on cycle threshold (Ct) values normalized to αTub84B reference gene. Primer sequences are as follows:
HHEX-F: 5′-GTTCAGCCAACAGCCTATTGT-3′
HHEX-R: 5′-GGAGGCAGGATTGGGGAAT-3′
Tissue dissociation and single-cell suspension preparation
Midgut dissociation was performed following established protocols26. Third-instar larvae of KD group and Ctrl group were dissected in ice-cold PBS containing 1% BSA (w/v). For each experimental group, >30 midguts were collected after removal of the crop, midgut-hindgut junction (with associated Malpighian tubules), and residual peritrophic matrix.
Midguts were minced into fragments and digested in 400 μL elastase solution (1 mg/mL PBS) in 1.5 mL Eppendorf tubes with orbital shaking at 27 °C (300 rpm, 30 min). Reactions were stopped by 1% BSA. Cell suspensions were sequentially filtered through 100 μm and 40 μm nylon meshes, then subjected to density gradient centrifugation using OptiPrep (Axis-Shield) at 1.12 g/mL. After centrifugation (800 × g, 20 min, 4 °C), viable cells were harvested from the upper interface layer. Cell viability (>90%) was confirmed by trypan blue exclusion.
scRNA-seq library preparation, sequencing, and genomic alignment
scRNA-seq libraries were prepared using the DNBelab C Series Single-Cell Library Prep Kit (MGI, 940-001924-00), with each library capturing approximately 20,000 cells. To ensure experimental reproducibility, two biological replicate libraries were generated for each experimental group, resulting in a total of four libraries (two each for KD group and Ctrl group).
The experimental workflow consisted of sequential phases: microfluidic droplet generation, reverse transcription within emulsion droplets, emulsion destabilization for product recovery, magnetic bead-based cDNA purification, and PCR amplification of cDNA libraries. Following library preparation, DNA quantification was conducted using the Qubit ssDNA Assay Kit (Thermo Fisher Scientific, Q10212). Libraries subsequently underwent paired-end sequencing on the MGI DNBSEQ-T1 platform at the China National GeneBank30.
Raw sequencing data were aligned to the Drosophila melanogaster reference genome (BDGP6.28 assembly, GenBank accession: GCA_000001215.4) through the standardized DNBelab C Series analysis pipeline (https://github.com/MGI-tech-bioinformatics/DNBelab_C_Series_HT_scRNA-analysis-software)31,32,33. Read mapping parameters included specification of an expected cell count (expect cells = 20000) corresponding to library preparation parameters. Detailed alignment statistics and quality control metrics are summarized in Table 1.
scRNA-seq data processing
Raw expression matrices were processed into AnnData objects (anndata v0.9.2) using Python v3.9, followed by rigorous quality control: cells with <200 UMIs or <100 detected genes were filtered (Scanpy.pp.filter_cells), genes expressed in <3 cells or with >20,000 counts were excluded (Scanpy.pp.filter_genes), and mitochondrial gene contamination (genes with mt: prefix) was restricted to <8% per cell34,35. Doublets were computationally removed using OmicVerse.pp.qc with an automated threshold score of 0.5336. Batch-corrected latent embeddings were generated via scVI (n_latent = 15), trained for 200 epochs on GPU-accelerated hardware using default neural network architecture (2 hidden layers, 128 nodes each), enabling downstream uniform manifold approximation (Scanpy.tl.umap) and Leiden clustering (resolution = 1.6)37.
Clustering and marker genes calculation
Single-cell transcriptomic analysis was performed using Omicverse v1.5.0 and Scanpy v1.9.335,36. Raw count matrices were preprocessed through Pearson residual-based normalization and variance stabilization via Omicverse.pp.preprocess, retaining 3,000 highly variable genes (HVGs) followed by zero-centered scaling using Omicverse.pp.scale. Dimensionality reduction leveraged scVI-derived principal components (n_pcs = 15) to construct a neighborhood graph (Scanpy.pp.neighbors, n_neighbors = 15), with subsequent UMAP embedding (Scanpy.tl.umap)37. Clustering was performed via Leiden algorithm at resolution 1.6 (Scanpy.tl.leiden), identifying 32 transcriptionally distinct clusters. Marker genes were computationally prioritized using Wilcoxon rank-sum tests (Scanpy.tl.rank_genes_groups) and refined through COSG optimization (Omicverse.single.cosg, top 30 genes per cluster). Final annotations integrated orthology mapping from FlyBase (https://flybase.org/) and conserved expression patterns from published Drosophila intestinal cell atlases26,38.
Statistical comparison of cell population abundance
Cell population changes between KD and Ctrl groups were examined using the milopy v 0.1.139. Analysis started with scVI-processed latent representations. A k-nearest neighbor graph (k = 100, 15 PCs) was built. Sample-level cell counts came from metadata identifiers in the dataset. A generalized linear model tested KD and Ctrl conditions for abundance changes (p_value < 0.05). Log-fold change (log2FC) values represent directional changes in the KD group: positive log2FC values show KD group increases, negative values show decreases. Results appear in violin plots colored by cell types. This method quantifies population dynamics while reducing technical noise.
Data Records
All raw sequencing data (paired-end reads in FASTQ format) generated in this study have been deposited under restricted access in the China National GeneBank Nucleotide Sequence Archive (CNSA; Project accession: CNP0007162; https://doi.org/10.26036/CNP0007162) at https://db.cngb.org/data_resources/project/CNP0007162/40,41,42. This controlled access facilitates data sharing according to established protocols. The dataset comprises four paired libraries, grouped as follows: Ctrl Group (Ctrl-1, Ctrl-2) and KD Group (KD-1, KD-2). Detailed library metadata, including unique library identifiers, experimental group assignments, associated raw data file names (X_1.fastq.gz, X_2.fastq.gz), are comprehensively listed in Table 2, and are also accessible via the corresponding sample records on the CNSA project page40. Processed single-cell gene expression data, including cell-type annotations, are publicly available via Figshare (https://doi.org/10.6084/m9.figshare.28927709) at https://figshare.com/articles/dataset/10_6084_m9_figshare_2892770943. This repository contains:(1) Per-sample raw count matrices, packaged in ZIP archives named after each library (e.g., Ctrl-1.zip). Each archive contains the standard compatible files: barcodes.tsv.gz, features.tsv.gz, and matrix.mtx.gz.(2) The integrated processed dataset, stored as an AnnData object (file: NP1_HHEX_anno.h5ad). This file contains the expression matrix, computed dimensionality reduction coordinates, cluster assignments, and curated cell-type annotations.
Technical Validation
Functional validation of HHEX knockdown efficacy
We systematically validated the establishment of the KD group at both experimental and analytical levels. In the KD group’s larval midgut, qPCR analysis revealed a significant reduction in HHEX expression compared with the Ctrl group (Fig. 1b). Phenotypic observations demonstrated a marked shortening of midgut length in the knockdown group (Fig. 1c,d), while overall larval body weight remained unaffected (Fig. 1e). These results indicate that HHEX EC knockdown specifically impairs midgut development without compromising essential intestinal functions.
Consistent with these findings, single-cell transcriptomic analysis showed significantly reduced HHEX expression in ECs annotated within the KD group dataset, aligning with experimental validation of knockdown efficacy (Fig. 1f).
Quality control of single-cell transcriptomic data
We employed the GAL4-UAS system to knock down HHEX in Drosophila midgut enterocytes, investigating its role in intestinal development and function. Midguts from third-instar larvae of Ctrl group and KD group strains were dissected and dissociated into single-cell suspensions. Libraries were prepared, with two technical replicates per biological group (Ctrl and KD). Raw sequencing data were processed through standard pipelines.
Following stringent quality control, we retained 23,100 high-quality cells from Ctrl group (11,209 and 11,891 cells per library) and 18,958 cells from KD group (7,876 and 11,082 cells per library). The mean UMI count per cell was 1,920, with 1,055 genes detected per cell on average (Fig. 2a,b). Technical reproducibility was validated through Pearson correlation analysis of average gene expression profiles (Fig. 2c). UMAP visualization revealed minimal batch effects between replicates, with substantial overlap between Ctrl and KD populations. However, distinct non-overlapping cell clusters emerged specifically in KD samples (Fig. 2d,e), suggesting potential HHEX-dependent transcriptional divergence.
Quality Control and Dataset Integration for Drosophila midgut scRNA-seq Datasets. (a) Boxplot distributions of detected genes, UMI counts and mitochondrial gene percentage across 4 sequencing libraries. Dashed lines indicate quality thresholds: genes > 100, UMIs > 200, pct_counts_mt < 8%. (b) Group-wise comparisons of cellular gene/UMI counts (Ctrl vs KD) using the same metrics as in (a). (c) Pearson correlation matrix of normalized expression profiles. Color intensity scales from red (high similarity) to blue (low similarity). (d) UMAP visualization of integrated datasets after batch-effect correction with scVI (see Methods). (e) UMAP projection of batch-corrected single-cell profiles. Cells are colored by sequencing library (Ctrl-1: n = 112,209; Ctrl-2: n = 11,891; KD-1: n = 7,876; KD-2: n = 11,082), with gray background indicating the combined cell distribution.
Single-cell transcriptomic annotation defines functional cell-type diversity in the Drosophila larval midgut
The Drosophila larval midgut single-cell transcriptomes were clustered into 32 initial groups, with clusters containing fewer than 10 cells in either experimental condition (KD group or Ctrl group) excluded to ensure analytical reliability. The remaining 31 robust clusters exhibited balanced cellular representation across experimental groups, as detailed in Table 3. Cluster annotation was performed based on cell-type-specific marker genes listed in Supplementary Table 1, enabling precise identification of distinct cellular subpopulations. UMAP visualization of these annotated clusters revealed clear segregation of cell types (Fig. 3a), with consistent topological organization between biological replicates.
Single-Cell Transcriptomic Atlas of Drosophila Larval Midgut. (a) UMAP visualization of 31 annotated cell clusters after batch-effect correction. Colors denote distinct cell types, including Adult Midgut Progenitors (AMPs), Enteroendocrine Cells (EEs), Enterocytes (ECs), HHEX-knockdown induced cells (HICs), Copper cells, Gastric caecum and Plasmatocytes, with colors indicating distinct cell types. (b) Dot plot of conserved marker genes across cell types (minimum 25% expression per cluster). Classic lineage markers are highlighted: AMPs (esg, Dl), EEs(pros), and ECs (Chs2, gas). Dot size indicates expression prevalence; color intensity reflects mean log-normalized counts. (c) UMAP projections showing expression patterns of key functional genes.
Annotation using cluster-specific marker genes cross-referenced with FlyBase and literature identified 31 midgut cell types, including MP, AMP/EE progenitors, AMPs-like, ECs which include 18 metabolically specialized ECs, two EEs, HHEX-KD Induced Cells (HICs), copper cells, and gastric caecum (Fig. 3b). The SNAIL family transcription factor Escargot (esg) and stem cell marker Delta (Dl) serves as a marker gene for AMPs in Drosophila44. Enteroendocrine cells (EEs) specifically express Prospero (Pros) and secrete diverse gut hormone peptides, including Allatostatins (AstA, AstB, AstC) and Tachykinin (Tk). Enterocytes (ECs) are characterized by the expression of Chitin synthase 2 (Chs2), gustatory receptor (gas), and hydrolase-encoding genes such as Jon99Aii25,27.
Clusters 20 and 24 were classified as midgut progenitors (C20_MP_1 and C24_MP_2) based on selective expression of Dl (encoding a Notch ligand critical for progenitor maintenance) and absence of esg expression. Cluster 30, designated C30_AMP/EE_progenitors, exhibited co-expression of Dl and Pros, a transcription factor required for enteroendocrine cell specification, suggesting lineage priming toward enteroendocrine differentiation. Two distinct AMPs-like populations (C08_AMPs-like_1 and C26_AMPs-like_2) were identified in Clusters 8 and 26, both maintaining esg and Sox100B expression45,46,47. Notably, Cluster 26 specifically expressed Chs2 (chitin synthase 2), indicating functional divergence from other AMPs-like clusters.
Clusters 5 and 15 displayed co-expression patterns of stemness markers (esg, Sox100B), enterocyte marker gas, and proliferation marker PCNA, suggestive of transitional states during differentiation from midgut progenitors to enterocytes25,48. These observations led us to tentatively classify these clusters as putative transitional populations (C05_MP > ECs_1 and C15_MP > ECs_2) (Fig. 3c).
Two EE subtypes were resolved: Cluster 18 (pros+/AstA+), and Cluster 29 (pros+/vn+), which additionally expressing stress-response genes eEF1alpha2 and Hsp68, and exhibited a reduced proportion in the KD group (Fig. 4a,b).
Cell Population Changes in Larval Midgut. (a) Stacked bar plot showing KD (cluster-matched colors) vs Ctrl (gray) proportions within each cell type, calculated as KD cells in X / total cells in X and Ctrl cells in X / total cells in X. Bars sorted by descending KD proportion values. Exact proportions (0.00–1.00 scale) shown above bars (two decimal places). X-axis: annotated cell types; Y-axis: proportion. (b) Violin plots showing log2FC for abundance in KD group. Sorted high to low. Dashed line at 0. log2FC > 0: KD abundance increase; log2FC < 0: KD abundance decrease. Colors match cell types. X-axis: annotated cell types; Y-axis: log2FC.
Region-specific populations included Copper cells (Cluster 25, CAH1+/Vha100-4+) localized to the mid-midgut and Gastric caecum (Cluster 28, Acbp4+) in the anterior region (Fig. 3c). Metabolic specialization divided ECs into anterior populations (Clusters 12, 19, 28) expressing Jon65Ai and alphaTry, posterior groups (Cluster 14,27) marked by iotaTry and zetaTry, detoxification-focused ECs (Cluster 7) enriched in Aldh, and chitin-producing ECs (Clusters 4, 11) with Chs2 expression. Within this functional landscape, Cluster 22 emerged as a ubiquitin ligase-enriched EC subtype exhibiting distinctively high E(spl)m6-BFM expression levels.
Clusters C16 and C17 were characterized by pronounced expression of Hemolectin (Hml) and Hemese (He), two established markers for plasmatocytes and hemocytes (Fig. 3c)49,50,51. Based on these distinct transcriptional profiles, we annotated these clusters as C16_Plasmatocytes_1 and C17_Plasmatocytes_2. Notably, eater (et), a hemocyte-specific gene known to be involved in phagocytosis of Gram-positive bacteria and hemocyte adhesion to sessile niches, was further enriched in C1652,53,54. This observation might suggest that C16 abundance is reduced in the KD group compared with the Ctrl group (Figs. 3b, 4a,b).
Clusters C00, C02, and C13 were observed to exhibit elevated expression of Antp and pb, two Hox family transcription factors implicated in midgut development and endoderm-derived cell differentiation, beyond their established roles in embryonic body patterning (Fig. 3c). Intriguingly, the human homolog of pb (HOXA2) has been reported to be upregulated in inflammatory bowel disease, while Antp’s human homolog (HOXA7) has been linked to colorectal cancer metastasis55,56. These parallels raise the possibility that pb and Antp might potentially contribute to midgut inflammatory regulation. Notably, the co-occurrence of Myo81F (a myosin family member hypothesized to facilitate collective cell migration via cytoskeletal remodeling) with the stemness marker Dl in in Cluster 13 as a potential precursor to Cluster 00 and 02 (Fig. 3c). Previous studies have demonstrated that upon infection or damage to ECs, the midgut epithelium undergoes a marked expansion of small progenitors expressing the Dl22,57,58,59. Considering their increased abundance in KD group (Fig. 4a,b) and partial retention of stem-like properties (e.g., Dl expression), these clusters were provisionally designated as HHEX-KD induced cells (C00-HICs_1, C02-HICs_2, C13-HICs_3), emerging specifically following HHEX knockdown.
Data availability
All raw sequencing data (paired-end FASTQ files) generated in this study are available under restricted access at the China National GeneBank Nucleotide Sequence Archive (CNSA) under project accession CNP0007162 (https://doi.org/10.26036/CNP0007162; Project https://db.cngb.org/data_resources/project/CNP0007162/)40,41,42. Four paired sequencing libraries (Ctrl-1, Ctrl-2, KD-1, KD-2) with metadata (library identifiers and file names) documented in Table 2 and on the CNSA project page40,41,42. Processed single-cell gene expression matrices with cell-type annotations are publicly available via Figshare (https://doi.org/10.6084/m9.figshare.28927709; Dataset https://figshare.com/articles/dataset/10_6084_m9_figshare_28927709)43.
Code availability
All code used for the analyses in this study, including data processing and downstream analyses, is available on Figshare at https://doi.org/10.6084/m9.figshare.2892770943.
References
Crompton, M. R. et al. Identification of a novel vertebrate homeobox gene expressed in haematopoietic cells. Nucleic Acids Res. 20, 5661–5667 (1992).
Newman, C. S., Chia, F. & Krieg, P. A. The XHex homeobox gene is expressed during development of the vascular endothelium: overexpression leads to an increase in vascular endothelial cell number. Mech. Dev. 66, 83–93 (1997).
Martinez Barbera, J. P. et al. The homeobox gene Hex is required in definitive endodermal tissues for normal forebrain, liver and thyroid formation. Dev. Camb. Engl. 127, 2433–2445 (2000).
Bergmann, S. et al. Spatial profiling of early primate gastrulation in utero. Nature 609, 136–143 (2022).
Zhang, J. et al. Tel2 regulates redifferentiation of bipotential progenitor cells via Hhex during zebrafish liver regeneration. Cell Rep. 39, 110596 (2022).
Yang, D. et al. CRISPR screening uncovers a central requirement for HHEX in pancreatic lineage commitment and plasticity restriction. Nat. Cell Biol. 24, 1064–1076 (2022).
Inamura, M. et al. Efficient generation of hepatoblasts from human ES cells and iPS cells by transient overexpression of homeobox gene HEX. Mol. Ther. J. Am. Soc. Gene Ther. 19, 400–407 (2011).
Gauvrit, S. et al. HHEX is a transcriptional regulator of the VEGFC/FLT4/PROX1 signaling axis during vascular development. Nat. Commun. 9, 2704 (2018).
Kershaw, R. M., Siddiqui, Y. H., Roberts, D., Jayaraman, P.-S. & Gaston, K. PRH/HHex inhibits the migration of breast and prostate epithelial cells through direct transcriptional regulation of Endoglin. Oncogene 33, 5592–5600 (2014).
Kitchen, P. et al. A Runaway PRH/HHEX-Notch3-Positive Feedback Loop Drives Cholangiocarcinoma and Determines Response to CDK4/6 Inhibition. Cancer Res. 80, 757–770 (2020).
Ruiz-Estevez, M. et al. Liver development is restored by blastocyst complementation of HHEX knockout in mice and pigs. Stem Cell Res. Ther. 12, 292 (2021).
Lewis, S. L. & Tam, P. P. L. Definitive endoderm of the mouse embryo: formation, cell fates, and morphogenetic function. Dev. Dyn. Off. Publ. Am. Assoc. Anat. 235, 2315–2329 (2006).
Ober, E. A., Verkade, H., Field, H. A. & Stainier, D. Y. R. Mesodermal Wnt2b signalling positively regulates liver specification. Nature 442, 688–691 (2006).
Slack, J. M. Developmental biology of the pancreas. Dev. Camb. Engl. 121, 1569–1580 (1995).
Koike, H. et al. Modeling human hepato-biliary-pancreatic organogenesis from the foregut-midgut boundary. Nature 574, 112–116 (2019).
Zhao, H., Han, D., Dawid, I. B., Pieler, T. & Chen, Y. Homeoprotein hhex-induced conversion of intestinal to ventral pancreatic precursors results in the formation of giant pancreata in Xenopus embryos. Proc. Natl. Acad. Sci. USA. 109, 8594–8599 (2012).
Bort, R., Martinez-Barbera, J. P., Beddington, R. S. P. & Zaret, K. S. Hex homeobox gene-dependent tissue positioning is required for organogenesis of the ventral pancreas. Dev. Camb. Engl. 131, 797–806 (2004).
Guo, Y. et al. CK2-induced cooperation of HHEX with the YAP-TEAD4 complex promotes colorectal tumorigenesis. Nat. Commun. 13, 4995 (2022).
Hickey, J. W. et al. Organization of the human intestine at single-cell resolution. Nature 619, 572–584 (2023).
Gehart, H. & Clevers, H. Tales from the crypt: new insights into intestinal stem cells. Nat. Rev. Gastroenterol. Hepatol. 16, 19–34 (2019).
Peterson, L. W. & Artis, D. Intestinal epithelial cells: regulators of barrier function and immune homeostasis. Nat. Rev. Immunol. 14, 141–153 (2014).
Jiang, H. et al. Cytokine/Jak/Stat signaling mediates regeneration and homeostasis in the Drosophila midgut. Cell 137, 1343–1355 (2009).
Duffy, J. B. GAL4 system in Drosophila: a fly geneticist’s Swiss army knife. Genes. N. Y. N 2000 34, 1–15 (2002).
Singh, S. R., Mishra, M. K., Kango-Singh, M. & Hou, S. X. Generation and Staining of Intestinal Stem Cell Lineage in Adult Midgut. Methods Mol. Biol. Clifton NJ 879, 47–69 (2012).
Buchon, N. et al. Morphological and molecular characterization of adult midgut compartmentalization in Drosophila. Cell Rep. 3, 1725–1738 (2013).
Hung, R.-J. et al. A cell atlas of the adult Drosophila midgut. Proc. Natl. Acad. Sci. USA. 117, 1514–1523 (2020).
Marianes, A. & Spradling, A. C. Physiological and stem cell compartmentalization within the Drosophila midgut. eLife 2, e00886 (2013).
Morgan, N. S., Skovronsky, D. M., Artavanis-Tsakonas, S. & Mooseker, M. S. The Molecular Cloning and Characterization of Drosophila melanogaster Myosin-IA and Myosin-IB. J. Mol. Biol. 239, 347–356 (1994).
Hu, Y. et al. FlyPrimerBank: an online database for Drosophila melanogaster gene expression analysis and knockdown evaluation of RNAi reagents. G3 Bethesda Md 3, 1607–1616 (2013).
Huang, J. et al. A reference human genome dataset of the BGISEQ-500 sequencer. GigaScience 6, 1–9 (2017).
Adams, M. D. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185–2195 (2000).
Drosophila 12 Genomes Consortium. et al. Evolution of genes and genomes on the Drosophila phylogeny. Nature 450, 203–218 (2007).
Hunt, S. E. et al. Ensembl variation resources. Database 2018, bay119 (2018).
Virshup, I., Rybakov, S., Theis, F. J., Angerer, P. & Wolf, F. A. anndata: Access and store annotated data matrices. J. Open Source Softw. 9, 4371 (2024).
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
Zeng, Z. et al. OmicVerse: a framework for bridging and deepening insights across bulk and single-cell sequencing. Nat. Commun. 15, 5983 (2024).
Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).
Thurmond, J. et al. FlyBase 2.0: the next generation. Nucleic Acids Res. 47, D759–D765 (2019).
Dann, E., Henderson, N. C., Teichmann, S. A., Morgan, M. D. & Marioni, J. C. Differential abundance testing on single-cell data using k-nearest neighbor graphs. Nat. Biotechnol. 40, 245–253 (2022).
CNGB Nucleotide Sequence Archive. https://doi.org/10.26036/CNP0007162 (2025).
Guo, X. et al. CNSA: a data repository for archiving omics data. Database J. Biol. Databases Curation 2020, baaa055 (2020).
Chen, F. Z. et al. CNGBdb: China National GeneBank DataBase. Yi Chuan Hered. 42, 799–809 (2020).
Tu, Z. et al. A single-cell transcriptomic dataset of enterocyte-specific HHEX knockdown in the Drosophila larval midgut. Figshare https://doi.org/10.6084/m9.figshare.28927709 (2025).
Micchelli, C. A. & Perrimon, N. Evidence that stem cells reside in the adult Drosophila midgut epithelium. Nature 439, 475–479 (2006).
Zettervall, C.-J. et al. A directed screen for genes involved in Drosophila blood cell activation. Proc. Natl. Acad. Sci. USA. 101, 14192–14197 (2004).
Jin, Z. et al. The Drosophila Ortholog of Mammalian Transcription Factor Sox9 Regulates Intestinal Homeostasis and Regeneration at an Appropriate Level. Cell Rep. 31, 107683 (2020).
Meng, F. W., Rojas Villa, S. E. & Biteau, B. Sox100B Regulates Progenitor-Specific Gene Expression and Cell Differentiation in the Adult Drosophila Intestine. Stem Cell Rep. 14, 226–240 (2020).
Plygawko, A. T. et al. The Drosophila adult midgut progenitor cells arise from asymmetric divisions of neuroblast-like cells. Dev. Cell 60, 429–446.e6 (2025).
Goto, A. et al. A Drosophila haemocyte-specific protein, hemolectin, similar to human von Willebrand factor. Biochem. J. 359, 99–108 (2001).
Stephenson, H. N., Streeck, R., Grüblinger, F., Goosmann, C. & Herzig, A. Hemocytes are essential for Drosophila melanogaster post-embryonic development, independent of control of the microbiota. Dev. Camb. Engl. 149, dev200286 (2022).
Kurucz, E. et al. Hemese, a hemocyte-specific transmembrane protein, affects the cellular immune response in Drosophila. Proc. Natl. Acad. Sci. USA. 100, 2622–2627 (2003).
Anderl, I. et al. Transdifferentiation and Proliferation in Two Distinct Hemocyte Lineages in Drosophila melanogaster Larvae after Wasp Infection. PLoS Pathog. 12, e1005746 (2016).
Kocks, C. et al. Eater, a transmembrane protein mediating phagocytosis of bacterial pathogens in Drosophila. Cell 123, 335–346 (2005).
Kroeger, P. T., Tokusumi, T. & Schulz, R. A. Transcriptional regulation of eater gene expression in Drosophila blood cells. Genes. N. Y. N 2000 50, 41–49 (2012).
Zhang, B. & Sun, T. Transcription Factors That Regulate the Pathogenesis of Ulcerative Colitis. BioMed Res. Int. 2020, 7402657 (2020).
Dang, Y., Yu, J., Zhao, S., Cao, X. & Wang, Q. HOXA7 promotes the metastasis of KRAS mutant colorectal cancer by regulating myeloid-derived suppressor cells. Cancer Cell Int. 22, 88 (2022).
Guo, Z. & Ohlstein, B. Bidirectional Notch signaling regulates Drosophila intestinal stem cell multipotency. Science 350, aab0988 (2015).
Perdigoto, C. N., Schweisguth, F. & Bardin, A. J. Distinct levels of Notch activity for commitment and terminal differentiation of stem cells in the adult fly intestine. Dev. Camb. Engl. 138, 4585–4595 (2011).
Patel, P. H., Dutta, D. & Edgar, B. A. Niche Appropriation by Drosophila Intestinal Stem Cell Tumors. Nat. Cell Biol. 17, 1182–1192 (2015).
Acknowledgements
This work was supported by the National Key R&D Program of China (Grant No. 2022YFC3400300 to M.W.), Shenzhen Science and Technology Innovation Program (Grant No. KQTD20180411143432337, China) and Shenzhen Key Laboratory of Gene Regulation and Systems Biology (Grant No. ZDSYS20200811144002008) (to Y.H. and Q.H.), Shenzhen Medical Research Fund (Grant No. D2401016 to Y.H.), Guangdong Basic and Applied Basic Research Foundation (Grant No. 2024A1515012343 to Q.H.), National Key R&D Program of China (Grant No. 2022YFC3400405 to X.X. and L.L.). We thank Biorender.com for graphical elements used in Fig. 1.
Author information
Authors and Affiliations
Contributions
Z.T., Q.H., M.W. and Y.H. conceived the idea; Z.T., Q.H., Z.J., Y.W., T.Y., L.L., X.X., M.W., and Y.H. supervised the work; Q.H., Y.W. and T.Y., prepared the samples; Z.T., Q.H., Z.J., Y.W., T.Y., prepared the sequencing library and performed sequencing; Z.T. and Z.J. performed computational analysis; Z.T., Q.H. and M.W. interpreted the data; Q.H., M.W., L.L., X.X. and Y.H. acquired funding; Z.T., Q.H. and M.W. wrote the manuscript; All authors reviewed and edited the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Tu, Z., Hu, Q., Jia, Z. et al. A single-cell transcriptomic dataset of enterocyte-specific HHEX knockdown in the Drosophila larval midgut. Sci Data 12, 1983 (2025). https://doi.org/10.1038/s41597-025-06290-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41597-025-06290-0






