Introduction

The liver is a primary target or off-target organ for gene therapy using recombinant adeno-associated viral (rAAV) vectors [1]. The liver is composed of lobes, which are further divided into repeating structural units called lobules. Each lobule is subdivided into zones—periportal, intermediate, and pericentral—each with distinct metabolic functions. These zones exhibit a gradient of enzyme expression and metabolic activities, allowing different regions of the lobule to specialize in various biochemical processes such as gluconeogenesis, glycolysis, and detoxification [2]. In mice, the spatial distribution of rAAV gene therapy vectors is related to liver zonation and is influenced by both serotype and sex. Whereas rAAV2-based vectors preferentially transduce periportal regions, rAAV9 favors pericentral zones [3]. Overall, liver transduction is generally higher in male mice [4, 5].

These sex- and serotype-specific differences may stem from androgen-regulated pathways that influence rAAV entry, trafficking, genome processing, or immune modulation [5]. Additionally, differences in liver zonation-specific metabolism and AAV receptor/co-receptor expression may further contribute to these variations. The differences in liver transduction result in varying transgene expression levels in male and female mice, potentially causing distinct downstream effects and potential toxicities. In this regard, methods that link transgene expression with genome-wide transcriptomic changes in situ could help to better characterize AAV transduction and its associated transcriptomic alterations.

Here, we apply 10x Genomics Visium spatial transcriptomics (ST) and single-nucleus RNA sequencing (snRNA-seq) to comprehensively map rAAV2- and rAAV9-transduction in the mouse liver. Unlike classical methods mapping rAAV biodistribution as immunohistochemistry (IHC) or in situ hybridization (ISH), ST uniquely enables spatial mapping of transgene expression to tissue structures and analysis of associated transcriptomic signatures. Furthermore, compared to panel-based, multiplex approaches in rAAV research, such as USeqFISH [6] or STARmap PLUS [7], which analyze 50 or 1000 genes, 10x Genomics Visium ST enables unbiased, whole-transcriptome profiling. We reasoned that this combination offers novel opportunities for hypothesis generation, specifically for understanding both the cellular responses to rAAV transduction and factors influencing rAAV entry.

In our proof-of-concept study, we applied Visium ST to delineate differentially expressed genes, pathway activity, and receptor expression in the liver of male and female mice transduced with rAAV2 or rAAV9 encoding enhanced green fluorescent protein (EGFP) under a cytomegalovirus (CMV) promoter (rAAV2-CMV-EGFP and rAAV9-CMV-EGFP). To cross-validate and resolve findings at the single-cell level, we utilized snRNA-seq analyses that were carried out using the same liver samples.

Our study establishes 10x Genomics Visium as a powerful platform for spatially mapping transgene expression in the mouse liver. We demonstrate that liver transduction with rAAV2- and rAAV9-CMV-EGFP induces broad, zonation- and sex-independent modulation of the circadian clock in liver regions with high EGFP transgene expression (i.e., the alterations were broadly consistent across the periportal, intermediate, and pericentral zones of the liver lobule). Additionally, we uncover deregulation of lipid metabolism, replication stress responses, and innate antiviral immunity, with distinct sex-specific patterns. Through a comprehensive ST analysis, we also validate known rAAV2 and rAAV9 co-receptors and propose putative novel entry factors. These findings provide a proof-of-concept for using spatially resolved transcriptomics to elucidate molecular signatures relevant to gene therapy, establishing it as a powerful hypothesis-generating tool for understanding vector tropism, host responses, and identifying potential targets for optimizing AAV-based liver-directed therapies.

Material and methods

Mice

All studies were conducted with the approval of the cantonal veterinary authority of Basel-Stadt under license (BS-3082), followed in strict adherence to the Swiss federal regulations on animal protection. Mice were housed in groups of up to 5 in polycarbonate cages in a temperature (22 ± 2 °C) and humidity (55 ± 15%) controlled environment on a 12 h light cycle (06:00–18:00 h). The ventilation rate was more than 10 air exchanges per hour. Prior to study start mice are acclimatized for at least 7 days. 12 B6 Albino mice (B6N-Tyrc-Brd/Brd CrCrl) (6 male and 6 female) and 26 C57BL/6J mice (13 male and 13 female) of 6–8 weeks were supplied by either Charles River Laboratories (Saint-Germain-Nuelles, France) or Charles River Laboratories (Sulzfeld, Germany), respectively and randomly grouped (n = 2–3) in IVC cages. Mice were housed with ad libitum access to food (Kliba Nafag 3439 pelleted) and water.

Power calculation

Sample size rationale is supported by power analysis for snRNA-seq [8]. Our calculations indicate that n = 3/4 per group yields a detection power of 0.93 at a significance level of 0.05. Importantly, differential gene expression analysis was performed using pseudobulk profiles per animal, region, and treatment (see sections for ST and snRNA-seq data analysis). By averaging across multiple spatial measures from the same sample, this approach minimizes noise and increases reliability, ensuring n = 3/4 is sufficient for robust statistical inference.

rAAV vectors and administration

All the vectors (Table 1) were custom-produced by SignaGen (Frederic, MD). For luminescence In Vivo Imaging System (IVIS) readouts, rAAV2 (Cat #SL100801) and rAAV9 (Cat #SL100801), carrying EGFP connected with the firefly luciferase (fLuc) via the Thosea asigna virus 2 A peptide (T2A) (CMV-EGFP-T2A-fLuc), were delivered by the vendor at 1.10 × 1013 vector genomes (VG)/ml and 3.68 × 1013 VG/ml concentrations respectively. For snRNAseq and histopathology, rAAV2 and rAAV9 carrying CMV-EGFP transgenes were delivered at 1.22 × 1013 VG/ml and 3.65 × 1013 VG/ml, respectively. All batches were pre-tested for endotoxin and confirmed to contain less than 20% empty capsids. The EGFP transgene contains 70 CpG dinucleotides within its 720 nucleotides. For i.v. injection, formulation was done in sterile PBS (Cat #10010023, Gibco, Thermo Fisher Scientific, Carlsbad, CA) at 5 × 1011 VG per mouse. Mice were placed in the restrainer, and 5 × 1011 VG were intravenously injected in 100 μl volume through the tail vein. For vector administration, mice were weight-matched and randomly placed in groups of 4.

Table 1 rAAV vectors used.

In vivo imaging system

IVIS was performed non-blinded on days 13, 22 and 27. The B6 albino mice were subdivided into groups of 4 (2 males and 2 females) and injected i.p. with IVISbrite™ D-Luciferin Ultra Bioluminescent Substrate in RediJect™ Solution (Cat #770505, Perkin Elmer, Waltham, MA) at 150 μg per gram of body weight. After 10 min, they were placed in an isoflurane induction chamber. The flow was maintained at 2.0 l/min at 4%. After 5 min the animals were placed inside the IVIS Spectrum system (Perkin Elmer) in a supine position, and the anesthesia was reduced to 2–3%. Subsequently, the images were acquired at FOV 22.8 cm (position D) using Auto Mode. The analysis was performed in Living Image software (Perkin Elmer). The elliptical ROIs were drawn in the liver area. Radiance values (p/s/cm/sr) calculated pixel-by-pixel were used to generate color-coded maps of luminescence, and corresponding Total Flux (p/s) values were reported in the graphs. Total Flux (p/s) values were calculated by summing radiance (p/s/cm²/sr) over the manually defined liver ROI using Living Image software.

Tissue sampling

On day 28, animals were sacrificed, and livers were collected. Parts of the left lateral lobe were either fixed in neutral buffered 10% formalin for 24 h and processed for histology according to standard procedures or frozen in Optimal Cutting Temperature (OCT) using dry ice and stored at –80 °C until further processing.

Histopathology

Chromogenic ISH on fresh frozen and FFPE sections

Slides with 7 μm fresh frozen tissue sections were fixed in neutral buffered 10% formalin (Cat #9713.5000, VWR International, LLC., Randor, PA) for 15 min, followed by dehydration in increasing ethanol concentrations (50%, 70%, 2 × 100% 5 min each) at room temperature. The slides were air-dried and subsequently transferred to the Ventana Discovery Ultra instrument (Ventana Medical Systems, Inc., Tucson, AZ, USA). mRNA expression of the targets was determined using the RNAscope VS Universal AP assay (Advanced Cell Diagnostics (ACD), Inc., Newark, CA) with the DISCOVERY mRNA RED, Amplification & Pretreatment PTO kit (Roche Diagnostics, Basel, CH, Cat #07074654001). ISH probes were all acquired from ACD and included EGFP (Cat #400289) as well as negative control DapB (Cat #312039) and the positive control Mm-PPIB (Cat #313919). The sections were counterstained with hematoxylin and bluing reagent for nuclei detection.

Formalin fixed, paraffin-embedded (FFPE) sections were cut at 3.5 μm, and EGFP and EGFP-sense detection was performed using the RNAscope VS Universal HRP Detection reagents (Cat #323210, ACD), the Ventana Discovery mRNA Purple Detection Kit. (Cat #760-255, Ventana Medical Systems, Inc., Tucson, AZ, USA) and the Discovery mRNA Probe Amplification Kit (Cat #760-222, Ventana Medical Systems, Inc.). The sections were counterstained with hematoxylin and bluing reagent for nuclei detection.

IHC on FFPE sections

IHC was performed on FFPE liver tissue using the Ventana Discovery Ultra Instruments. Briefly, slides were deparaffinized and antigen retrieval was performed using a citrate-based puffer (pH 6.5) at 95 °C for 32 min. EGFP was detected using a rabbit polyclonal GFP antibody (0.12 μg/ml, 37 °C, 20 min, Cat #ab6556, Abcam Limited, Cambridge, UK) and an OmniMap anti-Rabbit HRP secondary antibody (Cat #760-4311, Ventana Medical Systems). Chromogenic detection was performed using the Discovery Purple Kit (Cat #760-229, Ventana Medical Systems). Sections were counterstained with hematoxylin and bluing reagent for nuclei detection.

Quantification of ISH and IHC on FFPE

Scanning was performed using a NanoZoomer S 360 C13220 series (Hamamatsu Photonics K.K., Hamamatsu City, Japan) at 40x magnification. The.NDPI image files were uploaded to HALO (Indica Labs, Albuquerque, NM). For evaluating the percentage of positive cells in IHC and in ISH staining, first, a pre-trained nuclear segmentation algorithm was used and fine-tuned for our study. Subsequently, the ISH and IHC modules in HALO were adjusted to enhance their performance for their respective staining cases. After nuclear detection, the cytoplasm was set using nuclear extension. Quantification of the EGFP mRNA was performed using the cytoplasmic signal from the EGFP ISH probe. For the EGFP protein, the percentage of EGFP IHC-positive was determined. In all the cases, data were segregated intos (1) control, rAAV2, and rAAV9, and (2) in male, and female.

Visium ST sample preparation

For RNA quality assessment, RNA from three random animals per group was extracted using the Arcturus® PicoPure® RNA Isolation Kit (Cat #KIT0204, Applied Biosystems™, Waltham, MA). For cell lysis, 2–10 μm sections of the sample were resuspended in 200 μl extraction buffer. Total RNA was extracted following the instructions in the manual. RNA integrity number (RIN) was assessed using the 2100 Bioanalyzer system (Agilent Technologies, Inc.) with an Agilent RNA 6000 Nano Kit (Cat # 5067-1511, Agilent Technologies, Inc.). Samples with RIN above 7.0 were used.

Tissue optimization was carried out according to the manufacturer’s instructions (Visium Spatial Tissue Optimization User Guide, Rev C. 10x Genomics, Pleasanton, CA). Image acquisition was performed on the Hamamatsu NanoZoomer S 360 C13220 series (Hamamatsu Photonics K.K.) at 40x magnification, and the coverslip was removed afterwards by immersing the slide in a 3x Saline-Sodium Citrate buffer (Cat #S66391L, Sigma-Aldrich, Merck KGaA, Darmstadt, Germany). The stained tissue sections were permeabilized using a time course to test for the optimal permeabilization time. After performing fluorescent cDNA synthesis, the tissue was removed. Finally, the fluorescent cDNA was imaged using a Zeiss Axio Scan.Z1 (Zeiss, Oberkochen, Germany) with a Plan Apochromat 20x/0.8 M objective, an ET-Gold FISH filter (ex 538-551 nm/em 556-560 nm) and 100 ms exposure time.

For the gene expression analysis, 10 μm-thick sections of the samples were placed with a random non-blinded distribution on chilled 10x Genomics Visium Gene Expression slides. From each group (control, rAAV2-CMV-EGFP and rAAV9-CMV-EGFP), 3 males and females were included. The sections were similarly stained with H&E and subsequently imaged as described above. To release the mRNA, the sections were permeabilized for 12 min as defined by tissue optimization. For further processing, the cDNA was amplified according to the manufacturer’s protocol (CG000239_Visium Spatial Gene Expression User Guide_RevC). Double-indexed libraries were prepared. The libraries were quality controlled using a 2100 Bioanalyzer system with Agilent High Sensitivity DNA Kit (Cat #5067-4626, Agilent Technologies, Inc., Santa Clara, CA) and quantified with Qubit™ 1X dsDNA HS Assay Kit (Cat #Q33230, Invitrogen) on a Qubit 4 Fluorometer (Cat #Q33238, Invitrogen, Waltham, CA). The libraries were loaded onto the NovaSeq 6000 (Illumina, Inc., San Diego, CA) at a concentration of 150 pM. A NovaSeq SP v 1.5, S1 v 1.5 or S2 v 1.5 Reagent Kit (100 cycles) (Cat #20028319 and #20028401, Illumina, Inc) was used. For paired-end dual-indexed sequencing, the following read protocol was used: read 1: 28 cycles; i7 index read: 10 cycles; i5 index read: 10 cycles; and read 2: 90 cycles. All libraries were sequenced at a minimum of 50000 reads per covered spot.

Raw sequencing data were demultiplexed using the mkfastq function from Space Ranger (v. 1.2.0). Demultiplexed data were mapped to a custom reference with spaceranger count. The custom reference has been generated by adding the vector sequence to the mouse reference MM10 using spaceranger mkref. Spots under tissue folds, artifacts and tissue areas with insufficient morphological quality were manually removed using the 10x Genomics Loupe browser (v. 5.1.0).

ST data analysis

Data preprocessing and quality control

SpaceRanger output was pre-processed and stored into anndata objects using the Scanpy [9] and Squidpy [10] packages. Upon examination of the overall spatial distribution of counts and detected genes, low-quality spots were discarded following a similar approach to the one proposed by Ben-Moshe and colleagues [11]. In short, spots with total Unique Molecular Identifiers (UMI) count below two and above three Median Absolute Deviation (MAD) from the UMI count mean across all spots, computed individually for each sample, were filtered out. In the same line, spots with a fraction of mitochondrial genes greater than four MAD above the slide mean were also excluded from the analysis. Additionally, spots under tissue folds, artifacts, and those with insufficient morphological quality were removed. RNA counts per spot for each sample were normalized and log-transformed. The anndata objects from the different samples were then concatenated, and Pearson’s Correlation coefficient was calculated between all the genes and the transgene.

Allocating liver zonation regions

The pericentral and periportal regions of the mice liver were delineated by leveraging established markers [2, 12,13,14]. Specifically, gene expression of pericentral markers (Glul, Cyp2e1, Oat, Slc1a2, Cyp1a2) and periportal markers (Sds, Cyp2f2, Hal, Hsd17b13, Alb, Arg1, Pck1) was used to compute a per-sample score using Scanpy’s scanpy.tl.score_genes function. Spots with scores in the top 80th percentile of the per-sample distribution were assigned to the corresponding pericentral or periportal category; the remaining spots were labeled as intermediate.

Deconvolution

We performed cell type deconvolution of the ST data using the snRNA-seq as the reference dataset. In our snRNA-seq reference, each nucleus was assigned a cell type label as described in the snRNA-seq analysis section. The nucleus annotated as “B T cell doublet” or “mixed” were discarded for this analysis. Cell type annotations were transferred from the snRNA-seq reference to each spot in the ST data by comparing spot-level expression profiles to those of the annotated single-nucleus reference. Deconvolution was carried out using two well-established methods, RCTD [15] and cell2location [16], to ensure that results were consistent across different approaches. Although both methods yielded similar estimates of the cell type composition, we present the findings from RCTD in the results section, because small differences in several immune subtypes (in particular within T cell fractions) were more aligned with the pathologist’s expectations for this tissue.

The RCTD analysis was performed with default parameters and in “full mode” to enable modeling of mixtures of multiple cell types per spot. The output of RCTD is a probability distribution of cell type mixtures for each spot.

For cell2location, we created the signature from our snRNA-seq data with the following parameters: cell_count_cutoff = 25, cell_percentage_cutoff2 = 0.1, and nonz_mean_cutoff = 1.25. This resulted in 8064 genes from 146663 nuclei on which the regression model was trained for 400 epochs using mice sex as a categorical covariate. Using this single-cell level signature, the cell2location model was then trained for 50000 epochs using the parameters N_cells_per_location = 10 and detection_alpha = 100. The final output of cell2location is a spot-by-cell-type matrix of estimated cell type abundances, which yielded similar results to the RCTD probabilities.

Pseudo-bulk and differential gene expression analysis

Raw counts from the ST dataset were summed per sample for each liver region (pericentral, periportal, and intermediate) to generate pseudo-bulk profiles using the decoupleR package [17]. The counts were then normalized to counts per million (CPM), log transformed, and scaled to a maximum of 10. Genes with fewer than 10 counts per sample or 40 reads across all samples were not included in the analysis.

Using PyDESeq2 [18] with Benjamini-Hochberg false discovery rate correction, differential gene expression analysis was performed between pseudobulk profiles for the different conditions (liver zone, rAAV treatment, and sex). The apeglm shrinkage method [19] was used to correct Log2 fold changes (log2FC), and differentially expressed genes were classified based on adjusted p-values (p-value < 0.05) and absolute log2FC (>0.5) values.

Functional characterization

We estimated pathway activity per spot using PROGENy with a multivariate linear model as implemented in decoupleR (decoupler.run_mlm). The PROGENy model incorporates 14 pathways: Wnt, VEGF, Trail, TNFα, TGF-β, PI3K, p53, NFkB, MAPK, JAK/STAT, Hypoxia, Estrogen, Androgen, and EGFR. The model considers the expression of genes that are more responsive to perturbations on those pathways [20]. In our analysis, we ran PROGENy using the top 500 most responsive genes for each pathway. Following the same procedure, the Wald statistic from the differential gene expression analysis on the pseudo-bulk profiles was used to estimate differences in pathway activities between conditions. A p-value  < 0.05 was considered significant for changes in pathway activity.

Transcription factor activity per spot was determined using a univariate linear model (decoupler.run_ulm) on regulons derived from the CollecTRI resource [21]. The Wald statistic from the differential gene expression analysis on the pseudo-bulk profiles was utilized to infer differences in transcription factor activity between conditions using the same procedure. A p-value < 0.05 was considered significant for changes in transcription factor activity. The top 35 transcription factors with the most activity, regardless of condition, are displayed in the heatmap in the results section for comparison with the control.

The Gene Ontology Biological process (GO: BP) [22] and Kyoto Encyclopedia of Genes and Genomes (KEGG) [23] gene sets were used as a reference to conduct GSEA on the differential gene expression analysis results on the pseudo-bulk profiles (decoupler.get_gsea_df with default parameters). Changes in a specific term (i.e., biological process or pathway) were considered significant if they showed an adjusted p-value < 0.05 after Benjamini-Hochberg correction for multiple hypothesis testing. The Scanpy scanpy.tl.score_genes function was employed to calculate a per-spot score using the combined gene expression of Nr1d1 and Nr1d2 (selected for “RHYTHMIC_BEHAVIOR”), and Eci2 and Ech1 (selected for “FATTY_ACID_CATABOLIC_PROCESS” and “FATTY_ACID_BETA_OXIDATION”). These genes were common leading-edge genes identified across tested conditions for these terms.

Inference of factors modulating rAAV entry

To investigate potential factors mediating rAAV entry into the cell, we employed LIANA+  [24] with the goal of identifying spatially co-expressed transgene-receptor pairs. To do this, we first defined the parameters of the spatial connectivity, specifically, by selecting a Gaussian kernel with a bandwidth = 150 and cutoff = 0.1. We then computed the spatially weighted cosine similarity as a local measure of co-expression and the global bivariate Moran’s R to quantify global spatial autocorrelation. We assessed statistical significance using 1000 permutations and filtered out interactions where either the ligand or receptor was expressed in less than 20% of the spots. As potential receptors, we included the corresponding genes of well-known receptors (from the literature) that mediate the entry of rAAV factors. These genes are AU040320, Cd9, Fgfr1, Hspg2, Itgav, Itgb1, Itgb5, Met, and Rpsa. Additionally, we extracted a list of receptors from CellCommuNet [25] and, to obtain receptors relevant to the mouse liver, filtered by Mus musculus, normal condition, single-study type, and liver tissue. We also extracted potential receptors from UniProt [26], which are annotated as cell membrane genes (SL000039). It should be noted that Moran’s I score is computed per sample; therefore, we subsequently averaged the scores for the samples belonging to the same condition. We defined relevant genes as those with a significant (p < 0.01) average Moran’s I score > 0.025 with the transgene, a threshold chosen to ensure that detected interactions exhibit true spatial autocorrelation (i.e., localized patterns) rather than being randomly or ubiquitously distributed. Finally, and for visualization purposes, we selected the top 10 receptors exhibiting the largest global Moran’s score with the transgene.

Nuclei isolation

Nuclei were isolated from fresh frozen liver samples. Samples from 3 control males and females and 4 rAAV2- or rAAV9-CMV-EGFP-treated males and females were processed, respectively. One rAAV9- CMV-EGFP-treated female was processed in two replicates. For nuclei isolation, animals were distributed randomly and without blinding into groups of 4. Tissue samples of 17–35 mg were dissociated using a Singulator 100 (S2 Genomics, Inc., Livermore, CA) Nuclei suspensions were centrifuged at 500 g for 5 min at 4 °C and resuspended in Nuclei Storage Reagent (Cat #100-058-874, S2 Genomics, Inc.) supplemented with 0.2 U/ul RNaseOUT™ Recombinant Ribonuclease Inhibitor (Cat #10777019, Thermo Fisher Scientific). Nuclei were stained with NucBlue™ Fixed Cell ReadyProbes™ Reagent (Cat #R37606, Thermo Fisher Scientific) containing 4’,6-diamidine-2’-phenylindole dihydrochloride (DAPI) at one drop per 500 μl. After 30 min incubation on ice, cells were strained through 35 μm cell strainers prior to fluorescence-activated cell sorting (FACS) using a FACSAria™ Fusion Flow Cytometer (BD Biosciences, San Jose, CA, USA). Cell strainers were pre-wet with 1% bovine serum albumin (BSA, Cat #A9576 50 mL, Sigma) in PBS. Nuclei were identified by size and signal intensity. Nuclei were sorted into a pre-coated tube containing 1% BSA in PBS. Subsequently, nuclei were centrifuged at 500 g for 5 min at 4 °C and resuspended in PBS containing 0.04% BSA using pre-coated filter tips. Nuclei were manually counted using a hemocytometer and used for snRNA-seq.

Single-nucleus RNA sequencing

For the construction of snRNA-seq libraries, Chromium Next GEM Single Cell 3’ Kit v3.1 (10x Genomics) was used according to the manufacturer’s instructions. 8000 sorted nuclei were loaded onto a 10x Genomics Chromium Next GEM Chip G and processed immediately in a 10x Genomics Chromium controller. Amplification of cDNA was performed during 12 PCR cycles. Libraries were quantified with a Qubit™ 1X dsDNA HS Assay Kit (Cat #Q33230, Invitrogen) on a Qubit 4 Fluorometer (Cat #Q33238, Invitrogen). Fragment size was assessed with a TapeStation High Sensitivity D5000 Screen Tape (Agilent Technologies, Inc.). Libraries were pooled and loaded at a concentration of 200 pM and sequenced using a NovaSeq 6000 (Illumina, Inc., San Diego, CA) and S4 flow cell (Cat #20028313, 200 cycle kit) with a read one length of 28 cycles, and a read two length of 90 cycles. All libraries were sequenced at a minimum of 25000 reads per covered cell.

Raw sequencing data were demultiplexed using the mkfastq function from CellRanger (v. 6.0.2, 10x Genomics). Using the CellRanger count Demultiplexed data were mapped to a custom reference, including the vector sequence generated by the mkref function.

Single-nucleus RNA sequencing data analysis

Data preprocessing and quality control

CellBender, a variational inference framework used to estimate and remove ambient RNA contamination from droplet-based single-cell (or snRNA-seq) data, was employed to mitigate technical artifacts and ambient RNA signals prior to downstream analysis [27]. CellBender was employed with default parameters for fpr (0.01) and epochs (150) for all the samples, while the expected number of cells and the learning rate were adjusted for each sample to prevent over-correction.

Post–ambient RNA removal, nuclei were subjected to a stringent filtering process based on UMIs, number of detected genes, and percentage of mitochondrial transcripts. Specifically, we excluded nuclei with <1000 counts or > 40000 counts, <400 genes or >8000 genes and >1% mitochondrial reads.

Downstream analysis and cell type annotation

The Besca toolkit, an automated single-cell analysis pipeline implemented in Python and built upon Scanpy, was employed for clustering and subsequent interpretative analyses [9, 28]. Besca streamlines typical sc/snRNA-seq analysis steps, including normalization, feature selection, dimensionality reduction, clustering, and cell type annotation. Briefly, RNA counts were normalized per 10000 (cp10k), the top highly variable genes were selected for each sample, and a principal component analysis (PCA) was performed. The first 50 principal components (PCs) were used for nearest neighbor (n_neighbors = 10) calculations and Leiden clustering (resolution = 1.5). To visualize the data, uniform manifold approximation and projection (UMAP) embeddings were computed from the PCA space.

Cell annotation was performed using Besca’s cell annotation workflow and the sig-annot module, with the nomenclature following cell ontology (CL) conventions wherever possible. Cell type annotation was performed using Besca markers, along with additional markers detailed in Table S3. The UMAP in Fig. S1g shows a coarse-grained version of this annotation (Table S4). The detailed cell type annotations from Table S3 were used for deconvolution.

Pseudo-bulk and differential gene expression analysis

Raw counts from the snRNA-seq dataset were summed independently for each sample, for both periportal and pericentral annotated hepatocytes, to generate pseudo-bulk profiles using the decoupleR package [17]. The CPM, log transformed, and scaled to a maximum of 10. Genes with fewer than 10 counts per sample or 15 reads across all samples were excluded from the analysis.

Differential gene expression analysis was performed between pseudobulk profiles following the same procedure described above for the ST data.

Functional characterization

As described in the corresponding ST section, the Wald statistic from the differential gene expression analysis on the pseudo-bulk profiles of pericentral and periportal hepatocytes was used to estimate differences in pathway and transcription factor activities between conditions. A p-value  < 0.05 was considered significant for changes in pathway or transcription factor activity.

GSEA was performed on the differential gene expression analysis results derived from the pseudo-bulk profiles of pericentral and periportal hepatocytes. This analysis followed the same methodology as described for the ST dataset. Significant changes within a specific term (e.g., biological process or pathway) were determined using a Benjamini-Hochberg adjusted p-value threshold of less than 0.05 to account for multiple hypothesis testing.

Statistical analysis

Statistical methods for each analysis are detailed in their respective sections. Pearson correlation coefficients were computed using numpy.corrcoef function from the NumPy Python package. We compared average gene expression levels between male and female groups across zonation regions within each experimental condition by performing the Welch’s t-test (scipy.stats.ttest_ind).

The figures, including heatmaps, spatial maps, scatterplots plots, bar plots and dot plots were created using Matplotlib [29], Scanpy and Seaborn [30]. Dotplots to show quantification of ISH, IHC and snRNA-seq were generated using GraphPad Prism v10.1.2 (GraphPad Software, Boston, Massachusetts, USA).

Ethical approval

As described in the material and methods section, all studies were conducted with the approval of the cantonal veterinary authority of Basel-Stadt under license (BS-3082), followed in strict adherence to the Swiss federal regulations on animal protection.

Results

Spatial transcriptomics maps rAAV biodistribution

We hypothesized that the 10x Genomics Visium ST platform represents a novel method for mapping rAAV vectors to tissue-specific hallmarks and assessing corresponding transcriptomic signatures at the whole-genome level. To validate our approach, we analyzed liver samples from male and female mice transduced with either 5 × 1011 VG rAAV2- or rAAV9-CMV-EGFP, along with PBS-injected control samples. For rAAV2- or rAAV9-serotypes, distinct spatial liver distribution patterns have been described [3]. We complemented our ST analysis with snRNA-seq and confirmed transgene expression using ISH (Fig. 1a, Methods).

Fig. 1: Spatial transcriptomics deciphers rAAV-CMV-EGFP liver biodistribution.
Fig. 1: Spatial transcriptomics deciphers rAAV-CMV-EGFP liver biodistribution.The alternative text for this image may have been generated using AI.
Full size image

a Experimental setup. Created in BioRender. https://BioRender.com/q78t856b Spatial mapping of the transgene expression in the liver of a rAAV9-CMV-EGFP-treated male mouse. The color bar indicates the level of normalized transgene expression. c Pearson’s correlation between transgene abundance and gene expression across rAAV9-CMV-EGFP and rAAV2-CMV-EGFP-treated male and female liver samples. The correlation coefficient was jointly computed for all the samples of the same condition. Known periportal and pericentral zonation markers are highlighted. d Spatial mapping of the predicted zones in the liver of a male, rAAV9-CMV-EGFP-treated mouse. e Average transgene expression for rAAV9- and rAAV2-CMV-EGFP per zone in male and female mice. Black bars represent the standard deviation per condition. Statistical analysis to compare males and females was performed using Welch’s t-test (Methods), p-values are indicated. f Percentage of transgene positive nuclei in the single-nucleus RNA sequencing data (snRNA-seq) for male and female rAAV2-, rAAV9-CMV-EGFP-treated and control animals. Each dot represents one individual with the mean represented by horizontal lines. Only animals with corresponding spatial transcriptomics, in situ hybridization and EGFP immunohistochemistry data are displayed. g In situ hybridization (ISH) using an EGFP antisense probe maps the vector in liver tissue. Graph shows quantification of ISH as percent cells showing cytoplasmatic ISH signals, as well as both cytoplasmic and nuclear ISH signals, corresponding to RNA expression. Cells showing exclusively nuclear ISH signals, suggestive of rAAV DNA, were not included. Each dot represents one individual with the mean represented by horizontal lines. h EGFP antisense ISH of the liver of an rAAV2-CMV-EGFP-treated male animal. Scale bar 100 μm. i EGFP antisense ISH of the liver of an rAAV9-CMV-EGFP-treated male animal. Scale bar 100 μm.

ST detected the transgene across the whole liver section (Fig. 1b). We employed Pearson correlation analysis to identify genes that are positively associated with the EGFP transgene, revealing numerous liver zonation markers (Fig. 1c). To examine the zonal distribution of the transgene, the different liver zones were assigned using conserved markers (Figs. 1d and S1a–f, Methods) [2, 12,13,14]. Zone classifications were assessed by a pathologist based on morphological features and further validated by using known zonation markers (Fig. S1a–d). Moreover, we applied deconvolution, a widely used computational technique for estimating cell type proportions in ST data, to map periportal and pericentral hepatocytes from the snRNA-seq data to the tissue [15]. Periportal and pericentral hepatocytes aligned with periportal and pericentral zones, respectively (Fig. S1e, f, Methods).

Animals treated with rAAV9-CMV-EGFP showed higher pericentral EGFP expression, while rAAV2-CMV-EGFP exhibited the highest transgene expression in the periportal zone (Fig. 1e). Additionally, animals treated with rAAV9-CMV-EGFP had higher EGFP expression than those treated with rAAV2-CMV-EGFP, and males demonstrated higher transgene expression than females for both rAAV2- and rAAV9-CMV-EGFP (Fig. 1e). These findings are consistent with previously reported results [3]. snRNA-seq confirmed the sex- and serotype-specific patterns (Fig. 1f). The analysis of transgene-expressing cells revealed that around 50–60% are hepatocytes (Fig. S1g–j and Table S1). ISH and IHC confirmed the differences between sexes and serotypes (Figs. 1g–i and S1k). These findings were further in line with In Vivo Imaging System (IVIS) data generated in an independent experiment (Fig. S1l–o).

Hence, we demonstrate the ability of 10x Genomics Visium ST to spatially map the transgene expression and confirm the distinct sex- and serotype-specific patterns of rAAV vector biodistribution and transgene expression in the liver, with rAAV9 favoring the pericentral zone and rAAV2 the periportal zone.

ST and snRNA-seq enable zonation-based investigation of rAAV2- and rAAV9-CMV-EGFP transduced livers

Leveraging the comprehensive capture of both nuclear and cytoplasmic RNA by ST, we characterized zonation-, serotype-, and sex-specific transcriptomic responses in murine livers treated with rAAV2- or rAAV9-CMV-EGFP compared to PBS-treated controls. To enhance robustness, we integrated these data with a parallel snRNA-seq analysis, prioritizing genes with consistent trends across both platforms. Our analysis revealed numerous differentially expressed genes (DEGs) linked to lipid metabolism, the circadian clock, and immune responses (Fig. 2a). We further inferred transcription factor and pathway activities using decoupleR [17] and identified enriched functional pathways via Gene Set Enrichment Analysis (GSEA) [22, 23] to explore the underlying regulatory mechanisms (Figs. 3, 4 and S3, 4).

Fig. 2: Spatial transcriptomics reveals differential gene expression in rAAV2- and rAAV9-CMV-EGFP-treated mice livers per zone.
Fig. 2: Spatial transcriptomics reveals differential gene expression in rAAV2- and rAAV9-CMV-EGFP-treated mice livers per zone.The alternative text for this image may have been generated using AI.
Full size image

a Differential gene expression in treated versus control animals, categorized by liver zone and sex in the spatial transcriptomics (ST) data. Displayed is a selection of genes related to lipid metabolism, circadian rhythm, and immune modulation. p-values were calculated using the Wald statistical test (Methods). Spatial mapping of Srebf1 gene expression across the liver of b a male rAAV2-CMV-EGFP-treated animal, c a male control animal, d a female rAAV2-CMV-EGFP-treated animal and e a female control animal. Spatial mapping of Gadd45g gene expression across the liver of f a male rAAV9-CMV-EGFP-treated animal, g a male control animal, h a female rAAV9-CMV-EGFP-treated animal and i a female control animal. The color bar indicates the level of normalized expression.

Fig. 3: Spatial transcriptomics reveals transcription factor and pathways activity changes in mice livers per zone.
Fig. 3: Spatial transcriptomics reveals transcription factor and pathways activity changes in mice livers per zone.The alternative text for this image may have been generated using AI.
Full size image

a Predicted transcription factor activity per zone in the spatial transcriptomics (ST) data. p-values were calculated using the Wald statistical test (Methods). Genes discussed in the text are highlighted with a red square. Spatial mapping of the predicted Ppara transcription factor activity across the liver of b a male rAAV9-CMV-EGFP-treated animal, c a male control animal, d a female rAAV9-CMV-EGFP-treated animal and e a female control animal. f Differential pathway activity computed on pseudo-bulkRNA-seq generated from the zones in the ST data. p-values were calculated using the Wald statistical test (Methods). Pathways discussed in the text are highlighted with a red square. Spatial mapping of the predicted hypoxia pathway activity across the liver of g a female rAAV2-CMV-EGFP-treated animal, h a female control animal, i a male rAAV2-CMV-EGFP-treated animal and j a male control animal. The color bar indicates the predicted activity of the respective pathway.

Fig. 4: Spatial transcriptomics reveals changes in rhythmic processes and lipid metabolism.
Fig. 4: Spatial transcriptomics reveals changes in rhythmic processes and lipid metabolism.The alternative text for this image may have been generated using AI.
Full size image

a Gene set enrichment analysis (GSEA) using the GO biological process (GO_BP) database in the spatial transcriptomics (ST) data. NES normalized enrichment score. “REGULATION_OF_NUCLEAR_TRANSCRIBED_MRNA_CATABOLIC_PROCESS _DEADENYLATION_DEPENDENT_DECAY” is shortened for display purposes. p-values were calculated using the Wald statistical test (Methods). Gene sets discussed in the text are highlighted with a red square. Combined expression score of the two leading genes Nr1d1 and Nr1d2 for the “RHYTHMIC_BEHAVIOUR” process across the liver of b a male rAAV2-CMV-EGFP-treated and c a male control animal. Combined expression score of the two leading genes Eci2 and Ech1 for the “FATTY_ACID_CATABOLIC_PROCESS” and “FATTY_ACID_BETA_OXIDATION” across the liver of d a male rAAV9-CMV-EGFP-treated animal, e a male control animal, f a female rAAV9-CMV-EGFP-treated animal and g a female control animal. The color bar indicates the score of expression.

rAAV2- and rAAV9-CMV-EGFP are associated with zonation-independent changes in lipid metabolism

ST revealed alterations in lipid metabolism-associated genes, displaying partially sex-specific but overall serotype- and zonation-independent patterns.

Assessment of DEGs indicated overall a downregulation of Elovl3, a long-chain fatty acid elongase [31], that was most pronounced in rAAV9-CMV-EGFP-treated males and females. Chka, involved in synthesis of lipid membrane components [32], and Irs2, which plays a role in fatty acid synthesis [33], were downregulated in rAAV2- and rAAV9-CMV-EGFP-treated males and rAAV9-CMV-EGFP-treated females, exhibiting varying intensities of deregulation across different conditions/zones (Fig. 2a). Ppard, which regulates fatty acid beta oxidation and induces antiviral effects [34, 35], was significantly downregulated in rAAV2- and rAAV9-CMV-EGFP-treated males and rAAV9-CMV-EGFP-treated females (Fig. 2a).

Transcription factor activity estimates showed Ppard significantly upregulated in rAAV9-CMV-EGFP-treated males and downregulated in rAAV9-CMV-EGFP-treated females, suggesting compensatory regulation of upstream components and indicating sex-specific pathways modulating Ppard activity (Fig. 3a–e). Similarly, transcription factor activity estimates predicted Ppara, a regulator of lipid metabolism with antiviral effects [35], to be significantly upregulated in rAAV2- and rAAV9-CMV-EGFP-treated females and downregulated in males (Fig. 3a). Srebf1 a regulator of fatty acid and cholesterol synthesis [36, 37] was overall downregulated in rAAV2- and rAAV9-CMV-EGFP-treated males in differential expression analysis and transcription factor activity estimates (Figs. 2a–e and Fig. 3a).

In addition, we identified sex-specific DEG patterns for Tlcd4 and Cpt2, which are involved in lipid metabolism and fatty acid oxidation, respectively [38,39,40]. Overall, there was an upregulation of these genes in rAAV2- and rAAV9-CMV-EGFP-treated males, with Cpt2 showing a significant increase (Fig. 2a). Acot1, a regulator of fatty acid oxidation [41, 42], was upregulated in rAAV9-CMV-EGFP-treated males and downregulated in females (Fig. 2a).

Pathway activity scores showed a trend toward downregulation of the androgen pathway, which is closely linked to lipid metabolism, in rAAV2- and rAAV9-CMV-EGFP-treated males and rAAV9-CMV-EGFP-treated females (Fig. 3f).

In line with DEG and transcription factor activity prediction, GSEA revealed sex-specific trends, including an overall upregulation of fatty acid catabolic processes as well as fatty acid and lipid oxidation in the males and downregulation in the females (Fig. 4a, d–g). Additionally, GSEA using the KEGG database revealed an upregulation of arachidonic acid metabolism in rAAV2- and rAAV9-CMV-EGFP-treated males and a downregulation of PPAR signaling in rAAV2- and rAAV9-CMV-EGFP-treated females (Fig. S4b).

In snRNA-seq data, all DEGs exhibited the highest or high expression in hepatocytes (Fig. S2b) and overall, similar up and/or downregulation patterns were observed for most of the genes (Fig. S2a), transcription factor activity estimates (Fig. S3a), pathway activity assessments (Fig. S3b) and GSEA (Fig. S4a, c).

These changes may be linked to rAAV transduction and/or transgene expression, potentially modulating the AAV life cycle, transgene expression levels, and host immune responses in a partially sex-specific manner.

rAAV2- and rAAV9-CMV-EGFP modulate the circadian clock across liver zones in male and female mice

Our ST data further revealed several rAAV-related circadian clock-associated signatures that were overall zonation- and sex-independent and found for both rAAV2- and rAAV9-CMV-EGFP.

DEG analysis identified an overall zonation-independent upregulation of the circadian clock regulators Dbp and Tef [43] in rAAV9-CMV-EGFP-treated mice and rAAV2-CMV-EGFP-treated males (Fig. 2a), which was significant in periportal and pericentral zones. Conversely, the circadian rhythm regulator Arntl (encoding Bmal1) and Nfil3, a circadian rhythm inhibitor, showed a general trend of downregulation, whereas Nr1d1 and Nr1d2 were upregulated across liver zones (Fig. 2a) [44, 45]. Transcription factor activity predictions suggested an overall upregulation of Bmal1 and Clock activity (Fig. 3a). These findings indicate an upregulation of genes upstream of Bmal1, coupled with a compensatory downregulation of the Arntl transcription itself. In line with the observed changes in the DEG, GSEA predicted an upregulation of rhythmic behavior in both male and female rAAV2- and rAAV9-CMV-EGFP-treated animals (Fig. 4a–c), which was consistent with the upregulation of histone H3 acetylation, a process regulated by CLOCK and BMAL1 [45], in all treated conditions (Fig. 4a).

Overall, the snRNA-seq data showed similar trends and additionally revealed a significant upregulation of Dbp and Tef for rAAV2-CMV-EGFP-treated females (Fig. S2a).

rAAV2 and rAAV9-CMV-EGFP alter immune response-related genes with zonation-independent and partially sex-specific patterns

ST data provided further evidence for rAAV2- and rAAV9-CMV-EGFP-mediated alterations in hepatocyte-driven immune modulation and DNA replication-associated genes.

DEG analysis identified in rAAV9-treated males a significant upregulation of Gadd45g (Fig. 2a, f–i), which is associated with replication stress and immune response [46]. In contrast, a significant downregulation of Gadd45g was observed in females.

Additionally, Gadd45a, involved in immune regulation, DNA repair, and stress responses, was overall upregulated in the ST data (Fig. 2a). Irf2bp2, a regulator of macrophage function [47, 48] and also expressed in hepatocytes [12], was significantly downregulated in rAAV9-CMV-EGFP-treated animals. Rnf125, a negative regulator of the antiviral immune response [49, 50], was also overall downregulated in rAAV9-CMV-EGFP-treated males and females, as well as in rAAV2-CMV-EGFP-treated males (Fig. 2a).

Conversely, Ripk2, involved in the activation of NF-κB and MAPK pathways with protective functions against viral infections [51], along with Id2, a negative regulator of antiviral immune responses [52], were upregulated following rAAV2- and rAAV9-CMV-EGFP treatment.

snRNA-seq data revealed similar trends overall, with deregulation observed for both Gadd45g and Irf2bp2 in rAAV2- and rAAV9-CMV-EGFP-treated conditions (Fig. S2a). Hepatocyte rather than immune cells were the primary source of expression of these genes (Fig. S2b), which matched the absence of immune cell infiltrates observed in histopathology.

GSEA using the KEGG of ST data revealed significant downregulation of the complement and coagulation cascades in rAAV2- and rAAV9-CMV-EGFP-treated females, along with antigen processing and presentation in periportal and intermediate zones (Fig. S4b). Pathway activity predictions showed a significant downregulation of the hypoxia pathway in females, while EGFR and TGF-β pathways exhibited an overall upregulation trend in males and females (Fig. 3f). These patterns were overall consistent in the snRNA-seq data (Fig. S3b). These changes suggest a potential protective response to rAAV-CMV-EGFP treatment, with partially sex-specific trends.

ST delineates known and potential novel factors modulating rAAV entry

The zonation-associated distribution and transduction differences of rAAV9 and rAAV2 in the mouse liver may be linked to spatial variations in receptors, co-receptors, and other factors influencing rAAV transduction. Given the higher transgene capture efficiency of ST compared to snRNA-seq, we used the ST dataset to assess spatial factors, which potentially could modulate AAV entry and their association with EGFP transgene expression. It is currently unknown whether published AAV entry factors [53,54,55,56,57] exhibit varying spatial expression patterns across liver zones or sex differences in mice, potentially modulating serotype- and sex-specific liver transduction patterns. Therefore, we first examined the zonal distribution and expression levels of known rAAV entry factors in control male and female mice.

Only Fgfr1, a known AAV2 co-receptor, showed higher expression in the portal zone of males compared to females (Fig. S5), potentially contributing to the portal bias and sex difference in rAAV2 transgene expression. For other factors, either no zonation pattern was identified, or it did not correlate with the serotype-specific, periportal or pericentral zonation pattern or sex differences. snRNA-seq confirmed expression of all known entry factors in hepatocytes (Fig. S5s).

Next, we hypothesized that EGFP transgene expression would spatially correlate with the expression of known and putative novel AAV entry factors. Therefore, we analyzed transgene expression and its colocalization with membrane-associated genes as well as known AAV receptors (see Methods).

Among the top spatially colocalized genes for rAAV9-CMV-EGFP-treated males and females, we identified the gene Rpsa, encoding the laminin receptor, a known co-receptor for rAAV2 and rAAV9 [56] (Table S2). Additionally, the heparan sulfate proteoglycan (HSPG) Sdc1 was found to colocalize with the transgene in rAAV2-CMV-EGFP-treated males, consistent with the known role of HSPGs as AAV2 receptors [58, 59]. Further, Itgb1, part of the α5β1 integrin, a coreceptor for rAAV2 [60], was identified for the rAAV9-CMV-EGFP-treated male animals. Finally, we also observed spatial co-expression of EGFP with Dpp4, which was recently described to bind to rAAV9 [55].

Top spatially colocalized genes for rAAV9-CMV-EGFP-treated males and females (Fig. 5a, b and Table S2) included the organic cation transporters Slc22a1 and Slco1b2, as well as Nrp1 and Kdr, the latter two have already been identified as a receptor for other viruses [61, 62]. Additionally, Fcgrt, encoding FcRn, a known target for therapeutic inhibition to reduce immune responses [63], was identified. In the ST data from control animals, none of the abovementioned genes showed consistent sex differences (Fig. S6).

Fig. 5: Spatial transcriptomics to investigate rAAV transgene and receptor colocalization.
Fig. 5: Spatial transcriptomics to investigate rAAV transgene and receptor colocalization.The alternative text for this image may have been generated using AI.
Full size image

Top 10 genes that show colocalization with the transgene in the spatial transcriptomics (ST) data for a rAAV9-CMV-EGFP-treated males, b rAAV9-CMV-EGFP-treated females. c rAAV2-CMV-EGFP-treated males and d rAAV2-CMV-EGFP-treated females. Genes are sorted by their average Moran’s I (a measure of spatial autocorrelation, see Methods). Spatial mapping of the interaction (measured by the cosine similarity, see Methods) between the transgene and e Rpsa, f Slco1b2, g Pigr and h Fbp1 across a liver tissue section. i Cell-type-specific expression in the single-nucleus RNA sequencing (snRNA-seq) data combined for all animals. LSEC liver sinusoidal endothelial cells.

Genes exclusively found in rAAV9-CMV-EGFP-treated males (Fig. 5a and Table S2) were Slc19a2, Ncl, and Nectin2, all of which have already been identified as entry receptors for other viruses or to be involved in promoting viral entry [64,65,66,67,68]. Cldn2 and claudins in general have been identified to regulate viral infection [69]. Sort1 and Lrp1 were also identified, both involved in trafficking and endocytosis, respectively [70,71,72].

Spatially colocalized genes exclusively observed in rAAV9-CMV-EGFP-treated females (Fig. 5b and Table S2) included Bsg linked to viral entry [73, 74], and Cldn5 and Cd59a [75] both implicated in regulating viral infections.

Among the top transgene colocalizing genes for rAAV2-CMV-EGFP in male and female mice (Fig. 5c, d and Table S2), we identified Fbp1, Cdh1, Pigr, and Spp1, all of which are known to be involved in the entry of other viruses [76,77,78,79].

Additionally, Ly6e, which promotes viral entry through mechanisms other than receptor-mediated entry [80], was also detected in rAAV2-CMV-EGFP-treated livers of both sexes. The difference in expression between males and females in the ST data was also compared, and none of the transgene colocalizing genes showed significant consistent sex differences (Fig. S6).

For rAAV2-CMV-EGFP-treated males Ifitm3 (Fig. 5c and Table S2), known to negatively affect viral entry through mechanisms other than receptor-mediated entry [81, 82], was identified.

In rAAV2-CMV-EGFP-treated females Abhd2 (Fig. 5d, Table S2) was observed, which is known to promote viral propagation [83].

Since ISH and IHC revealed transgene expression predominantly in hepatocytes, the expression of each of the above-mentioned cell membrane-expressed genes was examined per cell type (Fig. 5i). Most of them were predominantly expressed in hepatocytes or also showed expression in other cell types besides hepatocytes. Exceptions were Kdr and Cldn5, which were mostly expressed in liver sinusoidal endothelial cells (LSECs) and Spp1, which seemed specific for cholangiocytes (Fig. 5i).

Beyond receptors, we investigated factors recently emphasized by Rostami et al. for their roles in modulating AAV gene transfer and expression, analyzing their patterns in the context of sex differences and zonation. (Figs. S7 and 8) [84]. None of the investigated genes showed differences that potentially could explain the observed sex and zonation differences.

In summary, our approach uncovered several established as well as potential new factors that spatially associate with rAAV transduction in hepatocytes. These findings not only offer potential explanations for serotype- and sex-specific patterns but also underscore the utility of this method in generating novel, spatially informed hypotheses. The identified genes represent a rich set of targets for future functional validation aimed at enhancing gene therapy efficacy.

Discussion

This study highlights ST using the 10x Genomics Visium platform as a powerful tool to map rAAV transduction patterns and reveals sex- and serotype-specific effects on gene expression in liver tissue. Visium ST offers unbiased, high-throughput transcriptomic profiling, enabling spatial resolution of transgene expression and tissue responses. When combined with snRNA-seq, it allows single-cell validation, overcoming the limitations of bulk RNA sequencing.

By exploiting ST and snRNA-seq, we systematically mapped rAAV2- and rAAV9-CMV-EGFP-mediated transduction across hepatic zones and examined associated transcriptomic alterations.

Our findings uncover that rAAV2- and rAAV9-CMV-EGFP transduction is associated with zonation-independent modulation of lipid metabolism, circadian rhythm, and immune responses, with several pathways displaying distinct sex-specific patterns. Moreover, our study identifies known and putative novel rAAV entry factors, providing insights into mechanisms underlying rAAV tropism in hepatocytes. These findings highlight the potential of ST in gene therapy research, offering a comprehensive framework to study AAV vector biology in situ.

ST confirms that rAAV2 preferentially transduces periportal hepatocytes, while rAAV9 exhibits a bias toward pericentral regions, consistent with previous findings [3]. Notably, we observed higher transgene expression in males, in line with previous findings [5]. Beyond transduction efficacy, we identified profound transcriptomic changes influenced by sex and serotype, but largely independent of zonation. Some effects were consistent across both serotypes and sexes, indicating conserved mechanisms induced by rAAV-CMV-EGFP transduction or transgene expression. Further experiments with empty capsid controls and promoterless transgenes, as well as other transgenes, are necessary to differentiate effects specifically linked to the CMV-EGFP expression cassette, especially given EGFP’s reported cytotoxic effects [85, 86].

One of the most striking findings of our study is the serotype- and zonation-independent downregulation of lipid metabolism genes, including Elovl3, Chka, Irs2, and Ppard, across both sexes. Strikingly, Elovl3 downregulation was also observed in a study including AAVrh.10 vector without a transgene (rAAVrh.10Null) [87], indicating that this effect is not related to EGFP. These genes regulate lipid synthesis, transport, and beta-oxidation, suggesting that rAAV transduction broadly influences hepatic lipid homeostasis. Interestingly, Srebf1, Tlcd4, Cpt2, and Acot1 showed sex-specific transcriptional changes, indicating that rAAV vectors may differentially impact lipid metabolism in male and female mice. These findings align with recent studies linking viral infections to lipid metabolism dysregulation [35, 88,89,90]. Given that lipid metabolism is crucial for viral entry, replication, and immune evasion [91,92,93], our results raise the possibility that rAAV vectors may engage similar host pathways to optimize transgene expression. This insight is particularly relevant for AAV-based liver gene therapies, as pre-existing lipid metabolic conditions could influence vector efficacy [94].

Moreover, to the best of our knowledge, our study is the first to report that rAAV transduction induces long-term alterations in hepatic circadian clock regulators, extending previous findings of acute dysregulation observed at 2 days post-rAAV8-transduction [95]. We observed zonation-independent upregulation of Dbp, Tef, Nr1d1 and Nr1d2 coupled with downregulation of Arntl (Bmal1) and Nfil3, suggesting that rAAV2- and rAAV9-CMV-EGFP vectors affect key circadian modulators. The circadian clock regulates hepatic metabolism, detoxification, and immune function, and its disruption has been linked to chronic liver diseases [96]. The upregulation of Dbp is particularly interesting, as this gene contains a binding motif [97] in the CMV enhancer. We hypothesize that this may facilitate a potential feedback mechanism wherein transgene expression indirectly influences host circadian regulation via Dbp. Given that circadian disruption is known to affect viral replication and immune responses [98,99,100,101], these observations warrant further investigation to determine if this mechanism impacts gene therapy timing and efficacy.

Our transcriptomic analysis also uncovered immune regulatory genes that exhibit both serotype- and sex-specific expression patterns. In particular, Gadd45g —a regulator of replication stress and immune response [102, 103]— was upregulated in males and downregulated in females following rAAV9-CMV-EGFP transduction. This sex difference may reflect variations in host immune surveillance mechanisms that influence AAV transduction efficacy. Notably, Gadd45g has been previously identified as a factor suppressing the innate immune response following Herpes Simplex Virus 1 infection [103], suggesting a complex role in virus-host interactions. Additionally, we observed an upregulation of TGF-β pathway activity, indicating a potential pro-tolerogenic response to AAV transduction [104]. The EGFR pathway was also upregulated, consistent with findings from the AAVrh.10Null study [87], suggesting that the upregulation of EGFR may be attributed to viral transduction. Moreover, the downregulation of the hypoxia pathway in females may contribute to lower transgene expression in female mice due to hypoxia’s known role in episomal transcriptional regulation [105, 106]. In addition, the sex-independent suppression of Rnf125 and Irf2bp2, genes modulating innate antiviral responses [47,48,49,50], indicates potential activation of antiviral immune responses. Rnf125 downregulation was also identified in the rAAV.10Null study, pointing to an EGFP-independent effect [87].

These observed differences, particularly the sex-specific patterns, represent promising, spatially derived hypotheses that warrant further functional investigation to determine their precise impact on AAV transduction efficacy and persistence.

In addition to validating known AAV receptors (e.g., Rpsa, Dpp4 for rAAV9, and Sdc1, for rAAV2), our ST analysis proposed novel candidate receptors that may influence serotype-specific and sex-biased transduction patterns. These include solute carrier family proteins (e.g. Slc22a1, Slco1b2, Slc19a2), some of which have been implicated in viral uptake [64], as well as cell adhesion and trafficking proteins (e.g. Cldn2, Nrp1, Kdr), known to mediate virus-host interactions [61, 62, 69]. We emphasize that these associations are correlational at this stage and represent promising, spatially derived hypotheses. As such, further functional validation (e.g., knockdown or overexpression studies) is required to confirm whether these factors directly modulate rAAV transduction efficacy.

Despite its methodological strengths, our study has a few limitations. Although the sample size was limited (n = 3 per condition), the consistency of many observed patterns across both rAAV serotypes or sexes suggests powered results; nevertheless, future studies should expand sample numbers for greater statistical confidence. Moreover, there are resolution constraints of Visium ST (55 μm per spot), which preclude single-cell resolution, but upcoming high-resolution platforms like Visium HD and Xenium may overcome this issue [107]. In our study, we also focused on one mouse strain, and future work should explore strain-specific effects on AAV transduction and host responses. Additionally, assessment of other time points post-transduction will allow for a more detailed investigation of the immune responses [108, 109]. Finally, our study design does not allow us to distinguish between transgene- or vector-induced effects; therefore, confirmation with empty vector controls is necessary. In regard to identifying potential novel rAAV entry factors, it is important to consider that mRNA and protein levels may not always correlate [110, 111], which can hinder the identification of these receptors. Further studies using knockout models, receptor binding assays, and metabolic profiling may also deepen our understanding of AAV transduction mechanisms and their broader implications for gene therapy.

In conclusion, our study establishes ST as a powerful approach for investigating AAV transduction patterns and their molecular consequences in the liver. We reveal rAAV2- and rAAV9-CMV-EGFP sex- and serotype-specific alterations in lipid metabolism, circadian regulation, and immune responses, and propose novel putative AAV entry factors. These findings provide critical insights into the host-vector interplay and highlight important considerations for optimizing AAV-based gene therapies. Future research should build on these findings to refine vector design and administration strategies, ultimately improving the safety and efficacy of gene therapy applications.