Introduction

Forty million people are living with the human immunodeficiency virus type one (HIV-1), and more than ten million have died of acquired immune deficiency syndrome (AIDS) since the epidemic began in 1981. Despite the abilities of antiretroviral therapy (ART) to suppress active viral production and reduce disease-associated comorbidities, low-level persistent viral growth continues. The virus is detected in several tissue compartments of people living with HIV-1 (PLWH)1,2,3,4. Treatment interruptions commonly result in viral rebound5. During infection, the virus easily integrates into the host genome after escaping the host’s immune response, and latent infection persists for extended times in PLWH, even under suppressive ART treatment6,7,8. Long-term ART treatment leads to the emergence of HIV-1 drug resistance and treatment failures9. These pose a challenge to any available HIV-1 cure strategies.

Recently, a novel gene-editing tool, the clustered regularly interspaced short palindromic repeat (CRISPR)-associated nuclease 9 (Cas9), with high specificity and efficacy, has emerged as a pathway towards potential HIV-1 elimination10,11,12. Together with ART, CRISPR-based therapies have achieved success in viral elimination. In 2019, we showed complete HIV-1 elimination from tissue compartments in a subset of infected humanized mice (hu-mice) through sequential LA ART and CRISPR-Cas9 excision therapy targeting the HIV-1 LTR-Gag region of the viral genome. LA-ART consisted of nanoformulated rilpivirine (RPV) and myristoylated dolutegravir (DTG), lamivudine (3TC), and abacavir (ABC) and was administered as nanoformulations. Myristoylated modifications for DTG, 3TC, and ABC were referred to as MDTG, M3TC, and MABC, while RPV in unmodified form as a nanoformulated drug was referred to as NRPV. In one-third of the infected humanized mice, the virus was eliminated as detected using highly sensitive viral detection assays, indicating a sterilizing cure13. Recently, the success of the viral elimination rate was further advanced by using dual CRISPR targeting host CCR5 and HIV-1 LTR-Gag in the infected and LA-ART suppressed hu-mice14. Such success was recorded by excision of the integrated proviral DNA from the infected cells of the dual-treated hu-mice. However, viral rebound was observed in a select number of treated animals13. How the virus adapts to dual LA ART and CRISPR treatments is not understood. Thus, this study examined the molecular genetic signature of the rebound virus that escaped the dual therapy.

The test approaches used in the clinic to investigate the rebound viruses primarily rely on genetic resistance testing, which is done under suboptimal ART-mediated viral suppression (such as drug failures) or during analytical treatment interruption15,16. While population-based Sanger sequencing (SS) remains the standard approach, the use of highly sensitive Next Generation Sequencing (NGS) techniques are being employed in recent times for better detection of minor drug-resistant variants during various treatment failures seen as a consequence of latency-reversing agent (LRA) treatments or as a consequence of vaccine trial failures17,18. Herein, we employed SS and NGS assays to determine the molecular signature of the rebound virus escaping LA ART and CRISPR therapies. We investigated the viral genome’s gag, pol, and env regions in untreated (control), ART, and/or CRISPR-treated hu-mice. The data was then mapped against antiretroviral drug (ARV) resistance databases and CRISPR-targeted viral regions. The results revealed unique, previously unidentified mutations in LA ART and dual-treated rebound viruses, which could have potentially contributed to viral rebound. The lack of CRISPR-associated signatures in the rebounded viruses further supports the effectiveness of CRISPR-mediated viral excision as a potential next-generation treatment modality, either alone or in combination with other therapeutics, for future clinical translation.

Results

HIV-1 variant analysis

PCR amplification of 6.7 kb of the HIV-1 genome of the viral gag, pol, and env genes was performed on rebound viruses recovered from infected hu-mice from untreated and LA ART-, CRISPR-, and dual-treated animal groups. LA-ART consisted of nanoformulated RPV (NRPV), and nanoformulated and myristoylated DTG (NMDTG), 3TC (NM3TC), and ABC (NMABC) as prodrugs. Rebound viruses from all treatment groups were recovered from plasma samples using a sensitive membrane-based Qiagen MinElute virus kit19. Three independent amplicons from 29 samples (22 samples from study 1 and 7 samples from study 2) were sequenced by GENEWIZ Sanger assays (Fig. 1).

Fig. 1: Study Scheme.
figure 1

The study scheme shows the infection times, treatment details, plasma collection, RNA isolation, Reverse Transcription PCR amplification, and subsequent sequencing assays and data analysis. In this study, humanized mice were either infected with HIV-1NL4-3 or HIV-1ADA and were subsequently divided into four treatment subgroups: 1- no treatment; 2- LA ART treatment consisting of NMDTG and NRPV at 45 mg/kg and NMABC and NM3TC at 40 mg/kg, administered via intramuscular injection two weeks post-infection (WPI) and ceased at 6 WPI; 3- CRISPR-Cas9 treatment at 7/8 WPI; or 4- sequential LA ART and CRISPR-Cas9 treatment with viral rebound. Plasma-derived viral RNA was isolated from all groups at designated time points, and three independent PCR-amplified HIV-1-regions (gag, pol, env) were sequenced and analyzed using Sanger sequencing. Amplicons of the pol regions were further sequenced and analyzed through the Illumina MiSeq platform. Sequences of interest, including the guide RNA targeting region, protease (PR), reverse transcriptase (RT), and integrase (IN) regions, env variable loop, and CD4 binding loop region, were analyzed using the bioinformatic pipeline. The crystal structure of unliganded HIV-1 clade B strain YU2 gp120 core is downloaded from the RCSB Protein Data Bank (https://www.rcsb.org/). Fig. 1 is created using the Biorender software.

The sequence analysis yielded high-quality sequence reads with an average quality score of greater than 45. Hypermutation was investigated using Hypermut 2.0 software20 tests, detected in only one RNA sequence, and excluded from further downstream analysis. Plasma viral RNA measures from individual humanized mice from each treatment group in all studies were provided in Supplementary Tables 2,3. Resistance genotyping tests in reverse transcriptase (RT), integrase (IN), and protease (PR) viral genes were successfully performed using the more sensitive MiSeq next-generation sequencing system. Hu-mice samples from two independent studies infected with two different viral strains [HIV-1NL4-3 (n = 22) and HIV-1ADA (n = 7)] were analyzed. In the sequencing analysis, 93.15% of the reads passed the standard quality check in the first step. During the second, more stringent quality check, one sample (4348) was excluded. We employed analysis thresholds of 2, 5, and 20% in NGS to identify minor variants with high sensitivity, as validated in prior multi-laboratory comparative studies21,22.

We sequenced full-length gag, pol, and env regions of viral RNA isolated from either HIV-1NL4-3 or HIV-1ADA-infected humanized mice plasma in two independent studies from four different treatment groups (untreated LA ART, CRISPR, and dual ART-CRISPR therapies where the rebounded virus was recovered). These were used to evaluate treatment effects on the emergence of specific viral molecular signatures. Multiple alignment analysis of HIV-1NL4-3-infected endpoint gag, pol, and env sequences demonstrated low-frequency mutations across untreated virus-infected samples, validating the existence of host-selection pressure in infected hu-mice (Fig. 2A, B and Supplementary Figs. 1,2).

Fig. 2: Endpoint Sequencing Analysis.
figure 2

A Highlighter plot of all infected and treatment groups (Untreated, CRISPR-, LA ART-, and Dual-treated with viral rebound) endpoint sequences in the HIV-1NL4_3 env region. Sequences were aligned to the most conserved sample from the HIV-1-infected untreated group [master sequence(m)]. Mutations were color-coded. B Highlighter plot of all samples belonging to all four treatment groups, endpoint sequences in the HIV-1NL4_3 env region with silent (green) and non-silent (red) mutations. C Frequency of mutations detected in gag, pol, and env regions derived from study one (HIV-1NL4_3) endpoint untreated (n = 3), CRISPR-treated (n = 5), LA ART-treated (n = 7–9), and dual-treated and viral rebound samples (n = 5). Statistical comparisons between the different treatment groups were analyzed using one-way ANOVA. Data represent the mean ± standard error for 4 different groups with independent biological replicates.

While the increased frequency of mutations, including nucleotide substitutions and indels, was observed in all treated viruses in the gag, pol, and env regions (Fig. 2A, B, Supplementary Figs. 1,2), the env region possessed a greater mutation frequency. It showed higher evolutionary divergence than the other structural viral genes in the same treatment group. HIV-1ADA viruses showed higher mutation frequency in env than in the gag and pol regions (Supplementary Figs. 3,5). No statistical significance was observed using one-way ANOVA between the treatment groups from HIV-1NL4-3-infected mice (Fig. 2C). A trend suggesting an increased mutation rate was identified as non-significant in the HIV-1ADA-infected mice group using the non-parametric Kruskal-Wallis’ test (Supplementary Fig. 6), suggesting the existence of viral strain specificity. These results supported viral genetic dynamics under parallel treatment conditions23.

Correlation Analyses

We performed Spearman’s rank correlation analyses on HIV-1NL4-3 (Table 1) and HIV-1ADA (Supplementary Table 1) infected viral samples to investigate whether the endpoint variant frequency detected was associated with the endpoint plasma viral load. The HIV-1NL4-3 nucleotide endpoint mutation frequencies in the pol region had an inverse correlation with the plasma viral load (r value = - 0.5879, p-value = 0.008) (Table 1). We did not observe any significant correlation between endpoint mutation frequency and plasma viral load in the gag and env regions (Table 1). The r values for gag and env regions are −0.207 and 0.166, and the p-values for the same areas are 0.382 and 0.485, respectively. We did not observe any significant correlations in HIV-1ADA-infected samples (Supplementary Table 1) with the noted sample size.

Table 1 Correlation analysis of HIV-1NL4_3 Mutation Frequency with endpoint plasma viral load

Longitudinal HIV-1 variant analyses

With the known notable diversity reported in the clinical samples in the HIV-1env region and as observed in our endpoint env sequences, we conducted a longitudinal sequence analysis of the samples displaying high levels of nucleotide variation to eliminate pre-existing variants before LA ART treatment. We mapped the env nucleotide variants identified from five animals (3139, 3171, 3182, 4356, 4358) before ART treatment, following ART and before CRISPR, and at the study endpoint (Fig. 3A). We examined potential variant dynamic changes (Fig. 3B, C). The higher mutation frequencies observed in the env region sequences from animals treated with LA ART or CRISPR at the endpoint relative to controls indicated that some mutations were either generated during or after the treatments (Fig. 3B, C, Supplementary Fig. 7). Nonetheless, the mutations observed over time reflected the CRISPR and LA ART influence on viral evolution, and this is the first report from a multimodal treatment paradigm in an HIV-disease-specific animal model.

Fig. 3: Longitudinal Sample Analysis of Viral Mutations using SS in the HIV-1 env region.
figure 3

A Mapping of nucleotide variants detected in the env region sequences from the five selected HIV-1NL4-3 animal groups (identified by the mouse ID on the right axis) at pre-ART (PRE), post-ART/pre-CRISPR (MID), and endpoints (END) (shown on the left axis). Individual nucleotide substitutions were depicted with distinct colors per the key below the graph. B The frequency of env variants at pre-ART (n = 5), post-ART/pre-CRISPR (n = 3), and endpoint (n = 5) was compared and analyzed using one-way ANOVA. Individual animals were represented with unique color-coded dots and lines (3139, 3171, 3182, 4356, and 4358). A significant increase in mutation frequency was observed (p = 0.019). C Donut charts illustrate the proportion of env nucleotide variants from individual mice at the specified time points, with pre-ART, post-ART/pre-CRISPR, and endpoint phases denoted by blue, red, and orange, respectively.

Drug resistance-associated mutations (DRAMs) detection by SS

We performed DRAMS screenings to detect and re-align mixed-up samples for integrative studies on our data obtained using the Stanford drug-resistance database and validated using the International AIDS Society (IAS-USA) mutations list 2022, and identified more drug-associated mutations in the HIV-1NL4-3 and HIV-1ADA samples (Table 2). One accessory integrase resistance mutation, V151I, was identified in all plasma samples from HIV-1NL4-3-infected hu-mice. This affirmed the susceptibility to the drug(s) used for viral suppression. A primary NRTI-associated mutation, M41L, was identified in all the samples from HIV-1ADA-infected hu-mice samples. M41L has been reported as a significant RT resistance mutation in clinical studies of infected patients treated with either abacavir, azidothymidine, or tenofovir disoproxil fumarate24. We also identified one protease inhibitor (PI)-specific accessory mutation, A71I, in one sample belonging to the dual-treatment rebound group, which is shown to be associated with other PI-resistant mutations24.

Table 2 DRAMs detected in RT/PR/IN using Sanger sequencing

DRAMs detected using Next Generation Sequencing (NGS)

We further performed MiSeq NGS to examine the presence of low-frequency drug-resistant sequence variants in the pol subregions, protease, reverse transcriptase, and integrase. Twenty-five plasma viral samples were tested from HIV-1NL4-3 or HIV-1ADA-infected hu-mice belonging to four treatment groups and derived from samples before ART and at the study’s conclusion (Fig. 4A). One sample failed to pass the stringent quality check and was excluded from the analysis. Using the Stanford NGS drug-resistance database algorithm, additional DRAMs were detected in RT, PR, and regions (Table 3). NNRTI primary mutation E138K and accessory mutation H211Y were observed at low frequency in our NGS analysis at a 2% threshold in sample 3169 (an LA ART-treated sample). NNRTI accessory mutation K101E was observed at NGS 5% and 2% threshold in one sample (3181) belonging to the LA ART-treated group. NRTI accessory mutation V75I was detected at NGS 5% and 2% threshold in another LA ART-treated sample (3182). One INSTI primary mutation, R263K, was detected at NGS 2% in a dual-treated and viral rebound sample (4375). Another INSTI accessory mutation, V151I, was detected at NGS 20%, 5%, and 2% thresholds in all the samples from all four groups. One PI-associated primary mutation I54T was observed at NGS 5 and a 2% threshold in one LA ART-treated sample (3170). Another PI accessory mutation, F53L, was detected only at the NGS 2% threshold in an LA ART-treated plasma sample (3182). Interestingly, all the DRAMs identified in our highly sensitive low threshold NGS analysis either belonged to the LA ART alone or LA ART and CRISPR-treated viral rebound groups (Fig. 4B). The predicted major antiretroviral drug susceptibility profiles were further analyzed for all samples from HIV-1NL4-3 and HIV-1ADA-infected mice using the ART resistance algorithm25 and presented in Fig. 4B. We didn’t detect any drug-resistant-associated mutations in the HIV-1-infected untreated or CRISPR-alone hu-mice samples, so they were not given in the significant ART-susceptibility analysis diagram. Overall, 41.6% (5/13) of HIV-1NL4-3-infected hu-mice and 100% (4/4) of HIV-1ADA-infected hu-mice had at least one mutation (Fig. 4B). A total of 7.7% (1/13), 15.4% (2/13), 15.4% (2/13), and 7.7% (1/13) HIV-1NL4-3 infected hu-mice were found resistant to nucleoside reverse transcriptase inhibitors (NRTIs) with low resistance, NNRTIs with medium to high resistance, PIs with low resistance, and integrase strand transfer inhibitors (INSTIs) with medium resistance data respectively. All HIV-1ADA-infected humanized mice exhibited low resistance to NRTIs but no resistance to any other ART category. Interestingly, we identified new mutations with unclear function in all HIV-1NL4-3-infected humanized mice and 25% (1/4) of the HIV-1ADA-infected hu-mice. These mutations have been shown to increase the replication fitness of viruses when present along with other INSTI/PI-resistance mutations24, but further investigation is needed.

Fig. 4: NGS analysis identified new antiretroviral drug-resistant mutations.
figure 4

A Next-generation sequencing (NGS) was used to screen the low-frequency HIV mutants (less than 15% detection threshold) in the protease (PR), reverse transcriptase (RT), and integrase (IN) regions within pol amplicons. A total of 25 humanized mouse samples belonging to either HIV-1NL4-3 or HIV-1ADA-infected differential treatment groups from pre-ART and endpoint were subjected to the NGS analysis. B A heap map shows the predicted major antiretroviral drug susceptibility profiles based on the Stanford drug resistance database from HIV-1NL4-3 and HIV-1ADA-infected and differentially treated samples. (The colors used to indicate the results are red, high-level resistance; orange, medium-level resistance; pink, low-level resistance; blue, sensitive; dark purple, unclear functionality.) The four ARVs used for this study were highlighted with a red arrow.

Table 3 DRAMs detected in RT/PR/IN using Next Generation Sequencing (NGS)

Comparison of DRAMs between SS and NGS detection systems

The rate of detecting HIV-1 mutations using single-genome sequencing, with that of next-generation sequencing (NGS) at 20%, 5%, and 2% thresholds, was presented in Fig. 5A. We first compared the mutation rate of single-genome sequencing with that of NGS at a 20% threshold. This reflected the variant detection limit of Sanger sequencing (SS), which is known to be ~20%26. Prior clinical studies reported discrepancies in the detection of DRAMs using NGS at a 20% threshold compared to SS results for various HIV-1 genomic subregions27,28,29. Here, in our current humanized mice study, we observed a high concordance in the detection of DRAMs between NGS and SS at a 20% threshold in the reverse transcriptase, protease, and integrase region sequences. Compared to Sanger Sequencing and NGS at 20% thresholds, NGS at 2% and 5% allowed better detection of primary and accessory DRAMs in our humanized mice plasma samples (Fig. 5B). A total of 7 DRAMs, including four nucleoside reverse transcriptase/nonnucleoside reverse transcriptase (NRTI/NNRTI)-, 2 PI-, and one integrase-associated DRAM, were detected using NGS at 5 and 2% analysis.

Fig. 5: A comparison of DRAMs was obtained using SS and NGS.
figure 5

A Bubble chart representing the frequency of predicted DRAMs to NRTIs/NNRTIs, PIs, and INSTIs (using the Stanford drug resistance database standard platform) detected in HIV-infected humanized mouse samples. The size of the bubbles indicates the count of mice with each mutation, identified using Sanger sequencing (SS) and next-generation sequencing (NGS) at varying detection thresholds (2%, 5%, 20%). B The DRAMs and their respective proportion detected using SS and three NGS thresholds (2%, 5%, 20%) were presented using the grid graph, with color coding specific to each amino acid mutation, and additional mutations detected were highlighted with dotted squares. Significant mutations are shown with asterisks—SS, Sanger sequencing, NGS, and Next-generation sequencing.

Unique Mutations detected using NGS

By employing NGS as described in Fig. 4A, we not only detected lower-frequency DRAMs across three PR, RT, and IN HIV-1-regions of the pol gene amplicons from our samples but also identified new and unique mutations that have never been reported in the drug resistance database (Fig. 6A, B). Upon comparing the distribution of the new mutations across different treatment groups, we observed no significant clustering of the mutations within any treatment group. Notably, these mutations were present in all treated groups in HIV-1NL4-3 and HIV-1ADA-infected animals (Supplementary Figs. 8,9). However, we found a region-specific pattern in the distribution of new mutation frequencies, with the RT region exhibiting the highest, followed by integrase (IN), and then the protease (PR) region, which displayed the lowest frequency (Fig. 6A). The mutations in RT regions of the three treatment groups are shown in Fig. 6C–E with color-coded amino acid changes and their respective frequencies. The mutation frequency is the proportion of mutation positions relative to the consensus within each sample group. Overall, we found that the LA ART-treated group harbored more new mutations than the CRISPR-treated and dual-treated groups (Fig. 6C–E), and each treatment group displayed a unique mutation signature. In the LA ART-treated group, greater than 50% of substitutions were at positions I375, V381, and E523. In the CRISPR group, substitutions more significant than 50% were observed at position R35, while in the dual-treated viral rebound group, no single mutation reached a frequency exceeding 50%. Interestingly, three new common amino acid substitutions (RT codon positions 29, 151, and 316) emerged in both LA ART (Fig. 6C) and the dual-treated group (Fig. 6E). In contrast, one new amino acid substitution (RT codon position 460) was observed in both the CRISPR-treated (Fig. 6D) and dual-treated rebound groups. Notably, unique mutations E29G and Q151R were exclusively observed in the LA ART and dual-treated groups.

Fig. 6: Next-generation sequencing identified new mutations in treated rebound animals.
figure 6

A, B Distribution of new and unique mutations in the protease (PR), reverse transcriptase (RT), and integrase (IN) regions of rebound viruses belonging to the different treatment groups infected with either HIV-1 strains NL4_3 or ADA. Violin plots depict the frequency of new mutations in each region relative to the total mutations identified in the individual sample. The frequency of new mutations identified in PR, RT, and IN regions from (n = 16) samples (HIV-1NL4_3 group) and (n = 4) samples (HIV-1ADA group) was analyzed using Type III Analysis of Variance Table with Satterthwaite’s method (p values shown in the figure). Comparison between the two regions was performed using Student’s t-test (two-tailed, paired) (C–E). The pattern of polymorphism and mutations associated with new mutations identified in the RT region of HIV-1NL4_3 treatment groups. Bar plots represent the frequency of new mutations in the RT region across different treatment groups: LA ART, CRISPR, and the dual-treated rebound group. Each bar represents the frequency of a specific amino acid mutation at a consensus position, with the height of the bar corresponding to the percentage frequency, and those shared between groups are indicated by consistent color coding. The alphabets represent single-letter amino acid codes.

CRISPR-associated and CD4 binding loop mutation analysis

Previous studies have reported CRISPR-mediated indels and mutations30,31. We detected no CRISPR-mediated indels or nucleotide substitutions at or near our gag gRNA targeting region cleavage site (Supplementary Fig. 10A). Interestingly, non-synonymous mutations were observed within the CD4 binding loop region in (2/5) dual-treated samples, resulting in amino acid changes from glycine to arginine (Supplementary Fig. 10B).

First, we performed Sanger sequencing to analyze viral sequences at the endpoint from HIV-1-infected and untreated and treated with either LASERART alone, CRISPR alone, or dual therapy-treated rebound animals, as presented in Figs. 1, 2. This analysis revealed a limited number of mutations. Second, to determine whether these mutations were pre-existing or generated during the treatment, we extended our analysis to already stored frozen plasma pre- and post-ART samples, highlighted in Fig. 3. Surprisingly, Sanger sequencing did not identify significant drug-resistance mutations in the rebound animals despite the administration of a combination of four long-acting antiretroviral drugs (Dolutegravir, Lamivudine, Abacavir, and Rilpivirine) in our study; this could be attributed to the sensitivity of detection of Sanger sequencing in identifying DRMs at 15-20%. Third, this unexpected result led us to employ a higher-resolution sequencing method, Illumina MiSeq NGS, to investigate if there are undetected DRMs in the single- and/or double-treated rebound animal samples. Using the NGS Illumina MiSeq platform, we detected drug resistance and new mutations in the viral sequences, as presented in Figs. 46. Fig. 1 integrates data from Figs. 1,2 and Table 1 (Sanger sequencing results) on the bottom left and Figs. 36 and Tables 2, 3 (Illumina MiSeq NGS results) on the bottom right.

Discussion

We and others have demonstrated the role of hu-mice as a model of persistent viral infection and antiretroviral therapies32,33,34,35,36,37,38,39. The current study reports the longitudinal molecular characterization of rebound viruses in an HIV-1-infected humanized mouse model with four treatment groups, administered alone or combined with four LA ARTs and CRISPR-Cas9 targeting the HIV-1 LTR-Gag region. We isolated plasma viral RNA at two weeks following viral infection, at the initiation of ART at six weeks, and at 14 weeks at the study’s conclusion. The RNA isolation was followed by single-step PCR amplification of three structural viral genes. This comprised 6.7 Kb of the HIV-1 genome in three independent reactions from 29 samples. These samples were infected with HIV-1NL4-3 or HIV-1ADA strains. We screened for viral rebound mutants using Sanger Sequencing at the study endpoint. Then, we went back to analyze the earlier pretreatment time points. We then employed highly sensitive Illumina-based next-generation sequencing to screen for low-frequency variants not identified in SS. Interestingly, we detected low-frequency mutations across the untreated viral plasma samples compared to the master sequence. Though we observed increased mutations in most treated and rebound animals in all three viral regions analyzed, the env region showed the highest number of mutations (Fig. 2C and Supplementary Fig. 6). We also found that the variant mutants that emerged in the HIV-1NL4-3 pol region were significantly correlated with the plasma viral load at the study’s conclusion. In addition, it is also noteworthy to mention that we observed predominant sequence variability in the V1 and V2 regions of endpoint env sequences as compared to the other three env regions of the treated and rebound animals, which aligns with the previously reported patient-derived sequences40. These mutations identified in the env region of our rebound hu-mice samples have been shown to play an essential role in disturbing the co-receptor binding and escape from neutralizing antibodies in clinical studies41. While the HIV env region is known to acquire a higher mutation rate than gag or pol due to host immune selection and for viral fitness to survive in humans42, we are the first to report the selection pressure in the env region in a humanized mouse model within 14 weeks of infection. In addition, though the HIV-gag region was targeted by CRISPR treatment, the gag regions remain highly conserved in our humanized mice. We also observed APOBEC3G-associated mutations in the pol region of our treated animal samples using the Mark APOBEC signature program43, suggesting the active role of host restriction factors during selection pressure (Supplementary Fig. 1C and Supplementary Fig. 4C), and the impact of host adaptation in hu-mice.

Pretreatment resistance is a principal concern for ART. It was suggested that pretreatment NRTI drug resistance was associated with lower rates of virologic suppression in clinical study participants receiving integrase inhibitor-based ART therapies and could eventually lead to treatment failures44. Thus, we aimed to validate and identify the source of the mutations observed in the rebound animals at the study endpoint and determine whether they resulted from the treatment modalities employed during the study course or preexisted in the viral genome. We conducted a longitudinal analysis of the same mouse samples stored frozen before ART and post-ART/before CRISPR treatments were initiated. Interestingly, we observed higher env sequence diversity in the single-treatment (either LA ART or CRISPR) groups compared to the dual-treated viral rebound group, so sequences from the single-treated groups were included in the analysis in Fig. 4. We confirmed from our analysis that though a few pretreatment mutations were present before ART treatment, most new mutations identified at the endpoint were introduced during the LA ART treatment period, but not after ART withdrawal.

Next, we examined the LA ART-associated DRAMs. Though HIV-1-infected patients are currently using the combinatorial long-acting ART treatment45, limited information is available on the drug resistance pattern against the long-acting regimens. There are only three reported drug resistance studies related to long-acting ART. One recent clinical trial evaluated the efficacy of LA Cabotegravir (CAB) and Rilpivirine (RPV) combination and found that participants experiencing virological failure had mutations associated with resistance to RPV. These included E138A, E138K plus C108I, along with the integrase mutation N155H. In addition, INSTI resistance was also reported in patients with virological failure in a preventive LA injectable Cabotegravir trial. Another study reported one NNRTI and NRTI carrying the M184V mutation against tenofovir disoproxil fumarate/emtricitabine (TDF/FTC)46,47. The other report was against the LA Lenacapavir (LEN), a novel capsid HIV-1 inhibitor48.

By utilizing SS, we detected a few accessory and minor LA ART-associated DRAMs in both HIV-1NL4-3 and HIV-1ADA-infected hu-mice rebound samples, except M41L, which is a primary NRTI HIV-1 drug resistance mutation (Table 2). It is known that SS generates a single-consensus sequence at a 15-20% threshold; thus, any lower-frequency DRAMs that are less than the threshold would be undetected. Earlier evidence has shown that the undetectable minor multi-drug resistant strains during treatment interruption contribute to the later dominant viral escape mutants when ART was restored49, suggesting the critical role of low-frequency drug-resistant variants escaping ART treatment. More evidence later has also revealed that low-frequency HIV-1 DRAMs detected using NGS can be detrimental to the treatment outcomes50,51,52. Thus, we harnessed the ability of NGS for the low-prevalence DRAMs, at or near 1% of the threshold. It has been reported that mutation detection specificity increased dramatically at the 2% threshold under analytical sensitivity conditions, suggesting that the 2% threshold is more reliable than the lowest 1% threshold21. Thus, in this study, we employed a differential range of thresholds (20, 5, and 2%) and performed three independent sample analyses to detect low-frequency mutants in our NGS platform. Interestingly, we detected seven additional major and several accessory DRAMs, as shown in Table 3. We also observed a strong concordance between our NGS-based HIV-1 drug resistance (HIVDR) testing data and SS at the 20% threshold, which was reported by others previously in patient samples28. Compared to SS, NGS at 2% and 5% thresholds (Fig. 5B) allowed better detection of major as well as minor DRAMs in our humanized mice plasma samples, consistent with previously reported clinical studies21,53,54, and is the first report from a disease-specific small animal model system receiving ART and CRISPR multimodal therapy.

We also identified new mutations in our NGS analysis at a 5% cut-off 50,55 across all three significant regions analyzed in our hu-mice rebound samples treated with either LA-ART, CRISPR, or dual therapy, which were not documented in the Stanford DR database (Fig. 6). We identified a significant difference in region-specific mutation frequencies of these mutations in both HIV-1NL4-3 and HIV-1ADA-infected plasma samples. Interestingly, most of the new mutations we observed were in the LA ART-treated group, and each treatment group exhibited a relatively unique new mutation pattern. However, common E29EG and Q151QR mutations were present in both the LA ART and the dual-treated group. This suggests that LA ART treatment may have contributed to these new escape mutations and viral rebound in the dual-treated group. The functional significance and the role of these new mutations identified at higher frequency in the RT region of our hu-mice remain to be determined. Still, they may have contributed to viral rebounds observed in the animals administered LA ART and/or CRISPR therapy. Though limited evidence has shown that guide RNA-mediated selective pressure drives viral evolution at untargeted loci, our findings suggest targeting gag may impose selective constraints that indirectly promote adaptive changes in other viral regions, such as pol and env. The selective pressure likely arises from the fitness cost imposed by CRISPR-mediated disruption of gag, which may constrain replication-competent variants and favor those with compensatory changes.

To determine whether the reported mutations will alter the virus’s binding capability to the host receptor, we examined the CD4 binding loop sequence of all the animals from the first study (Supplementary Fig. 10B). Interestingly, we identified one single-nucleotide substitution in the CD4 binding loop sequence in 40% (2/5) of the dual-treated viral rebound mice. This suggests that the combined treatment may have altered viral entry and transmission to other immune cells, but it requires further investigation.

Lastly, we examined CRISPR-mediated mutations from our HIV-1NL4-3-infected samples. So far, there are only a few reports on CRISPR-mediated viral escape mutants. A small population of evolving quasi-species was reported escaping either single or dual CRISPR targets31,56,57, which implies that HIV-1 possesses the capability to undergo evolutionary adaptations during the suboptimal CRISPR-based therapeutics. We are the first to report the molecular analysis of rebound viruses from CRISPR-based combinatorial therapy (Supplementary Fig. 10A). However, we detected no mutations at or near the guide RNA targeting site in the CRISPR-treated rebound virus.

To the best of our knowledge, we were the first to identify unique mutations involving LA ART and CRISPR therapies in infected hu-mice through the current study. However, the study has limitations. CRISPR on-target mutation analysis using NGS was not performed. Though mutations were identified from plasma samples, which body compartment that contributed to the origin of viral mutants needs further investigation. We did not observe evidence of sequence cluster formation in phylogenetic analysis in our treated samples (Supplementary Fig. 11), which could be attributed to the origin of the founder virus. The phylogenetic tree generated from our longitudinal analysis sequences (Supplementary Fig. 11B) demonstrated the viral evolution during or after the treatment period of approximately 14 weeks.

In conclusion, our study reports several unique findings. First, the molecular characterization of rebound viruses from an HIV-disease-specific multimodal Long-acting ART and CRISPR treatments of infected hu-mice. Second, as noted, we observed host-selection pressure and the evolution of rebound viruses in our hu-mice. Third is the longitudinal analysis of viral mutants from plasma-derived RNA samples employing NGS analysis in infected hu-mice. Fourth, we detected new emerging variants carrying DRAMs using NGS analysis of longitudinal samples. Fifth, we report, for the first time, HIV DRAM genotyping using both SS and NGS. By employing the updated Stanford Drug Resistance Database, we have identified a few accessory mutations. Our results showed a high correlation between the SS and NGS at the 20% threshold from the same treatment groups. However, low-frequency major and accessory DRAMs were only detected by NGS at the 5% and 2% thresholds. Seventh, the env region was found to have highly diverse sequences that supported previous clinical findings. Eighth, we identified unique mutations in our treated animals’ PR/RT/IN regions. Ninth, our work demonstrated that the emergence of the drug-resistant escape mutants to the mono or dual therapies could have contributed to the viral rebound. We will be looking at the role of new mutations and exploring other mechanisms that may have contributed to the viral rebound in our humanized mice study. These findings highlight the role of treatment-dependent selection pressure driving the viral evolution in our hu-mice and underscore the importance of continued monitoring and assessment of viral genetic diversity over time. These findings in a disease-appropriate translational model system may lead to the improvement of current therapeutic interventions. The important goal is to understand the genetic and evolutionary mechanisms responsible for viral persistence and drug resistance during HIV-1 infection and treatment in patients, as combinatorial long-acting therapies are already employed in the clinics.

Methods

Sample Collection

NSG (NOD.Cg-Prkdcscid Il2rgtm1Wjl/SzJ) humanized mice were analyzed in this study. The procedure for humanization was detailed in our prior publication13. Two independent studies were conducted from which the samples were derived for the analysis. CD34 + cells used for humanization were obtained from 2 independent single donors. In the first study, humanized mice were infected intraperitoneally with HIV-1NL4-3 at 104 tissue culture infective dose 50 (TCID50)/ml, and in the second study, they were infected intraperitoneally with the HIV-1ADA at 104 TCID50/ml. The plasma samples were collected through sub-mandibular vein puncture into EDTA-coated tubes and stored in the ultra-low freezer for subsequent analysis. All the Plasma samples analyzed in this manuscript were obtained from 29 virally rebound NSG humanized mice, including 22 samples from study one and seven from study two, reported in our previous publication on HIV-1 elimination from a subset of humanized mice, published in the Nature Communications journal in 2019 (PMID – 31266936). In this manuscript, we aimed to investigate the reason for viral rebound in those dual-treated humanized mice using sensitive sequential detection methods targeting major HIV genes. All the humanized mice in the first publication were of the same age (18–20 weeks). Within these cohorts, mice were subjected to 4 different treatment groups: treated either with LA ART, CRISPR-Cas9, or a dual therapy, or left untreated. In that study, humanized mice were either infected with HIV-1NL4-3 or HIV-1ADA and were subsequently divided into four treatment subgroups: 1- no treatment; 2- LA ART treatment consisting of NMDTG and NRPV at 45 mg/kg and NMABC and NM3TC at 40 mg/kg, administered via intramuscular injection two weeks post-infection (WPI) and ceased at 6 WPI; 3- CRISPR-Cas9 treatment at 7/8 WPI via a single dose of intravenous injection using Adeno associated vector 9 serotype (AAV9); or 4- sequential LA ART and CRISPR-Cas9 treatment with viral rebound. The design of gRNA, construction of CRISPR-Cas9 expression plasmid, and AAV9 vector were as per our published manuscript13. After sequence verification of the constructed plasmid pX601-CMV-saCas9-LTR1-GagD, the construct was packaged into the AAV9 serotype. AAV9 was chosen as the vector for CRISPR-Cas9 delivery for its robust transduction efficiencies in multiple tissues. The notion was to permit efficient AAV entry into all putative HIV-1 target tissues reservoirs. A single intravenous (IV) dose of AAV9-CRISPR-Cas9 was administered at week 9 for study 1 and at week 7 for study 2 to the CRISPR assigned experimental groups, as shown in Fig. 1. The assignment of control and treatment groups is as follows: 1- no treatment, 2- LA ART treatment, 3- CRISPR-Cas9 treatment and 4- sequential LA ART and CRISPR-Cas9 treatment with viral rebound as elucidated in Fig. 1 and derived from our earlier publication13. At study end all the animals were sacrificed using isoflurane inhalation in our earlier publication from which the above-mentioned plasma samples were derived for analysis.

RNA extraction and One-step RT-PCR

RNA extraction was performed on 50 μL of plasma collected in EDTA blood collection tubes using a Qiagen MinElute virus spin kit (Cat. No. 57704, QIAGEN, Hilden, Germany). The cDNA synthesis and first-round PCR steps were performed in a single reaction using the Superscript III One-Step RT PCR Platinum Taq HiFi kit (Cat. No. 12574035, Thermo Fisher, Massachusetts, USA). For one-step reverse transcription and PCR, we performed the sample reactions independently in respective single tubes using gene-specific primers and target RNA samples. For PCR amplification of each HIV region of interest, 500 ng of RNA was added to the reaction mix containing molecular biology grade water, reaction mix, enzyme mix, and 10 pmol/μl of forward and reverse primer sets. The primer used for PCR amplification is described in Supplementary Table 4. The PCR reaction conditions for HIV-Gag, Pol, and Env regions were provided in Supplementary Table 5. To minimize the risk of cross-contamination between treatment groups, all molecular biology procedures, including PCR amplification, gel electrophoresis, purification of PCR products, and preparation for sequencing, were carried out on samples belonging to the same treatment group and in a single experimental session and two different clean rooms. This approach eliminated any possibility of unintended transfer of genetic material or cross-contamination between samples from different treatment groups.

Sanger sequencing

Targeted PCR products were visualized by gel electrophoresis with a Quick-Load purple 1 kb DNA ladder (New England Biolabs). PCR products were purified using QIAquick® PCR Purification Kit (Cat. No.28104, QIAGEN, Hilden, Germany). 50 ng of purified amplicons were added to a mixture containing molecular biology grade water and five pmol/μl primer, with a final volume of 15 μL, and submitted to the Genewiz Sanger Sequencing Facility (Genewiz, Azenta Life Sciences, MA, USA). The GenBank accession numbers for Sanger sequencing for gag, pol, and env Regions sequences for HIV-1NL4_3: PQ144791- PQ144834 and for HIV-1ADA: PQ144835- PQ144848.

Next-generation sequencing

We selected the HIV-pol region for our NGS analysis. The size of the pol amplicon tested for NGS analysis was approximately 2959 bp, covering positions 2254 to 5213 on the HIV genome (based on standard HXB2 coordinate), covering three essential genes, which include nearly full-length protease (PR, codon 3-99), full reverse transcriptase (RT) and full integrase (IN). 2100_expert High Sensitivity DNA Assay for QC was performed on these PCR pol amplicons of 25 selected samples. For NGS library construction, the targeted PCR products underwent a segmentation process. They were re-amplified using read sequencing primers containing the MiSeq (Illumina, Inc., San Diego, CA, USA) index adapter sequences. The samples were processed according to the Illumina DNA Preparation Reference Guide (Illumina, LLC, Inc., San Diego, CA, USA). Briefly, 250 ng of each sample’s pol amplicon products were segmented using the bead-linked transposons, and the adapter-tagged DNA was cleaned using Tagment Wash Buffer. Tagmentation was used as a first step in library prep where unfragmented DNA was cleaved and tagged for analysis. The tagmented DNA was further reamplified under five cycles with a denaturation temperature of 98 °C for 45 seconds, 62 °C for 30 seconds of annealing, and 68 °C for 2 min of extension. The final extension was done at 68 °C for 10 min. The indexed PCR product was cleaned with a purification bead and 80% ethanol. The indexed amplicons were quantified using a Qubit assay, and a normalized library was generated. The pooled library was loaded on the Illumina MiSeq chip, and a 2 ×150-bp pair-end multi-plex run was used to sequence the constructed library. The GenBank accession numbers for Illumina MiSeq NGS sequences for HIV-1NL4_3: PQ152024- PQ152044 and for HIVADA: PQ152045- PQ152051.

Sequencing Data Analysis

High-quality (with QS score equal to or higher than 40) reads derived from SS were selected for contig assembly using the Seqman Ngen software (DNASTAR, Inc.) with optimal parameter settings. Gag, pol, and env sequences were aligned and highlighted with mutations or APOBEC signatures utilizing Highlighter software43. Selected key genes and subregions (Protease, RT, Integrase, envelope V1-V5) were trimmed using SeqNinja software (DNASTAR, Inc.). Multiple sequence alignment files were created for all the samples using HXB2/HIV-1NL4-3/HIV-1ADA references using the latest version of MUSCLE (Multiple Sequence Comparison by Log-Expectation) algorithms, muscle 5.1.0 at https://www.ebi.ac.uk/Tools/msa/muscle and/or built-in MAFFT7 algorithm in the HIV Los Alamos HIV database HIVAlign Tool. The fastq files derived from NGS were quality checked using fastqc tool (bioinformatics.babraham.ac.uk/projects/fastqc) with an evaluation of quality scores across all the bases, sequence length distribution, and overexpressed sequences. One of our samples (4348) failed the quality test and was thus excluded from the following assembly step and analysis. The resequencing assembly automatically merges forward and reverse reads from each sample. Read trimming, adapter scanning, and primer removal were built into the Lasergene MegAlign Pro software (DNASTAR, Inc.) with the input sequencing platform as “Illumina” and data type as “pair-ended” data. The “xng” algorithm in the DNAstar Lasergene package was used for assembling sequence reads. Variant analysis was also performed using Lasergene MegAlign Pro software from DNASTAR, Inc. The original fastq files were converted to CodFreq files using the HIV-1 NGS analysis Pipeline 21. The resulting CodFreq files were created for downstream HIV DRM analysis per the Stanford HIV Drug Resistance Database. While employing the Stanford HIVdb Algorithm, we chose and applied 20%, 5%, and 2% sensitivity thresholds to detect drug resistance mutations. Susceptibility to drugs was classified as susceptible (S), low-level (L), intermediate (I), and resistance (R) as per the drug resistance mutation scores for PR, NRTI, NNRTI, INSTI in Stanford HIV Drug Resistance Database and 2022 update of the drug resistance mutations analysis in HIV-1 from IAS-USA (https://www.iasusa.org/resources/hiv-drug-resistance-mutations/). Data visualization was performed using tools provided by the LANL HIV Database (https://www.hiv.lanl.gov/) and the R environment (http://www.rstudio.com/).

Statistics and reproducibility

Statistical Comparisons between more than two groups (Figs. 2C, 4) were performed using one-way analysis of variance (ANOVA) in GraphPad Prism 10.0.0 (GraphPad Software, San Diego, CA). In this study, humanized mice were either infected with HIV-1NL4-3 (Study 1; n = 22) or HIV-1ADA (Study 2; n = 7) and were subsequently divided into four treatment subgroups for study 1: 1- no treatment (n = 3); 2- LA ART treatment (n = 7–9); 3- CRISPR-Cas9 treatment (n = 5); or 4- sequential LA ART and CRISPR-Cas9 treatment with viral rebound (n = 5). For study 2, HIV-1 ADA infection, 3 subgroups were included: 1- no treatment (n = 3); 2- LA ART treatment (n = 2); and 3- sequential LA ART and CRISPR-Cas9 treatment with viral rebound (n = 2). Statistical Comparisons between more than two groups with smaller sample sizes (n < 3) were performed using a non-parametric Kruskal-Wallis’ test (Supplementary Fig. 6). A modified ANOVA analysis, Type III Analysis of Variance (ANOVA) with Satterthwaite’s method, was performed using R to compare the differences in the frequency of distribution in three subregions derived from individual samples in Fig. 6A and B. Comparison between two subregions (Fig. 6A, B) was performed using paired Student’s t-test (two-tailed). The Shapiro-Wilk test for normality was used to test data distribution. If data distribution was abnormal, Spearman’s rank correlation analysis was used to test associations between plasma viral load and mutation frequency (Table 1 and Supplementary Table 1). A p-value less than 0.05 is considered statistically significant. The figures with error bars represent mean ± standard error of the mean.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.