A small molecule stabilizer rescues the surface expression of nearly all missense variants in a GPCR

Mighell, Taylor L.; Lehner, Ben

doi:10.1038/s41594-025-01659-6

Download PDF

Article
Open access
Published: 22 September 2025

A small molecule stabilizer rescues the surface expression of nearly all missense variants in a GPCR

Nature Structural & Molecular Biology (2025)Cite this article

7928 Accesses
128 Altmetric
Metrics details

Subjects

Abstract

Reduced protein abundance is the most frequent mechanism by which rare missense variants cause disease. A promising therapeutic avenue for treating reduced abundance variants is pharmacological chaperones (PCs, also known as correctors or stabilizers), small molecules that bind to and stabilize target proteins. PCs have been approved as clinical treatments for specific variants, but protein energetics suggest their effects might be much more general. To comprehensively assess PC efficacy for variation in a given protein, it is necessary to first assign the molecular mechanism explaining all pathogenic variants, then measure the response to the PC. Here we establish such a framework for the vasopressin 2 receptor (V2R), a G-protein-coupled receptor in which loss-of-function variants cause nephrogenic diabetes insipidus (NDI). Our data show that more than half of NDI variants are poorly expressed, highlighting loss of stability as the major pathogenic mechanism. Treatment with a PC rescues the expression of 87% of destabilized variants. The non-rescued variants identify the drug’s predicted binding site. Our results provide proof-of-principle that small molecule binding can rescue destabilizing variants throughout a protein’s structure. The application of this principle to other proteins should allow the development of effective therapies for many different rare diseases.

Functional characterization of a loss-of-function mutant I324M of arginine vasopressin receptor 2 in X-linked nephrogenic diabetes insipidus

Article Open access 26 May 2021

PCF-VAE: posterior collapse free variational autoencoder for de novo drug design

Article Open access 01 October 2025

Exploration of potential novel drug targets for diabetic retinopathy by plasma proteome screening

Article Open access 22 May 2024

Main

Rare genetic diseases pose a formidable challenge for global health. For any disease, the number of affected individuals is a small percentage of the population; however, in total, as many as 300 million people are affected by rare diseases¹. Despite recent progress in computational methods², the identification of causal pathogenic variants and the determination of molecular mechanisms remains an arduous challenge^3,4. Furthermore, developing effective therapies for genetic diseases for which only a small number of patients carry each causal variant is extremely challenging.

The most frequent mechanism by which missense variants cause rare diseases is reduced protein abundance. Large-scale experimental⁵ and computational^6,7 surveys estimate that 40–60% of pathogenic variants are explained by loss of abundance. Compensating for this reduced abundance therefore represents a potentially general strategy to treat rare diseases. PCs are typically small molecules whose binding increases the thermostability and subsequent steady-state expression level of a target protein. A striking example of PC success is in the treatment of cystic fibrosis, in which a combination treatment of two PCs and a channel potentiator offers effective treatment for people with the most common alteration, p.Phe508del, and some other variants⁸. PCs—also referred to as protein stabilizers or correctors—have also been developed for non-membrane proteins, including clinically approved PCs for lysosomal storage disease proteins⁹ and the amyloidogenic transthyretin protein¹⁰, as well as experimental PCs for the most frequently mutated tumor-suppressor protein, p53 (refs. ^11,12). To maximize the effectiveness of PC therapy, however, the mechanism of all pathogenic variation for a given protein, as well as the response to PC, must be identified.

Depending on the mechanism of action, some PCs could have high specificity, rescuing only subsets of pathogenic variants localized in particular regions of the protein^13,14. Alternatively, some PCs could behave simply, in accordance with the law of mass action, and have largely non-specific stabilizing effects that offset the destabilization by most variants in a protein, wherever they are located^12,15. Most human proteins are marginally stable, meaning that only small changes in folding energy are required to produce large changes in folded protein abundance¹⁶. Indeed, in a collated set of 223 experimentally determined changes in Gibbs free energy of folding (ΔΔG) values for membrane protein mutations, 91% were less than 3 kcal mol⁻¹ (ref. ¹⁷). Similarly, massive experimental mutagenesis of soluble proteins has confirmed that the vast majority of variants cause only small changes in folding energy¹⁸, as do known pathogenic variants⁵. Such small changes in fold stability could potentially be easily compensated for by small-molecule binding, which can produce comparable changes in free energy to those induced by mutations¹⁹. Moreover, provided that free energies combine mostly additively^12,20,21, the stabilization conferred by small-molecule binding should be largely independent of the binding site and mutation location, as long as the compound specifically binds the native folded state.

The vasopressin 2 receptor (V2R) is a GPCR with an important role in water homeostasis in the kidneys²². The hormone arginine vasopressin (AVP) is the primary endogenous ligand of V2R, and AVP levels regulate the permeability of the kidney’s collecting duct: elevated AVP levels lead to increased permeability, promoting water reabsorption. Upon AVP binding, V2R adopts an active conformation and couples primarily with G_αs-containing heterotrimeric G proteins, leading to intracellular signaling that results in translocation of aquaporin-2 water channels to the plasma membrane²³. When V2R function is lost, water reabsorption is compromised, leading to nephrogenic diabetes insipidus (NDI)²⁴. Individuals with NDI experience chronic dehydration that can lead to severe clinical outcomes, and treatment options are available only to manage symptoms²⁵. Highlighting the sensitivity and importance of this system, rare gain-of-function mutations that elevate the basal activity of V2R result in nephrogenic syndrome of inappropriate antidiuresis (NSIAD)²⁶. The gene encoding V2R, AVPR2, is on the X chromosome. The majority of people with NDI are hemizygous males, although skewed X inactivation has been reported to cause NDI or subclinical phenotypes in some females²⁷. Hundreds of AVPR2 variants have been found in individuals with NDI, of which about half are missense variants; remainder are nonsense, small insertions or deletions or splice-site mutations²⁸. Only a fraction of the missense variants has been experimentally characterized^{23,28,29,30,31,32,33,34,35}.

Here, we use V2R as a model system to directly test whether PCs can rescue all destabilizing mutants in a protein. First, we use a multiplexed assay to quantify the effects of all possible variants on the cell surface expression of V2R, revealing that more than half the known pathogenic variants strongly impair V2R expression, as do thousands of other missense variants throughout the protein. Strikingly, treatment with the V2R small-molecule binder tolvaptan rescues the expression of nearly all these variants. Only a small number are not compensated for, with these identifying functionally important sites and the drug binding site. The application of this approach and principle to other proteins should allow the development of general stabilizers for many genetic diseases.

Results

Massively parallel measurement of V2R surface expression

We used scalable and uniform nicking (SUNi) mutagenesis³⁶ to generate a saturation variant library containing all single amino acid changes to the V2R coding sequence. Mutagenesis primers were designed to introduce a degenerate NNK or NNS codon at each position (depending on the wild-type codon, K = G or T, S = G or C), and random DNA barcodes were subsequently inserted into the plasmid backbone to enable identification of each variant with short-read sequencing. To link the full V2R variant sequence with the short DNA barcode, we used long read sequencing and successfully linked 66,031 barcode variants (Extended Data Fig. 1a). In total, 7,005 out 7,400 (94.7%) possible missense and nonsense variants were represented by at least 1 barcode, with a median of 5 barcodes per variant.

We implemented a fluorescence-activated cell sorting (FACS)-based approach to measure the surface expression of the library of variants in human cells^37,38. First, the plasmid library was recombined into HEK293T landing-pad cells³⁹ (see Methods), ensuring that each cell carried exactly one variant. The V2R construct had a hemagglutinin (HA)-epitope tag at the amino terminus, which would be extracellular in a properly folded and trafficked receptor. Therefore, we performed immunostaining with a fluorescent antibody but without permeabilizing the cells, so that only receptors that had reached the membrane would contribute to signal. Then, we sorted cells into four bins, isolated DNA from the cells in each bin and used short-read sequencing to count the frequency of each variant across the bins (Fig. 1a).

**Fig. 1: Experimental approach and primary dataset overview.**

Surface expression scores were calculated using the frequency of each variant in each bin multiplied by the geometric mean fluorescence value associated with each bin (see Methods). Across four replicates, we obtained high-confidence measurements for 6,844 (92.5% of possible; Methods) variants with high reproducibility (average pairwise replicate Pearson’s r = 0.90, Extended Data Fig. 1b and data in Supplementary Table 1). Multiplexed scores show a strong correlation with FACS measurements from a series of isogenic cell lines containing single V2R variants (r = 0.95, Fig. 1i). Scores were normalized between a complete loss of function (designated as the median score of all nonsense variants in the first 300 positions of the receptor; assigned score 0), and the wild-type genotype (assigned a score of 1). Synonymous wild-type variant scores cluster distinctly from nonsense scores, and the distribution of all missense variants was bimodal: a large subset was expressed at near-wild-type levels, and a smaller subset near the level of nonsense variants (Fig. 1b). We used the top 95th percentile of truncation scores (0.35) and the bottom 95th percentile of synonymous wild-type scores (0.825) to categorize missense variants as well expressed (3,415 variants), moderately expressed (1,772 variants) or poorly expressed (1,025 variants, Fig. 1b).

A heatmap representation of the data reveals that nonsense mutations are uniformly damaging through all seven transmembrane (TM) helices, but mostly do not compromise surface expression in the unstructured, carboxy-terminal tail (Fig. 1c). TM helices are particularly intolerant to substitutions, especially to charged amino acids. In fact, substitutions in the TM helices as a population are significantly more damaging than are those in the extra- or intracellular loops (Mann–Whitney U test, two-sided, P = 3.16×10⁻¹³⁵, P = 8.54×10⁻¹⁵³, respectively), whereas there is no difference between the extra- and intra-cellular loops (P = 0.39, Fig. 1d,g). Comparing the surface expression scores of all variants in each TM helix, variation in TM3 is the most damaging (median, 0.49; Fig. 1e). This is consistent with previous work suggesting that TM3 has a critical role in receptor stability⁴⁰ and acts as a ‘structural hub’⁴¹ for the TM bundle across class A GPCRs. TM2 is the second most mutation-sensitive (median, 0.56) TM helix. The predicted free-energy change (ΔG) for membrane integration⁴² of these helices indicates a relatively unfavorable process, with median predicted ΔG values of 3.1 and 3.7 for TM2 and TM3, respectively), suggesting that substitutions in these locations could further compromise an already inefficient process. By contrast, TM7 exhibits a median predicted ΔG of 4.2, indicating inefficient membrane incorporation; however, substitutions here are well tolerated (Fig. 1f).

TM regions are solvated in lipids and are therefore enriched in hydrophobic residues, compared with extra- or intracellular residues. Substitutions in the TM regions that decrease hydrophobicity have negative effects on surface expression (Spearman’s ρ = 0.28, P = 1.8 × 10⁻⁴⁸), whereas the relationship is much weaker and in the opposite direction for the extra- or intracellular regions (ρ = −0.05, P = 0.002, Supplementary Fig. 1c). Visualization of each residue’s preference for hydrophobicity (calculated as rank correlation of surface expression scores with Kyte–Doolittle hydrophobicity) agrees with this notion (Fig. 1h), but highlights several residues in the core of the receptor that prefer hydrophilic residues (Extended Data Fig. 1f).

V2R undergoes post-translational modification in the form of O- and N-glycosylation on the N-terminal tail⁴³ as well as palmitoylation on the C-terminal tail.⁴⁴ Positions 22–24 represent the N-glycosylation motif (N-X-S/T), and substitutions at positions 22 and 24 are much more deleterious than are their neighbors in the unstructured N terminus (respective median position score, 0.77 and 0.68; Extended Data Fig. 1d). Whereas O-glycosylation was reported at several serines and threonines in the N terminus⁴³, we see strong mutational effects only at S5 and T6 (respective median position scores, 0.74 and 0.73), suggesting these are the only critical glycosylation sites, at least in HEK cells with this overexpression system. Substitutions to cysteine are significantly more deleterious in the extracellular loops than in the intracellular loops (Fig. 1j, Mann–Whitney U test, two-sided, P = 7.3 × 10⁻¹⁰). The extracellular cysteine substitutions likely disrupt a conserved disulfide bond⁴¹. Finally, palmitoylation at cysteines 341 and 342 have been reported to be important for surface expression⁴⁴; our data show that in HEK cells, substitutions at these sites are well tolerated (Extended Data Fig. 1e).

The contribution of V2R surface expression to pathogenicity

We next sought to understand the contribution of surface expression defects to V2R-related disease. To do this, we collated clinical variants from ClinVar⁴⁵, population variants from gnomAD²⁹ and variants reported in individuals with NDI or NSIAD in the Human Gene Mutation Database⁴⁶. There are striking differences between the surface expression scores for putatively benign variants (in gnomAD, or benign or likely benign in ClinVar) and the putatively loss-of-function ones (pathogenic or likely pathogenic in Clinvar, or NDI, Fig. 2a). There are no poorly expressed variants in the gnomAD, benign or likely benign sets, and only 30.7% and 26.6% are moderately expressed in likely benign and gnomAD, respectively (Fig. 2b). By contrast, for pathogenic, likely pathogenic and NDI variants, 40%, 53.3% and 55.6%, respectively, are poorly expressed, whereas 26.6%, 26.6% and 30%, respectively, are moderately expressed. There are only five known NSIAD variants, and they are constitutively active⁴⁷. Four are moderately expressed and one is well expressed, in accordance with the gain-of-function mechanism. Among variants of uncertain significance, 15.6% and 25% are poorly and moderately expressed, respectively (Fig. 2b). Stratifying gnomAD variants by allele frequency shows that common variants (allele frequency > 0.001) are all well expressed (Fig. 2c).

**Fig. 2: V2R clinical genetics and variant effect predictors.**

Classifying pathogenic variant mechanisms

Although computational variant effect predictors (VEPs) can distinguish pathogenic from benign variation with increasing accuracy, they cannot perform the crucial task of determining molecular mechanisms for pathogenicity (Fig. 2d). First, we evaluated how well computational VEPs can discriminate pathogenic (NDI alleles) from putatively benign variation (gnomAD alleles). ESM1b⁴⁸, a protein language model, achieves an area under the receiver operator characteristic curve (AUROC) of 0.91. EVE⁴⁹, which uses multiple sequence alignments to model evolution, achieves an AUROC of 0.92. Finally, AlphaMissense⁵⁰, which uses structural predictions, population variant frequencies and evolutionary data, achieves an AUROC of 0.94. In comparison, surface expression scores achieve an AUROC of 0.84, again suggesting that surface expression is a major determinant of pathogenicity.

However, models designed to predict effects of variation on protein stability performed less well than did the empirical surface expression scores. RaSP⁵¹ achieved an AUROC of 0.76, and ThermoMPNN⁵² achieved one of 0.80 (Fig. 2e). We also compared the predictors with the surface expression scores, and with each other. The VEPs correlated better with the surface expression data, with ρ = 0.6, 0.57 and 0.56 for AlphaMissense, EVE and ESM1b, respectively (Fig. 2f and Extended Data Fig. 2a–c). ThermoMPNN and RaSP correlated less well, at 0.46 and 0.33, respectively (Fig. 2f and Extended Data Fig. 2d,e). The poorer correlation of the stability predictors likely reflects the difficulty of predicting stability changes for membrane proteins, especially given that RaSP and ThermoMPNN were trained on soluble proteins. Finally, we explored whether mutation effect predictors could capture the pathogenicity of the NSIAD gain-of-function mutations. The number of variants is too small to make strong conclusions, but there seems to be much variability for the different models’ predictions, suggesting that gain-of-function variants could be more difficult to predict than loss-of-function ones (Extended Data Fig. 2a–e).

We hypothesized that our surface expression scores could clarify the mechanisms behind pathogenic variants, specifically whether a V2R variant is pathogenic owing to decreased surface expression or an impaired signaling capability. We posited that variation at positions close to the natural ligand (AVP) binding site in 3D space are more likely to disrupt signaling than is variation located further away. Indeed, we found that well and moderately expressed NDI variants are closer to AVP in a solved structure (Mann–Whitney U test, two-sided, P = 1.5×10⁻³ and 7.7×10⁻⁴ for well-expressed and moderately expressed variants compared with poorly expressed variants; PDB structure 7KH0; Fig. 2g,h).

Although there are more than 100 known NDI variants, this is likely only a small fraction of pathogenic variation in V2R. Therefore, we used AlphaMissense to predict all pathogenic variants in V2R (AlphaMissense threshold, 0.564 (ref. ⁵⁰)). There are 2,911 variants predicted to be pathogenic, of which there are high-confidence surface expression scores for 2,595. Of these, 979 (37.7%) are poorly expressed, 953 (36.7%) are moderately expressed and 663 (25.5%) are well expressed (Fig. 2i). AlphaMissense predicted pathogenic variants have bimodal surface expression scores, just like the distribution of all missense variants. However, although the mode of the higher expression peak for all missense variants is near 1.0, the mode of the higher expression peak for AlphaMissense pathogenic variants is around 0.8; this is likely because the majority of them are in TM helices (1,789 out of 2,595, 68.9%). The mode of the higher expression peak for TM variants is also near 0.8 (see Fig. 1d). Finally, we compared the proximity of AlphaMissense pathogenic variants with AVP in the solved V2R structure. As with the NDI variants, the well-expressed and moderately expressed pathogenic variants are closer to AVP in 3D space than are the poorly expressed variants (P = 2.2 × 10⁻²¹ and 0.03, respectively, Mann–Whitney U test, two-sided, Fig. 2j), consistent with a specific signaling defect for variants that have at least moderate expression level. For this much larger set, the well-expressed variants are also closer to AVP than are the moderately expressed ones (P = 1.9 × 10⁻⁹, Fig. 2j).

Temperature rescue of V2R variants

In principle, variants that are poorly expressed owing to decreased thermodynamic stability should be rescued by incubating the cells expressing the variant library at reduced temperature (Fig. 3a). Although reduced temperature could also affect biosynthetic processes that result in increased expression⁵³, we assume that, in the majority of the cases, rescue is through thermodynamic stabilization¹⁴. We sought to understand what fraction of V2R variants could be rescued by temperature reduction by culturing the cells expressing the V2R variant library at 27 °C. After 24 h at 27 °C, FACS analysis revealed a shift in the distribution of surface expression, with more cells with high V2R expression levels (Extended Data Fig. 3a). To gather quantitative data for all variants, we then sorted and sequenced the libraries rescued at 27 °C. High-confidence measurements were collected for 6,787 missense and nonsense variants (91.7% of possible), and replicates were highly correlated (r = 0.91, Extended Data Fig. 3c and data in Supplementary Table 1). The distribution of multiplexed missense variant scores mirrors the FACS data, with a shift of some variants from the low-expressing to the high-expressing peak (Fig. 3b).

**Fig. 3: Thermodynamic rescue of V2R variants.**

Then, we compared the surface expression of all variants at 27 °C compared with their expression under the control condition, 37 °C. The magnitude of rescue is greatest for variants with a moderate expression level (Fig. 3c). However, variants that are well or poorly expressed in the control condition do not show large changes. This result is consistent with temperature reduction having a constant, additive change in free energy, which, in turn, has a non-linear effect on the phenotype of interest^16,54, in this case steady-state expression level. Namely, as has been observed before in similar experimental paradigms^55,56, the magnitude of rescue is greatest for variants whose free energy values begin in the steepest part of the curve of the free-energy expression level (Fig. 3d).

To explore the generality of rescue achieved through temperature reduction, we designed an approach to identify positions with varying levels of rescue compared with all other positions. First, we fit a LOWESS curve to the control versus rescue data, establishing a null expectation for the magnitude of rescue at each control surface expression value. Following this, we calculate residuals for each variant relative to the fit line. Then, the residuals for each position are compared with those at all other positions to find outliers (Fig. 3e). After fitting the LOWESS curve, we used a two-sided Mann–Whitney U test to identify positions with significantly biased residuals (Methods and Supplementary Table 2). For the 27 °C condition, out of 371 positions, only 4 positions are rescued less than expected; 6 are rescued more than expected on the basis of the model, (false discovery rate (FDR) = 0.1, Fig. 3g,h). Among these ten positions are three glycosylation sites (positions 5, 6 and 24), which would be expected to have effects beyond simple thermodynamic destabilization. This suggests that the majority of variants can be rescued by temperature reduction.

Pharmacological chaperone rescue of V2R variants

On the basis of the broad effectiveness of temperature rescue, we predicted that PC binding could have a general rescue effect. PCs are typically hydrophobic small molecules that cross the plasma membrane and stabilize a target protein by binding to the folded state to promote trafficking (Fig. 4a). Although PCs have effectively corrected NDI-associated V2R variants both in vitro and in the clinic⁵⁷, uncertainty around how general the effect of any PC would be has been a major obstacle²³. PCs might have quite specific effects, such that each PC rescues only a subset of structurally related variants^13,14. Alternatively, PCs might generally rescue mutants with reduced thermodynamic stability¹². Tolvaptan (also known as OPC-41061) is a V2R-specific, competitive, small-molecule antagonist⁵⁸ (Fig. 4b), approved for treating autosomal dominant polycystic kidney disease⁵⁹, in which AVP-V2R signaling is misregulated. However, tolvaptan has been explored in vitro for its activity as a PC. In most cases, tolvaptan not only rescues surface expression, but also enables some level of AVP-mediated signaling^33,35,60,61. However, as for PCs in general, tolvaptan has been tested for effectiveness on only a very limited set of variants (<15)^33,35,60,61.

**Fig. 4: Pharmacological rescue of V2R variants.**

Therefore, we treated cells bearing the V2R variant library with tolvaptan for 24 h and then profiled the population surface expression with FACS. Compared with the 27 °C condition, the tolvaptan condition has a stronger rescue effect, with more cells shifting from low to high expression (11% of cells shifting in tolvaptan compared with 5.3% at 27 °C, Extended Data Fig. 3b).

We sorted and sequenced the tolvaptan-rescued libraries and collected high-confidence measurements for 6,759 missense and nonsense variants (91.3% of possible, data in Supplementary Table 1). Replicates were well correlated (r = 0.74, Extended Data Fig. 3d), and multiplexed measurements were consistent with individual variant measurements in isogenic cell lines (r = 0.85, Extended Data Fig. 3e). Multiplexed missense score distributions match the FACS distributions and emphasize a near-complete shift of variants from low to high expression (Fig. 4c). Likewise, directly comparing surface expression levels in the presence and absence of tolvaptan indicates nearly universal rescue (Fig. 4d).

Next, we assessed the efficacy of tolvaptan rescue for known and predicted pathogenic variants. Tolvaptan is remarkably effective at rescuing NDI alleles: of 69 poorly expressed variants in the control condition, 60 (87.0%) are rescued to at least moderately expressed levels (Fig. 4e,f). We next evaluated effectiveness on variants predicted to be pathogenic by AlphaMissense. Strikingly, 835 of 965 poorly expressed variants (86.5%) are rescued to at least moderately expressed levels (Fig. 4g and Extended Data Fig. 3f). Tolvaptan has minor effects on NSIAD variant expression, likely because these are moderately or well expressed in the control condition (Extended Data Fig. 3g). As an alternative estimation of the extent of rescue, we identified those variants with tolvaptan rescue magnitude (that is, tolvaptan expression – control expression) greater than the 95% confidence interval of tolvaptan expression. Although the expected rescue magnitude for well-expressed variants is small, when considering variants with low control expression (<0.6), 88.4% are significantly rescued (Extended Data Fig. 3h).

To further investigate tolvaptan rescue, we fitted a LOWESS curve (Fig. 4d) and analyzed the residuals to this line to identify positions exhibiting greater or lesser rescue levels than expected (Supplementary Table 2). More positions have specific interactions with tolvaptan than with 27 °C (32 compared with 10); 21 positions are rescued less than expected, and 11 are rescued more than expected (Fig. 4h,i, FDR = 0.1). Of the ten outlier residues for 27 °C rescue, six are also outliers for tolvaptan: S5, T6 and S167⁴^x⁵³ are below expected levels in both cases, and L336⁸^x⁵⁴ (GPCRdb⁶² numbering in superscript) is rescued more than expected levels in both cases. Notably, although S24 and and Y205⁵^x³⁹ are rescued more than expected at 27 °C, they are actually less rescued than expected in the tolvaptan condition. Overall, however, rescue in the two conditions is correlated (ρ = 0.49, Supplementary Fig. 3i).

Identifying drug binding and functional sites

Next, we sought to understand the structural features associated with rescue outlier positions. For 27 °C rescue, three out of ten outlier positions are glycosylation sites. Both O-glycosylation sites (S5 and T6) are rescued less than expected, but S24 is actually rescued more than expected, suggesting that temperature reduction can compensate the N-glycosylation defect (Fig. 5a, residues colored according to FDR-adjusted P value, residues outlined in yellow are rescued more than expected, those not outlined are rescued less than expected). Of the others, three are near the orthosteric site (Fig. 5b) and two are in or near helix 8 (Fig. 5c).

**Fig. 5: Identifying sites with more or less rescue than expected.**

For tolvaptan, all four glycosylation sites (S5, T6, N22 and S24) are rescued less than expected (Fig. 5d), implying that tolvaptan cannot compensate for the lack of proper glycosylation. In addition, rescue for a cluster of residues on one side of the orthosteric binding site is worse than expected (Fig. 5e). We suspected that this could be the tolvaptan-binding site. Indeed, a study employing molecular dynamics and site-directed mutagenesis⁶³ determined that the most important residues for tolvaptan antagonism are M123^3x36, F178^3x36, Y205^5x39, V206^5x40, and F287^6x51, of which all but F178^3x36 are rescued less than expected by tolvaptan. Finally, R137^3x50, in the highly conserved E/DRY motif, and two arginines in close proximity (R158^4x44 and R143^3x56) are rescued less well than expected (Fig. 5f). The E/DRY motif is well conserved among class A GPCRs and has an important role in stabilizing the inactive state; mutations in this motif cause constitutive activity in various GPCRs⁶⁴; similarly mutations at R137^3x50 in V2R are known to cause constitutive signaling activity⁴⁷, so this cluster of substitutions might render tolvaptan less effective by biasing the receptor to the active conformation. An examination of the sites at which variants are rescued more than expected highlights a cluster of residues at the intracellular interface of the receptor (Fig. 5g). Substitutions at these sites could potentially affect signaling and/or internalization; PC stabilization of the inactive state might therefore have an exaggerated effect here. Further mechanistic studies would be required to understand the behavior of substitutions at these sites.

Taken together, these results demonstrate nearly universal surface expression rescue of destabilized V2R variants with the PC tolvaptan. The few sites at which variants are consistently not rescued shed light on the likely tolvaptan-binding site, as well as other functional sites of the receptor.

Discussion

Reduced expression due to impaired fold stability is the predominant mechanism by which missense variants cause disease, including in both soluble^{5,65,66,67,68} and membrane proteins^69,70,71,72. The destabilization conferred by most missense variants in most proteins is, however, small, suggesting that re-stabilization of a protein by a small-molecule PC that binds the native state might confer sufficient free energy to rescue the effects of very diverse substitutions throughout a protein’s structure. For this approach to be effective, however, the mechanism for all pathogenic variants in a protein, as well as their response to PC, must be prospectively classified. Here, we have implemented this framework and tested the ‘universal PC’ hypothesis for V2R. First, we show that more than half of known loss-of-function variants are poorly expressed, and use temperature reduction to show that the vast majority of variants lose expression as a result of thermodynamic instability. On this basis, we test the efficacy of tolvaptan, a well-characterized antagonist and PC of V2R. Our results show that binding of a single small-molecule PC can rescue the cell surface expression of nearly all variants throughout the receptor’s structure.

Previous low-throughput studies mostly tested the efficacy of PCs on a limited number of variants, and the failed rescue of individual variants lead to suggestions of widespread idiosyncratic effects between PCs and variants^32,33,34,35. More systematic efforts testing hundreds of variants in rhodopsin^14,55 and CFTR⁵⁶ demonstrated rescue of many variants, but the authors also suggested substantial region-specific differences in rescue. The rhodopsin investigations emphasized differences in rescue between mutations in transmembrane helices 2 and 7 (ref. ¹⁴), but subsequently found that 67 out of 69 reduced expression retinopathy variants had measurably increased expression in the presence of a PC⁵⁵, consistent with our findings. CFTR has multiple folding domains⁷³, and PCs were found to be most effective for variants in close proximity to the PC-binding site⁵⁶. Class A GPCRs then, as single-domain proteins, could be amenable targets for PCs. Here, we have profiled an order of magnitude more variants than previous studies, demonstrating protein-wide, nearly universal PC rescue of variants.

We expect that our results will generalize to other proteins and PCs, on the basis of empirical and theoretical evidence. Most proteins are marginally stable, meaning that large changes in expression result from substitutions with ΔΔG of only a few kcal mol⁻¹ (ref. ¹⁶), and indeed large-scale surveys of mutational effects show most pathogenic substitutions cause only small changes in fold stability^5,17,18. It is reasonable to think that such small changes in fold stability can be easily compensated for by small-molecule binding in many proteins. Tolvaptan binding to V2R, for example, confers a free energy change of approximately −12 kcal mol⁻¹ (ref. ⁶³), potentially completely offsetting the destabilizing effects of nearly all variants, provided that drug concentrations are high enough.

Accordingly, small molecules could offer a much more general strategy to rescue poorly expressed variants than is commonly perceived. The law of mass action means that, to rescue expression, a small molecule must specifically bind to the folded conformation of a protein, with sufficient energy to shift the folding equilibrium toward the folded state (Fig. 6). There is no requirement to specifically recognize the mutated amino acid or to bind any particular region of the protein. Similar principles apply when multiple mutations are combined in a protein, where large-scale experimental analyses have revealed that changes in fold stability are nearly always additive when diverse mutations are combined in a single protein^21,74,75. Mutations, like small-molecule binding, can be viewed as perturbations to a protein that lead to changes in free energy. We believe, therefore, that many small molecules binding to proteins with sufficient free energy will behave as general or universal PCs.

**Fig. 6: Additive model of tolvaptan rescue.**

In further support of the general, energetically additive view of PC efficacy, the small number of variants that are not rescued by tolvaptan mostly have clear mechanistic explanations: substitutions in the binding site of tolvaptan interfere with binding of the drug, and variants that affect expression by mechanisms other than reduced fold stability are not rescued, here exemplified by those at post-translational modification sites. The variants that are not rescued by a PC can be rapidly identified by selection and sequencing experiments and excluded from clinical trials. This approach to identifying changes in abundance that are not rescued by small-molecule binding is a potentially very general strategy to rapidly identify drug-binding sites in proteins.

High-throughput protein abundance selection assays have now been developed for many different protein classes^18,76,77. Similar to the present study, these assays could quickly assess the efficacy of PCs across all variants in a protein, to prioritize broadly effective PCs. One limitation of our tolvaptan-rescue finding is that we have not established the fraction of rescued variants that can signal at the membrane. However, there is reason to expect that a significant portion will retain signaling activity. Several studies have demonstrated detectable signaling for PC-rescued V2R^33,35,60,61 as well as for other GPCR variants in vitro and in vivo^78,79. Indeed, a small clinical trial found that another V2R antagonist PC, SR49059, exhibited clinical improvement in people, although the trial was discontinued owing to off-target effects.

The demonstration that a PC rescues the expression of most missense variants throughout the structure of a protein has wide-ranging implications for rare disease research. Previous experimental⁵ and computational^6,7 approaches estimate that 40–60% of pathogenic variants are explained by loss of stability or abundance (which is in line with our findings here), suggesting a broad scope for PC therapy. Such general PCs will not have to bind to specific sites in a protein, and they will not need to be tailored to each pathogenic variant. Rather, in accordance with the simple principle of additive free energies, any molecule that binds specifically to the folded state of a protein with sufficient free energy is a potential universal PC.

Methods

V2R saturation mutagenesis and barcoding

The sequence for the barcoded attB-HA-V2R construct is available as a genbank file (https://zenodo.org/records/14216036). To introduce all single amino acid substitutions, we used SUNi mutagenesis³⁶. In short, oligonucleotides were designed to encode all amino acid substitutions (using NNK or NNS degenerate codons) with flanking sequence of 18–40 bases to optimize melting temperature and ensure the presence of a 5′ GC clamp³⁶. Oligonucleotides were ordered as an oPool from Integrated DNA Technologies (IDT, Supplementary Table 2). Four microliters of final product from the SUNi protocol was electroporated into 50 μl of 10β high-efficiency electrocompetent Escherichia coli cells (New England Biolabs (NEB)), followed by 1 h recovery with 2 ml of super optimal broth with catabolite repression (SOC) medium. At this point, 0.5% of the recovery was plated onto an LB agar plate with ampicillin, and 99.5% was inoculated into 100 ml of LB liquid with ampicillin. The total transformant number, estimated by the number of colonies on the plate, was estimated to be >200,000. The plasmid was isolated the following morning using the Qiagen Plasmid Plus Midi kit.

To introduce the barcode construct, 5 μg of purified plasmid was digested with 250 units of ApaI restriction enzyme (NEB) for 1 h at 37 °C. Then, 1.5 μl QuickCIP (NEB) was added, and the reaction was incubated at 37 °C for a further 30 min. The reaction was then run on an agarose gel, and the digested fragment was isolated and column purified. The barcode (ApaI_barcode, Supplementary Table 4) was designed to have 20 degenerate nucleotides interspersed with constant AT dinucleotides: [5×N]AT[5×N]AT[5×N]AT[5×N]. It is flanked by a sequence complementary to the sequence flanking the ApaI cut site in the attB-HA-V2R plasmid and was ordered as a single-stranded oligonucleotide from IDT. ApaI_barcode was made double stranded and amplified by primers amp_ApaI_barcode[F/R]. Then, the barcode was introduced into the plasmid via Gibson assembly using 60 ng of vector and 3 ng of barcode and incubated at 50 °C for 20 min. The reaction was then diluted 1:12 with water; 1.5 μl of this was electroporated into 25 μl of 10β high-efficiency electrocompetent E. coli cells. Recovery was done in 2 ml of SOC at 37 °C for 45 min, then 0.5% was plated onto LB-agar plate with ampicillin; the other 99.5% was split between four flasks of 15 ml LB-amp liquid culture. In the morning, the number of clones on the plate was used to estimate that each flask should contain ~25,000 transformants. Two flasks were discarded, and the other two were combined to arrive at an estimated library diversity of 50,000. These were further incubated another 12 h in 100 ml LB and finally purified using the Qiagen Plasmid Plus Maxi kit.

Long read sequencing to associate variants with barcodes

To liberate the sequence of interest from the rest of the plasmid (that is the barcode and V2R coding sequence), 10 μg of plasmid was digested with 15 units of KpnI-HF and 6 units of FseI (NEB), and the fragment was then run on an agarose gel. The digested fragment was isolated and column purified.

SMRTbell libraries were prepared using the SMRTbell Express Template Prep Kit 2.0 (Pacific Biosciences (PacBio)). The V2R mutagenesis libraries were multiplexed with other libraries and ran across two SMRT Cells on the Sequel IIe instrument (PacBio). The PacBio sequencing data were analyzed with alignparse⁸⁰. Reads were quality filtered by removing any read with an estimated error rate above 1 × 10⁻⁴ for the V2R coding sequence, or above 1 × 10⁻³ for the barcode. Then, consensus sequences were called using the alignparse.consensus.simple_mutconsensus method with default parameter settings. For further processing, only barcodes with variant call support of ≥2 were retained, alongside those with variant call support of 1, provided the variant exhibited multiple nucleotide changes compared with the wild type.

Landing-pad cell line recombination

HEK293T LLP-iCasp9-Blast Clone 12 from Matreyek et al.³⁹ (hereafter referred to as landing-pad cells) were used for library integration, expression and screening. The cells were a gift from K. Matreyek at the University of Washington. Cells were cultured in DMEM supplemented with 10% tetracycline-free FBS. For recombination, 10 million landing pad cells were plated onto a T175 cell culture flask. The following day, 20 μg of the library was combined with 20 μg of pCAG-NLS-Bxb1 and transfected with Lipofectamine 3000 (Thermo Fisher Scientific), per the manufacturer’s instructions. After 48 h, the meidum was removed and replaced with doxycycline-containing medium (2 μg ml⁻¹ Sigma-Aldrich). Twenty-four hours later, the medium was replaced again with medium containing doxycycline as well as rimiducid (10 nM, Selleckchem). Rimiducid causes cell death in unrecombined cells, and substantial cell death was apparent after 24 h. At this point, the medium was replaced with medium containing only doxycycline, not rimiducid. Cells were then grown out and passaged when approaching 95% confluency, always in the presence of doxycycline. The recombined cell population was expanded, then cryopreserved in many vials and used for all experiments.

Cell sorting

Drug treatment was done 24 h before sorting. Tolvaptan (Selleckchem, catalog no. S2593) was dissolved to 10 mM in DMSO then added to cell culture medium, for a final concentration of 10 μM. To dissociate cells, they were first washed once with PBS, then incubated with Trypsin-EDTA (0.05%) for 4 min at room temperature. Then cells were washed off the plate with medium, then pelleted and resuspended in blocking buffer (1% bovine serum albumin in phosphate-buffered saline). Cells were counted and 30 million–50 million cells were transferred to a new tube. Blocking buffer was added to attain 15 million cells ml⁻¹. Then, cells were incubated on a rotating wheel at 4 °C for 30 min. Following this, HA-Tag (6E2) monoclonal antibody Alexa Fluor 647 conjugate (no. 3444, Cell Signaling Technologies) was added to a final concentration of 1:100, and cells were again incubated on a rotating wheel at 4 °C for 60 min. At this point, cells were pelleted and the supernatant was removed, then resuspended in 5 ml blocking buffer with propidium iodide (1 μg ml⁻¹).

Cells were sorted on a BD FACSaria II and analyzed with onboard FACSDiva software or post-sorting with FlowJo. Cells were first filtered by forward scattering area and side scattering area, then single cells were isolated with forward scattering width and height. BFP-positive cells were filtered as unrecombined landing-pad cells, and propidium-iodide-positive cells were filtered as dead cells. Then, the remaining population of cells was sorted into four bins, on the basis of Alexa Fluor 647 signal intensity, that were designed to result in a similar number of cells in each bin (Supplementary Fig. 1g). For each replicate, about 10 million cells were collected in total. After sorting, cells were pelleted and frozen at −80 °C, and processed later.

Sequencing library preparation

DNA was isolated from cell pellets using the DNeasy Blood & Tissue Kit (Qiagen) and eluted with 150 μl buffer EB. For each sample, 128 μl sample was amplified in 400 μl polymerase chain reactions (PCRs) across 8 PCR tubes using Q5 High-Fidelity DNA Polymerase (NEB). Primers contained partial Illumina adapters, variable degenerate bases to promote complexity on the flowcell, and sequence complementary to the barcode flanks to amplify the barcode (ApaI_barcode_seq_[F/R]_[3-5]N, Supplementary Table 4). The PCR program was 98 °C for 30 s; then 25 cycles of 98 °C for 15 s, 64 °C for 30 s, 72 °C for 30 s; then 72 °C for 30 s followed by 8 °C thereafter. The products from this reaction were cleaned up with NucleoSpin columns (Macherey-Nagel) and eluted in 50 μl water. Two microliters of the eluate were then amplified in a second PCR that appended the rest of the Illumina sequencing adapter as well as index sequences (PCR2_i[5/7]). The PCR program was 98 °C for 30 s; then 5 cycles of 98 °C for 15 s, 64 °C for 30 s, 72 °C for 30 s; then 72 °C for 30 s, followed by 8 °C thereafter. The products were run on an agarose gel and the intended product isolated and column purified. Libraries were sequenced on an Illumina NovaSeq 6000 with 2×50 paired end reads.

Sequencing data analysis and calculation of surface expression scores

Paired-end sequencing reads in fastq format were first merged with vsearch⁸¹, then adapters were trimmed with cutadapt⁸². The remaining sequence corresponds to the barcode, which was compared with the full list of barcodes identified from the PacBio variant-barcode association and tallied. Then, for each variant, the frequency of that variant in each bin is calculated. This frequency is compared with the original number of cells sorted into each bin to estimate the number of cells of that genotype in each bin. Then, all barcode-variants that code for the same amino acid change are combined, and the log₁₀-transformed geometric mean fluorescence (fluor) value of all cells sorted into each bin is combined with the estimated cell counts to arrive at the raw surface expression estimate:

$$\frac{\begin{array}{l}\left(\left({\rm{bin}}{\mbox{-}}1\;{\rm{no}}.{\rm{cells}}\times {\rm{bin}}{\mbox{-}}1{\rm{fluor}}\right)+\,\left({\rm{bin}}{\mbox{-}}2\;{\rm{no}}.{\rm{cells}}\times {\rm{bin}}{\mbox{-}}2{\rm{fluor}}\right)\right.\\+\,\left.\left({\rm{bin}}{\mbox{-}}3\;{\rm{no}}.{\rm{cells}}\times {\rm{bin}}{\mbox{-}}3{\rm{fluor}}\right)+\,\left({\rm{bin}}{\mbox{-}}4\;{\rm{no}}.{\rm{cells}}\times {\rm{bin}}{\mbox{-}}4{\rm{fluor}}\right)\,\right)\end{array}}{{\rm{bin}}{\mbox{-}}1\;{\rm{no}}.{\rm{cells}}+{\rm{bin}}{\mbox{-}}2\;{\rm{no}}.{\rm{cells}}+{\rm{bin}}{\mbox{-}}3\;{\rm{no}}.{\rm{cells}}+{\rm{bin}}{\mbox{-}}4\;{\rm{no}}.{\rm{cells}}}\,$$

Then, the surface expression scores were normalized such that the wild-type genotype was 1 and the median of known loss-of-function variants (premature stop codons before the 300th residue) was 0. Variants were retained and considered high confidence if the estimated number of cells sorted for that genotype was ≥50.

For the control scores, four replicates were combined- these were the controls for the rescue experiments. Namely, one condition with normal culture conditions (37 °C condition), two conditions with 0.1% DMSO (DMSOA and DMSOB, controlling for tolvaptan) and one condition with 1% DMSO. Because these replicates were well correlated with each other, they were all used to derive the control estimates. Data analysis was performed in jupyter lab environments using Python and the following packages: matplotlib, numpy, seaborn, pandas, scipy, sklearn and statsmodels.

Modeling rescue and identifying outliers

LOWESS curves were fit to the controls versus rescue data with the Python package statsmodels.nonparametric.smoothers_lowess⁸³ with these parameter values: frac=0.3, it=3, delta=0.0. After fitting the LOWESS curve, residuals were calculated for all variants as the distance in the y axis between the curve and the variant. Then, for each position, a two-sided Mann–Whitney U test was conducted, comparing the residuals of all variants at that position with control surface expression below 0.85 with all other variants with control surface expression below 0.85. This threshold was intended to include only variants that actually had the potential to be rescued. To control the FDR, we then used statsmodels.stats.multitest.multipletests, method = ”fdr_bh”, alpha=0.1.

Clinical variant curation

The gnomAD data was from version v2.1.1 with controls only, and was obtained on 8 February 2024. Clinvar variants were obtained on 9 February 2024 and filtered for missense variants. HGMD variants were obtained on 8 February 2024 and filtered for missense variants.

Computational predictors

ESM1b was installed and ran as described (https://github.com/ntranoslab/esm-variants) using the esm_score_missense_mutations.py function. AlphaMissense, EVE and RaSP scores were available as precomputed scores that we downloaded from the respective servers. ThermoMPNN scores were computed using the Google Colab server.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Files needed to reproduce analyses can be found at Zenodo (https://zenodo.org/records/14216036)⁸⁴. Raw sequencing reads can be found at Sequence Read Archive (accession number PRJNA1190688). Clinical annotations taken from ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/), Human Gene Mutation Database (https://www.hgmd.cf.ac.uk/ac/index.php) and gnomAD (https://gnomad.broadinstitute.org/). Data and materials can be obtained from the corresponding author upon request. Source data are provided with this paper.

Code availability

Custom code to reproduce analyses can be found at github (https://github.com/lehner-lab/V2R_surfexp_rescue).

References

Endocrinology, T. L. D. &. Rare diseases: individually rare, collectively common. Lancet Diabetes Endocrinol. 11, 139 (2023).
Article Google Scholar
Livesey, B. J. & Marsh, J. A. Interpreting protein variant effects with computational predictors and deep mutational scanning. Dis. Model. Mech. 15, dmm049510 (2022).
Article PubMed PubMed Central CAS Google Scholar
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–423 (2015).
Article PubMed PubMed Central Google Scholar
Pejaver, V. et al. Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria. Am. J. Hum. Genet. 109, 2163–2177 (2022).
Article PubMed PubMed Central CAS Google Scholar
Beltran, A., Jiang, X., Shen, Y. & Lehner, B. Site-saturation mutagenesis of 500 human protein domains. Nature 637, 885–894 (2025).
Article PubMed PubMed Central CAS Google Scholar
Cagiada, M., Jonsson, N. & Lindorff-Larsen, K. Decoding molecular mechanisms for loss of function variants in the human proteome. Preprint at bioRxiv https://doi.org/10.1101/2024.05.21.595203 (2024).
Jänes, J. et al. Predicted mechanistic impacts of human protein missense variants. Preprint at bioRxiv https://doi.org/10.1101/2024.05.29.596373 (2024).
Allen, L. et al. Future therapies for cystic fibrosis. Nat. Commun. 14, 693 (2023).
Article PubMed PubMed Central CAS Google Scholar
Keyzor, I. et al. Therapeutic role of pharmacological chaperones in lysosomal storage disorders: a review of the evidence and informed approach to reclassification. Biomolecules 13, 1227 (2023).
Article PubMed PubMed Central CAS Google Scholar
Maurer, M. S. et al. Tafamidis treatment for patients with transthyretin amyloid cardiomyopathy. N. Engl. J. Med. 379, 1007–1016 (2018).
Article PubMed CAS Google Scholar
Song, H. et al. Diverse rescue potencies of p53 mutations to ATO are predetermined by intrinsic mutational properties. Sci. Transl. Med. 15, eabn9155 (2023).
Article PubMed CAS Google Scholar
Joerger, A. C. & Fersht, A. R. The p53 Pathway: origins, inactivation in cancer, and emerging therapeutic approaches. Annu. Rev. Biochem. 85, 375–404 (2016).
Article PubMed CAS Google Scholar
Beerepoot, P., Nazari, R. & Salahpour, A. Pharmacological chaperone approaches for rescuing GPCR mutants: current state, challenges, and screening strategies. Pharmacol. Res. 117, 242–251 (2017).
Article PubMed CAS Google Scholar
McKee, A. G. et al. Systematic profiling of temperature- and retinal-sensitive rhodopsin variants by deep mutational scanning. J. Biol. Chem. 297, 101359 (2021).
Article PubMed PubMed Central CAS Google Scholar
Powers, E. T., Morimoto, R. I., Dillin, A., Kelly, J. W. & Balch, W. E. Biological and chemical approaches to diseases of proteostasis deficiency. Annu. Rev. Biochem. 78, 959–991 (2009).
Article PubMed CAS Google Scholar
Tokuriki, N. & Tawfik, D. S. Stability effects of mutations and protein evolvability. Curr. Opin. Struct. Biol. 19, 596–604 (2009).
Article PubMed CAS Google Scholar
Kroncke, B. M. et al. Documentation of an Imperative to Improve Methods for Predicting Membrane Protein Stability. Biochemistry 55, 5002–5009 (2016).
Article PubMed CAS Google Scholar
Tsuboyama, K. et al. Mega-scale experimental analysis of protein folding stability in biology and design. Nature 620, 434–444 (2023).
Article PubMed PubMed Central CAS Google Scholar
Ross, G. A. et al. The maximal and current accuracy of rigorous protein-ligand binding free energy calculations. Commun. Chem. 6, 222 (2023).
Article PubMed PubMed Central Google Scholar
Wells, J. A. Additivity of mutational effects in proteins. Biochemistry 29, 8509–8517 (1990).
Article PubMed CAS Google Scholar
Faure, A. J. et al. The genetic architecture of protein stability. Nature 634, 995–1003 (2024).
Article PubMed PubMed Central CAS Google Scholar
Birnbaumer, M. The V2 vasopressin receptor mutations and fluid homeostasis. Cardiovasc. Res. 51, 409–415 (2001).
Article PubMed CAS Google Scholar
Erdélyi, L. S., Hunyady, L. & Balla, A. V2 vasopressin receptor mutations: future personalized therapy based on individual molecular biology. Front. Endocrinol. 14, 1173601 (2023).
Bichet, D. G. Vasopressin receptors in health and disease. Kidney Int. 49, 1706–1711 (1996).
Article PubMed CAS Google Scholar
Bockenhauer, D. & Bichet, D. G. Pathophysiology, diagnosis and management of nephrogenic diabetes insipidus. Nat. Rev. Nephrol. 11, 576–588 (2015).
Article PubMed CAS Google Scholar
Rosenthal, S. M., Feldman, B. J., Vargas, G. A. & Gitelman, S. E. Nephrogenic syndrome of inappropriate antidiuresis (NSIAD): a paradigm for activating mutations causing endocrine dysfunction. Pediatr. Endocrinol. Rev. 4, 66–70 (2006).
PubMed Google Scholar
Sasaki, S., Chiga, M., Kikuchi, E., Rai, T. & Uchida, S. Hereditary nephrogenic diabetes insipidus in Japanese patients: analysis of 78 families and report of 22 new mutations in AVPR2 and AQP2. Clin. Exp. Nephrol. 17, 338–344 (2013).
Article PubMed CAS Google Scholar
Spanakis, E., Milord, E. & Gragnoli, C. AVPR2 variants and mutations in nephrogenic diabetes insipidus: review and missense mutation significance. J. Cell. Physiol. 217, 605–617 (2008).
Article PubMed CAS Google Scholar
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Article PubMed PubMed Central CAS Google Scholar
Arthus, M.-F. C. et al. Report of 33 novel AVPR2 mutations and analysis of 117 families with X-linked nephrogenic diabetes insipidus. J. Am. Soc. Nephrol. 11, 1044 (2000).
Article PubMed CAS Google Scholar
Morello, J.-P. et al. Pharmacological chaperones rescue cell-surface expression and function of misfolded V2 vasopressin receptor mutants. J. Clin. Invest. 105, 887–895 (2000).
Article PubMed PubMed Central CAS Google Scholar
Janovick, J. A., Spicer, T. P., Bannister, T. D., Scampavia, L. & Conn, P. M. Pharmacoperone rescue of vasopressin 2 receptor mutants reveals unexpected constitutive activity and coupling bias. PLoS One 12, e0181830 (2017).
Article PubMed PubMed Central Google Scholar
Prosperi, F. et al. Characterization of five novel vasopressin V2 receptor mutants causing nephrogenic diabetes insipidus reveals a role of tolvaptan for M272R-V2R mutation. Sci. Rep. 10, 16383 (2020).
Article PubMed PubMed Central Google Scholar
Ranadive, S. A. et al. Identification, characterization and rescue of a novel vasopressin-2 receptor mutation causing nephrogenic diabetes insipidus. Clin. Endocrinol. 71, 388–393 (2009).
Article CAS Google Scholar
Robben, J. H., Sze, M., Knoers, N. V. A. M. & Deen, P. M. T. Functional rescue of vasopressin V2 receptor mutants in MDCK cells by pharmacochaperones: relevance to therapy of nephrogenic diabetes insipidus. Am. J. Physiol. Ren. Physiol. 292, F253–F260 (2007).
Article CAS Google Scholar
Mighell, T. L., Toledano, I. & Lehner, B. SUNi mutagenesis: scalable and uniform nicking for efficient generation of variant libraries. PLoS ONE 18, e0288158 (2023).
Article PubMed PubMed Central CAS Google Scholar
Penn, W. D. et al. Probing biophysical sequence constraints within the transmembrane domains of rhodopsin by deep mutational scanning. Sci. Adv. 6, eaay7505 (2020).
Article PubMed PubMed Central CAS Google Scholar
Heredia, J. D. et al. Mapping interaction sites on human chemokine receptors by deep mutational scanning. J. Immunol. 200, 3825–3839 (2018).
Article PubMed CAS Google Scholar
Matreyek, K. A., Stephany, J. J., Chiasson, M. A., Hasle, N. & Fowler, D. M. An improved platform for functional assessment of large protein libraries in mammalian cells. Nucleic Acids Res. 48, e1 (2020).
PubMed CAS Google Scholar
Heifetz, A. et al. Characterizing interhelical interactions of G-protein coupled receptors with the fragment molecular orbital method. J. Chem. Theory Comput. 16, 2814–2824 (2020).
Article PubMed PubMed Central CAS Google Scholar
Venkatakrishnan, A. J. et al. Molecular signatures of G-protein-coupled receptors. Nature 494, 185–194 (2013).
Article PubMed CAS Google Scholar
Hessa, T. et al. Molecular code for transmembrane-helix recognition by the Sec61 translocon. Nature 450, 1026–1030 (2007).
Article PubMed CAS Google Scholar
Sadeghi, H. & Birnbaumer, M. O-Glycosylation of the V2 vasopressin receptor. Glycobiology 9, 731–737 (1999).
Article PubMed CAS Google Scholar
Sadeghi, H. M., Innamorati, G., Dagarag, M. & Birnbaumer, M. Palmitoylation of the V2 vasopressin receptor. Mol. Pharmacol. 52, 21–29 (1997).
Article PubMed CAS Google Scholar
Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
Article PubMed CAS Google Scholar
Stenson, P. D. et al. Human Gene Mutation Database (HGMD): 2003 update. Hum. Mutat. 21, 577–581 (2003).
Article PubMed CAS Google Scholar
Vezzi, V. et al. Vasopressin receptor 2 mutations in the nephrogenic syndrome of inappropriate antidiuresis show different mechanisms of constitutive activation for G protein coupled receptors. Sci. Rep. 10, 9111 (2020).
Article PubMed PubMed Central CAS Google Scholar
Brandes, N., Goldman, G., Wang, C. H., Ye, C. J. & Ntranos, V. Genome-wide prediction of disease variant effects with a deep protein language model. Nat. Genet. 55, 1512–1522 (2023).
Article PubMed PubMed Central CAS Google Scholar
Frazer, J. et al. Disease variant prediction with deep generative models of evolutionary data. Nature 599, 91–95 (2021).
Article PubMed CAS Google Scholar
Cheng, J. et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381, eadg7492 (2023).
Article PubMed CAS Google Scholar
Blaabjerg, L. M. et al. Rapid protein stability prediction using deep learning representations. eLife 12, e82593 (2023).
Article PubMed PubMed Central CAS Google Scholar
Dieckhaus, H., Brocidiacono, M., Randolph, N. Z. & Kuhlman, B. Transfer learning to leverage larger datasets for improved prediction of protein stability changes. Proc. Natl Acad. Sci. USA 121, e2314853121 (2024).
Article PubMed PubMed Central CAS Google Scholar
Denning, G. M. et al. Processing of mutant cystic fibrosis transmembrane conductance regulator is temperature-sensitive. Nature 358, 761–764 (1992).
Article PubMed CAS Google Scholar
Domingo, J., Baeza-Centurion, P. & Lehner, B. The causes and consequences of genetic interactions (epistasis). Annu. Rev. Genomics Hum. Genet. 20, 433–460 (2019).
Article PubMed CAS Google Scholar
Roushar, F. J. et al. Molecular basis for variations in the sensitivity of pathogenic rhodopsin variants to 9-cis-retinal. J. Biol. Chem. 298, 102266 (2022).
Article PubMed PubMed Central CAS Google Scholar
McKee, A. G. et al. General trends in the effects of VX-661 and VX-445 on the plasma membrane expression of clinical CFTR variants. Cell Chem. Biol. 30, 632–642(2023).
Article PubMed PubMed Central CAS Google Scholar
Bernier, V. et al. Pharmacologic chaperones as a potential treatment for X-linked nephrogenic diabetes insipidus. J. Am. Soc. Nephrol. JASN 17, 232–243 (2006).
Article PubMed CAS Google Scholar
Ghali, J. K., Hamad, B., Yasothan, U. & Kirkpatrick, P. Tolvaptan. Nat. Rev. Drug Discov. 8, 611–612 (2009).
Article PubMed CAS Google Scholar
Raina, R. et al. Clinical utility and tolerability of tolvaptan in the treatment of autosomal dominant polycystic kidney disease (ADPKD). Drug Healthc. Patient Saf. 14, 147–159 (2022).
Article PubMed PubMed Central Google Scholar
Szalai, L. et al. Functional rescue of a nephrogenic diabetes insipidus causing mutation in the V2 vasopressin receptor by specific antagonist and agonist pharmacochaperones. Front. Pharmacol. 13, 811836 (2022).
Article PubMed PubMed Central CAS Google Scholar
Takahashi, K. et al. V2 vasopressin receptor (V2R) mutations in partial nephrogenic diabetes insipidus highlight protean agonism of V2R antagonists. J. Biol. Chem. 287, 2099–2106 (2012).
Article PubMed CAS Google Scholar
Isberg, V. et al. Generic GPCR residue numbers - aligning topology maps while minding the gaps. Trends Pharmacol. Sci. 36, 22–31 (2015).
Article PubMed CAS Google Scholar
Liu, H.-L. et al. Structural basis of tolvaptan binding to the vasopressin V2 receptor. Acta Pharmacol. Sin. 23, 3784–3799 (2024).
Google Scholar
Rovati, G. E., Capra, V. & Neubig, R. R. The highly conserved DRY motif of class A G protein-coupled receptors: beyond the ground state. Mol. Pharmacol. 71, 959–964 (2007).
Article PubMed CAS Google Scholar
Mighell, T. L., Thacker, S., Fombonne, E., Eng, C. & O’Roak, B. J. An integrated deep-mutational-scanning approach provides clinical insights on PTEN genotype–phenotype relationships. Am. J. Hum. Genet. 106, 818–829 (2020).
Article PubMed PubMed Central CAS Google Scholar
Clausen, L. et al. A mutational atlas for Parkin proteostasis. Nat. Commun. 15, 1541 (2024).
Article PubMed PubMed Central CAS Google Scholar
Amorosi, C. J. et al. Massively parallel characterization of CYP2C9 variant enzyme activity and abundance. Am. J. Hum. Genet. 108, 1735–1751 (2021).
Article PubMed PubMed Central CAS Google Scholar
Weng, C., Faure, A. J., Escobedo, A. & Lehner, B. The energetic and allosteric landscape for KRAS inhibition. Nature 626, 643–652 (2024).
Article PubMed CAS Google Scholar
Tiemann, J. K. S., Zschach, H., Lindorff-Larsen, K. & Stein, A. Interpreting the molecular mechanisms of disease variants in human transmembrane proteins. Biophys. J. 122, 2176–2191 (2023).
Article PubMed PubMed Central CAS Google Scholar
Yee, S. W. et al. The full spectrum of SLC22 OCT1 mutations illuminates the bridge between drug transporter biophysics and pharmacogenomics. Mol. Cell 84, 1932–1947(2024).
Article PubMed PubMed Central CAS Google Scholar
Coyote-Maestas, W., Nedrud, D., He, Y. & Schmidt, D. Determinants of trafficking, conduction, and disease within a K⁺ channel revealed through multiparametric deep mutational scanning. eLife 11, e76903 (2022).
Article PubMed PubMed Central Google Scholar
Howard, M. K. et al. Molecular basis of proton-sensing by G protein-coupled receptors. Cell 188, 671–687 (2025).
Article PubMed CAS Google Scholar
McDonald, E. F., Sabusap, C. M. P., Kim, M. & Plate, L. Distinct proteostasis states drive pharmacologic chaperone susceptibility for cystic fibrosis transmembrane conductance regulator misfolding mutants. Mol. Biol. Cell 33, ar62 (2022).
Article PubMed PubMed Central CAS Google Scholar
Ding, D. et al. Protein design using structure-based residue preferences. Nat. Commun. 15, 1639 (2024).
Article PubMed PubMed Central CAS Google Scholar
Park, Y., Metzger, B. P. H. & Thornton, J. W. The simplicity of protein sequence-function relationships. Nat. Commun. 15, 7953 (2024).
Article PubMed PubMed Central CAS Google Scholar
Matreyek, K. A. et al. Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat. Genet. 50, 874–882 (2018).
Article PubMed PubMed Central CAS Google Scholar
Faure, A. J. et al. Mapping the energetic and allosteric landscapes of protein binding domains. Nature 604, 175–183 (2022).
Article PubMed CAS Google Scholar
Janovick, J. A. et al. Restoration of testis function in hypogonadotropic hypogonadal mice harboring a misfolded GnRHR mutant by pharmacoperone drug therapy. Proc. Natl Acad. Sci. USA 110, 21030–21035 (2013).
Article PubMed PubMed Central CAS Google Scholar
René, P., Lanfray, D., Richard, D. & Bouvier, M. Pharmacological chaperone action in humanized mouse models of MC4R-linked obesity. JCI Insight 6, e132778 (2021).
Crawford, K. H. D. & Bloom, J. D. alignparse: a Python package for parsing complex features from high-throughput long-read sequencing. J. Open Source Softw. 4, 1915 (2019).
Article PubMed PubMed Central Google Scholar
Rognes, T., Flouri, T., Nichols, B., Quince, C. & Mahé, F. VSEARCH: a versatile open source tool for metagenomics. PeerJ 4, e2584 (2016).
Article PubMed PubMed Central Google Scholar
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10–12 (2011).
Article Google Scholar
Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with Python. Proc. 9th Python Science Conference 92–96 (SciPy, 2010).
Lehner, B. & Mighell, T. A pharmacological chaperone stabilizer rescues the expression of the vast majority of pathogenic variants in a G protein-coupled receptor. Zenodo https://doi.org/10.5281/zenodo.14216035 (2024).

Download references

Acknowledgements

This work was funded by a European Research Council Advanced Grant (883742), Wellcome (220540/Z/20/A), the Spanish Ministry of Science and Innovation (LCF/PR/HR21/52410004, EMBL Partnership, Severo Ochoa Centre of Excellence), the Bettencourt Schueller Foundation, the AXA Research Fund, Agencia de Gestio d’Ajuts Universitaris i de Recerca (AGAUR, 2017 SGR 1322) and the CERCA Program/Generalitat de Catalunya. T.L.M. was funded by an EMBO fellowship (ALTF 113-2021). We thank all members of the Lehner Lab and J. Selent and T. Stepniewski for helpful discussions. We thank the CRG/UPF Flow Cytometry Unit for assistance with the sorting experiments.

Author information

Authors and Affiliations

Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
Taylor L. Mighell & Ben Lehner
Universitat Pompeu Fabra (UPF), Barcelona, Spain
Ben Lehner
ICREA, Barcelona, Spain
Ben Lehner
Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
Ben Lehner

Authors

Taylor L. Mighell
View author publications
Search author on:PubMed Google Scholar
Ben Lehner
View author publications
Search author on:PubMed Google Scholar

Contributions

The project was conceived by T.L.M. and B.L. Experiments and analyses were performed by T.L.M. Figures were prepared by T.L.M. The manuscript was written by T.L.M. and B.L.

Corresponding author

Correspondence to Ben Lehner.

Ethics declarations

Competing interests

B.L. is a founder and shareholder of ALLOX. T.L.M. declares no competing interests.

Peer review

Peer review information

Nature Structural & Molecular Biology thanks Willow Coyote-Maestas, Kenneth Matreyek and Jonathan Schlebach for their contribution to the peer review of this work. Primary Handling Editor: Katarzyna Ciazynska, in collaboration with the Nature Structural & Molecular Biology team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Extended analyses of primary V2R surface expression data.

a Overview of data processing for barcode-variant association with PacBio long read sequencing. b Replicate correlations for the four control conditions. 37° was culture media without DMSO, DMSOA and DMSOB were culture media with 0.1% DMSO, DMSOC was culture media with 1% DMSO. Because the conditions are all well correlated, they were all considered control and combined for the control estimates. P < 1×10¹⁰⁰ in all cases. c Hexbin plots comparing the change in hydrophobicity with surface expression score, separated by positions that are in transmembrane domains, outside of transmembrane domains, or the whole receptor considered together. d Violin plot of variant effects in the N-terminus of V2R. Highlighted with arrows are the putative O- and N-glycosylation sites (5 and 6, 22 and 24, respectively). e Variant effects at the sites of palmitoylation. f Topdown view of V2R, with residues colored by their preference for hydrophobicity. Highlighted with arrows are residues in the core of the receptor where hydrophilic amino acids are preferred. g Flow cytometry gating strategy.

Source data

Extended Data Fig. 2 Comparison of computational VEPs with empirical surface expression scores, highlighting NSIAD scores.

a Hexbin plot comparing surface expression with AlphaMissense scores for all missense variants. NSIAD scores are highlighted. b Hexbin plot comparing surface expression with ESM1b scores for all missense variants. NSIAD scores are highlighted. c Hexbin plot comparing surface expression with EVE scores for all missense variants. NSIAD scores are highlighted. d Hexbin plot comparing surface expression with RaSP scores for all missense variants. NSIAD scores are highlighted. e Hexbin plot comparing surface expression with ThermoMPNN scores for all missense variants. NSIAD scores are highlighted.

Source data

Extended Data Fig. 3 Extended analysis of rescue experiments.

a FACS data for V2R library in the control versus reduced temperature (27°) condition. b FACS data for V2R library in the control versus Tolvaptan condition. c Replicate correlations for 27° experiment. p < 1×10¹⁰⁰. d Replicate correlations for Tolvaptan experiment. p < 1×10¹⁰⁰. e Comparison of ten variants measured in multiplex or individually, in the presence of Tolvaptan. f Percent of AlphaMissense predicted pathogenic variants that are poorly, moderately, or well expressed in control and Tolvaptan conditions. g Violinplot comparing the surface expression scores of NSIAD variants in the control compared with the Tolvaptan condition. h Control surface expression compared with Tolvaptan rescue magnitude. i Reduced temperature rescue compared with Tolvaptan rescue magnitude. p < 1×10¹⁰⁰.

Source data

Supplementary information

Reporting Summary

Peer Review File

Supplementary Tables 1–4.

Source data

Source Data Fig. 1

Numerical Source Data.

Source Data Fig. 2

Numerical Source Data.

Source Data Fig. 3

Numerical Source Data.

Source Data Fig. 4

Numerical Source Data.

Source Data Extended Data Fig. 1

Numerical Source Data.

Source Data Extended Data Fig. 2

Numerical Source Data.

Source Data Extended Data Fig. 3

Numerical Source Data.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mighell, T.L., Lehner, B. A small molecule stabilizer rescues the surface expression of nearly all missense variants in a GPCR. Nat Struct Mol Biol (2025). https://doi.org/10.1038/s41594-025-01659-6

Download citation

Received: 13 December 2024
Accepted: 24 July 2025
Published: 22 September 2025
DOI: https://doi.org/10.1038/s41594-025-01659-6

Subjects

Abstract

Similar content being viewed by others

Main

Results

Massively parallel measurement of V2R surface expression

The contribution of V2R surface expression to pathogenicity

Classifying pathogenic variant mechanisms

Temperature rescue of V2R variants

Pharmacological chaperone rescue of V2R variants

Identifying drug binding and functional sites

Discussion

Methods

V2R saturation mutagenesis and barcoding

Long read sequencing to associate variants with barcodes

Landing-pad cell line recombination

Cell sorting

Sequencing library preparation

Sequencing data analysis and calculation of surface expression scores

Modeling rescue and identifying outliers

Clinical variant curation

Computational predictors

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links