Introduction

Splice site modulators are synthetic molecules, either oligonucleotides (oligos) or small molecules, that change the splicing outcome of a given pre-mRNA by either enhancing or preventing the use of a given splice site (the boundaries between exons and introns in a pre-mRNA)1,2. To date, the most successful of these molecules have been nusinersen (an oligo) and risdiplam (a small molecule) that are both approved treatments for spinal muscular atrophy (SMA)3,4. SMA is caused by mutations in the SMN1 gene that prevent production of functional protein5. A nearly identical gene, SMN2, is also present that can be used to produce the same protein but is normally inactive due to exclusion of exon 7 from the mRNA transcript by alternative splicing6,7,8. Both nusinersen and risdiplam promote exon 7 inclusion to allow for translation of the functional protein in SMA patients4. However, the mechanisms by which these two drugs function are quite different. Nusinersen targets an intronic splicing silencer element downstream of exon 7 that naturally represses exon 7 splicing9. On the other hand, risdiplam and related molecules such as branaplam are believed to restore exon 7 splicing by increasing the interaction between the U1 small nuclear ribonucleoprotein (snRNP) splicing factor and exon 7’s weak 5′ splice site (SS)10,11,12,13.

Remarkably, risdiplam and branaplam do not modulate splicing at all 5′SS but instead exhibit a high degree of selectivity12,13,14,15. The origins of this specificity or differences in which sites are impacted between risdiplam and branaplam are not well-understood14. The SMN2 exon 7 5′SS is atypical relative to more common sites in that it contains a bulged nucleotide at the −1 position (−1 represents the last nucleotide of the exon, +1 represents the first nucleotide of the intron). The bulged nucleotide (an adenosine) is predicted to be unpaired and flipped out of the RNA duplex formed between the 5′SS and the small nuclear RNA (snRNA) component of the U1 snRNP10. Biochemical and in vivo data have suggested that risdiplam and branaplam act specifically at these types of 5′SS and function by a bulge repair mechanism to convert the bulged duplex into something that more closely resembles a standard interaction between a 5′SS and U1 snRNA10,14,16. Since perturbations in U1 snRNP/5′SS binding can lead to changes in alternative splicing17,18, it is thought that these drugs enhance U1 snRNP affinity for the SMN2 exon 7 5′SS, which in turn promotes spliceosome assembly, splicing modulation, and exon inclusion. Thus, these drugs appear to modulate the SS recognition process to ultimately alter the nucleotide sequences of spliced mRNA products.

The 5′SS is recognized by multiple factors during spliceosome assembly including the U1 and U6 snRNAs. U1 and U6 base pair sequentially to the 5′SS with U1 snRNP associating during the earliest stages of spliceosome assembly before being displaced by the U6 snRNA just prior to spliceosome activation, when the active site for splicing is formed19. The U1 snRNP is composed of the U1 snRNA and 10 protein factors. The snRNA region that base pairs with the 5’SS is found at the very 5′ end and can form up to 11 contiguous base pairs, flanking both sides of the exon/intron boundary17. In humans, most 5′SS have fewer potential base pairing interactions with U1 snRNA and are highly sequence diverse—more than 9000 different 5′SS sequences are functional in humans, some of which contain atypical base pairing registers or bulged nucleotides as in SMN217. While the GU at the +1 and +2 positions are nearly invariant due to the constraints placed on this location by splicing chemistry, every possible nucleotide can be accommodated at every other position17,20. In both yeast and humans, the zinc finger domain of the U1-C protein binds the backbone of the 5′SS/snRNA duplex near the GU dinucleotide and has been postulated to fine-tune U1 snRNP interactions with RNA21. While yeast U1-C is an obligate snRNP component22, human U1-C is smaller and readily dissociates from purified human U1 snRNPs at room temperature23.

Quantitative models for 5′SS recognition by human U1 snRNP could reveal how the splicing factor discriminates between different RNA sequences as well as how this process is modulated by U1-C or drugs such as branaplam. A thermodynamic model has recently been proposed for explaining the in vivo effects of splice site modulating drugs14; however, there are no detailed kinetic data yet available for splice site recognition by human factors. Obtaining a kinetic mechanism of 5′SS recognition in a biochemically-defined system is essential for understanding and predictive modeling of the splicing decisions that give rise to the cellular transcriptome as well as understanding the mechanism of action of splicing modulator drugs14. A kinetic description of this process is particularly important since splicing, like many events in gene expression, does not occur under equilibrium conditions and is likely limited to a window of opportunity24. The window of opportunity is defined, in part, by how quickly U1 can associate with a 5′SS and how long it may remain bound in order to promote spliceosome assembly before the RNA is degraded, exported, or a competing 5′SS is utilized. Splice site modulation by drugs such as branaplam is likely also restricted to the same, or related, window of opportunity.

To elucidate 5′SS recognition and modulation in humans, we reconstituted a model human U1 snRNP and assayed its interactions with RNA oligos in the presence and absence of branaplam. A combination of surface plasmon resonance (SPR), microscale thermophoresis (MST) and colocalization single molecule spectroscopy (CoSMoS) assays reveals how 5′SS containing a bulged adenosine at the −1 position (−1A) are recognized and modulated by drugs working collaboratively with protein splicing factors. Branaplam reversibly binds to the U1 snRNP/5’SS complex, and drug modulation of this complex is strictly dependent on reversible binding of U1-C. U1-C in turn can only bind to the snRNP if the 5′SS has not yet been engaged. Thus, our sequential binding mechanism predicts that 5’SS modulation by branaplam depends on an ordered series of events: U1-C binds to U1 snRNP, this complex then binds to the 5’SS, and finally branaplam binds to the U1 snRNP/U1-C/5′SS ternary complex. This mechanism reveals how a reversibly binding splicing modulator can elicit formation of long-lived U1 snRNP/5′SS interactions as well as fundamental features of human 5′SS recognition.

Results

A −1A bulge in the 5′SS induces dynamic RNA binding to U1 snRNP

To measure the effects of branaplam on U1 snRNP/5′SS interactions, we reconstituted a minimal U1 snRNP based upon prior work from the Nagai laboratory (Fig. 1A, Supplementary Fig. 1)21. Our U1 snRNP complex consisted of seven Sm proteins, the N-terminal region of U1-70K, the zinc finger region of U1-C, and a truncated U1 snRNA. It has previously been shown that the U1-A protein and snRNA stem loops I and II (which contains the binding site for U1-A) are not essential for splicing25. As in prior work by Nagai and coworkers, several Sm proteins were also truncated to remove unstructured domains and the U1-70K fragment was fused to SmD1 (see Methods). The sequence of our commercially synthesized U1 snRNA was further altered in the shortened stem loop 1 region to prevent dimerization (as opposed to the construct used by Nagai and coworkers for crystallization) and modified with 2′-O-methyls at the +1A and +2U positions and pseudouridines at +5 and +6 as found in the native human U1 snRNA. The snRNA was designed without a 5′ cap, which is also not believed to be required for SS recognition and splicing25. The complex was assembled in solution from purified components (including U1-C, unless otherwise noted) following established methods and purified via size exclusion chromatography21.

Fig. 1: A −1A bulge induces dynamic RNA binding to U1 snRNP.
figure 1

A Crystal structure of minimal U1 snRNP (PDB: 4PJO). Arrows indicate relative placement of modifications for single molecule measurements. B RNA oligos containing 5′SS. Shaded region indicates predicted base pairing interactions with the U1 snRNA (shown at the top, above a schematic of the exon (box)/intron (line) junction). C SPR sensorgrams showing the association and dissociation of surface-tethered (top) 11bp and (bottom) 11bp-1A RNAs at various U1 snRNP concentrations (0.02 to 100 nM). D Cartoon schematic of the two-color CoSMoS assay for monitoring U1/RNA interactions. E Fields of view under 633 nm (left, immobilized U1 snRNP molecules) and 532 nm (right, interacting RNA oligos) excitation (scale bar = 20 µm). Inset highlights colocalization (scale bar 1 µm). Fluorescent beads were included as fiducial makers (yellow arrow). Images are rendered by averaging three consecutive images and applying uniform brightness and contrast values. F Fluorescence in arbitrary units (au) across time in seconds (s) showing the binding of 9bp (top) and 9bp-1A (bottom) to surface tethered U1 snRNP (0.33 frames/s, Hz). Black lines indicate idealized fit. G Linear regression (solid line) of keq values (circle) on 9bp concentration, given by keq = kon([9 bp]) + koff. The shaded region is the 95% confidence interval of the linear regression. Here, the kon value is fixed from maximum likelihood estimations of unbound dwell times (see F) at kon = 3.9 ×106 M−1s−1, resulting in a koff = 4.6 ± 2.9 ×10−4 s−1. H Cumulative probability distributions of (left) unbound dwell times (0.5 nM, N = 2670; 1 nM, N = 5154; 2 nM, N = 3788; 4 nM, N = 4373) and (right) bound dwell times (0.5 nM, N = 2380; 1 nM, N = 5216; 2 nM, N = 4093; 4 nM, N = 4895) across range of 9bp-1A RNA concentrations. I Cumulative probability distribution of bound dwell times of the 9bp-1A RNA at 1 nM (grey, N = 5216) overlaid with MLE of single (blue dashed) and biexponential (red) distributions. Source data are provided as a Source data file.

We first determined the effect of a −1A substitution in the 5′SS on U1 snRNP/5’SS duplex formation using a fully complementary 5′SS oligo (11bp) and a -1A bulged variant (11bp-1A) (Fig. 1B) by surface plasmon resonance (SPR). In these assays, the 5′SS oligo is tethered to the surface via a 3′ biotin and U1 snRNP is added in solution at various concentrations (1.23-100 nM). As expected, the fully complementary 11bp sequence binds the U1 snRNP very tightly (Fig. 1C top, KD = 2.03 ± 0.08 x 10−10 M) while the introduction of the -1A reduces the affinity by an order of magnitude (KD = 2.70 ± 0.01 x 10−9 M). These values stem from global fitting to a single-site binding model consisting of a single kon and koff; however, this simple model does not sufficiently describe the sensorgrams of 11bp or 11bp-1A (Supplementary Fig. 2). To circumvent the challenge of identifying unique kinetic parameters that best describe ensemble data in nonlinear models26, we used single-molecule co-localization spectroscopy (CoSMoS) to observe U1 snRNP/5′SS interactions.

For CoSMoS experiments, we generated a modified minimal U1 snRNP similar to that used for the SPR assays except that it also contained a biotin moiety on SmF for surface tethering and a 3′-Cy5 dye on the U1 snRNA (Fig. 1A, Supplementary Fig. 1B). Due to potential interference from U1-C dissociating from the U1 snRNP21,23 under the dilute conditions required for single molecule immobilization, we reconstituted our U1 snRNP particle without U1-C. The U1-C zinc-finger domain was then separately purified and added directly into solution as required (typically at 100 nM unless otherwise noted; U1-C binding was also analyzed in depth as described subsequently).

The fluorescently-labeled and biotinylated U1 snRNP particle was immobilized with streptavidin onto a passivated and biotinylated glass slide for CoSMoS measurements (Fig. 1D). An RNA oligo containing a 3′-Cy3 dye was added to the solution for direct visualization of U1 snRNP/5′SS binding dynamics. Binding events were monitored using either sequential or alternating laser excitation of 532 nm (5′SS, Cy3) and 633 nm (U1 snRNP, Cy5). Co-localized spots were detected in each channel following channel alignment, and the fluorescence time series at each spot were idealized using unsupervised statistical methods to delineate bound and unbound events for each U1 snRNP molecule (Methods, Fig. 1E). For all 5′SS oligos we investigated, we flanked the 5′ and 3′ ends with an ‘ACA’ motif outside of the 11 nucleotides corresponding to the 5′SS. The secondary structure of each 5′SS/U1 snRNA duplex was predicted using the RNAstructure Web Server27 to ensure the minimum free energy structure matched our expectations (Supplementary Fig. 3).

We first replicated the effect of a -1A bulge on the U1:5′SS interaction observed by SPR at the single molecule level. Considering the tight binding of the 11bp-1A RNA without branaplam, we opted to use a 9bp-1A oligo for single molecule experiments. We hypothesized the reduced number of base pairs would increase the koff and minimize the impact of photobleaching over extended observation times. Mismatches were incorporated at +7 and +8 of the 5′SS, as these positions are not well conserved in humans17. Similar to our SPR data, we see exceptionally tight binding of the 9bp compliment (Fig. 1F, Supplementary Fig. 4). No binding was observed at 10 nM of a fully mismatched oligo at the same frame rate (observed fraction bound <0.002 across 1297 U1 snRNP molecules compared to an observed fraction bound of 0.77 across 1271 molecules at 1 nM of 9bp Supplementary Fig. 5). This demonstrates that non-specific binding did not meaningfully impact our analysis under the experimental conditions.

Given that the dissociation of the 9bp 5′SS RNA was very slow, we implemented a non-equilibrium imaging scheme whereby the cumulative binding of the 9bp RNA was monitored over time immediately following its addition to solution28. This measurement was repeated across various concentrations of RNA from 0.25-2 nM (0.25 nM, N = 1204; 0.5 nM, N = 1283; 1 nM, N = 1271; 2 nM, N = 1156, where N is the number of U1 molecules). We then determined the apparent association rate (kapparent) at each concentration using maximum likelihood estimation (MLE) of our unbound dwell time distributions. These parameters stem from estimations of a modified exponential probability density function that accounts for the sampling rate and observation time29. Next, we performed linear regression of kapparent values obtained at different 9 bp 5′SS RNA concentrations to determine a kon = 3.9 ± 0.2 × 106 M−1 s−1. The equilibration rate (keq) at each RNA concentration was determined by fitting the observed fraction bound over time curves to a single exponential function (Supplementary Fig. 6). Finally, we performed linear regression of keq values as a function of 9bp RNA concentration (Fig. 1G). The slope was constrained at the kon value from MLE, resulting in a koff = 6.4 ± 2.3 × 10−4 s−1. Overall, we estimate a KD ≈ 1.2 × 10−10 M for a 9bp 5′SS RNA, which closely matches our SPR data of an 11bp 5′SS RNA (KD = 2.03 × 10−10 M). The similarity of these values provides high confidence that neither photobleaching nor surface immobilization is significantly impacting our analysis. Together the SPR and CoSMoS data indicate that U1 snRNP binds highly complementary RNAs very tightly with a KD of ~100 pM, and this affinity is primarily attributable to formation of very stable bound complexes with lifetimes of ~27 min rather than rapid association kinetics. These data also indicate that additional base pairing interactions between the +7 and+8 positions of a 5′SS with the AU dinucleotide present at the 5 end of the snRNA do not necessarily confer significant changes in the dissociation constant, KD.

At the single molecule level, the introduction of the -1A bulge (9bp-1A) into the 5′SS results in dynamic RNA binding to U1 snRNP (Fig. 1F, Supplementary Fig. 7). Given the faster kinetics from the weaker binding, we were able to perform equilibrium measurements whereby imaging commenced after equilibrium was reached. Compared to the 9bp RNA, we see only a slight decrease in kon to 2.9 ± 0.2 × 106 M−1 s−1, indicating the energetic penalty of -1A bulge stems from duplex stability and not recruitment. On average, the 9bp-1A 5′SS RNA exhibits a koff of 3.4 ± 0.2 × 10−3 s−1 and KD = 1.2 × 10−9 M. These results are also close to our SPR data for the 11bp-1A 5′SS RNA and confirm an order of magnitude reduction in kinetic stability from the -1A substitution.

We next analyzed the single molecule distributions of bound and unbound dwell times (Fig. 1H, Supplementary Fig. 7). Bound dwell times across all 9bp-1A 5′SS RNA concentrations were poorly described by a single exponential distribution and instead required at least two exponential components (Fig. 1I, Supplementary Fig. 8). MLE resulted in in two bound time constants (\({\tau }_{B}^{1}\) ≈ 20 s and \({\tau }_{B}^{2}\) ≈ 330 s), each with their own relative weight (A1 ≈ 0.12 and A2 ≈ 0.88). As expected for a bimolecular dissociation process, bound dwell time parameters are constant across -1A concentrations (Supplementary Fig. 8B). We additionally performed MLE for unbound dwell times across 9bp-1A 5′SS RNA concentrations. Although subtle, we do see evidence for two time constants that becomes more apparent with increasing RNA concentration (Supplementary Fig. 8). We conclude that introduction of a bulged -1A nucleotide reduces U1 snRNP affinity for a 5′SS by an order of magnitude; although, the interaction is still quite strong with a KD of 1-2 nM. The reduction in affinity is driven by a reduced stability of the U1 snRNP/5′SS complex whose average lifetime decreases from ~27 to <5 min. This corresponds with a thermodynamic penalty due to the -1A bulge of ~2 kcal/mol when binding to the U1 snRNP, similar to values reported for the impact of bulged nucleotides on the stabilities of RNA-only duplexes30.

Branaplam modulates the binding of -1A bulged 5′SS RNA to U1 snRNP

We then investigated the modulation of the U1 snRNP/5′SS duplex by branaplam (Fig. 2A, B). At the ensemble level using SPR, we maintained a constant concentration of U1 snRNP in solution (10 nM) and varied the concentration of branaplam, revealing a branaplam-dependent slowing of U1 snRNP:11bp-1A 5′SS RNA dissociation (Fig. 2C). The dissociation phase of each sensorgram was fit to a single exponential decay function to determine koff. In the case of 11bp-1A RNA, the fitted koff values yielded an EC50 = 1.3 ± 0.1 µM for branaplam, and we see a 4-fold reduction in koff of the RNA due to inclusion of branaplam (2.35 ± 0.14 × 10−3 s−1 and 6.10 ± 0.21 × 10−4 s−1 at 0 and 5 µM branaplam, respectively, Fig. 2D). Strikingly, under these high concentrations of branaplam, the koff of U1 snRNP:11bp-1A 5′SS RNA approaches the dissociation rate of the 11bp 5′SS RNA (koff = 2.11 ± 0.03 × 10−4 s−1). As expected, branaplam had no effect on the binding kinetics of the 11bp 5′SS RNA itself since this RNA lacks the -1A bulge (Fig. 2D).

Fig. 2: Branaplam stabilizes -1A bulged 5′SS RNA binding to U1 snRNP.
figure 2

A RNA oligos containing -1A bulged 5′SS. Shaded region indicates predicted base pairing interactions with the U1 snRNA (top). B Chemical structure of branaplam. C SPR showing the association and dissociation of 11bp-1A RNA across various concentrations of branaplam (0–5 µM) and at 10 nM U1 snRNP. D Dose response curve showing the fitted dissociation rates (koff) of response units vs branaplam concentration for 11bp-1A (white circles) and 11bp (black circles) RNAs. Data for 11bp-1A is overlaid with the fitted equation to determine EC50 value (1.3 ± 0.1 µM). E Fluorescence in arbitrary units (au) across time in seconds (s) of 1 nM 9bp-1A RNA binding to immobilized U1 in the presence of 100 nM U1-C plus (top) DMSO (0.33 Hz) or (bottom) 10 µM branaplam (0.11 Hz) overlaid with idealizations (black lines). F Dose response curves of 9bp-1A RNA binding to U1 snRNP vs branaplam concentration. The fraction bound at each branaplam concentration (white cirlcle) is overlaid with a fitted equation (solid red line) and the 95% confidence interval of the fitted equation (shaded red region) to estimate an EC50 value (0.45 ± 0.17 µM). G Cumulative probability distribution of bound dwell times across branaplam concentrations (DMSO, N = 5216; 0.1 µM, N = 6234; 0.3 µM, N = 4811; 1 µM, N = 3640; 3 µM, N = 2211; 10 µM, N = 1503). H Cumulative probability distribution of bound dwell times at 1 nM 9bp-1A and 100 nM U1-C (grey circles, N = 3640) in presence of 1 µM branaplam overlaid with MLE of mono (blue dashed) and biexponential distributions (solid red). I Maximum likelihood estimates of (left) time constants (\({\tau }_{B}^{1}\) and \({\tau }_{B}^{2}\)) and (right) amplitude of \({\tau }_{B}^{2}\) of a biexponential distribution for bound dwell times at 1 nM 9bp-1A RNA and 100 nM U1-C across branaplam concentrations (mean ± SEM). Plotted parameters are computed across all single molecules for each branaplam concentration (DMSO, N = 5216; 0.1 µM, N = 6234; 0.3 µM, N = 4811; 1 µM, N = 3640; 3 µM, N = 2211; 10 µM, N = 1503). J Contour plots showing the correlation between successive bound event durations (\(i\) and \(i+1\)) within individual molecules. Dashed line is the identity line. Source data are provided as a Source data file.

We then used CoSMoS assays to glean more detailed insights into how branaplam is altering 5′SS binding. CoSMoS experiments were conducted across branaplam concentrations between 0.1 to 10 µM, with 1 nM 9bp-1A 5′SS RNA and 100 nM U1-C in solution (Fig. 2E, Supplementary Fig. 9). Consistent with our SPR data, we see the lifetimes of single binding events increase with increasing concentrations of branaplam (Fig. 2G). At our highest concentration of branaplam (10 µM), we observe some individual binding events persisting for over an hour. To capture these data with a minimal contribution from photobleaching, we decreased our sampling frame rate from 0.33 Hz in the absence of branaplam to 0.11 Hz at 10 µM branaplam. These experiments yielded an EC50 = 0.60 ± 0.12 µM for branaplam (Fig. 2F), similar to the EC50 determined by the koff rates from SPR for the 11bp-1A 5′SS RNA (1.3 ± 0.1 µM). We did not observe any significant effect of branaplam on the association rate of the 9bp-1A 5′SS RNA (Supplementary Fig. 10), consistent with branaplam specifically perturbing 5′SS RNA dissociation, but not association, rates.

We performed MLE of single and biexponential distributions of our bound dwell times across branaplam concentrations. As in the absence of branaplam, we find that two time constants are required to describe the distributions in all cases (Fig. 2H). We find that branaplam does not affect the relative amplitude of our two populations, but predominantly increases the slower bound time constant, \({\tau }_{B}^{2}\), in a concentration dependent manner. The \({\tau }_{B}^{1}\) value and the relative amplitudes of the two time bound time constants are similar to the no branaplam conditions (Fig. 2I). Overall, an order of magnitude increase in \({\tau }_{B}^{2}\) for the 9bp-1A 5′SS RNA is observed between the absence (342 ± 8 s) and presence of 10 µM branaplam (4426 ± 458 s). Together, the SPR and CoSMoS data indicate that branaplam does not facilitate -1A 5′SS RNA association and primarily functions to stabilize the U1 snRNP/-1A 5′SS RNA complex. However, only a subset of U1 snRNP/-1A 5′SS interactions are branaplam-sensitive.

We next tested whether or not this subset of branaplam-sensitive U1 snRNP/-1A 5′SS interactions arises from two different types of U1 snRNP/-1A 5′SS complexes on the surface (e.g., +/- a particular, non-fluorescent subunit) or from reversible interconversion of a kinetically homogenous U1 snRNP/-1A 5′SS complex between two states (e.g., a conformational change). We correlated the lifetimes of successive binding events at individual molecules. In the case of two different types of U1 snRNP/-1A 5′SS complexes, such analysis should yield at least two clusters of individual molecules showing distinct kinetic behaviors, whereas only a single cluster should emerge if a homogenous group of U1 snRNP/-1A 5′SS complexes were interconverting between two states31. This analysis indicates the presence of two distinct clusters, dependent on the branaplam concentration (Fig. 2J). This is particularly evident at 10 µM branaplam, where two clusters identified by agglomerative clustering display unique average bound and unbound times for the -1A 5′SS RNA (Methods). Here, the molecules that more weakly and dynamically engage with the -1A 5′SS RNA account for 5% of the data (Supplementary Fig. 11). This result suggests the presence of two types of U1 snRNP molecules on the surface, each with distinct kinetic properties. We were able to robustly detect the smaller population due to the analysis of many thousands of single molecule binding events and by avoiding ensemble averaging which would have obscured their presence.

U1-C is required for 5′SS modulation by branaplam

We hypothesized that the source of the two distinct kinetic populations could be attributed to the presence or absence of U1-C given the reported lability of this factor and its contacts with the snRNA/5′SS duplex and site of the -1A bulge21,23. In this regard, the relative amplitudes from MLE of our bound time distributions in the presence of branaplam would presumably correspond to the fractions of U1 snRNP molecules without (the amplitudes of the short-lived parameters) and with U1-C (the amplitudes of the long-lived parameters).

In agreement with this hypothesis, we observed that withholding U1-C dramatically changed the binding kinetics (Fig. 3A, B; Supplementary Fig. 12). In the absence of branaplam and U1-C, we see only weak binding at 1 nM 9bp-1A 5′SS RNA with an average bound lifetime of \({\tau }_{B}\) = 65.2 ± 1.3 s (N = 2381). This value matches the \({\tau }_{B}^{1}\) parameter we observed in the branaplam and no branaplam conditions. The addition of 10 µM branaplam in the absence of U1-C had no effect on duplex stability (\({\tau }_{{off}}\) = 57.9 ± 0.9 s, N = 3806). These data are consistent with a model in which U1-C is required for branaplam binding and modulation of U1 snRNP/5′SS interactions.

Fig. 3: U1-C is required for branaplam modulation of -1A 5′SS RNA binding.
figure 3

A Fluorescence in arbitrary units (au) across time in seconds (s) of 1 nM 9bp-1A binding in the absence of U1-C with (top) DSMO and (bottom) 10 µM branaplam overlaid with idealizations (black lines). B Violin plots of bound dwell time distributions at 1 nM 9bp-1A RNA with and without U1-C and/or branaplam. DMSO was included in the absence of branaplam. C Change in fluorescence due to branaplam binding to a duplex of 11bp-1A and U1 snRNP in the absence (grey triangles) and presence (white circles) of U1-C by microscale thermophoresis (MST). The change in fluorescence in the presence of U1-C is overlaid with a fitted equation (solid purple) and 95% confidence interval of the fitted equation (shaded purple region) to estimate a KD value (2.69 ± 0.36 µM). D Violin plots showing the bound dwell time distributions across different permutations of U1-C and branaplam concentrations in solution across indicated RNA oligo sequences (SMN2, FOXM1, SF3B3, HTT*). Each violin plot is overlaid with box plot that show the median (horizontal line), interquartile range (IQR, box), and whiskers representing data within 1.5\(\times\)IQR. Highlighted nucleotides in the 5′SS sequences above each plot indicate predicted base pairs to the U1 snRNA. The lower case letters in HTT* indicate a +7 G:A and +8G:U substitutions included to enable RNA synthesis. For B and D, numbers above the violins indicate the number of bound lifetimes included in each distribution. Source data are provided as a Source data file.

Since the CoSMoS assay cannot monitor branaplam binding directly, our single molecule data cannot by themselves reject an alternative model in which branaplam initially binds the U1 snRNP/-1A 5′SS duplex. In this alternate model, branaplam would then interact with the bulged nucleotide to relieve a steric clash and enable U1-C association, as has been proposed for risdiplam and branaplam-like molecules10,16. To distinguish between U1-C-first and branaplam-first models, we used MST to directly monitor the binding of branaplam to the U1 snRNP bound to a fluorescent 11bp-1A 5′SS RNA at the ensemble level (Fig. 3C). We pre-formed a U1 snRNP/11bp-1A 5′SS RNA complex using our U1 snRNP particle and observed the shift in fluorescence intensity upon binding branaplam. When performed in the absence of U1-C, we do not observe any appreciable binding to branaplam at up to 20 µM. Upon inclusion of U1-C in solution, we see a branaplam concentration-dependent response and can estimate a KD of 2.7 ± 0.4 x10−6 M for branaplam binding, which is consistent with EC50 values we measured using SPR and CoSMoS (Fig. 2C, F). This assay was repeated using the 11 bp 5′SS RNA and, as expected, did not detect any binding to branaplam due to the absence of the -1A bulge (Supplementary Fig. 13A). Finally, we replaced the U1 snRNP particle with a RNA mimicking the U1 snRNA 5 end, annealed to the 11bp-1A 5′SS RNA, and assayed for branaplam binding by MST. Under these conditions, we also could not detect any binding of branaplam to the RNA:RNA duplex. (Supplementary Fig. 13B). Combined, these data show that branaplam binds a -1A bulged U1 snRNA/5′SS duplex only in the presence of the U1 snRNP and U1-C and support the U1-C-first model.

We next determined if the U1-C requirement extended to other naturally occurring -1A bulged 5′SS, including those with clinical relevance. We examined four 5′SS associated with alternative splicing and known to be affected by branaplam or risdiplam (Fig. 3D)14,15. These include exon 7 of SMN2 which is involved in SMA, pseudoexon 50a of HTT which is involved in Huntington’s disease, pseudoexon 2a of SF3B3 which is used as a model 5′SS for targeted exonization by branaplam for gene therapy32, and exon 9 of FOXM1 which is an off-target of risdiplam identified in patient derived fibroblasts33. In the case of the HTT 5′SS, the endogenous sequence proved to be synthetically intractable due to a stretch of guanine bases. Therefore, we substituted guanines at the +7 and +8 positions with UA (HTT*).

CoSMoS was used to measure the extents to which U1-C and branaplam modulate each 5′SS interactions with U1 snRNP. For simplicity, each oligo was only examined at one concentration (SMN2: 10 nM, SF3B3: 10 nM, FOXM1: 10 nM, HTT*: 3 nM) across four conditions: +/- U1-C (0 or 100 nM) and +/- branaplam (DMSO or 10 µM) (Supplementary Figs. 14-17). Consistent with the 9bp-1A 5′SS RNA, all four 5′SS display weaker binding to U1 and no effect upon addition of branaplam in the absence of U1-C. In the case of SF3B3, we were not even able to observe enough binding events above background to determine a bound lifetime in the absence of U1-C.

The addition of U1-C alone promoted binding of each 5′SS by at least a factor of 1.7-fold, and this was further enhanced by branaplam. The magnitude of branaplam-enhancement varies among the different 5′SS (SMN2: 3.7x, SF3B3: 7.7x, FOXM1: 5.5x, HTT* 3.6x), consistent with additional sequence dependencies beyond just the presence of a -1A bulged nucleotide14,15. Overall, our combined single molecule and ensemble data show that U1-C must be present for branaplam to bind to and modulate U1 snRNP/-1A 5′SS complexes and that the extent of binding enhancement is sequence-dependent.

U1-C promotes and stabilizes 5′SS binding to U1 snRNP

Given the critical roles of U1-C for both duplex stability and branaplam enhancement, we next investigated the effect of U1-C on 5′SS recognition in general. For these CoSMoS measurements, we held the concentration of 9bp-1A 5′SS RNA constant at 1 nM and varied the solution concentration of the U1-C from 0 to 100 nM (Supplementary Fig. 18). The average fraction bound increased as a function of U1-C concentration with an EC50 = 1.3 ± 0.2 nM (Fig. 4A).

Fig. 4: U1-C promotes and stabilizes the binding of a -1A RNA to U1 snRNP.
figure 4

A Dose response curves of 9bp-1A RNA binding to U1 snRNP vs. U1-C concentration. The red line and shading indicate the fit and 95% CI to the fitted equation to estimate an EC50 value (1.3 ± 0.20 nM). B Cumulative probability distribution of unbound dwell times at 1 nM 9bp-1A RNA across U1-C concentrations (0 nM, N = 3106; 0.1 nM, N = 2920; 1 nM, N = 5543; 10 nM, N = 3169; 100 nM, N = 5154). C Unbound time constants (\({\tau }_{U}\)) determined from MLE of a monoexponential distribution for unbound dwell times overlaid with a fit to the fitted equation (EC50 = 1.8 ± 0.5 nM). Plotted \({\tau }_{U}\) values (circles) are shown as mean ± SEM and are computed across all single molecules for each U1-C concentration (0 nM, N = 2381; 0.1 nM, N = 2247; 1 nM, N = 4764; 10 nM, N = 3105; 100 nM, N = 5216). The fitted equation is shown as the fit (solid line) and 95% confidence interval of the fit (shaded region). D Cumulative probability distribution of bound dwell times at 1 nM 9bp-1A RNA across U1-C concentrations (0 nM, N = 2381; 0.1 nM, N = 2247; 1 nM, N = 4764; 10 nM, N = 3105; 100 nM, N = 5216). E Cumulative probability distribution of bound dwell times at 1 nM 9bp-1A RNA and 100 nM U1-C (grey circles overlaid with MLE of mono (blue dashed) and biexponential distributions (solid red). F MLE of bound time constants (\({\tau }_{B}^{1}\) and \({\tau }_{B}^{2}\), left) and amplitude of \({\tau }_{B}^{2}\) (right) of a biexponential distribution for bound dwell times at 1 nM 9bp-1A RNA across U1-C concentrations (mean ± SEM). The amplitudes of \({\tau }_{B}^{2}\) are overlaid with the with a fit to the fitted equation (EC50 = 1.0 ± 0.2 nM). The fitted equation is shown as the fit (solid line) and 95% confidence interval of the fit (shaded region). Plotted parameters and errors are computed across all single molecules for each U1-C concentration (0 nM, N = 2381; 0.1 nM, N = 2247; 1 nM, N = 4764; 10 nM, N = 3105; 100 nM, N = 5216). G Contour plots showing the correlation between successive bound event durations (\(i\) and \(i+1\)) within individual molecules. Dashed line is the identity line. Source data are provided as Source data file.

We analyzed the single molecule dwell times of the unbound 9bp-1A 5′SS RNA events across these U1-C concentrations. We find that the RNA unbound lifetimes decrease with increasing U1-C concentration, demonstrating U1-C also helps promote the binding of a -1A 5′SS RNA (Fig. 4B). We estimated kapparent values using single exponential distributions and the resulting kapparent values yielded an EC50 = 1.8 ± 0.5 nM (Fig. 4C). Overall, we see the association rate at 1 nM 9bp-1A 5′SS RNA double between the absence and presence of saturating U1-C.

The larger effect of U1-C on the affinity for the 9bp-1A 5′SS RNA stems from an increase in the bound lifetime (Fig. 4D). MLE analysis of bound dwell times indicates that at least two time constants are required to describe the data (Fig. 4E). In contrast to the effect of branaplam on duplex lifetimes, we find that the value of the time constants does not significantly change across U1-C concentrations, but rather, their amplitudes change dramatically (Fig. 4F). On average, the 9bp-1A 5′SS RNA duplex exhibits \({\tau }_{B}^{1}\) ≈ 30 s and \({\tau }_{B}^{2}\) ≈ 330 s. In the absence of U1-C, U1 snRNP can still form the longer-lived complexes with this RNA; however, these are rare relative to the shorter-lived interactions. Consistent with this, the amplitude of \({\tau }_{B}^{1}\) is dominant at low concentrations of U1-C, but \({\tau }_{B}^{2}\) dominates at high concentrations. The change in the amplitude of \({\tau }_{B}^{2}\) across U1-C concentrations yielded an EC50 = 1.0 ± 0.2 nM.

To test whether these two time constants reflect dynamic association/dissociation of U1-C with a kinetically homogenous population of U1 snRNP molecules, we correlated the dwell time durations of successive binding events within individual immobilized U1 snRNPs (Fig. 4G). At the extremes of either no U1-C or saturating U1-C, bound lifetimes of the 9bp-1A 5′SS RNA predominately align to a single cluster at the level of individual molecules corresponding to ether faster \({\tau }_{B}^{1}\) durations or slower \({\tau }_{B}^{2}\). Near the EC50 of U1-C, we observe both short and long events corresponding to dynamic interconversion of the two time constants, which likely stems from the association and dissociation of U1-C. Higher concentrations of U1-C increase the probability of binding to U1 snRNP and the probability of observing a more stable duplex. Overall, these data show that U1-C dynamically binds U1 snRNP, and that its presence can help recruit and stabilize a -1A bulged 5′SS RNA.

U1-C modulation of 5′SS binding is sequence dependent

We next aimed to determine if the U1-C dependent increase in kon and decrease in koff is generalizable to other 5′SS motifs using a small collection of designed 5′SS oligos (Fig. 5A). We first assayed if the association rate of a non-bulged 5′SS RNA (9bp 5′SS RNA) is enhanced by U1-C. We performed kinetic association experiments in the absence or presence of saturating U1-C at various 9bp (Supplementary Fig. 19) or 9bp-1A 5′SS RNA (Supplementary Fig. 20) concentrations. U1-C enhances the kon of 9bp by a factor of 2.1 and 9bp-1A by a factor of 1.9 (Fig. 5B), the latter of which confirms our result from varying U1-C at a constant RNA concentration (Fig. 4C). We could not conclude if U1-C decreases the koff of 9bp, as the estimated koff from non-equilibrium experiments in the absence of U1-C was not significantly different than the koff with saturating U1-C (p = 0.92, stepwise regression statistical test).

Fig. 5: The impact of U1-C on 5′SS recognition is sequence dependent.
figure 5

A RNA oligos. Shaded region indicates predicted base pair interactions with the U1 snRNA (top). B Apparent association rates for 9 bp (left) and 9bp-1A RNAs (right) at various concentrations in the absence and presence of saturating U1-C (100 nM). Data are overlaid with linear fits (solid line) to determine kon (9 bp with 100 nM U1-C: kon = 3.9 ± 0.4 × 106 M−1 s−1, R2 = 0.97; 9 bp without U1-C: kon = 1.8 ± 0.5 × 106 M−1 s−1, R2 = 0.95; 9bp-1A with 100 nM U1-C: kon = 3.2 ± 0.6 × 106 M−1 s−1, R2 = 0.98; 9bp-1A without U1-C: kon = 3.2 ± 0.6 × 106 M−1 s−1, R2 = 0.98). Shaded region indicates the 95% confidence interval of the linear regression. C Violin plots showing the distributions of unbound (top) and bound (bottom) dwell times for various RNAs at 0 and 100 nM U1-C. Data for 9bp-1A is presented at 1 nM. N indicates the number of single molecule events included in the violin plot. Each violin plot is overlaid with box plot that show the median (horizontal line), interquartile range (IQR, box), and whiskers representing data within 1.5\(\times\)IQR. D Scatter plot showing the average unbound (top) and bound (bottom) dwell times of RNAs at 0 (x-axis) or 100 nM (y-axis) U1-C. Dashed line is the identify line. Source data are provided as a Source data file.

In order to further investigate how U1-C enhances 5′SS recognition, we performed equilibrium binding measurements on various 5′SS RNAs to determine the effect of U1-C (Fig. 5C, Supplementary Figs. 21-25). This collection of oligos included a -1C variant, which is known to form a bulged -1C but is not activated by branaplam12, a +1C mutant which can inhibit splicing34,35, and +2C, which converts the common +1/+2 GU 5′SS subtype to the less common +1/+2 GC motif ( < 1% human 5′SS)17. In addition, we designed two +1/+2 GU 5′SS-containing RNAs with 6 base pairs of complementarity to the U1 snRNA either on the exonic (-4 to +2 positions) or intronic ( +1 to +6 positions) side of the exon/intron boundary. All experiments were conducted at a single concentration of the RNA (9bp-1C: 1 nM, 9 bp+1C: 10 nM, 9 bp+2C: 1 nM, 6bp-exon: 3 nM; 6bp-intron: 3 nM; depending on their affinity) and either in the absence or presence of 100 nM U1-C.

For all of these RNAs, we see a decrease in the average unbound lifetimes, corresponding to a faster association rate, when U1-C is included relative to its absence (Fig. 5D, top). On average, U1-C increased the apparent association rate by 2.5-fold. This suggests that U1-C does not have a single, specific sequence requirement (e.g., presence of a GU 5’SS) for facilitating RNA association to U1 snRNP.

In contrast, we see a larger variance in the effect on bound lifetimes by U1-C. The largest enhancement due to U1-C is a 29.3x increase in average bound lifetimes for the 6bp-intron 5′SS RNA (3.6 s to 105.8 s, Supplementary Fig. 25) while the smallest is a 1.4x increase for 6bp-exon 5′SS RNA (6.1 s to 8.8 s, Supplementary Fig. 24). The enhancement due to U1-C results in an ~20-fold increased lifetime for the 6bp-intron U1 snRNP/5′SS RNA complex despite its predicted U1 snRNA/5’SS duplex being 2.2 kcal/mol less stable than that for the 6bp-exon 5’SS RNA (Supplementary Fig. 3). Therefore, U1-C stabilizes U1 snRNP interactions with RNAs dependent on the register of the 5′SS:U1 snRNA duplex and potential pairing potential with the U6 snRNA. In particular, this result reinforces observations made in yeast that Watson-Crick base-pairing potential alone is not predictive of 5′SS:U1 snRNP interaction lifetimes36.

The bound lifetimes of the 9bp-1C and 9bp-1A 5′SS RNAs are similar with (325.8 s) and without U1-C (50.1 s), despite no enhancement by branaplam in the case of the former (Supplementary Fig. 21). Thus, the effect of branaplam is likely to be driven by the particular structural details of the -1A bulge rather than by inherently different kinetic properties of the U1 snRNP/5′SS RNA interaction. A +1C substitution that converts the canonical GU 5’SS to a CU nearly abolishes stable RNA binding (Supplementary Fig. 22), and U1-C is ineffective at stabilizing the bound state. This suggests that U1 snRNP may enforce the requirement for a +1G at the 5′SS through kinetic selection against mismatches at this position. This selection results from both U1-C independent (poor binding in the absence of U1-C) and dependent (failure of U1-C to stabilize the bound state) components.

Finally, the 9 bp+2C 5′SS RNA (which converts the +1/+2 GU 5′SS to the less frequent +1/+2 GC subtype) exhibits less stable binding than the +1/+2 GU 5′SS in the absence (86 s) and presence (171 s) of U1-C (Supplementary Fig. 23). We estimate the 9 bp 5′SS RNA with a GU motif binds nearly 10x more tightly compared to the GC variant. This correlates with results from high throughput in vivo splicing assays which showed strong preference for GU over GC37, reinforcing that splicing outcomes in cells are often dependent on U1-binding kinetics.

A sequential binding model describes the mechanism of 5′SS modulation by branaplam

To fully describe -1A bulged 5′SS recognition and how it is influenced by U1-C and modulated by branaplam, we used hidden Markov modeling (HMM, see Methods) and simulations to fully define and evaluate potential kinetic mechanisms (Supplementary Figs. 26-31, Supplementary Table 5). A sequential binding model provided the best description of our data (Fig. 6). This model predicts that 9bp-1A 5′SS RNA binding and unbinding can proceed through two pathways dependent on the presence of U1-C. The faster dissociation and slower association arise from U1 snRNP lacking U1-C, whereas the slower dissociation and faster association arise from U1-C bound to the U1 snRNP. Our model therefore predicts that the multiple components in unbound and bound dwell time distributions stem from the binding and unbinding of U1-C and are not inherent to the U1 snRNA:9bp-1A 5′SS interaction. Surprisingly, we find no mathematical evidence for the binding of U1-C once the 5′SS is already bound. We suspect that 5’SS:U1 snRNA duplex may sterically occlude the binding of U1-C, given that the U1-C zinc finger domain rests between the duplex and the SmD3 protein21.

Fig. 6: A sequential binding mechanism for -1A bulged 5′SS RNA recognition by U1 snRNP and modulation by branaplam.
figure 6

A kinetic model for -1A bulged 5′SS association and dissociation in the presence and absence of U1-C and branaplam. Optimized rate transitions for 9bp-1A are provided in Table 1. Equilibrium arrows in grey indicate transitions that are not supported by our experimental data or mathematical modeling.

Branaplam binding only occurs in the presence of U1-C and an existing -1A bulge U1 snRNA/5′SS duplex. Our model predicts a modest affinity for the branaplam interaction (KD = 6.59 ± × 10−7 M), which is similar to the KD determined via MST (Fig. 3C). Surprisingly, the dissociation rate of branaplam is three times faster than the dissociation rate of the 5′SS RNA when U1-C is bound. The increase in bound lifetimes with increasing branaplam concentrations is supported by this model. High concentrations of branaplam lead to its faster association relative to the dissociation rate of the 5′SS RNA, thereby driving the equilibrium toward branaplam rebinding. The dynamic binding of branaplam contributes to the observed high stability of the U1 snRNP/-1A 5′SS RNA complex. We were able to confirm this prediction using SPR and by washing branaplam from solution after an initial binding period to the U1 snRNP/11bp-1A complex. Under these conditions, we do not see increased stability of the 11bp-1A RNA consistent with branaplam reversibly binding the complex and dissociating during the wash (Supplementary Fig. 32). Our work therefore shows that transient, reversible drug binding can nonetheless contribute to formation of very long-lived U1 snRNP/5′SS interactions.

Discussion

Herein, we used a reconstituted U1 snRNP to study the detailed kinetics of its interactions with 5′SS RNA oligos and how these interactions change upon the inclusion of a small molecule splicing modulator. Both single molecule and bulk biophysical measurements show that U1 snRNP binds RNA in a sequence-specific manner, branaplam extends U1 snRNP/oligo lifetimes of -1A bulged 5′SS if U1-C is present, and that splicing modulation can involve a complex, multi-step process. The origin of this complexity is in part due to reversible binding of the U1-C component, which dynamically interacts with the snRNP. U1-C itself both promotes RNA binding by U1 snRNP and stabilizes the U1/RNA complex in a sequence-specific manner. We were able to use our feature-rich and large single molecule data sets to determine a sequential binding mechanism for U1 snRNP, U1-C, and branaplam interactions with a -1A 5′SS-containing oligo. In this mechanism, U1-C associates with the snRNP prior to 5′SS binding and decreases the KD for the 5′SS sixteen-fold. Branaplam then reversibly associates with this complex with a moderate KD ( ~ 0.7 µM).

Differences and similarities between yeast and human U1 snRNPs

Our work here with human U1 snRNP recapitulates several features of 5′SS recognition that we previously characterized in the much larger yeast complex36,38. Both human and yeast U1 snRNPs exhibit short and long-lived interactions with RNAs. In yeast, the data were consistent with a two-step binding mechanism in which RNAs could either be released from the snRNP or the U1 snRNP/RNA complex would transition to a more tightly bound state. This transition is possibly due to a conformational change in U1 snRNP that could involve the U1-C and Luc7 proteins22,36,39,40. In human U1 snRNP interactions with a -1A 5′SS, there is complexity due to dynamic binding of U1-C; however, we do not detect branaplam-independent transitions of the U1 snRNP/U1-C/5′SS complexes between kinetically-distinct states. Thus, the origins of the complexity are different in each system, and this could reflect fundamental differences between the yeast and human splicing machinery. In yeast, there are fewer and smaller introns41, and U1 snRNP is ~10-fold less abundant than in humans (~0.3 µM in yeast vs 2.5 µM in human nuclei)42. Moreover, interacting domains between U1 snRNP and RNAP II found in humans are not conserved43 and putative yeast-specific Pol II interaction domains on U1 have no effect on co-transcriptional U1 snRNP recruitment or the early stages of spliceosome assembly44. Based on these observations, we hypothesize that rapid surveillance of the transcriptome for 5′SS enabled by reversible, two-step binding by yeast U1 snRNP may enable efficient splicing in yeast but be less essential in humans wherein U1 snRNP can be localized proximal to potential 5′SS as they are synthesized by RNAP II43. This hypothesis is also consistent with the lifetimes of yeast and human U1 interactions with highly complementary RNAs. The lifetimes of yeast U1/5′SS complexes are much shorter than their human counterparts (a longest-lived lifetime of <400 s vs. ~1563 s for RNA oligos with 9 bp of complementarity for yeast and human U1 snRNPs, respectively)36. Long-lived U1 snRNP/5′SS interactions may be important for maintaining 5′SS definition and RNAP II elongation rates while long introns and the downstream exon are transcribed in humans45.

In both yeast and humans, base-pairing potential and predicted thermodynamic stability of the snRNA/5′SS duplex are not good predictors by themselves of the U1 snRNP/5′SS interaction lifetime. Neither yeast nor human U1 snRNP efficiently stabilizes duplexes containing mismatches at the +1 position even when extensive base pairing interactions are predicted at other positions. A +1G at the 5′SS is critical for formation of an atypical base-pairing interaction with the -1G of the 3′SS during exon ligation, which occurs in a spliceosome complex that lacks U1 snRNP20. Moreover, human U1 snRNP binds more stably (and with greater enhancement from U1-C) to 5′SS with complementarity in the +1 to +6 (intronic) positions over the -4 to +2 (exonic) positions despite the greater predicted thermodynamic stability of the latter. This may reflect the importance of specifying the intronic portion of the 5′SS sequence in order to permit proper pairing with the U6 snRNA during the catalytic steps of splicing20. A similar, ruler-like function was previously noted for the yeast U1 snRNP36. Thus, kinetic selection of substrates chemically competent for splicing is a conserved feature of both human and yeast U1 snRNPs even though neither U1 snRNP is present during the transesterification steps.

The role of U1-C in 5′SS recognition

For human U1 snRNP, selection against non-+1G 5′SS appears to involve both failure of the snRNP to associate stably with these RNAs in the absence of U1-C and failure of U1-C to stimulate formation of a longer-lived state. What causes the weaker binding of the RNA with a mismatch at +1 in the absence of U1-C is not clear but could be a combination of reduced overall stability of the remaining base-pairing interactions and the mismatch interfering with contacts between the snRNA/5′SS duplex and the nearby U1 snRNA helix H21. Failure of U1-C to stabilize binding of the RNA lacking a +1G is consistent with the protein’s location adjacent to the +1G base-pairing site with the snRNA21. Another prediction of our kinetic model is that U1-C can only associate with U1 snRNPs in the absence of 5′SS pairing. In other words, U1-C must be pre-associated with U1 snRNP prior to its engagement with RNA in order to modulate 5′SS recognition.

U1-C interacts dynamically with the snRNP in the absence of 5′SS pairing. We do not believe that this was due to our use of a model, reconstituted U1 snRNP since U1-C has previously been observed to be a salt labile component of endogenous U1 snRNPs23. In addition, interpretable density due to U1-C is absent in a cryo-EM structure of endogenous human U1 snRNP bound to RNA polymerase II despite its presence during the purification43. In our experiments, U1-C has an EC50 of ~1 nM, and our model predicts a KD of ~0.3 nM. While U1-C binds tightly, the calculated lifetime of the U1/U1-C complex is only ~2 min in the absence of a paired RNA, less than the amount of time RNAP II would take to transcribe the average human gene ( ~ 10 min for a median human gene size of 24 kbp46). Based on these kinetics, it should not be assumed that a U1-C-containing U1 snRNP that is recruited to the transcriptional machinery for co-transcriptional spliceosome assembly still contains U1-C at the moment a 5′SS is transcribed.

Combined, the above observations suggest that U1-C-lacking and U1-C-containing U1 snRNPs may both be biologically relevant for gene expression. In terms of splicing, it is intriguing to speculate that U1-C-lacking snRNPs could be fully functional in the splicing reaction for both 5′SS recognition and transfer to the U6 snRNA. If true, this could have implications for splicing regulation, the ATP requirement for 5′SS transfer to U6, and how 5′SS compete for U1 snRNPs when U1-C is limiting (e.g., a hungry spliceosome-like mechanism47). Based on our data, we would predict that strong, highly complementary splice sites would be the most likely to stably recruit U1/ΔU1-C snRNP particles and that these sites could be preferentially used by the spliceosome when U1-C is limiting. While splice site complementarity was not analyzed specifically, U1-C knockdown in HeLa cells does result in site-specific changes in 5′SS usage48. In Drosophila, knockdown of U1-C appears to reduce usage of the nonconsensus dAdar exon 3a 5′SS in favor of the stronger exon 3 5′SS49. Alternatively, U1 snRNPs containing or lacking U1-C could aid in distinguishing complexes involved in spliceosome assembly from those functioning in other processes such as telescripting50.

As noted above, whether or not U1/5′SS interactions are influenced by the presence or absence of U1-C depends on the sequence of the 5′SS: a highly complementary RNA oligo binds tightly even in the absence of U1-C while U1-C can greatly enhance the bound lifetimes of RNAs containing mismatches. These properties—reversible binding to U1 and RNA sequence-specific effects—show that U1-C has features in common with many alternative splicing factors. In humans, U1-C may have more in common with these factors than with other constitutive components of the splicing machinery consistent with previous observations by Rösel-Hillgärtner and coworkers48. While beyond the scope of this manuscript, our work suggests that understanding the correlations between predicted base-pairing strength, location of the base pairs within the U1 snRNA/5′SS duplex, and dependency on U1-C for exon inclusion in vivo are all critical for predictive modeling of U1 snRNP occupancy in cells.

Mechanism of 5′SS Modulation by Branaplam

While it may seem counterintuitive that a reversibly-binding splicing modulator leads to formation of long-lived interaction between U1 snRNP and -1A bulged 5′SS, our kinetic mechanism provides a rationale for this observation. The -1A 5′SS RNA can only dissociate from U1 snRNP when branaplam is not bound and rapid re-binding of branaplam limits the lifetime of this state.

Recently, a thermodynamic model for splicing modulator drug action has been proposed based on RNA-Seq data and measurements of mRNA production in cells14. In this work, the authors proposed two branaplam-binding modes for U1 snRNP: a risdiplam-like binding mode that occurs on -1A bulged 5′SS that also contain a -2G and a second state that leads to hyperactivation of some 5’SS that additionally contain a -3A. It is unlikely that these two states are due to presence/absence of U1-C since our data shows that branaplam can only bind U1 snRNP when U1-C is present.

While we did not study the sequence requirements for hyperactivation explicitly, we do note that the 5′SS RNA oligos that showed the largest changes in U1 snRNP bound state lifetimes were also those with the hyperactivation AGA motif (9 bp -1A, SF3B3, and HTT*). These results suggest that the hyperactivation phenotype has a kinetic basis and might be due to larger changes in the lifetime of the U1 snRNP/5′SS interaction. In addition, while the thermodynamic model included two different branaplam-binding modes, the authors were not able to determine if the risdiplam-like binding mode is a necessary precursor for formation of the hyperactivated state. Our single molecule data supports only a single branaplam-bound state for the U1 snRNP/U1-C/5′SS complex. The risdiplam-like and hyperactivated binding modes of branaplam likely occur independently of one another, each involving particular molecular interactions with their corresponding 5′SS.

Finally, our results suggest that drug development efforts for molecules such as branaplam should consider the U1 ribonucleoprotein target rather than optimizing RNA duplex interactions alone. Combined with our observation of the reversibility of U1-C association, it is also likely that branaplam’s ability to act as a splicing modulator of a given transcript is a function of U1-C binding and re-binding rates. Differences in binding/re-binding rates for both branaplam and U1-C could contribute to variation in splicing modulation that has been observed among different transcripts12,14. Splicing modulation may, therefore, be limited in part by the inherent kinetic properties of the factors and processes involved.

Methods

Key Reagents

Key chemicals and materials are described in Supplementary Table 6.

RNA oligonucleotides

RNA oligonucleotides (Supplementary Table 1) for SPR, MST, and single-molecule experiments were purchased from Integrated DNA Technologies (IDT, CoSMoS), Dharmacon (SPR), or Metabion (MST). Stocks of fluorescent RNAs intended for single-molecule experiments were prepared by resuspending the lyophilized oligonucleotides in nuclease-free water (20-50 µM, Ambion). RNA concentrations were calculated from their absorbance values 260 nm using a NanoDrop and the extinction coefficients from IDT via the Beer-Lambert law.

Protein Purification

Individual components and reconstitution of the miniU1 particle were produced and purified as previously described21. All plasmids were purchased from GenScript (Piscataway, NJ, USA) based on the pET-28a(+) backbone and codon optimized then transformed into Escherichia coli BL21 Star (DE3) cells (Cat# C601003, ThermoFisher Scientific, Waltham, MA, USA).

For the U1-70K_SmD1/D2 polycistronic construct, an N-terminal thioredoxin-6xHis-tag followed by a tobacco etch virus (TEV) protease cleavage site was appended to the U1-70K fragment comprised of residues 2-59 followed by a Gly-Ser triplet linker then residues 7-91 of SmD1. A second open reading frame containing SmD2 was comprised of residues 1−118. For the SmD3/B polycistronic construct, an N-terminal 6xHis-tag followed by a TEV cleavage site was appended to residues 1−126 of SmD3, which was followed by a second open reading frame for residues 1-95 of SmB. For the SmF/E/G polycistronic construct, an N-terminal His-SUMO-Avi tag was introduced prior to residues 1-75 of SmF, followed by additional open reading frames for SmE (residues 1-92) and SmG (1-76). For SPR and MST experiments, a construct with only a His-SUMO tag was used. For U1-C_61, a C-terminal 6xHis-tag was added after residues 1-61 of U1-C.

Cells were cultured at 37 °C in 1 L of 2xYT media supplemented with kanamycin (50 µg/mL) then induced with 0.5 mM IPTG at 16 °C overnight. Cell pellets were resuspended in lysis buffer (20 mM HEPES, 1 M KCl, 1 M Urea, 5 mM TCEP, pH 7.5) plus cOmplete ULTRA EDTA-free protease inhibitor cocktail (Roche, Basel, Switzerland), then lysed by sonication. Clarified lysates were diluted in IMAC Buffer A (20 mM HEPES, 1 M KCl, 1 M Urea, 2 mM TCEP, pH 7.5) then loaded onto a HisTrap HP 5 mL Ni-NTA column (Cytiva, Marlborough, MA, USA) and eluted with a gradient of IMAC Buffer B (20 mM HEPES, 1 M KCl, 1 M Urea, 2 mM TCEP, 300 mM Imidazole, pH 7.5). Eluted fractions were pooled in dialysis tubing (Cat# 68035, ThermoFisher Scientific) with TEV protease and dialyzed overnight against 20 mM HEPES, 250 mM KCl, 1 M Urea, 2 mM TCEP, pH 7.5. Following dialysis, the solution was adjusted to 1 M KCl and loaded onto a HisTrap column equilibrated in IMAC Buffer A. The flow-through was collected and injected onto a Superdex HiLoad 75 26/60 column (Cytiva) equilibrated in 20 mM HEPES, 250 mM KCl, 2 mM TCEP, pH 7.5 and the fractions were collected then concentrated by centrifugation (Cat# UFC9003, MilliPore Sigma, Burlington, MA, USA).

For the SmF/E/G trimer, cells were cultured like the other constructs with the addition of 1% (w/v) glucose to the culture media. Protein was similarly purified via the 6xHis-tag, cleaved by SUMO protease (Cat#12588018, ThermoFisher Scientific), and finally purified by IMAC and SEC as described. For single molecule studies, the SmF/E/G trimer was biotinylated on the SmF AviTag using the BirA biotin-protein ligase reaction kit (Avidity LLC, Aurora, CO, USA) and biotinylation was confirmed by MALDI-TOF MS.

U1 snRNA Production

The U1 snRNA used for reconstitution of the miniU1 particle was purchased from AxoLabs (LGC Group, Kulmbach, Germany) and dissolved to a concentration of 500 µM in RNAse-free ddH2O comprising the sequence: 5′-AmUmACψψACCU GGCAGUGACC ACCACACACU GCAUAAUUUG UGGUAGUGGG CGAAAGCCCG-3′, where Am and Um represent 2′-O-methyl nucleotides, and ψ is pseudouridine. To enable single molecule studies, a U1 snRNA of the same sequence was produced with an aminohexyl linker on the 3′ end that was subsequently labeled with Cy5 NHS ester as the fluorophore.

Minimal U1 snRNP Reconstitution

Prior to complex reconstitution, U1 snRNA was prepared by refolding at 80 °C for 3 min and then cooling on ice for 10 min. In a pre-warmed solution of Reconstitution Buffer (20 mM HEPES, 250 mM KCl, 2 mM TCEP, pH 7.5) containing 40 U/mL RNAsin (Cat#N2111, Promega, Madison, WI, USA), each Sm protein sub-complex was combined to a final concentration of 8 µM and incubated for 5 min at 37 °C. To this, U1 snRNA was added to a final concentration of 4 µM and incubated for 45 min at 37 °C. When required, U1-C_61 can be added to a final concentration of 8 µM, incubated for 15 min at 37 °C, then the complex is cooled overnight at 4 °C. The crude complex was then loaded onto a MonoQ 10/100 GL column (Cytiva) in Reconstitution Buffer and eluted with a gradient of Reconstitution Buffer containing 1 M KCl. Eluted fractions were pooled and loaded onto a Superdex column (Cytiva) equilibrated in Reconstitution Buffer. After purification, fractions corresponding to miniU1 were concentrated using a 30 kDa MWCO centrifugal filter (Cat#UFC9030, MilliPore Sigma).

Surface plasmon resonance (SPR)

For SPR assays, a Biacore 8 K (Cytiva) was used with a streptavidin-coated Series S Sensor Chip SA (Catalog #BR100531). The instrument was equilibrated in 20 mM HEPES, 200 mM KCl, 5 mM MgCl2, 0.5 mM TCEP, 0.05% (v/v) Tween-20, and 2% (v/v) DMSO. For immobilization, RNA was synthesized with a 3’-biotin (-1A bulge: CAGAGUAAGUAU; SMN2: AGGAGUAAGUCU; Match: CAGGUAAGUAU; Reverse: UAUGAAUGGAC; Dharmacon) and injected at 1 nM with 30 s contact time at 10 µL/min to afford 3-5RU of capture. U1 snRNP binding studies were performed by injecting a titration of complex that was serially diluted from 100 nM with a contact time of 180 s and a dissociation time of 600 s at a flowrate of 30 µL/min in duplicate. After each cycle, the chip surface was regenerated by an injection of 3 M MgCl2, with a 30 s contact time at 30 µL/min, followed by a 50% (v/v) DMSO needle wash. All data was analyzed after reference subtraction, blank subtraction, and a 1.5-2.5% (v/v) DMSO solvent correction applied. Data was analyzed using a two-state binding model in the Biacore Insight Software. To assess the effect of ligand on U1 snRNP binding kinetics, a co-inject format was used to allow for compound to be present during the association and dissociation phases of the experiment. A protein solution of U1 snRNP was prepared at 10 nM with varying concentrations of Branaplam serially diluted from 5 µM and injected across the immobilized RNA surface as previously described.

MicroScale Thermophoresis (MST)

For branaplam binding assays, a serial dilution was performed in DMSO at 25x final concentration, followed by dilution with assay buffer (20 mM HEPES, 200 mM NaCl, 1 mM MgCl2, 1 mM TCEP, 0.05% Tween-20, pH 7.0) to 2x final concentration. Separately, a pre-formed complex of U1 snRNP ΔU1-C was prepared at 200 nM in the presence of 20 nM of a 5’SS oligonucleotide labeled with Cy5 (5’-Cy5-CAGAGUAAGUAU; Metabion) with or without the addition of 500 nM U1-C supplementation. The branaplam titration was then mixed 1:1 with the pre-formed U1 snRNP complex and incubated at room temperature for 15 min before loading Monolith LabelFree Premium Capillaries (MO-Z025; NanoTemper Tech). Capillaries were analyzed by a Monolith X red-continuous instrument (NanoTemper Tech) at 25 °C with 100% LED and laser power.

For U1-C binding assays, U1-C protein was serially diluted at 2x final concentration in assay buffer. Separately, a pre-formed complex of U1 snRNP ΔU1-C was prepared at 200 nM in the presence of 20 nM of a 5′SS oligonucleotide labeled with Cy5 with the addition of branaplam at 5 µM or DMSO, to yield a final DMSO concentration of 2% (v/v). The U1-C titration was mixed 1:1 with the pre-formed U1 snRNP complex and assessed on a Monolith X as previously described.

Equation fitting

Dissociation constants (KD) and EC50 values from CoSMoS (e.g., Fig. 2A, B), SPR (e.g., Fig. 1D) and MST (Fig. 2J) were obtained by fitting binding data to Hill Eq. 1

$$Y={Y}_{\min }+\frac{({Y}_{\max }-{Y}_{\min })}{1+\frac{{Y}_{50}}{x}}$$
(1)

where \({Y}_{50}\) is EC50 or KD, depending on the data, Y is the observed response (i.e., fluorescence) and x is the concentration. Curves were fit using nonlinear regression using the custom code in MATLAB with the function nlinfit. 95% confidence intervals were computed from the estimated Jacobian returned by nonlinear least squares fitting via nlparci.

Single molecule chamber preparation

Single molecule imaging chambers were prepared using microscope slides (24 mm × 60 mm, #1.5, GoldSeal) and cover glasses (25 mm × 25 mm, #1.5, Corning) at least one day before each experiment. Substrates were first cleaned by successive sonication in 2% v/v Micro-90, 100% ethanol, and 1 M KOH for 60 min each in slide-mailers (Fisher Scientific), with a MillliQ water wash after each step. Cleaned substrates were then dried with high purity nitrogen (Airgas) and aminosilanized with 1.5% (v/v) VECTABOND (Vector Laboratories) in acetone (Spectrophotometric Grade, Alfa Aesar) for 10 min. Substrates were washed with 100% ethanol, dried with nitrogen, and passivated by incubation of a 1:100 w/w mixture of mPEG-biotin-SVA (Laysan Bio) and mPEG-SVA (Laysan Bio) in 100 mM NaHCO3 (pH 8) overnight. Prior to use, the substrates were rinsed with MilliQ water and dried with nitrogen. Imaging chambers were created by placing thin strips of double-sided tape and vacuum grease along the glass slide and adhering a cover glass on top. This typically resulted in three 25 µL volume lanes per slide. Prior to use, assembled chambers were rinsed with at least 200 µL of wash buffer (WB: 20 mM HEPES pH 7.5, 200 mM KCl, 1 mM MgCl2, 0.5% v/v Tween20, 0.1% w/v PEG8000).

Single molecule data acquisition

Single molecule data were collected on a custom-built micro-mirror total internal fluorescence microscope system as previously described36,51. Each experiment utilized a 532 nm (CrystaLaser) and 633 nm (Power Technology Inc.) laser for fluorophore excitation, and a 785 nm laser (Power Technology Inc.) for continuous focal plane drift correction (TIRF-Lock, Mad City Labs). Laser powers were measured before each video (532 nm, 900−1400 µW; 633 nm, 800-1400 µW; 785 nm, 250 µW) at a point prior to the objective in the optical path. Excitation and emission passed through a 60x 1.49 NA oil immersion objective (Olympus). Emission was passed through a short-pass filter (FES0700, ThorLabs) and imaged onto two separate sCMOS detectors (Hamamatsu ORCA-Flash4.0 V3). All imaging was controlled with Micro-Manager 2.052 and the TIRF-Lock was controlled with LabView. All images were collected 2x2 pixel binning with active hot pixel correction.

Immediately prior to data collection, streptavidin-labeled fluorescent beads (T10711, Invitrogen) were flowed into the lane at a low concentration (~5 × 104-fold dilution from stock in WB) to serve as fiducial markers for channel alignment and lateral drift correction. The lane was then washed with 50 µL of 0.2 mg/mL streptavidin (SA10-10, Agilent) for two minutes, followed by 50 µL of WB to remove unbound beads and streptavidin. The U1 snRNP particle labeled with Cyb5 was then diluted to 10-20 pM in WB and incubated in a lane for one minute, followed by a 50 µL wash with WB. The surface density of Cy5-labeled U1 snRNP-∆U1-C was checked by flowing in imaging buffer (IB: 20 mM HEPES pH 7.5, 200 mM KCl, 1 mM MgCl2, 0.05% v/v Tween20, 0.1% w/v PEG8000, 5 mM protocatechuic acid (PCA), 1 U/mL protocatechuic-dioxygenase (PCD), 1 mM Trolox, 2% (v/v) DMSO.

Two imaging schemes were used for data collection: alternating laser excitation or sequential laser excitation. In alternating laser excitation, a 50 µL solution containing variable concentrations of RNA, U1-C, and branaplam in IB was added and successive images were captured with a 1 s exposure under 532 nm then 633 nm excitation, separated by a 400 ms switching time. In sequential laser excitation, approximately 30 s were first recorded (633 nm, 1 Hz) to identity areas of interest (AOI) followed by addition of 50 µL solution containing variable concentrations of RNA, U1-C, and Branaplam in IB was added and images were captured sequentially at 1 Hz (532 nm, 1 s exposure). Regardless of imaging scheme, a total of 600 frames were collected across varying frame rates (0.11 to 1 Hz) to minimize photobleaching. Finally in both schemes, the lane was washed with WB to remove oxygen scavengers and images were collected under 633 nm excitation until all surface tethered molecules photobleached (typically 30-60 frames). Importantly, this step allowed us to ensure we only analyzed AOIs featuring a single U1 snRNP molecule.

Data were collected under both non-equilibrium and equilibrium conditions. In the case of non-equilibrium experiments (e.g., 9bp RNA), imaging commenced immediately after addition of RNA to the lane, with the goal of characterizing initial association rates. In the case of equilibrium binding experiments (e.g., 9bp-1A RNA), the RNA solution was incubated on the lane for up to 15 min time to reach equilibrium (the rate of which was determined in pilot experiments not reported herein) prior to the start of imaging. No lane was used for longer than two hours after U1 snRNP deposition to minimize possible surface effects. A summary of each dataset is found in Supplementary Table 2.

Single molecule data analysis

Raw tiff stacks from Micro-Manager 2.0 were processed using custom code written in MATLAB (MathWorks, see Software Availability below). The 532 nm channel (RNA) was mathematically mapped to the 633 nm channel (surface tethered U1) in a two-step process using the fluorescent beads present in each spectral channel. First, a nonreflective similarity transformation was applied to align the detectors in physical space (rotation and translation), followed by an affine transformation using the beads as anchor points to correct for chromatic aberrations between the spectrally separated emitters (rotation, translation, and shear). Lateral drift was automatically corrected in each image by computing a nonreflective similarity transformation between temporally separated images. Areas of interest (AOIs) likely containing at least one single molecule were detected in a 633 nm channel by averaging the first five frames and performing a generalized likelihood ratio test on the resulting image53. Detected AOIs were fit to a two-dimensional gaussian function within a 5x5 pixel space. AOIs were filtered by removing those with intensity values of greater than three scaled median absolute deviations from the median (e.g., beads, multiple overlapping molecules) and those with a Euclidean distance less than 5 pixels away from a neighboring AOI. Accepted AOIs were then mapped to the 532 nm channel using the mathematical transformations described above. The time dependent fluorescence of each AOI in each channel was computed by integrating over all frames in a 3x3 pixel space centered on each AOI’s sub-pixel location. All the steps of this process were incorporated into a graphical user interface (smVideoProcessing).

Fluorescence trajectories of each AOI in both 633 nm (surface U1 snRNP, Cy5) and 532 nm channels (solution RNA binding, Cy3) were first idealized using the divisive segmentation and clustering algorithm (DISC)54. Only trajectories showing single step photobleaching or photoblinking of Cy5 (633 nm channel) were included for further analysis. For binding data, time series idealization enabled a binary interpretation of the time-dependent signal, whereby each frame was classified as a 0 for unbound or 1 for bound. All identified binding events were visually inspected to ensure on-target binding by examining a cropped image at each AOI for each frame in both channels (i.e., image gallery). Events determined to be off-target binding (e.g., a fluorescent RNA diffused within the AOI) were manually removed in the idealized time series. In the case of sparse, fast binding dynamics a variational Bayesian hidden Markov modeling algorithm (vbFRET) was used instead of DISC55, owning to their known differences in event detection accuracy54. Only molecules exhibiting at least one binding event were included for further analysis. All the steps of this process were incorporated into a graphical user interface (smTraceViewer).

Single molecule dwell time analysis

Unbinned dwell time distributions were treated as a mono- or bi-exponential distributions and the underlying parameters of each distribution were estimated using maximum likelihood (MLE)29. We used a conditional probability density function (PDF) that accounts for the experimental limitations of frame rate (\({t}_{\min }\)) and experiment duration (\({t}_{\max }\)), as any observed dwell time \(t\) must therefore be \({t}_{\min }\le t\le {t}_{\max }\). The conditional PDFs of mono (PDF1) and biexponential (PDF2) distributions are given by (Eqs. 2 and 3)

$${{\rm{PDF}}}1\left(t\right)=\frac{\lambda {e}^{(-\lambda t)}}{{e}^{(-\lambda {t}_{\min })}-{e}^{(-\lambda {t}_{\max })}}$$
(2)
$${{\rm{PDF}}}2\left(t\right)=\frac{({A}_{1})\left({\lambda }_{1}{e}^{\left(-{\lambda }_{1}t\right)}\right)+({1-A}_{1})\left({\lambda }_{2}{e}^{\left(-{\lambda }_{2}t\right)}\right)}{({A}_{1})\left({e}^{\left(-{\lambda }_{1}{t}_{\min }\right)}-{e}^{\left(-{\lambda }_{1}{t}_{\max }\right)}\right)+({1-A}_{1})\left({e}^{\left(-{\lambda }_{2}{t}_{\min }\right)}-{e}^{\left(-{\lambda }_{2}{t}_{\max }\right)}\right)}$$
(3)

whereby each exponential distribution is described by a rate constant \({\lambda }_{n}\) and an amplitude \({A}_{n}\) where \({\sum }_{n=1}^{N}{A}_{n}=1\). Rate constants obtained from MLE are often interpreted as their corresponding time constants (\(\tau\)) rather than rates (\(\lambda\)) by the relationship \(\lambda={\tau }^{-1}\). A log likelihood-ratio (LLR) test was performed for each bound dwell time distribution to compare the goodness of fit of mono- and biexponential distributions (Eq. 4).

$${{\rm{LLR}}}=-2({\mathrm{ln}}L({\theta }_{1})-{\mathrm{ln}}L({\theta }_{0}))$$
(4)

Where \(L({\theta }_{1})\) is the likelihood of the data given the parameters of the alternative hypothesis (biexponential distribution) and \(L({\theta }_{0})\) is the likelihood of the data given the parameters of the null hypothesis (exponential distribution). Values are reported as the MLE of the parameter ± standard error. Here, standard error is calculated from the 95% confidence interval obtained from the mle function in MATLAB using the Wald method. The results of all maximum likelihood estimations of dwell times are provided in Supplementary Table 3 for unbound dwell times and Supplementary Table 4 for bound dwell times.

Dwell time distributions are visualized as either their cumulative probabilities, violin plots, or histograms. Cumulative probability plots were constructed by computing a cumulative distribution function (CDF) estimate (Eq X) where the value of each bin \(({v}_{i})\) is computed by Eq. 5.

$${v}_{i}={\sum }_{j=1}^{i}\frac{{c}_{j}}{N}\,where\,0\le {v}_{i}\le 1$$
(5)

Overlays of MLE of mono- and biexponential distributions were computed by integrating PDF1 and PDF2 over a range [\({t}_{\min },\, {t}_{\max }\)] as provided by Eqs. 6 and 7.

$${{\rm{CDF}}}1\left(t\right)={\int _{{{{\boldsymbol{t}}}}_{{{\min }}}}^{{{{\boldsymbol{t}}}}_{{{\max }}}}}{PDF}1\left(t\right) \, {dt}=-\frac{{e}^{-\lambda t}}{{e}^{\left(-\lambda {t}_{\min }\right)}-{e}^{\left(-\lambda {t}_{\max }\right)}}+{{\rm{C}}}$$
(6)
$${{\rm{CDF}}}2\left(t\right) ={\int _{{{{\boldsymbol{t}}}}_{{{\min }}}}^{{{{\boldsymbol{t}}}}_{{{\max }}}}}{PDF}2\left(t\right)\, {dt} \\ =-\frac{\left(-{A}_{1}\right){e}^{\left(-{\lambda }_{1}t\right)}+\left({A}_{1}-1\right){e}^{\left(-{\lambda }_{2}t\right)}}{({A}_{1})\left({e}^{-{\lambda }_{1}{t}_{\min }}-{e}^{-{\lambda }_{1}{t}_{\max }}\right)+({1-A}_{1})\left({e}^{-{\lambda }_{2}{t}_{\min }}-{e}^{-{\lambda }_{2}{t}_{\max }}\right)}+{{\rm{C}}}$$
(7)

In both cases, \(C\) was computed as either \({CDF}1({t}_{\min })\) or \({CDF}2({t}_{\min })\) which returns values in the range [0,1]. In the case of histograms, binned dwell times are overlaid with MLE estimates using PDF values computed from Eq. 2 and Eq. 3 and scaled for the number of observations in the distribution. For data collected at equilibrium (e.g., 9bp-1A), all unbound and bound events considered for MLE. For non-equilibrium data (e.g., 9 bp, Fig. 4b–c), only the initial unbound event (aka time to first binding) was considered for MLE estimation to reduce the potential bias introduced by photobleaching of tight binders.

Dwell time correlation plots (Figs. 2J and 4G) were constructed by plotting the log10 transformation of event duration \(i\) vs event duration \(i+1\) for a given trace and rendered across all molecules as contour plot. Clustering of dwell time correlations (Supplementary Fig. 11) was performed by agglomerative clustering of centroid distances via linkage and cluster functions in MATLAB. In the case of Supplementary Fig. 11, the likelihood of 2 clusters vs 1 cluster was determined by computing a Bayesian Information Criterion (BIC) score for each model by

$${{\rm{BIC}}}=N \, {{\mathrm{ln}}}\left(\frac{{SSE}}{N}\right)+k\times {{\mathrm{ln}}}\left(N\right)$$
(8)

where N is the number of datapoints, k is the number of clusters, and SSE is the sum of squared errors within a cluster summed across all clusters by

$${SSE}={\sum }_{i=1}^{k}{\sum}_{{x}_{j\in }\in {C}_{i}}{{||}{x}_{j}-{\mu }_{i}{||}}^{2}$$
(9)

where k is the number of clusters, \({C}_{i}\) is the \(i\)-th cluster, \({x}_{j}\) is the data point in cluster \({C}_{i}\), and \({\mu }_{i}\) is the centroid of cluster \({C}_{i}\). A lower BIC value was used as evidence for a better model.

Global HMM modeling of single molecule data

Kinetic modeling of single molecule data was performed with QuB56. Hidden Markov models were globally optimized to simultaneously describe the idealized behavior across all molecules within a particular dataset. Contrary to MLE analysis of single-molecule dwell time distributions, HMMs enable hypothesis testing of state connectivity and globally optimize transitions rates across all molecules for a model postulated a priori. Given the complexity and scale of our CoSMoS data collected with the 9bp-1A 5′SS RNA, we tested specific models on the following four subsets of data: (1) varying 9bp-1A concentration at 0 nM U1-C in DMSO, (2) varying 9bp-1A concentration at 100 nM U1-C in DMSO, (3) varying U1-C concentration at 1 nM 9bp-1A in DMSO, and (4) varying branaplam concentration at 1 nM 9bp-1A with 100 nM U1-C (Supplementary Fig. 26, Supplementary Table 5). In the latter case, 10 µM branaplam data was excluded from QuB modeling due to our inability to fully sample the bound dwell time distribution (Supplementary Fig. 9F). Each dataset was used to optimize specific rate parameters, the results of which are reported in Table 1.

Table 1 Globally optimized transition rates for 9bp-1A

Transition rates of user-defined models were optimized using maximum idealized point (MIP) likelihood rate estimation57. In all cases, a dead time parameter in QuB was set equal to the sampling time. For each condition, multiple models of varying complexity were built and ranked by their Bayesian Information Criterion58 (Eq. 8)

$${{\rm{BIC}}}=k\times {\mathrm{ln}} \, \left(N\right)-2\times {\mathrm{ln}} \, \left(L(\theta )\right)$$
(10)

where \(k\) is the number of free parameters, \(N\) is the number of data points (frames), and \(L(\theta )\) is the likelihood of the data given the model returned by MIP. Optimized rate constants are provided in Supplementary Table 4. Please note that of the models tested (Supplementary Fig. 26), no one model of this list was considered a single best fit, owning the inability to test all concentration dependent variables (e.g., [9bp-1A], [U1-C], and [branaplam], in a single model at once using QuB. Overall, the physical states, their connectivity, and their rates were determined across the different datasets using their corresponding best fit models. The final model presented in Fig. 6 contains the physical states and connectivity of Model 4a in Supplementary Fig. 26, but the transition rates of this model presented in Table 1 stem from the optimized transition rates of the best fit model for each dataset defined in Supplementary Fig. 26.

To confirm this model matched our experimental data, we performed single molecule simulations (Supplementary Figs. 27-31). Single molecule trajectories were simulated as noiseless transitions between discrete states following the global model of 9bp-1A binding under Markov assumptions (Fig. 6, Table 1). Each simulation consisted of 10,000 molecules for a given concentration of 9bp-1A, U1-C, and branaplam. Five percent of these molecules were simulated without the inclusion of U1-C, to mimic of single molecule observations (Supplementary Fig. 11). Each molecule was simulated for 600 frames at the frame rate used in data collection (Supplementary Table 2).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.