Abstract
Single-molecule fluorescence microscopy (SMFM) can reveal important biological insights. However, uncovering rare but critical intermediates often demands manual inspection of time traces and iterative ad hoc approaches. To facilitate systematic and efficient discovery from SMFM time traces, we introduce META-SiM, a transformer-based foundation model pretrained on diverse SMFM analysis tasks. META-SiM rivals best-in-class algorithms on a broad range of tasks including trace classification, segmentation, idealization and stepwise photobleaching analysis. Additionally, the model produces embeddings that encapsulate detailed information about each trace, which the web-based META-SiM Projector (https://www.simol-projector.org) casts into lower-dimensional space for efficient whole-dataset visualization, labeling, comparison and sharing. Combining this Projector with the objective metric of local Shannon entropy enables rapid identification of condition-specific behaviors, even if rare or subtle. Applying META-SiM to an existing single-molecule Förster resonance energy transfer dataset, we discover a previously undetected intermediate state in pre-mRNA splicing. META-SiM removes bottlenecks, improves objectivity and both systematizes and accelerates biological discovery in single-molecule data.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
The experimental data and the data analysis code that support the findings of this study are available via GitHub at https://github.com/simol-lab/META-SiM.
Code availability
Code for generating embedding vectors and UMAP projections, applying META-SiM for downstream tasks and calculating metrics including the SCS and LSE is available in our GitHub repository at https://www.github.com/simol-lab/meta-sim. A comprehensive user manual is available in this GitHub repository and on the META-SiM Projector website (https://www.simol-projector.org).
References
Shashkova, S. & Leake, M. C. Single-molecule fluorescence microscopy review: shedding new light on old problems. Biosci. Rep. 37, BSR20170031 (2017).
Bacic, L., Sabantsev, A. & Deindl, S. Recent advances in single-molecule fluorescence microscopy render structural biology dynamic. Curr. Opin. Struct. Biol. 65, 61–68 (2020).
Lerner, E. et al. FRET-based dynamic structural biology: challenges, perspectives and an appeal for open-science practices. eLife 10, e60416 (2021).
Wasserman, M. R., Alejo, J. L., Altman, R. B. & Blanchard, S. C. Multiperspective smFRET reveals rate-determining late intermediates of ribosomal translocation. Nat. Struct. Mol. Biol. 23, 333–341 (2016).
Shebl, B., Norman, Z. & Cornish, P. V. Ribosome structure and dynamics by smFRET microscopy. Methods Enzymol. 549, 375–406 (2014).
Agam, G. et al. Reliability and accuracy of single-molecule FRET studies for characterization of structural dynamics and distances in proteins. Nat. Methods 20, 523–535 (2023).
Peter, M. F. et al. Cross-validation of distance measurements in proteins by PELDOR/DEER and single-molecule FRET. Nat. Commun. 13, 4396 (2022).
Blanco, M. & Walter, N. G. in Methods in Enzymology Vol. 472 (ed. Walter, N. G.) Ch. 9 (Academic, 2010).
Roy, R., Hohng, S. & Ha, T. A practical guide to single-molecule FRET. Nat. Methods 5, 507–516 (2008).
Hartmann, A. et al. An automated single-molecule FRET platform for high-content, multiwell plate screening of biomolecular conformations and dynamics. Nat. Commun. 14, 6511 (2023).
Juette, M. F. et al. Single-molecule imaging of non-equilibrium molecular ensembles on the millisecond timescale. Nat. Methods 13, 341–344 (2016).
Hellenkamp, B. et al. Precision and accuracy of single-molecule FRET measurements—a multi-laboratory benchmark study. Nat. Methods 15, 669–676 (2018).
Götz, M. et al. A blind benchmark of analysis tools to infer kinetic rate constants from single-molecule FRET trajectories. Nat. Commun. 13, 5402 (2022).
Li, J. et al. Exploring the speed limit of toehold exchange with a cartwheeling DNA acrobat. Nat. Nanotechnol. 13, 723–729 (2018).
Chung, H. S. et al. Extracting rate coefficients from single-molecule photon trajectories and FRET efficiency histograms for a fast-folding protein. J. Phys. Chem. A 115, 3642–3656 (2011).
Mohapatra, S., Lin, C.-T., Feng, X. A., Basu, A. & Ha, T. Single-molecule analysis and engineering of DNA motors. Chem. Rev. 120, 36–78 (2020).
Schmid, S., Götz, M. & Hugel, T. Single-molecule analysis beyond dwell times: demonstration and assessment in and out of equilibrium. Biophys. J. 111, 1375–1384 (2016).
Di Antonio, M. et al. Single-molecule visualization of DNA G-quadruplex formation in live cells. Nat. Chem. 12, 832–837 (2020).
Asher, W. B. et al. Single-molecule FRET imaging of GPCR dimers in living cells. Nat. Methods 18, 397–405 (2021).
Tsekouras, K., Custer, T. C., Jashnsaz, H., Walter, N. G. & Pressé, S. A novel method to accurately locate and count large numbers of steps by photobleaching. Mol. Biol. Cell 27, 3601–3615 (2016).
Leake, M. C. et al. Stoichiometry and turnover in single, functioning membrane protein complexes. Nature 443, 355–358 (2006).
Ulbrich, M. H. & Isacoff, E. Y. Subunit counting in membrane-bound proteins. Nat. Methods 4, 319–321 (2007).
Uno, S.-N. et al. A spontaneously blinking fluorophore based on intramolecular spirocyclization for live-cell super-resolution imaging. Nat. Chem. 6, 681–689 (2014).
Püntener, S. & Rivera-Fuentes, P. Single-molecule peptide identification using fluorescence blinking fingerprints. J. Am. Chem. Soc. 145, 1441–1447 (2023).
McKinney, S. A., Joo, C. & Ha, T. Analysis of single-molecule FRET trajectories using hidden Markov modeling. Biophys. J. 91, 1941–1951 (2006).
Hon, J. & Gonzalez, R. L. Bayesian-estimated hierarchical HMMs enable robust analysis of single-molecule kinetic heterogeneity. Biophys. J. 116, 1790–1802 (2019).
Bronson, J. E., Fei, J., Hofman, J. M., Gonzalez, R. L. & Wiggins, C. H. Learning rates and states from biophysical time series: a Bayesian approach to model selection and single-molecule FRET data. Biophys. J. 97, 3196–3205 (2009).
Li, J., Zhang, L., Johnson-Buck, A. & Walter, N. G. Automatic classification and segmentation of single-molecule fluorescence time traces with deep learning. Nat. Commun. 11, 5833 (2020).
Thomsen, J. et al. DeepFRET, a software for rapid and automated single-molecule FRET data classification using deep learning. eLife 9, e60404 (2020).
Wanninger, S. et al. Deep-LASI: deep-learning assisted, single-molecule imaging analysis of multi-color DNA origami structures. Nat. Commun. 14, 6564 (2023).
Xu, J. et al. Automated stoichiometry analysis of single-molecule fluorescence imaging traces via deep learning. J. Am. Chem. Soc. 141, 6976–6985 (2019).
Zhou, S. et al. Deep learning based local feature classification to automatically identify single molecule fluorescence events. Commun. Biol. 7, 1404 (2024).
Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 30 (eds Guyon, I. et al.) 5998–6008 (Curran Associates, 2017).
Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33 (eds Larochelle, H. et al.) 1877–1901 (Curran Associates, 2020).
Widom, J. R. et al. Ligand modulates cross-coupling between riboswitch folding and transcriptional pausing. Mol. Cell 72, 541–552 (2018).
Blanco, M. R. et al. Single molecule cluster analysis identifies signature dynamic conformations along the splicing pathway. Nat. Methods 12, 1077–1084 (2015).
Semlow, D. R., Blanco, M. R., Walter, N. G. & Staley, J. P. Spliceosomal DEAH-box ATPases remodel pre-mRNA to activate alternative splice sites. Cell 164, 985–998 (2016).
Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at arxiv.org/abs/2108.07258 (2022).
Radford, A. et al. Language Models Are Unsupervised Multitask Learners (OpenAI, 2019).
Johnson-Buck, A. et al. Kinetic fingerprinting to identify and count single nucleic acids. Nat. Biotechnol. 33, 730–732 (2015).
Jolliffe, I. T. & Cadima, J. Principal component analysis: a review and recent developments. Philos. Trans. A Math. Phys. Eng. Sci. 374, 20150202 (2016).
Healy, J. & McInnes, L. Uniform manifold approximation and projection. Nat. Rev. Methods Primers 4, 82 (2024).
Hinton, G.E. & Roweis S.T. Stochastic neighbor embedding. In Advances in Neural Information Processing Systems 15 (eds Becker, S. et al.) 857–864 (MIT Press, 2003).
Schmid, S. & Hugel, T. Controlling protein function by fine-tuning conformational flexibility. eLife 9, e57180 (2020).
Dong, J. et al. Direct imaging of single-molecule electrochemical reactions in solution. Nature 596, 244–249 (2021).
Deguchi, T. et al. Direct observation of motor protein stepping in living cells using MINFLUX. Science 379, 1010–1015 (2023).
Fu, J. et al. Multi-enzyme complexes on DNA scaffolds capable of substrate channelling with an artificial swinging arm. Nat. Nanotechnol. 9, 531–536 (2014).
Suddala, K. C., Wang, J., Hou, Q. & Walter, N. G. Mg2+ shifts ligand-mediated folding of a riboswitch from induced-fit to conformational selection. J. Am. Chem. Soc. 137, 14075–14083 (2015).
Suddala, K. C. et al. Local-to-global signal transduction at the core of a Mn2+ sensing riboswitch. Nat. Commun. 10, 4304 (2019).
Hayward, S. L. et al. Ultraspecific and amplification-free quantification of mutant DNA by single-molecule kinetic fingerprinting. J. Am. Chem. Soc. 140, 11755–11762 (2018).
Dosovitskiy, A. et al. An image is worth 16×16 words: transformers for image recognition at scale. In International Conference on Learning Representations (ICLR 2021) https://openreview.net/forum?id=YicbFdNTTy (OpenReview, 2020).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics (eds Burnstein, J. et al.) 4171–4186 (Association for Computational Linguistics, 2019).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
Johnson-Buck, A., Li, J., Tewari, M. & Walter, N. G. A guide to nucleic acid detection by single-molecule kinetic fingerprinting. Methods 153, 3–12 (2019).
Fix, E. & Hodges, J. L. Discriminatory analysis. nonparametric discrimination: consistency properties. Int. Stat. Rev. 57, 238–247 (1989).
Zosel, F., Soranno, A., Buholzer, K. J., Nettels, D. & Schuler, B. Depletion interactions modulate the binding between disordered proteins in crowded environments. Proc. Natl Acad. Sci. USA 117, 13480–13489 (2020).
Acknowledgements
We thank J. Tang and L. Dai for manual labeling of the stepwise photobleaching data. We thank L. Dai and P. Banerjee for multiplexing SiMREPS data. We thank A. Chauvier for SiM-KARTS data. We thank Z. Li for esthetic consulting for the design of Fig. 1. We thank S. Schmid for sharing the protein HSP90 study data for our benchmark test. This work was supported by NIH grant R35 GM131922 to N.G.W.
Author information
Authors and Affiliations
Contributions
J.L. and L.Z. conceived the ideas and analyzed and interpreted data. L.Z. and J.L. wrote Python programs for data processing, deep learning network training and single-molecule trace simulation. J.L., L.Z., A.J.-B. and N.G.W. cowrote the paper. All authors discussed the results and edited the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Hugo Sanabria and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Rita Strack, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Performance of META-SiM in various downstream tasks.
a, The area under the receiver-operating characteristic curve (ROC AUC) for trace classification by META-SiM and DeepFRET compared to manual analysis. Error bars are +/- one standard deviation (s.d.) from 10 times sub-sampling experiments. b, Representative FRET histograms based on traces curated and segmented by META-SiM versus manual analysis. c, A distribution of dwell time predicted by the model is fit with single exponential distributions to yield transition rate constants. d, A representative confusion matrix comparing the labels from manual counting (‘True Label’) to the labels predicted by META-SiM. e, Concordance between manual counting and predictions by META-SiM that either match exactly or differ by no more than one step. f, Standard curve for T790M generated by META-SiM and HMM analysis. g, h, Evaluation of performance in trace idealization on 131 time traces for META-SiM (Fine-Tuned), and benchmarking against 14 other common analysis tools13, on the basis of measured rate constants (g) and FRET efficiencies (h): (1) Pomegranate; (2) Tracy(HMM); (3) FRETboard; (4) Hidden-Markury; (5) SMACKS(SS); (6) SMACKS; (7) Correlation; (8) Edge finding(CK); (9) Edge finding(k-means); (10) Step finding; (11) STaSI; (12) MASH-FRET(bootstrap); (13) MASH-FRET(prob); (14) postFRET. Error bars in g and h are +/- one s.d. of fitted rate constants and fitted Gaussian distribution of FRET efficiency, respectively.
Extended Data Fig. 2 Evaluation of UMAP projection on both simulated and experimental data.
a–c, 2D UMAP projections of simulated traces with varying high-FRET value (a), number of FRET states (b), and number of photobleaching steps (c). d–g, 2D UMAP projection of traces from dataset D247 (d), D348 (e), D435 (f), D549 (g) that were manually accepted (red) or rejected (blue) for further analysis.
Extended Data Fig. 3 Varying principal projections from the same dataset.
a–c, 2D UMAP projection of dataset D435 based on different attributes: kinetic rate (a), photobleaching steps (b), donor and acceptor fluorophore lifetime prior to photobleaching (c). d–f, 2D UMAP projection of dataset D750 based on different attributes: single-channel kinetic rate (d), single-channel photobleaching steps (e), single-channel SNR (f). g,h, 2D UMAP projection of dataset D6 with different attributes: single-channel kinetic rate (g), single-channel SNR (h).
Extended Data Fig. 4 Different annotations of the smFRET Atlas.
a, An smFRET Atlas constructed with 22,000 traces derived from simulation. b–f, the same Atlas where only (b) low-SNR traces, (c) traces with a single FRET state, (d) traces with 3 FRET states, (e) traces with two FRET states and slow transitions, or (f) traces with two FRET states and fast transitions are plotted. Codes for cluster names (1-c-l, etc.) are listed in Supplementary Table 3.
Extended Data Fig. 5 Titration of KCl into the paused transcriptional elongation complex system (D4).
a,b, 2D UMAP projections of embeddings from uncurated traces under different KCl concentrations in Atlas coordinates (a) and system-specific coordinates (b). c, TODPs of traces from the different KCl concentration conditions. d, FRET histogram of traces from the different KCl concentrations. e,f, 2D UMAP projections of the 10% of embeddings with lowest LSE from traces under the different KCl concentrations in Atlas coordinates (e) and system-specific coordinates (f). g, TODPs of 10% lowest LSE traces from the different KCl concentration conditions. h, FRET histograms of the 10% of traces with lowest LSE from the different KCl concentrations. i, FRET histograms of manually curated traces from the different KCl concentrations.
Extended Data Fig. 6 Single-molecule FRET characterization of the effects of the A577I mutation on the conformation of yeast Hsp 90.
a–c, The 2D UMAP projections of trace embeddings from the different experimental conditions in Atlas coordinates (a) and system-specific coordinates (b) and (c). d, FRET histograms of all traces from the different experimental conditions. e, 2D UMAP projections of the 20% of traces with lowest LSE under the different experimental conditions in system-specific coordinates. f, FRET histograms of the 20% of traces with lowest LSE from the different experimental conditions, more strongly highlighting the conclusion from the original study44 that, in the presence of ATP, both the A577I/A577I homodimer (5th column) and the A577I/wild-type (wt) hetero-dimer (6th column) lead to a shift toward closed conformations, corresponding to high-FRET states.
Extended Data Fig. 7 Single-molecule FRET characterization the effects of macromolecular crowding by Ficoll 400 on the conformation of yeast Hsp90.
a, The 2D UMAP projections of trace embeddings from the different experimental conditions in Atlas coordinates (a) and system-specific coordinates (b) and (c). d, FRET histograms of traces from the different experimental conditions. e, 2D UMAP projections of the 20% of traces with lowest LSE under the different experimental conditions in system-specific coordinates. f, FRET histograms of the 20% of traces with lowest LSE from the different experimental conditions, highlighting the conclusion from the original study8 that crowding by Ficoll 400 at increasing concentrations (1st, 2nd, 3rd columns) leads to progressively greater shifts towards closed conformations, corresponding to high-FRET states. While such a shift is present for sucrose at 30%, it is much less pronounced than for Ficoll 400.
Extended Data Fig. 8 Single-molecule FRET characterization of the effects of cochaperone Aha1 on the conformation of yeast Hsp 90.
a–c, The 2D UMAP projections of trace embeddings from the different experimental conditions in Atlas coordinates (a) and system-specific coordinates (b) and (c). d, FRET histograms of traces from the different experimental conditions. e, 2D UMAP projections of the 20% of traces with lowest LSE under the different experimental conditions in system-specific coordinates. f, FRET histograms of the 10% of traces with lowest LSE from the different experimental conditions, more strongly highlighting the conclusion from the original study44 that the presence of cochaperone Aha1 (labeled as ‘(+)Aha1+ATP’) lead to a shift toward closed conformations, corresponding to high-FRET states.
Extended Data Fig. 9 Full smFRET characterization of the yeast pre-mRNA splicing pathway.
a, Diagram of the splicing pathway. Experimental conditions used to block any further progress beyond specific steps in the pathway are annotated in orange font36. b, c, 2D UMAP projections of trace embeddings from the different experimental conditions in Atlas coordinates (b) and system-specific coordinates (c). d, TODPs of traces from the different experimental conditions. e, FRET histograms of traces from the different experimental conditions. f, g, 2D UMAP projections of the 10% of traces with lowest LSE under the different experimental conditions in Atlas coordinates (f) and system-specific coordinates (g). h, TODPs of the 10% of traces with lowest LSE from the different experimental conditions. i, FRET histogram of the 10% of traces with lowest LSE from the different experimental conditions.
Extended Data Fig. 10 Distribution of the 10% of traces with lowest LSE across the different experimental conditions in the splicing study.
The total number count and fraction of traces from each experimental dataset that are among those with the 10% lowest LSE values is different, indicating that certain conditions exhibit a larger fraction of highly condition-specific traces.
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, J., Zhang, L., Johnson-Buck, A. et al. Foundation model for efficient biological discovery in single-molecule time traces. Nat Methods (2025). https://doi.org/10.1038/s41592-025-02839-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41592-025-02839-4