Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Foundation model for efficient biological discovery in single-molecule time traces

Abstract

Single-molecule fluorescence microscopy (SMFM) can reveal important biological insights. However, uncovering rare but critical intermediates often demands manual inspection of time traces and iterative ad hoc approaches. To facilitate systematic and efficient discovery from SMFM time traces, we introduce META-SiM, a transformer-based foundation model pretrained on diverse SMFM analysis tasks. META-SiM rivals best-in-class algorithms on a broad range of tasks including trace classification, segmentation, idealization and stepwise photobleaching analysis. Additionally, the model produces embeddings that encapsulate detailed information about each trace, which the web-based META-SiM Projector (https://www.simol-projector.org) casts into lower-dimensional space for efficient whole-dataset visualization, labeling, comparison and sharing. Combining this Projector with the objective metric of local Shannon entropy enables rapid identification of condition-specific behaviors, even if rare or subtle. Applying META-SiM to an existing single-molecule Förster resonance energy transfer dataset, we discover a previously undetected intermediate state in pre-mRNA splicing. META-SiM removes bottlenecks, improves objectivity and both systematizes and accelerates biological discovery in single-molecule data.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: META-SiM enables analysis, visualization and efficient discovery from diverse single-molecule datasets.
Fig. 2: Architecture and training of the META-SiM model.
Fig. 3: Performance of META-SiM on diverse analysis tasks.
Fig. 4: Whole-dataset visualization and interpretation with META-SiM Projector.
Fig. 5: Quantitative metrics for quality control and discovery.
Fig. 6: META-SiM as a tool for biological discovery in complex smFRET data.

Similar content being viewed by others

Data availability

The experimental data and the data analysis code that support the findings of this study are available via GitHub at https://github.com/simol-lab/META-SiM.

Code availability

Code for generating embedding vectors and UMAP projections, applying META-SiM for downstream tasks and calculating metrics including the SCS and LSE is available in our GitHub repository at https://www.github.com/simol-lab/meta-sim. A comprehensive user manual is available in this GitHub repository and on the META-SiM Projector website (https://www.simol-projector.org).

References

  1. Shashkova, S. & Leake, M. C. Single-molecule fluorescence microscopy review: shedding new light on old problems. Biosci. Rep. 37, BSR20170031 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Bacic, L., Sabantsev, A. & Deindl, S. Recent advances in single-molecule fluorescence microscopy render structural biology dynamic. Curr. Opin. Struct. Biol. 65, 61–68 (2020).

    Article  CAS  PubMed  Google Scholar 

  3. Lerner, E. et al. FRET-based dynamic structural biology: challenges, perspectives and an appeal for open-science practices. eLife 10, e60416 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Wasserman, M. R., Alejo, J. L., Altman, R. B. & Blanchard, S. C. Multiperspective smFRET reveals rate-determining late intermediates of ribosomal translocation. Nat. Struct. Mol. Biol. 23, 333–341 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Shebl, B., Norman, Z. & Cornish, P. V. Ribosome structure and dynamics by smFRET microscopy. Methods Enzymol. 549, 375–406 (2014).

    Article  CAS  PubMed  Google Scholar 

  6. Agam, G. et al. Reliability and accuracy of single-molecule FRET studies for characterization of structural dynamics and distances in proteins. Nat. Methods 20, 523–535 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Peter, M. F. et al. Cross-validation of distance measurements in proteins by PELDOR/DEER and single-molecule FRET. Nat. Commun. 13, 4396 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Blanco, M. & Walter, N. G. in Methods in Enzymology Vol. 472 (ed. Walter, N. G.) Ch. 9 (Academic, 2010).

  9. Roy, R., Hohng, S. & Ha, T. A practical guide to single-molecule FRET. Nat. Methods 5, 507–516 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Hartmann, A. et al. An automated single-molecule FRET platform for high-content, multiwell plate screening of biomolecular conformations and dynamics. Nat. Commun. 14, 6511 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Juette, M. F. et al. Single-molecule imaging of non-equilibrium molecular ensembles on the millisecond timescale. Nat. Methods 13, 341–344 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Hellenkamp, B. et al. Precision and accuracy of single-molecule FRET measurements—a multi-laboratory benchmark study. Nat. Methods 15, 669–676 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Götz, M. et al. A blind benchmark of analysis tools to infer kinetic rate constants from single-molecule FRET trajectories. Nat. Commun. 13, 5402 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Li, J. et al. Exploring the speed limit of toehold exchange with a cartwheeling DNA acrobat. Nat. Nanotechnol. 13, 723–729 (2018).

    Article  CAS  PubMed  Google Scholar 

  15. Chung, H. S. et al. Extracting rate coefficients from single-molecule photon trajectories and FRET efficiency histograms for a fast-folding protein. J. Phys. Chem. A 115, 3642–3656 (2011).

    Article  CAS  PubMed  Google Scholar 

  16. Mohapatra, S., Lin, C.-T., Feng, X. A., Basu, A. & Ha, T. Single-molecule analysis and engineering of DNA motors. Chem. Rev. 120, 36–78 (2020).

    Article  CAS  PubMed  Google Scholar 

  17. Schmid, S., Götz, M. & Hugel, T. Single-molecule analysis beyond dwell times: demonstration and assessment in and out of equilibrium. Biophys. J. 111, 1375–1384 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Di Antonio, M. et al. Single-molecule visualization of DNA G-quadruplex formation in live cells. Nat. Chem. 12, 832–837 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Asher, W. B. et al. Single-molecule FRET imaging of GPCR dimers in living cells. Nat. Methods 18, 397–405 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Tsekouras, K., Custer, T. C., Jashnsaz, H., Walter, N. G. & Pressé, S. A novel method to accurately locate and count large numbers of steps by photobleaching. Mol. Biol. Cell 27, 3601–3615 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Leake, M. C. et al. Stoichiometry and turnover in single, functioning membrane protein complexes. Nature 443, 355–358 (2006).

    Article  CAS  PubMed  Google Scholar 

  22. Ulbrich, M. H. & Isacoff, E. Y. Subunit counting in membrane-bound proteins. Nat. Methods 4, 319–321 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Uno, S.-N. et al. A spontaneously blinking fluorophore based on intramolecular spirocyclization for live-cell super-resolution imaging. Nat. Chem. 6, 681–689 (2014).

    Article  CAS  PubMed  Google Scholar 

  24. Püntener, S. & Rivera-Fuentes, P. Single-molecule peptide identification using fluorescence blinking fingerprints. J. Am. Chem. Soc. 145, 1441–1447 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  25. McKinney, S. A., Joo, C. & Ha, T. Analysis of single-molecule FRET trajectories using hidden Markov modeling. Biophys. J. 91, 1941–1951 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Hon, J. & Gonzalez, R. L. Bayesian-estimated hierarchical HMMs enable robust analysis of single-molecule kinetic heterogeneity. Biophys. J. 116, 1790–1802 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Bronson, J. E., Fei, J., Hofman, J. M., Gonzalez, R. L. & Wiggins, C. H. Learning rates and states from biophysical time series: a Bayesian approach to model selection and single-molecule FRET data. Biophys. J. 97, 3196–3205 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Li, J., Zhang, L., Johnson-Buck, A. & Walter, N. G. Automatic classification and segmentation of single-molecule fluorescence time traces with deep learning. Nat. Commun. 11, 5833 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Thomsen, J. et al. DeepFRET, a software for rapid and automated single-molecule FRET data classification using deep learning. eLife 9, e60404 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Wanninger, S. et al. Deep-LASI: deep-learning assisted, single-molecule imaging analysis of multi-color DNA origami structures. Nat. Commun. 14, 6564 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Xu, J. et al. Automated stoichiometry analysis of single-molecule fluorescence imaging traces via deep learning. J. Am. Chem. Soc. 141, 6976–6985 (2019).

    Article  CAS  PubMed  Google Scholar 

  32. Zhou, S. et al. Deep learning based local feature classification to automatically identify single molecule fluorescence events. Commun. Biol. 7, 1404 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 30 (eds Guyon, I. et al.) 5998–6008 (Curran Associates, 2017).

  34. Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33 (eds Larochelle, H. et al.) 1877–1901 (Curran Associates, 2020).

  35. Widom, J. R. et al. Ligand modulates cross-coupling between riboswitch folding and transcriptional pausing. Mol. Cell 72, 541–552 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Blanco, M. R. et al. Single molecule cluster analysis identifies signature dynamic conformations along the splicing pathway. Nat. Methods 12, 1077–1084 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Semlow, D. R., Blanco, M. R., Walter, N. G. & Staley, J. P. Spliceosomal DEAH-box ATPases remodel pre-mRNA to activate alternative splice sites. Cell 164, 985–998 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at arxiv.org/abs/2108.07258 (2022).

  39. Radford, A. et al. Language Models Are Unsupervised Multitask Learners (OpenAI, 2019).

  40. Johnson-Buck, A. et al. Kinetic fingerprinting to identify and count single nucleic acids. Nat. Biotechnol. 33, 730–732 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Jolliffe, I. T. & Cadima, J. Principal component analysis: a review and recent developments. Philos. Trans. A Math. Phys. Eng. Sci. 374, 20150202 (2016).

    PubMed  PubMed Central  Google Scholar 

  42. Healy, J. & McInnes, L. Uniform manifold approximation and projection. Nat. Rev. Methods Primers 4, 82 (2024).

    Article  CAS  Google Scholar 

  43. Hinton, G.E. & Roweis S.T. Stochastic neighbor embedding. In Advances in Neural Information Processing Systems 15 (eds Becker, S. et al.) 857–864 (MIT Press, 2003).

  44. Schmid, S. & Hugel, T. Controlling protein function by fine-tuning conformational flexibility. eLife 9, e57180 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Dong, J. et al. Direct imaging of single-molecule electrochemical reactions in solution. Nature 596, 244–249 (2021).

    Article  CAS  PubMed  Google Scholar 

  46. Deguchi, T. et al. Direct observation of motor protein stepping in living cells using MINFLUX. Science 379, 1010–1015 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Fu, J. et al. Multi-enzyme complexes on DNA scaffolds capable of substrate channelling with an artificial swinging arm. Nat. Nanotechnol. 9, 531–536 (2014).

    Article  CAS  PubMed  Google Scholar 

  48. Suddala, K. C., Wang, J., Hou, Q. & Walter, N. G. Mg2+ shifts ligand-mediated folding of a riboswitch from induced-fit to conformational selection. J. Am. Chem. Soc. 137, 14075–14083 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Suddala, K. C. et al. Local-to-global signal transduction at the core of a Mn2+ sensing riboswitch. Nat. Commun. 10, 4304 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Hayward, S. L. et al. Ultraspecific and amplification-free quantification of mutant DNA by single-molecule kinetic fingerprinting. J. Am. Chem. Soc. 140, 11755–11762 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Dosovitskiy, A. et al. An image is worth 16×16 words: transformers for image recognition at scale. In International Conference on Learning Representations (ICLR 2021) https://openreview.net/forum?id=YicbFdNTTy (OpenReview, 2020).

  52. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics (eds Burnstein, J. et al.) 4171–4186 (Association for Computational Linguistics, 2019).

  53. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).

  54. Johnson-Buck, A., Li, J., Tewari, M. & Walter, N. G. A guide to nucleic acid detection by single-molecule kinetic fingerprinting. Methods 153, 3–12 (2019).

    Article  CAS  PubMed  Google Scholar 

  55. Fix, E. & Hodges, J. L. Discriminatory analysis. nonparametric discrimination: consistency properties. Int. Stat. Rev. 57, 238–247 (1989).

    Article  Google Scholar 

  56. Zosel, F., Soranno, A., Buholzer, K. J., Nettels, D. & Schuler, B. Depletion interactions modulate the binding between disordered proteins in crowded environments. Proc. Natl Acad. Sci. USA 117, 13480–13489 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank J. Tang and L. Dai for manual labeling of the stepwise photobleaching data. We thank L. Dai and P. Banerjee for multiplexing SiMREPS data. We thank A. Chauvier for SiM-KARTS data. We thank Z. Li for esthetic consulting for the design of Fig. 1. We thank S. Schmid for sharing the protein HSP90 study data for our benchmark test. This work was supported by NIH grant R35 GM131922 to N.G.W.

Author information

Authors and Affiliations

Authors

Contributions

J.L. and L.Z. conceived the ideas and analyzed and interpreted data. L.Z. and J.L. wrote Python programs for data processing, deep learning network training and single-molecule trace simulation. J.L., L.Z., A.J.-B. and N.G.W. cowrote the paper. All authors discussed the results and edited the manuscript.

Corresponding authors

Correspondence to Jieming Li or Nils G. Walter.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Hugo Sanabria and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Rita Strack, in collaboration with the Nature Methods team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Performance of META-SiM in various downstream tasks.

a, The area under the receiver-operating characteristic curve (ROC AUC) for trace classification by META-SiM and DeepFRET compared to manual analysis. Error bars are +/- one standard deviation (s.d.) from 10 times sub-sampling experiments. b, Representative FRET histograms based on traces curated and segmented by META-SiM versus manual analysis. c, A distribution of dwell time predicted by the model is fit with single exponential distributions to yield transition rate constants. d, A representative confusion matrix comparing the labels from manual counting (‘True Label’) to the labels predicted by META-SiM. e, Concordance between manual counting and predictions by META-SiM that either match exactly or differ by no more than one step. f, Standard curve for T790M generated by META-SiM and HMM analysis. g, h, Evaluation of performance in trace idealization on 131 time traces for META-SiM (Fine-Tuned), and benchmarking against 14 other common analysis tools13, on the basis of measured rate constants (g) and FRET efficiencies (h): (1) Pomegranate; (2) Tracy(HMM); (3) FRETboard; (4) Hidden-Markury; (5) SMACKS(SS); (6) SMACKS; (7) Correlation; (8) Edge finding(CK); (9) Edge finding(k-means); (10) Step finding; (11) STaSI; (12) MASH-FRET(bootstrap); (13) MASH-FRET(prob); (14) postFRET. Error bars in g and h are +/- one s.d. of fitted rate constants and fitted Gaussian distribution of FRET efficiency, respectively.

Extended Data Fig. 2 Evaluation of UMAP projection on both simulated and experimental data.

a–c, 2D UMAP projections of simulated traces with varying high-FRET value (a), number of FRET states (b), and number of photobleaching steps (c). d–g, 2D UMAP projection of traces from dataset D247 (d), D348 (e), D435 (f), D549 (g) that were manually accepted (red) or rejected (blue) for further analysis.

Extended Data Fig. 3 Varying principal projections from the same dataset.

a–c, 2D UMAP projection of dataset D435 based on different attributes: kinetic rate (a), photobleaching steps (b), donor and acceptor fluorophore lifetime prior to photobleaching (c). d–f, 2D UMAP projection of dataset D750 based on different attributes: single-channel kinetic rate (d), single-channel photobleaching steps (e), single-channel SNR (f). g,h, 2D UMAP projection of dataset D6 with different attributes: single-channel kinetic rate (g), single-channel SNR (h).

Extended Data Fig. 4 Different annotations of the smFRET Atlas.

a, An smFRET Atlas constructed with 22,000 traces derived from simulation. b–f, the same Atlas where only (b) low-SNR traces, (c) traces with a single FRET state, (d) traces with 3 FRET states, (e) traces with two FRET states and slow transitions, or (f) traces with two FRET states and fast transitions are plotted. Codes for cluster names (1-c-l, etc.) are listed in Supplementary Table 3.

Extended Data Fig. 5 Titration of KCl into the paused transcriptional elongation complex system (D4).

a,b, 2D UMAP projections of embeddings from uncurated traces under different KCl concentrations in Atlas coordinates (a) and system-specific coordinates (b). c, TODPs of traces from the different KCl concentration conditions. d, FRET histogram of traces from the different KCl concentrations. e,f, 2D UMAP projections of the 10% of embeddings with lowest LSE from traces under the different KCl concentrations in Atlas coordinates (e) and system-specific coordinates (f). g, TODPs of 10% lowest LSE traces from the different KCl concentration conditions. h, FRET histograms of the 10% of traces with lowest LSE from the different KCl concentrations. i, FRET histograms of manually curated traces from the different KCl concentrations.

Extended Data Fig. 6 Single-molecule FRET characterization of the effects of the A577I mutation on the conformation of yeast Hsp 90.

a–c, The 2D UMAP projections of trace embeddings from the different experimental conditions in Atlas coordinates (a) and system-specific coordinates (b) and (c). d, FRET histograms of all traces from the different experimental conditions. e, 2D UMAP projections of the 20% of traces with lowest LSE under the different experimental conditions in system-specific coordinates. f, FRET histograms of the 20% of traces with lowest LSE from the different experimental conditions, more strongly highlighting the conclusion from the original study44 that, in the presence of ATP, both the A577I/A577I homodimer (5th column) and the A577I/wild-type (wt) hetero-dimer (6th column) lead to a shift toward closed conformations, corresponding to high-FRET states.

Extended Data Fig. 7 Single-molecule FRET characterization the effects of macromolecular crowding by Ficoll 400 on the conformation of yeast Hsp90.

a, The 2D UMAP projections of trace embeddings from the different experimental conditions in Atlas coordinates (a) and system-specific coordinates (b) and (c). d, FRET histograms of traces from the different experimental conditions. e, 2D UMAP projections of the 20% of traces with lowest LSE under the different experimental conditions in system-specific coordinates. f, FRET histograms of the 20% of traces with lowest LSE from the different experimental conditions, highlighting the conclusion from the original study8 that crowding by Ficoll 400 at increasing concentrations (1st, 2nd, 3rd columns) leads to progressively greater shifts towards closed conformations, corresponding to high-FRET states. While such a shift is present for sucrose at 30%, it is much less pronounced than for Ficoll 400.

Extended Data Fig. 8 Single-molecule FRET characterization of the effects of cochaperone Aha1 on the conformation of yeast Hsp 90.

a–c, The 2D UMAP projections of trace embeddings from the different experimental conditions in Atlas coordinates (a) and system-specific coordinates (b) and (c). d, FRET histograms of traces from the different experimental conditions. e, 2D UMAP projections of the 20% of traces with lowest LSE under the different experimental conditions in system-specific coordinates. f, FRET histograms of the 10% of traces with lowest LSE from the different experimental conditions, more strongly highlighting the conclusion from the original study44 that the presence of cochaperone Aha1 (labeled as ‘(+)Aha1+ATP’) lead to a shift toward closed conformations, corresponding to high-FRET states.

Extended Data Fig. 9 Full smFRET characterization of the yeast pre-mRNA splicing pathway.

a, Diagram of the splicing pathway. Experimental conditions used to block any further progress beyond specific steps in the pathway are annotated in orange font36. b, c, 2D UMAP projections of trace embeddings from the different experimental conditions in Atlas coordinates (b) and system-specific coordinates (c). d, TODPs of traces from the different experimental conditions. e, FRET histograms of traces from the different experimental conditions. f, g, 2D UMAP projections of the 10% of traces with lowest LSE under the different experimental conditions in Atlas coordinates (f) and system-specific coordinates (g). h, TODPs of the 10% of traces with lowest LSE from the different experimental conditions. i, FRET histogram of the 10% of traces with lowest LSE from the different experimental conditions.

Extended Data Fig. 10 Distribution of the 10% of traces with lowest LSE across the different experimental conditions in the splicing study.

The total number count and fraction of traces from each experimental dataset that are among those with the 10% lowest LSE values is different, indicating that certain conditions exhibit a larger fraction of highly condition-specific traces.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Zhang, L., Johnson-Buck, A. et al. Foundation model for efficient biological discovery in single-molecule time traces. Nat Methods (2025). https://doi.org/10.1038/s41592-025-02839-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41592-025-02839-4

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing