Abstract
The Network for Advanced NMR (NAN) is a novel distributed resource that connects Nuclear Magnetic Resonance (NMR) facilities via a scalable cyberinfrastructure supporting NMR data harvesting, interactive data management, and the discovery of instruments, methods, and data to enable emerging data standards in biomedicine, chemistry, and material science. Anchored by the first open-access 1.1 GHz instruments in the USA, NAN integrates NMR facilities around a centralized hub for identity management, resource discovery, and access control. The system includes automated data harvesting through the NAN data transport system (NDTS), metadata-rich data archiving, and interactive web-based tools for data and metadata browsing, editing, and publishing, as well as tools for facility and laboratory data management by facility managers and principal investigators. NAN knowledgebases provide vetted, standardized pulse programs, protocols, parameters, and example datasets, along with processed data. Supported by the US National Science Foundation Midscale Research Infrastructure program, NAN helps to democratize access to NMR resources and fosters open, reproducible science.
Similar content being viewed by others
Data availability
The NAN resource is available as a web portal (https://usnan.nmrhub.org), which provides interactive access to the data browser, sample browser, and associated management tools. All datasets originate from community users of NAN and are automatically harvested from NAN nodes and archived with rich metadata in the repository (see Data Harvesting). Publicly available datasets can be searched, filtered, and downloaded through the NAN data browser (see Data and Sample Browsers). Users without a NAN account can access all public datasets via the “Public & Knowledgebase Datasets” view located on the Resource Connector (https://usnan.nmrhub.org/resource-connector/public-datasets). Datasets become public three years after harvesting unless released earlier by the investigator, and immutable published versions are assigned persistent identifiers and are distributed under a Creative Commons Attribution (CC BY) license to support citation and reuse (see Publishing & Public Data). Access to embargoed non-public datasets is governed by PI controlled permissions (see Accounts & Permissions). In addition to the interactive web portal, NAN provides programmatic access through a Python software development kit (SDK) and RESTful API, enabling automated queries, dataset downloads, and integration into external workflows.
Code availability
As a research infrastructure system, most NAN software has little relevance for individual investigators and therefore is not released publicly. The exception is the Python SDK, which is available on GitHub (https://github.com/NanNMR/PythonSDK) and on the Python Package Index (https://pypi.org/project/usnan/). This SDK provides programmatic access to the system. Internal components such as the NDTS spectrometer components, gateway software, receiver and parser services, along with back-end APIs are restricted for security reasons but may be made available to responsible research organizations with conditions. Organizations interested in deploying an instance of NAN cyberinfrastructure should contact the corresponding author.
References
Jonas, J. & Gutowsky, H. S. NMR in Chemistry–An Evergreen. Annu. Rev. Phys. Chem. 31, 1–28 (1980).
National Research Council. High Magnetic Field Science and Its Application in the United States: Current Status and Future Directions (2013).
Judge, M. T. & Ebbels, T. M. Problems, principles and progress in computational annotation of NMR metabolomics data. Metabolomics 18, 102 (2022).
Hoch & Stern, J. C. Alan S. in. NMR Data Processing. (Wiley, New York, 1996).
Verdi, K. K., Ellis, H. J. & Gryk, M. R. Conceptual-level workflow modeling of scientific experiments using NMR as a case study. BMC Bioinformatics 8, 31 (2007).
Borges, J. L. The library of Babel. Collected fictions (1998).
Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3, 160018 (2016).
Maciejewski, M. W. et al. NMRbox: a resource for biomolecular NMR computation. Biophys. J. 112, 1529–1534 (2017).
Hoch, J. C. et al. Biological magnetic resonance data bank. Nucleic Acids Res. 51, D368–D376 (2023).
Haak, L. L., Fenner, M., Paglione, L., Pentz, E. & Ratner, H. ORCID: a system to uniquely identify researchers. Learned publishing 25, 259–264 (2012).
Gormley, C. & Tong, Z. in Elasticsearch: the definitive guide: a distributed real-time search and analytics engine (“O’Reilly Media, Inc.”, 2015).
Sud, M. et al. Metabolomics Workbench: An international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools. Nucleic Acids Res. 44, D463–D470 (2016).
Kunze, J. & Rodgers, R. The ARK identifier scheme. (2008).
Stonebraker, M. & Rowe, L. A. The design of Postgres. ACM Sigmod Record 15, 340–355 (1986).
Moreau, L., Groth, P., Cheney, J., Lebo, T. & Miles, S. The rationale of PROV. J. Web Semant. 35, 235–257 (2015).
Acknowledgements
NAN is supported by the U.S. National Science Foundation (NSF) through the Mid-scale Research Infrastructure-2 program (Grant Number 1946970) and an Operations and Maintenance grant for sustaining the NAN cyberinfrastructure (Grant Number 2529058).
Author information
Authors and Affiliations
Contributions
Conceptualization of NDTS and the NAN portal was carried out by M.W.M., J.C.H. & K.H.W. Technical requirements were defined by M.W.M., J.C.H., K.H.W., Y.P., S.T., M.R.G., M.P.W., I.I.M. & A.L.P. NDTS and web-portal software components were developed by C.B., J.R.W., G.W. & H.B. Validation and testing were performed by M.W.M., J.C.H., K.H.W., Y.P., S.T., M.R.G., A.P., B.S., L.M., J.N.G., M.U., A.E., A.E.M., J.H.G., A.L.P., S.W., P.R.P. and B.H.V. The original draft was prepared by M.W.M., J.C.H., C.B., and K.H.W. with all authors contributing to reviewing and editing. Project administration was undertaken by Q.C., J.C.H., M.W.M., K.H.W., A.S.E., C.M.R., L.M. and B.H.V.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hoch, J.C., Henzler-Wildman, K., Edison, A.S. et al. Scalable cyberinfrastructure for experimental NMR data. Sci Data (2025). https://doi.org/10.1038/s41597-025-06446-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-06446-y


