Innovative molecular networking analysis of steroids and characterisation of the urinary steroidome

Chen, Ting; Massias, Justine; Bertrand, Samuel; Guitton, Yann; Le Bizec, Bruno; Dervilly, Gaud

doi:10.1038/s41597-024-03599-0

Download PDF

Data Descriptor
Open access
Published: 24 July 2024

Innovative molecular networking analysis of steroids and characterisation of the urinary steroidome

Scientific Data volume 11, Article number: 818 (2024) Cite this article

4061 Accesses
3 Citations
7 Altmetric
Metrics details

Subjects

Abstract

Steroids are cholesterol-derived biomolecules that play an essential role in biological processes. These substances used as growth promoters in animals are strictly regulated worldwide. Targeted assays are the conventional methods of monitoring steroid abuse, with limitations: only detect known metabolites. Metabolism leads to many potential compounds (isomers), which complicates the analysis. Thus, to overcome these limitations, non-targeted analysis offers new opportunities for a deeper understanding of metabolites related to steroid metabolism. Molecular networking (MN) appears to be an innovative strategy combining high-resolution mass spectrometry and specific data processing to study metabolic pathways. In the present study, two databases and networks of steroids were constructed to lay the foundations for the implementation of the GNPS-MN approach. Steroids of the same family were grouped together, nandrolone and testosterone were linked to other analogues. This network and associated database were then applied to a few urine samples in order to demonstrate the annotation capacity in steroidome study. The results show that MN strategy could be used to study steroid metabolism and highlight biomarkers.

Joint metabolomics and transcriptomics analysis systematically reveal the impact of MYCN in neuroblastoma

Article Open access 30 August 2024

A versatile toolkit for drug metabolism studies with GNPS2: from drug development to clinical monitoring

Article 08 September 2025

Identification of novel neuroblastoma biomarkers in urine samples

Article Open access 18 February 2021

Background & Summary

Steroid hormones are non-polar compounds with a perhydro-1, 2-cyclopentanophenanthrene ring system. Based on their specific chemical structures, they can be divided into androgens (C19), estrogens (C18), glucocorticoids (C21), and mineralocorticoids (C21) (Fig. 1). Endogenous steroids, e.g. testosterone, are naturally produced by the organism and play a significant role in physiological activities, such as physical development, metabolic homeostasis, and sexual maturation. For specific applications (e.g. medical, veterinary, etc.), certain steroids are the subject of chemical synthesis, either the same structures as those naturally produced or strictly exogenous variants^1,2,3,4. In particular, endogenous and exogenous steroids have been used as anabolic agents with benefits on growth in animal husbandry over the past 60 years. The steroids of interest in breeding are mainly androgens (e.g. testosterone, nandrolone, boldenone) or estrogens such as estradiol. The presence of residues of these substances in foodstuffs from treated animals raises health questions for the consumer since steroids are endocrine disruptors^5,6,7,8,9. For these reasons and because of the risk assessment, notably carried out at the global level by the Joint FAO/WHO Expert Committee on Food Additives (JECFA), measures have been taken in various regions of the world going as far as banning its use in Europe in particular. JECFA has defined an Acceptable Daily Intake (ADI) of 0–2 µg/kg bw/d and 0–0.05 µg/kg bw/d for testosterone and estradiol (52nd JECFA, 1999), respectively, as well as Maximum Residue Limits (MRL) in certain animal tissues. To comply with these provisions, control measures, in particular based on analytical strategies, are put in place by analytical laboratories to monitor the presence of steroids or their metabolites in animal products^10,11.

Conventional analytical strategies to monitor steroids are targeted analyses relying on Gas Chromatography Mass Spectrometry (GC-MS) or Liquid Chromatography Mass Spectrometry (LC-MS). The targeted methods can monitor the parent compounds or their direct metabolites, e.g. phase I or phase II metabolized compounds^{5,7,12,13,14,15,16,17,18}. If these strategies are very effective, the fact remains that they come up against certain limits, such as the administration of unknown synthetic steroids or the use of low-dose steroid cocktails, these two doping strategies put at risk the selectivity and sensitivity properties of current approaches. It is in this precise context and to guarantee a high level of food safety that the search for new strategies to optimize control performance takes place^19,20. It is known that the administration of steroids leads to an overall modification of the urinary steroidome, which is composed of hundreds of steroidal compounds whose profiles (qualitative and quantitative) are inter-correlated. This is why it may be interesting to explore this specific chemical space in a more comprehensive way in order to reveal new biomarker candidates to detect a greater number of compounds and their effects over a wider detection window^21,22. As steroids represent a large family of compounds and their metabolites are just as numerous, the administration of one steroid has consequences on the profile of others. The main challenge is still to improve the detection performance further to find more new metabolites related to steroid metabolism. Hence, the global strategy to study the relationships between lots of steroids from different families is very interesting.

Innovative approaches based on tandem mass spectrometry data were recently developed to point out structural similarity among detected ions, such methodology is Molecular Networking (MN), which has become a popular tool in the analysis of Tandem Mass Spectrometry (MS/MS)-based metabolomics studies^23,24,25,26. As MN organizes chemically similar compounds into clusters and reveals the relationship between molecules, such as a shared sub-structure between metabolites, it can facilitate the recognition of patterns at a chemical family level and also enhance the structural characterization of multiple connected metabolites. The workflow of MN is shown in Fig. 2. Global Natural Product Social Molecular Networking (GNPS, https://gnps.ucsd.edu), which is a platform that serves both as data analysis with MN and data repository as well as library capabilities, is currently the only online infrastructure that provides MN^{25,27,28,29,30}. Recently, GNPS-MN has become an essential toolbox for metabolomics-type studies. Indeed, it compares pairs of MS/MS spectra based on their similarities and connects them to MS/MS reference spectral libraries. Then, MN allows further propagation of annotations via mass spectral relationships. Visualization of molecular networks in GNPS represents each spectrum as a node, with spectrum-to-spectrum alignments as edges between nodes. The clusters generated in the molecular networks could then bring together metabolites belonging to the same chemical families or sharing closely related chemical structures, which could apply to steroids and thus help to study new steroid metabolites. GNPS-MN was first introduced to analyse LC-MS data using a method referred to as classical molecular network. Then, Feature-Based Molecular Networking (FBMN) was developed to improve the classical approach by incorporating MS1 information, which can distinguish isomers that might have remained hidden and facilitate spectral annotation^29,31. Subsequently, GNPS-MN was introduced into GC-MS data, with the workflow including GC-MS deconvolution, alignment and mass spectral library matching²⁷.

The strategy of MN thus appears to be an interesting approach for globally studying compounds such as steroids, whose chemical structures are close since they are derived from a parent hydrocarbon structure derived from cholesterol³². Steroids are thus expected to provide similar fragmentation patterns that may be used to create networks using a MN approach. Currently, to our knowledge, no research has been carried out on urinary steroid profiles based on MN strategy. Meanwhile, MN carried out on GNPS has shown good application prospects in metabolomics studies. The aim of this study was, therefore to explore the MN-based strategy to enable further exploration of steroid metabolism. The first step in such a study is to construct a large (n > 120 steroids) database forming a steroidal network before applying it to samples of interest.

Methods

Reference substances and preparation

All reference steroids (n = 88) were acquired from steraloids (Newport, RI, USA), including epi-19-nortestosterone, 19-nortestosterone, 19-norandrosterone, 19-noretiocholanolone, 19-nortestosterone sulfate, estradiol, testosterone, androstenedione, 5β-estran-3α, 17α-diol, 5α-estran-3β, 17α-diol, 5α-estran-3α, 17β-diol, cortisone, dexamethasone, prednisolone, etc. Each steroid stock solution was prepared at 100 µg/mL in ethanol. The working standard solutions were prepared by the dilution of stock standard solutions in methanol. The preparation of the reference compounds for LC-HRMS involved analysis of 88 steroid reference compounds. The working solution was 10 µg/mL, and the injection volume was 5 µL. For GC-HRMS analysis, 27 steroid reference compounds have been prepared in a working solution at 10 µg/mL. These 27 steroids were then derivatized 40 min at 60°C with 20 µL MSTFA-TMIS-DTE (1000:5:5, v/v/w). Afterwards, the compounds were ready for injection. All of the solutions were stored in amber glass vials at −20 °C.

Urine samples and preparation

The urine samples were collected from bovine treated with steroids and already stored at the biobank of the laboratory at −20 °C. Bovine urine samples were prepared as follows and according to previously publications^14,33. Briefly, 10 mL of urine samples were added with 10 ng 17β nandrolone-d₃ and 17β-estradiol-d₃, 1 mL acetate buffer (2 M, pH 5.2) and 200 µL β-glucuronidase from purified Helix pomatia (Sigma-Aldrich, St. Quentin Fallavier, France). Hydrolysis was performed over 15 h at 52 °C. Urine samples were centrifuged (10 min, 1000×g) before purification on SPE Envi-ChromP. Then, cartridges were conditioned with 6 mL ethyl acetate, 6 mL methanol and 6 mL water. The extraction was applied to the column. The phase was washed with 3 mL water and then 2 mL hexane. The high vacuum was applied before and after each washing. Steroids were eluted with 14 mL hexane/diethyl ether (70:30, v/v), which were evaporated to dryness under a gentle stream of nitrogen. After hydrolysis, 0.5 mL of 1 M of NaOH was added. Liquid/liquid extractions in an alkaline medium phase performed with 4 mL hexane/diethyl ether (70:30, v/v) permitted the extraction. The extraction was centrifuged for 1 min, 700 g. The residue supernatant was applied onto an SPE silica column conditioned with 6 mL hexane. The column was washed with 3 mL hexane/ethyl acetate (75:25, v/v) and then 8 mL hexane/ethyl acetate (85:15, v/v). The analytes were eluted with 20 mL hexane/ethyl acetate (60:40, v/v), which were evaporated to dryness under a gentle stream of nitrogen. After that, 10 ng norgestrel (Sigma-Aldrich) was added as an external standard. Finally, the urine samples were derivatized for 40 min at 60 °C with 20 µL MSTFA-TMIS-DTE (1000:5:5, v/v/w). Two µL samples were injected into the GC-Q- Exactive.

Chemicals and reagents

All reagents and solvents used in this study were of analytical grade unless otherwise specified. Acetonitrile was purchased from Honeywell Chromasolv (Bucharest, Romania). Formic acid (eluent additive for LC-MS) was acquired from LGC Standards GmbH (Wesel, Germany). Isotope-labeled internal standards, namely, L-lecine-5,5,5-d₃, L-tryptophan-2,3,3-d₃, indole-2,4,5,6,7-3-acetic acid, and 1,14-tetradecanedioic-d₂₄ acid were purchased from sigma-Aldrich (Saint Quentin Fallavier, France). MSCAL6 Proteo Mass LTQ/FT-Hybrid, standard mixtures used for calibration of the MS instrument (positive, negative ionization mode) were obtained from Sigma-Aldrich (Saint Fallavier, France). Ultra-pure water was acquired from VWR (Fontenay-sous-Bois, France). Envi-ChromP and silica (0.5 g and 1 g stationary phase, respectively) solid-phase extraction (SPE) cartridges were from Carlo Erba Réactifs SDS (Val de Rueil, France). Derivatisation reagents N-methyl-N-(trimethylsilyl)-trifluoroacetamide (MSTFA), dithio-threitol (DTE) and trimethyliodosilane (TMIS) were purchased from Acros Organics (Geel, Belgium). The internal standards used were 17β-nandrolone-d₃ (17β-NT-d₃), 17β-estradiol-d₃ (17β-E2-d₃) from NARL reference materials (Pymble, Australia).

Data acquisition

88 steroid reference compounds were characterized on LC-Q–Exactive and the analysis was performed according to previously reported LC-based methods for steroids analysis^20,33,34. The chromatographic separation was performed on an Acquity UPLC® System from Waters®, C18 column (Acquity UPLC® BEH C18, 2.1 × 100 mm, 1.7 mm; Waters®). Separations were carried out at 50 °C under gradient elution conditions. Elution solvents were 0.1% formic acid in water (A) and 0.1% formic acid in acetonitrile (B). The mobile phase was supplied at a flow rate of 0.6 mL/min. Two gradient programs were used for analysing different kinds of steroids: one gradient for characterizing steroid esters and progestogens standards, while another gradient program was for characterizing androgens as well as estrones and others. The elution gradient (A/B, v/v) of androgens, estrones characterization was as follows: 95:5 between 0 and 0.3 min, 57:46 at 9.6 min, 0:100 from 13.5 to 15.5 min, and 95:5 from 16 to 19.5 min. At the same time, the elution gradient (A/B, v/v) for characterizing steroid esters and progestogens standards was as follows: 50:50 from 0 to 2 min; 90:10 at 9.6 min, 0:100 from 13.5 to 15.5 min, 50:50 from 16 to 19.5 min. The acquisition was performed on an Exactive–Exactive system (Thermo Fisher Scientific, Bremen, Germany) in positive and negative electrospray ionization mode (ESI + /–). The spectrometric parameters are as follows: spray voltage 3.0 kV, capillary temperature 350 °C, sheath gas flow rate 55 U, gas flow rate 10 AU, sheath gas flow rate 55 AU, and the column oven temperature 50 °C. The MS settings were described as follows: The full scan mass spectra were acquired from 66.7 to 1000 m/z with a mass resolution of 70 000 in centroid mode, maximum inject time 200 ms, 1e6 AGC target and the 0 to 19.5 min runtime. The integrated Xcalibur software version 4.1 was used for data acquisition and analysis. Background ions can be evaluated and put onto an “exclusion list” to increase the likelihood of obtaining MS2 spectra. Additionally, a separate injection acquired by data-dependent MS2 (DDMS2) for a set of precursor ions (exclusion lists) in the positive or negative was performed to collect more diagnostic MS2 spectra. The parameters of DDMS2 were as follows: 17 500 resolution, 1e5 AGC target, maximum inject time 60 ms, loop count 5, Isolation window 1.0 m/z, collision energy 10, 30, 60; dd setting (Minimum AGC target 6.00e3, Intensity threshold 1.0e5, Charge exclusion 3–8, >8, dynamic exclusion 3.0 s).

Twenty-seven steroid reference compounds were also characterized using a GC-EI-Q–Exactive, according to analytical parameters already reported^15,35,36,37. In this sense, 1 µL of derivatized steroid standards were injected into the GC injector at 250 °C by the robotic arm TriPlu^TM RSH autosampler (Thermo Scientific^TM, Bremen, Germany). A flow rate of 5 mL/min of helium in split flow (20.0 mL/min) was applied. The chromatographic separation was performed on a TRACE^TM 1310 gas chromatography instrument (Thermo Scientific^TM, Austin, TX, USA). Helium carrier gas at a flow rate of 1 mL/min was applied for the separation on OPTIMA 5-MS Accent column (30 m length 0.2×5 mm i.d. 0.25 μm, SN 1205365). The temperature gradient program was set as follows: the temperature was initially held at 120 °C for 2 min, a 15 °C/min ramp rate was applied up to 320 °C, and the final temperature of 320 °C was held for 13 min, and the total run time was 20 min. Eluting peaks were transferred through an auxiliary transfer temperature of 320 °C into a Q Exactive^TM-GC mass spectrometer (Thermo Scientific^TM, Bremen, Germany). Electron ionization (EI) at 70 eV energy and emission current of 50 µA was used as ionization mode, setting an ion source temperature of 300 °C. Data was acquired in full scan mode and PRM mode with a mass range of m/z 66.7–1000 and a resolving power of 120 000 at m/z 200. AGC Target was set at 5 × 10⁵ ions with an automatic filling limit and maximum IT at 500 ms. The instrument was controlled by Xcalibur software version 4.1.

The urine samples were analysed under the same conditions as those applied for the characterisation of steroid reference compounds on the GC-HRMS platform.

Database constitution

A database of 88 steroid reference compounds acquired on LC-Q-Exactive was generated, including molecular formula, exact mass, retention time and detected ion in positive and negative modes. Xcalibur version 4.3 software was used to integrate chromatographic peaks acquired in full scan mode with a mass tolerance of 5 ppm. The workflow has been developed and relies on a combination of open bioinformatics tools, including MZmine (https://github.com/mzmi ne/mzmine2/releases)^38,39,40, Cytoscape (Version 3.8, https://cytoscape.org/)⁴¹ and GNPS Web-platform²⁸, the workflow of database and molecular networking constitution is shown in Fig. 3. Statistical analysis is done predominantly from MS1-based peaks with a specific accurate mass-to-charge (m/z) ratio, described as features of MS-based metabolomics studies. Feature-based molecular network (FBMN) can link MS1 intensities derived from LC-MS features with MS2 information from molecular networking, which bridges the gap between MS1 abundance and MS2 qualitative information compared to classical molecular networking⁴². So FBMN job was performed on the GNPS platform (https://gnps.ucsd.edu). The database constitution procedure of 88 steroid standards, which were acquired on LC-Q–Exactive is as follows. Firstly, the 88 standards files in. raw were converted to centroided.mzML format using MSconvert (version 3.0.20248, http://proteowizard.sourceforge.net/downloads.shtml)⁴³. Secondly, a feature detection and alignment tool, MZmine (version 2.53) was used to process the data, which allowed the annotation of steroid isomers⁴⁰. The step allowed the detection of the spectral features across the chromatographic fractions, so Targeted Feature Detection was used to detect all the features. The parameters are as following: mass detector = Centroid, intensity tolerance = 50, m/z tolerance = 0.05 & 10.0 ppm, retention time tolerance = 2.0 min, algorithm = Wavelets (ADAP), m/z centre calculation = MEDIAN, m/z range for MS2 scan pairing = 0.1 Da, RT range for MS2 scan pairing = 0.2 min, S/N threshold = 10, S/N estimator = intensity window SN, feature height = 1 min, area threshold = 15, peak duration range = 0.02–1.0 min, RT wavelet range = 0.01–0.02, m/z tolerance = 0.001 & 5.0 ppm, retention time tolerance = 0.05 min. Then two files were exported: a feature quantification table (.CSV format) and a MS2 spectral summary (.MGF format). The feature quantification table contains information about features of 88 steroid standards including retention time, intensity, m/z value and feature ID (unique identifier) for each feature. The MS2 spectral summary contains a list of MS2 spectra, with one representative MS2 spectrum per feature. Thirdly, the data was achieved by analysing MS/MS data on the GNPS Web platform²⁸. The parameters of FBMN are as follows: precursor ion mass tolerance = 0.02 Da, fragment ion mass tolerance = 0.02 Da, minimum cosine score = 0.5, and maximum analogue search mass difference = 100. The analogue search mode was used by searching against MS/MS spectra with a maximum difference of 100 in the precursor ion value. The resulting network was filtered based on edges, and the edges between two nodes were retained in the network only if each node appeared in the respective top 10 most similar nodes of the other node. The spectra in the network were compared against GNPS spectral libraries and our library built in this study. The molecular network job is available on the GNPS platform: https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task = 1b18a9f9c1e649e59deb186ffa020c49. Finaly visualization and analysis of GNPS-generate molecular networks can be performed in Cytoscape (Version 3.8), open-source software for visualizing complex networks⁴¹. All the dereplicated steroid standards were assigned within the network based on LC-Q-Exactive, MS2 and retention time matching. Adduct connection was manually introduced in the network to provide an ion identity molecular network⁴⁴.

A database was also generated for data acquired on GC-Q-Exactive including retention time, MS, and MS2 information extracted manually by using Xcalibur version 4.3. The 27 standards files in. raw were also converted to centroided.mzML format using MSconvert (version 3.0.20248). FTP client (WinSCP for Windows) was used to upload data files to MassIVE, a public repository. A repository scale analysis infrastructure for GC-MS data enables the creation of networks within the GNPS molecular networking platform. The community infrastructure can be accessed at https://gnps.ucsd.edu, under the heard “GC-MS EI Data Analysis”. MS Hub algorithms (https://gnps.ucsd.edu/ProteoSAFe/static/gnps-splash.jsp) were used for data processing deconvolution. Then, the.mgf file was generated by GNPS-MSHub, including the deconvoluted spectra with aligned retention time and feature table of the peak of the feature across all files, and then searched against the public libraries and private libraries generated in this study. The MSHUB-GC job and molecular network job are available on the GNPS platform: https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=1c23aec9a10848d3ae24c8554fd81071#; https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=6ec04b438b444203b00edc096fa11015. The molecular network was also visualized in the Cytoscape (Version 3.8, https://cytoscape.org/)⁴¹.

Molecular network construction based LC-HRMS database

The database of LC-HRMS involves formulae, accurate mass, retention time, MS1 and MS2 spectra of 88 reference steroid compounds, which correspond to 198 spectra due to multiple detected adducts-producing spectra. These steroids could be separated from 4.6 to 14.56 min on LC-Q-Exactive Most of the steroids could be detected in positive mode, especially [M + H]⁺, [M–2H₂O + H]⁺ and [M–H₂O + H]⁺, such as 19-nortestosterone, androsterone, estradiol and boldenone. The group of testosterone steroids can be only detected in [M + H]⁺, e.g., epi-testosterone, testosterone benzoate, testosterone decanoate and 4-chlorotestosterone. As for some androgens, the known metabolites of 17β-nandrolone could be detected in [M–H₂O + H]⁺ and [M–2H₂O + H]⁺, including estranediols, 19-norandrostendione and 19-norandrosterone. Six steroids could be detected in both positive mode and negative mode, 19-nortestosterone sulfate, 19-nortestosterone glucuronide, estradiol 3-glucuronide, 17-sulfate, estradiol 17-sulfate, estradiol 3-glucuronide, estradiol 17-glucuronide. Furthermore, one interesting point is that estradiol 3-glucuronide, 17-sulfate and estradiol 17-glucuronide can be only detected in negative mode, [M–2H]^2–, [M–H]^–, and [M–Na]^–, [2M-Na]^–, respectively. Thus, the database was used to construct the molecular network. Figure 4 describes the molecular network involving 88 reference steroids presented in circular-shaped nodes. The molecular network was constructed based on the FBMN job, which was performed on GNPS. This molecular network was built on the fundamental observation that structurally related molecules share fragment ion patterns when subjected to MS2 fragmentation methods. Structurally related molecules yield comparable MS2 spectra due to the commonalities in their structures and are represented by separate nodes that connect within the network via edges. In the molecular network, each cluster can be found with the same numbering in the.mgf file, and each node has its spectra information and identification. The compound annotation of this molecular network is based on the steroid standard database generated in this study.

Because MN allows for the identification of molecular families corresponding to clusters in the network, the spectral similarity is calculated through the cosine score. In this study, the steroid compounds can be observed depending on the relationship between structures in the global molecular network. Overall, the obtained molecular network tended to cluster according to the steroid family. It can thus be observed in Fig. 4 that the androgens (C19 structures) in pink are grouped, the corticosteroids (C21 structures) are also grouped in the blue cluster, and the estrogens (C18 structures) are grouped in the green cluster. Some steroid esters are also highlighted in purple nodes, some of which are not linked to each other because they are grouped with the classification of the steroids, e.g. androgen steroids or estrogen steroids. At the bottom, there are also several single nodes because of their unique adduct. Most of the steroids are detected in positive mode, especially in [M + H]⁺, and several steroids can be detected in negative mode, for example, estradiol, 17-glucuronide [M–H]^–, estrone, 3-sulfate [2 M + Na–2H]^–. Figure 5 shows the comparison between all the steroids detected in various adducts, including androgen, corticosteroid, estrogen and steroid ester. Most of the steroids were detected in adducts with [M + H]⁺, [M–2H₂O + H]⁺ and [M–H₂O + H]⁺. Illustrative chemical representatives of the discussed clusters are displayed in Fig. 6. As an example, Fig. 6A shows the androgens cluster, the structures and the detected adduct of epi-19-nortestosterone, 19-nortestosterone, 19-norandrostendione, epi-testosterone sulfate and epi-testosterone glucuronide are highlighted. These compounds are grouped in one cluster because their fragmentation leads to similar patterns. An interesting example is that of epi-testosterone glucuronide, and it could be detected in four adducts, [M + H]⁺, [M–C₆H₈O₆ + H]⁺, [M–C₆H₈O₆–H₂O + H]⁺ and [M + Na]⁺. As expected, the molecular network cluster of 19-nortestosterone allowed the dereplication of previously known metabolites, including 19-noretiocholanolone, epi-testosterone, 19-norandrostendione and 19-norandrosterone. For illustration purposes, another two examples are in Fig. 6B and 6C, estradiol 3-glucuronide, 17-sulfate and estradiol 17-glucuronide could only be detected in positive mode and linked with estradiol 17-glucuronide in the cluster. Dexamethasone is also an interesting compound, which can also be detected in different adducts and form into a cluster with other corticosteroids, such as cortisone, prednisolone, and prednisone. Molecular families are clusters of molecules whose structures are expected to be similar to each other, which gives multiple advantages to further metabolomics data analysis.

Molecular network construction based GC-HRMS database

The database of GC-HRMS contains the formulae, retention time, SMILES, MS spectra and the molecule mass of steroids with two TMS of 27 reference steroids. These steroids could be separated from 13.2 to 15.16 min on GC-Q-Exactive. The majority of GC-MS platforms used in metabolomic studies operate with EI ionisation, and the identification process involves, in addition to the retention time, the specificity associated with the fragmentation process, which depends on the molecular structure⁴⁵. Early MN strategies were developed focusing on LC-ESI-MS data, which differ significantly from traditional GC-EI-MS data. Recently, algorithmic auto-deconvolution of GC-MS data has enabled molecular networking with GNPS. In the GC-MS molecular networking based on GNPS, an essential step for GC-MS data processing, the multiple signals from these highly fragmented spectra must be deconvoluted into individual analyte EI-MS^27,46. In analysing all GC-MS data, spectral deconvolution is the process of separating fragmentation ion patterns for each eluting molecule into a composite mass spectrum. Annotation of GC-MS data is achieved by matching the deconvoluted fragmentation spectra against reference spectral libraries of known molecules. Currently, GC-EI-MS has significantly higher resolution in chromatography and higher reproducibility in ionization, which allows good prospects for applying GC-GNPS in non-targeted approaches. Therefore, the GC-GNPS approach was used to study a set of steroids in the present study. Figure 7 depicts the molecular network generated by GC-GNPS, involving 27 steroid reference compounds characterized on GC-Q-Exactive, which was visualized in Cytoscape. Each node in the molecular network represents a unique mass spectrometry feature obtained by spectral deconvolution. Since the precursor ion signal is absent on the mass spectrum obtained in GC-HRMS due to energetic ionization by EI and subsequent extensive fragmentation, a list of spectral matches is more likely to contain erroneous annotations (isomers or related isobars) when searching the fragmentation spectra against the GNPS GC libraries. Consequently, the annotations presented in this work were based on GNPS reference libraries and specific libraries (molecular ions, retention time and chemical ionization with MS spectra) generated as part of this study.

Because the molecular ion is absent in GC-EI data, the molecular networks are created through the spectral similarity of the deconvolved fragmentation spectrum without considering the molecular ion in the GNPS GC-MS pipeline workflow. The clustering patterns of the cosine similarity networks in the GC-EI database are mainly determined by structural similarity. In the molecular network of Fig 0.7, nandrolone and its known metabolites, e.g. 19-norandrosterone, 19-noretiocholanolone, 19-nroandrostendione and five estranediols (e.g. 5β-estrane-3α, 17β-diol; 5β-estrane-3α, 17α-diol) are clustered together. Since steroids comprise a diverse family with similar structures, and androgen steroids share common sub-structures, this explains why they are connected and clustered together. So, this molecular network further guides the annotation at the structural similarity and family level by utilizing information from connected nodes. The proposed molecular network and database provide new insight into investigating steroid metabolomics on GC-HRMS.

LC-HRMS and GC-HRMS Molecular Networks fusion

The fusion of the LC-MS/MS and GC-MS molecular networks was performed manually based on standard compound consistency. Formerly, nodes related to the same compound, analyzed using both methods were manually connected.

LC-HRMS and GC-HRMS data sharing and reuse

The generated HRMS and extracted MS/MS spectra in the present study have been made public through https://doi.org/10.57745/HZPEDR. All files are public and be opened with dedicated software (Thermo Xcalibur for.raw files and many others for mzML files (mzMine, MS-DIAL, R…). Spectral databases can be reused easily in identification software, such as MSsearch, EntropySearch or MZmine. The workflow used for data conversion and deposition is shown in Fig. 8.

Data Records

The data set is available at [recherche.data.gouv.fr]⁴⁷, including raw data characterized on both LC-Q-Exactive and GC-Q-Exactive, mzML files of associated raw data, two batches of steroid information, and two steroids databases. It can be accessed through the website link: https://entrepot.recherche.data.gouv.fr/dataset.xhtml?persistentId=doi:10.57745/HZPEDR or (https://doi.org/10.57745/HZPEDR).

Metadata

The two databases of steroid standards acquired on both LC-Q-Exactive and GC-Q-Exactive are recorded with a variety of details, including instrument details, chemical formula, adduct, polarity, retention time, accurate mass, MS1 and MS2 spectra, SMILES, etc. The metadata are available on recherche.data.gouv.fr, with https://doi.org/10.57745/HZPEDR.

Technical Validation

Validation of database compounds

Some of the most commonly used preparations in production animals include testosterone propionate in combination with estradiol benzoate or compounds such as 19-nortestosterone (17β-nandrolone) and boldenone. It is for these reasons that in the present study, these steroids were selected to carry out the proof of concept to evaluate the interest of molecular networks in characterizing their metabolisms. In order to consider the broadest possible steroid universe around these target compounds, we chose to construct a network with 88 steroid standards, including androgens, estrogens and corticosteroids, characterized by LC-HRMS. Additionally, 27 steroid standards were characterized on GC-HRMS. Figure 9 shows the number and relationship of steroid standards characterized by the two instruments. For example, 17β-nandrolone and its known metabolites in bovine urine, e.g. epi-nandrolone, 19-noretiocholanolone, norandrostenedione and estranediols, were both selected for study on both analytical platforms, depending on their physicochemical properties that make them amenable to both techniques. The analyses of steroid standards were carried out based on optimized analytical methods available in the laboratory. Structures, chemical formula, SMILES, and accurate mass were retrieved from the literature and Pubchem. The obtained mass spectra, exact mass, and retention time were inspected to verify the annotations of the molecular networks.

Network fusion

The network fusion approach refers to the integration of multiple types of metabolomics data, such as liquid chromatography and gas chromatograph mass spectrometry data or other omics data into a unified network. Network fusion in metabolomics offers a powerful approach to integrate and analyse diverse metabolomics data sources, which provides insights into the metabolic pathways and biomarkers findings⁴⁸. Consequently, the integration of both networks is particularly innovative, providing new insights into the metabolomics of steroids, in addition to reporting possible complementarity between both ionisations in a MN strategy perspective. Thus, after the generation of molecular networks using data from both instruments (using GNPS), based on the large dataset containing more than 120 steroid standards characterized on both LC-HRMS and GC-HRMS (Fig. 9). To go further in creating a more global annotation tool relevant for steroidome studies, the two networks were fused. The fusion of the LC-MS/MS and GC-MS molecular networks was performed manually based on standard compound consistency. Formerly, nodes related to the same compound, analysed using both methods were manually connected. The fused networks (Fig. 10) were based on the 22 steroid compounds, e.g., androgens and estrogens, characterized on both platforms, and the blue edges represent the connection between the LC and GC networks. This network fusion, as proposed in Fig. 10, highlights the complementarity between both data sources. As an example, 17β-boldenone and 19-norandrosterone, both androgens, are not connected through the LC-HRMS/MS network. In comparison, they are connected within the GC-MS network. Such complementarity is clearly in Fig. 10, where the GC-MS (triangle) nodes are groups and located in the middle of the global network in between two sub-clusters of the main androgen cluster. Due to estrogens e.g. estradiol and estrane-diols are the metabolites of 17β-nandrolone (androgen)^14,33, it is possible for estrogen compounds to share similar fragmentation patterns or mass spectral features with certain androgens, they may indeed be connected within the same molecular network. This result highlights the expected differences between the fragmentation mechanisms according to ESI/EI MS. Thus, grouping data from both origins may add additional connectivity and thus interpretability.

Data integration is a current challenge that aims to bridge the gap between multi-omics or different platforms in metabolomics, which generate vast amounts of data. The objective is to improve global understanding of biological systems by aggregating different data sets. Although the integration of data makes it possible to explore the scientific question, combining these types of data represents a huge challenge. In particular, each platform has its own characteristics, which complicates the integration process. Difficulties such as the diversity in the size of datasets and their format, as well as the lack of standardisation in data communication and storage, make it difficult to combine datasets from different suppliers and instruments. In addition, missing data, noise in different types of data, and the correspondence and correlation between measurements from different platforms can also create difficulties. In fact, not only standardised protocols and advanced data processing algorithms, but also new tools and software are emerging to address these challenges and difficulties, facilitating the integration of data from various platforms or sources. In this study, merging the networks was facilitated by the consistency of the steroid compounds characterised by the two instruments, which also came from the same supplier, so that the data was in a similar format.. The fused molecular network was constructed using an edge.

Database application to urine

Technical validation was also achieved by applying the database and molecular network to annotate steroids in a biological environment, bovine urine, a natural excretion matrix rich in steroids, was analysed by GC-HRMS. Raw data from urine samples were converted to.mzML files with MSconvert software. FTP client (WinSCP for Windows) was used to upload.mzML files to MassIVE. A repository scale analysis infrastructure for GC-MS data enables the creation of networks within the GNPS molecular networking interface, and the community infrastructure can be accessed at https://gnps.ucsd.edu. The GC-MS Hub platform was used for data processing deconvolution, and the.mgf file, including the deconvoluted spectra, was generated by GNPS. Then, the spectra list was sent to the library search to search against the GNPS public libraries and personal libraries generated in this study. The MSHUB-GC job and library search job are available on the GNPS platform, MSHUB-GC job: https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=7737ac63afc74820b57fcc480cf26639; Library search job: https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=6d0b0f1772a04da5847d0ee9a0272df7. A molecular network of urine samples was visualized in Cytoscape software.

MN methodology was then applied to the data acquired, as shown in Fig. 11. Figure 11A illustrates the overall molecular network of urine samples, while Fig. 11B is a zooming-in corresponding to the subnetwork of a given cluster, and Fig. 11C is the comparison of estradiol annotation spectra between the library and in the urine sample. This urine molecular network was generated by GC-GNPS library search after MSHUB deconvolution. An annotation of estradiol is highlighted after a search of the urine spectra with the library. The cosine similarity of estradiol between the library and urine sample is 0.9. This result shows the potential of the approach as a new tool for annotating steroid compounds in complex matrices. It also meets expectations in terms of identification in non-targeted steroidomics studies.

Due to the complexity of steroid metabolism in urine, many unknown metabolites could be formed. Non-targeted strategies based on LC-HRMS or GC-HRMS provide a new research perspective and opportunities for deeper investigation of steroid metabolomics and finding new biomarkers or interests. In the present study, two steroid standard databases were generated: 88 reference steroids characterized by LC-HRMS and 27 reference steroids characterized by GC-HRMS. The databases include retention time, formulas, adducts, etc. Then, GNPS served as a data analysis infrastructure with MN and was used to construct molecular networks of steroids in two databases respectively. The global steroid network of LC-HRMS data involves 198 detected adducts of steroids and steroids from the same family clustered together, e.g., androgens, estrogens, and corticosteroids. And the clusters of steroids from different families are separated. In the molecular network of GC-HRMS data, androgens and estrogens also formed two clusters. To our present knowledge, this is the first research using GNPS-MN to study steroid metabolomics. In addition, few urine samples were characterized on GC-HRMS, and the database generated in the current study were applied to investigate steroids in urine samples. An annotation of estradiol is highlighted, thanks to the database and technologies of GNPS, we have the capability to detect and annotate steroids in bovine urine samples. Therefore, based on the results of available studies, MN could be considered as a proof of concept of a new strategy to evidence steroid chemical structure similarities and could be used to investigate new steroid biomarkers of interest. Moreover, the two libraries of steroids are also regularly extended by the scientific community of new reference spectra, which could strengthen MS annotations.

Code availability

Each spectra in.mgf format were merged with an R script using the R 3.6.0 language, and it is publicly available on GitHub, https://github.com/Ting1217/Feature-Merge-R.

References

Massé, R., Ayotte, C. & Dugal, R. Studies on anabolic steroids. Journal of Chromatography B: Biomedical Sciences and Applications 489, 23–50 (1989).
Article Google Scholar
Puymbroeck, M. V. et al. Metabolites in feces can be important markers for the abuse of anabolic steroids in cattle†. Analyst 123, 2449–2452 (1998).
Article ADS PubMed Google Scholar
Samuels, T. P., Nedderman, A., Seymour, M. A. & Houghton, E. Study of the metabolism of testosterone, nandrolone and estradiol in cattle. The Analyst 123, 2401–4 (1998).
Article ADS CAS PubMed Google Scholar
Yamada, M., Kinoshita, K., Kurosawa, M., Saito, K. & Nakazawa, H. Analysis of exogenous nandrolone metabolite in horse urine by gas chromatography/combustion/carbon isotope ratio mass spectrometry. J Pharm Biomed Anal 45, 654–8 (2007).
Article CAS PubMed Google Scholar
Anizan, S. et al. Screening of 4-androstenedione misuse in cattle by LC-MS/MS profiling of glucuronide and sulfate steroids in urine. Talanta 86, 186–94 (2011).
Article CAS PubMed Google Scholar
Cloteau, C., Kaabia, Z., Le Bizec, B., Bailly-Chouriberry, L. & Dervilly, G. From targeted methods to metabolomics based strategies to screen for growth promoters misuse in horseracing and livestock: A review. Food Control 148, 109601 (2023).
Article CAS Google Scholar
Destrez, B. et al. Criteria to distinguish between natural situations and illegal use of boldenone, boldenone esters and boldione in cattle 2. Direct measurement of 17β -boldenone sulpho-conjugate in calf urine by liquid chromatography–high resolution and tandem mass spectrometry. Steroids 74, 803–808 (2009).
Article CAS PubMed Google Scholar
Le Bizec, B. et al. New anabolic steroid illegally used in cattle—structure elucidation of 19-norchlorotestosterone acetate metabolites in bovine urine. The Journal of Steroid Biochemistry and Molecular Biology 98, 78–89 (2006).
Article PubMed Google Scholar
Ouzia, S. et al. Nandrolone and estradiol biomarkers identification in bovine urine applying a liquid chromatography high‐resolution mass spectrometry metabolomics approach. Drug Testing and Analysis 14, 879–886 (2022).
Article CAS PubMed Google Scholar
Kind, T. & Fiehn, O. Strategies for dereplication of natural compounds using high-resolution tandem mass spectrometry. Phytochem Lett 21, 313–319 (2017).
Article CAS PubMed Google Scholar
Vogg, N. et al. Targeted metabolic profiling of urinary steroids with a focus on analytical accuracy and sample stability. J Mass Spectrom Adv Clin Lab 25, 44–52 (2022).
Article CAS PubMed PubMed Central Google Scholar
Anizan, S. et al. Targeted phase II metabolites profiling as new screening strategy to investigate natural steroid abuse in animal breeding. Analytica chimica acta 700, 105–13 (2011).
Article CAS PubMed Google Scholar
Blokland, M. H., Van Tricht, E. F., Van Rossum, H. J., Sterk, S. S. & Nielen, M. W. F. Endogenous steroid profiling by gas chromatography-tandem mass spectrometry and multivariate statistics for the detection of natural hormone abuse in cattle. Food Additives & Contaminants: Part A 29, 1030–1045 (2012).
Article CAS Google Scholar
Dervilly-Pinel, G. et al. 5alpha-Estrane-3beta,17beta-diol and 5beta-estrane-3alpha,17beta-diol: definitive screening biomarkers to sign nandrolone abuse in cattle? The Journal of steroid biochemistry and molecular biology 126, 65–71 (2011).
Article CAS PubMed Google Scholar
Pinel, G. et al. Elimination kinetic of 17beta-estradiol 3-benzoate and 17beta-nandrolone laureate ester metabolites in calves’ urine. The Journal of steroid biochemistry and molecular biology 110, 30–8 (2008).
Article CAS PubMed Google Scholar
Pinel, G., Rambaud, L., Monteau, F., Elliot, C. & Le Bizec, B. Estranediols profiling in calves’ urine after 17beta-nandrolone laureate ester administration. The Journal of steroid biochemistry and molecular biology 121, 626–32 (2010).
Article CAS PubMed Google Scholar
Scarth, J. P. et al. A review of analytical strategies for the detection of ‘endogenous’ steroid abuse in food production. Drug Test Anal 4(Suppl 1), 40–9 (2012).
Article CAS PubMed Google Scholar
Prévost, S., Nicol, T., Monteau, F., André, F. and Le Bizec, B. Gas chromatography/combustion/isotope ratio mass spectrometry to control the misuse of androgens in breeding animals: new derivatisation method applied to testosterone metabolites and precursors in urine samples. (2001).
Kaabia, Z. et al. Ultra high performance liquid chromatography/tandem mass spectrometry based identification of steroid esters in serum and plasma: an efficient strategy to detect natural steroids abuse in breeding and racing animals. Journal of chromatography. A 1284, 126–40 (2013).
Article CAS PubMed Google Scholar
Kaabia, Z., Laparre, J., Cesbron, N., Le Bizec, B. & Dervilly-Pinel, G. Comprehensive steroid profiling by liquid chromatography coupled to high it resolution mass spectrometry. J Steroid Biochem 183, 106–115 (2018).
Article CAS Google Scholar
Khodadadi, M. & Pourfarzam, M. A review of strategies for untargeted urinary metabolomic analysis using gas chromatography-mass spectrometry. Metabolomics 16, 66 (2020).
Article CAS PubMed Google Scholar
Pinel, G. et al. Targeted and untargeted profiling of biological fluids to screen for anabolic practices in cattle. TrAC Trends in Analytical Chemistry 29, 1269–1280 (2010).
Article CAS Google Scholar
Fox Ramos, A. E., Evanno, L., Poupon, E., Champy, P. & Beniddir, M. A. Natural products targeting strategies involving molecular networking: different manners, one goal. Nat. Prod. Rep. 36, 960–980 (2019).
Article CAS PubMed Google Scholar
Nothias, L.-F. et al. Bioactivity-Based Molecular Networking for the Discovery of Drug Leads in Natural Product Bioassay-Guided Fractionation. J. Nat. Prod. 81, 758–767 (2018).
Article CAS PubMed Google Scholar
Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat Biotechnol 34, 828–837 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Watrous, J. et al. Mass spectral molecular networking of living microbial colonies. Proc. Natl. Acad. Sci. USA 109, (2012).
Aksenov, A. A. et al. Auto-deconvolution and molecular networking of gas chromatography–mass spectrometry data. Nat Biotechnol 39, 169–173 (2021).
Article CAS PubMed Google Scholar
Aron, A. T. et al. Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nat Protoc 15, 1954–1991 (2020).
Article CAS PubMed Google Scholar
Frank, A. M. et al. Clustering Millions of Tandem Mass Spectra. J. Proteome Res. 7, 113–122 (2008).
Article CAS PubMed Google Scholar
Quinn, R. A. et al. Molecular Networking As a Drug Discovery, Drug Metabolism, and Precision Medicine Strategy. Trends Pharmacol Sci 38, 143–154 (2017).
Article CAS PubMed Google Scholar
Nothias, L. F. et al. Feature-Based Molecular Networking in the GNPS Analysis Environment. http://biorxiv.org/lookup/doi/10.1101/812404 (2019).
Moss, G. P. Nomenclature of steroids (Recommendations 1989). Pure and Applied Chemistry 61, 1783–1822 (1989).
Article CAS Google Scholar
Dervilly-Pinel, G. et al. Assessment of two complementary liquid chromatography coupled to high resolution mass spectrometry metabolomics strategies for the screening of anabolic steroid treatment in calves. Analytica chimica acta 700, 144–54 (2011).
Article CAS PubMed Google Scholar
Hernández-Mesa, M., Le Bizec, B., Monteau, F., García-Campaña, A. M. & Dervilly-Pinel, G. Collision Cross Section (CCS) Database: An Additional Measure to Characterize Steroids. Anal. Chem. 90, 4616–4625 (2018).
Article PubMed Google Scholar
L et al. Review: control of anabolic steroid in breeding animals: mass spectrometry, a powerful analytical tool. Chromatographia S3–S11 (2004).
Le Bizec, B., Montrade, M.-P., Monteau, F. & Andre, F. Detection and identification of anabolic steroids in bovine urine by gas chromatography—mass spectrometry. Analytica Chimica Acta 275, 123–133 (1993).
Article Google Scholar
Marchand, P., Le Bizec, B., Gade, C., Monteau, F. & André, F. Ultra trace detection of a wide range of anabolic steroids in meat by gas chromatography coupled to mass spectrometry. Journal of Chromatography A 867, 219–233 (2000).
Article CAS PubMed Google Scholar
Katajamaa, M., Miettinen, J. & Orešič, M. MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22, 634–636 (2006).
Article CAS PubMed Google Scholar
Olivon, F., Grelier, G., Roussi, F., Litaudon, M. & Touboul, D. MZmine 2 Data-Preprocessing To Enhance Molecular Networking Reliability. Anal. Chem. 89, 7836–7840 (2017).
Article CAS PubMed Google Scholar
Pluskal, T., Castillo, S., Villar-Briones, A. & Orešič, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC bioinformatics 11, 1–11 (2010).
Article Google Scholar
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research 13, 2498–2504 (2003).
Article CAS PubMed PubMed Central Google Scholar
Nothias, L. F. et al. Feature-based molecular networking in the GNPS analysis environment. Nat Methods 17, 905–908 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kessner, D., Chambers, M., Burke, R., Agus, D. & Mallick, P. ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics 24, 2534–2536 (2008).
Article CAS PubMed PubMed Central Google Scholar
Schmid, R. et al. Ion identity molecular networking for mass spectrometry-based metabolomics in the GNPS environment. Nat Commun 12, 3832 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Dunn, W. B. et al. Mass appeal: metabolite identification in mass spectrometry-focused untargeted metabolomics. Metabolomics 9, 44–66 (2013).
Article CAS Google Scholar
Stein, S. E. An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data. J. Am. Soc. Mass Spectrom. 10, 770–781 (1999).
Article ADS CAS Google Scholar
Ting C & Gaud, D. These high-resolution mass spectrometry data generated on LC-QExactive and GC-QExactive instruments where adquired from bovine urines in the context of the REC-19-STEROIDOM project. https://doi.org/10.57745/HZPEDR (2024).
Chierici, M. et al. Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling. Front. Oncol. 10, 1065 (2020).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We gratefully acknowledge the financial support of l’Alliance-Agreenium and the China Scholarship Council.

Author information

Authors and Affiliations

Oniris, INRAE, LABERCA, Nantes, 44307, France
Ting Chen, Justine Massias, Yann Guitton, Bruno Le Bizec & Gaud Dervilly
Nantes Université, Institut des Substances et Organismes de la Mer, ISOMER, UR 2160, F-44000, Nantes, France
Samuel Bertrand
Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000, Nantes, France
Samuel Bertrand

Authors

Ting Chen
View author publications
Search author on:PubMed Google Scholar
Justine Massias
View author publications
Search author on:PubMed Google Scholar
Samuel Bertrand
View author publications
Search author on:PubMed Google Scholar
Yann Guitton
View author publications
Search author on:PubMed Google Scholar
Bruno Le Bizec
View author publications
Search author on:PubMed Google Scholar
Gaud Dervilly
View author publications
Search author on:PubMed Google Scholar

Contributions

Ting CHEN: Data collection and curation, Formal analysis, Investigation, Writing-original draft, editing and submission. Justine MASSIAS: assisted in data collection. Samuel BERTRAND: assisted in data analysis and annotation process, Writing – review & editing, Supervision. Yann GUITTON assisted in data collection, analysis, and data deposit. Bruno LE BIZEC: Project administration, Funding acquisition. Gaud DERVILLY: Project administration, Funding acquisition, Writing – review & editing, Supervision.

Corresponding author

Correspondence to Gaud Dervilly.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, T., Massias, J., Bertrand, S. et al. Innovative molecular networking analysis of steroids and characterisation of the urinary steroidome. Sci Data 11, 818 (2024). https://doi.org/10.1038/s41597-024-03599-0

Download citation

Received: 11 April 2024
Accepted: 02 July 2024
Published: 24 July 2024
DOI: https://doi.org/10.1038/s41597-024-03599-0