Extended Data Fig. 3: Evaluation of annotation false discovery rate (FDR) and fraction gold-standard peaks annotated correctly using different reference databases.
From: Metabolite discovery through global annotation of untargeted metabolomics data

The four tested reference compound databases are HMDB (human metabolomics database), PBCM (short for PubChemLite.0.2.0, zenodo.org/record/3611238), PBCM_BIO (short for PubChemLite_BioPathway, a subset of biopathway related entries in PubChemLite.0.2.0) and YMDB (yeast metabolomics database). (a) False discovery rate estimated using target-decoy strategy. (b) Fraction of 314 manually curated ‘ground truth’ annotations made correctly. For A and B, each individual data point (circle) is from a different randomized decoy library. N = 10 randomized libraries were tested for each reference compound database. Boxes show median and IQR and whiskers extend to largest and smallest value no further than ±1.5 × IQR from hinge.