Extended Data Fig. 2: The scheme for the simulation of the datasets and explanation evaluation.

The profiles recorded the concentrations and the interactive logic of the “biomarker” molecules and the other molecules. The spectra were simulated based on the profile and the spectra dictionary. The classification labels were decided by the concentration threshold and interactive logics among biomarkers. Multiple interactive logics (such as ‘AND’, ‘OR’, ‘SOFT’, etc.) were used to simulate the different interaction and correlation among the molecules in various diseases. Recall and accuracy were evaluated by comparing the predictive biomarkers with the ground-truth biomarkers.