Fig. 2: SFC prioritizes dimensions governing continuity in the synthetic dataset DS1 compared with the GC.

For dataset DS1, SFC \(\left({w}_{{AB}}={w}_{{BA}}={w}_{{BC}}={w}_{{CB}}=0.05 \sim 0.5,{w}_{{AC}}={w}_{{CA}}=0\right)\) and the GC were constructed, and the feature importance outputs from each classifier were compared. A Violin plots showing the distribution of group data for each dimension in generated dataset DS1, which comprises 20,000 genes mimicking real disease data and exhibits continuity among groups A–C. B Scatter plot showing the group data distribution for genes 1 and 2 in DS1. C Feature importance of genes 1 and 2 as the outputs of the SFC and GC. D p-values calculated using Welch’s t-test based on the feature importance outputs from the SFC and GC. E Test data accuracy for each classifier. Mean ± SD, n = 5.