Table 2 Averaged normalized HV and IGD values of synthetic tasks (upper) and RE tasks (lower) in the Off-MOO benchmark, where the best and runner-up results on each task are highlighted by bold and underlined numbers.

	ZDT(n=2)		OmniTest(n=2)		DTLZ(n=3)		Mean Rank
	Avg. HV(↑)	Avg. IGD(↓)	Avg. HV(↑)	Avg. IGD(↓)	Avg. HV(↑)	Avg. IGD(↓)	HV Rank(↓)	IGD Rank(↓)
\({{\mathcal{D}}}_{\,\text{train}}^{\text{(best)}\,}\)(Preferred)	1.0 (1.118)	1.0 (0.0)	1.0 (1.056)	1.0 (0.0)	1.0 (1.098)	1.0 (0.0)	/	/
MM-NSGA2(k=1)	0.888 ± 0.013	2.555 ± 0.118	1.044 ± 0.010	0.454 ± 0.129	0.987 ± 0.012	0.635 ± 0.048	8.3 / 10	6.7 / 10
MO-DDOM (k=1)	0.948 ± 0.009	1.674 ± 0.150	0.983 ± 0.002	0.827 ± 0.046	0.932 ± 0.014	0.795 ± 0.063	9.3 / 10	8.0 / 10
ManGO (k=1)	1.097 ± 0.004	0.729 ± 0.064	1.050 ± 0.002	0.399 ± 0.068	0.971 ± 0.016	0.739 ± 0.104	6.0 / 10	6.0 / 10
ManGO^+Self-IS (k=1)	1.099 ± 0.005	0.702 ± 0.065	1.050 ± 0.001	0.359 ± 0.017	1.053 ± 0.015	0.250 ± 0.084	4.3 / 10	3.7 / 10
MM-MOBO	0.963 ± 0.007	4.723 ± 0.164	1.056 ± 0.000	0.206 ± 0.019	1.075 ± 0.000	0.362 ± 0.016	4.0 / 10	6.0 / 10
ParetoFlow	1.000 ± 0.008	2.867 ± 0.405	0.953 ± 0.057	1.523 ± 0.567	0.998 ± 0.009	0.672 ± 0.115	7.7 / 10	8.3 / 10
MM-NSGA2 (k=256)	1.055 ± 0.003	3.592 ± 0.044	1.046 ± 0.002	1.008 ± 0.019	1.086 ± 0.000	0.752 ± 0.016	4.0 / 10	9.0 / 10
MO-DDOM (k=256)	0.981 ± 0.006	1.052 ± 0.144	1.033 ± 0.001	0.270 ± 0.010	1.054 ± 0.006	0.255 ± 0.044	6.7 / 10	4.3 / 10
ManGO (k=256)	1.107 ± 0.002	0.420 ± 0.030	1.051 ± 0.002	0.118 ± 0.015	1.066 ± 0.003	0.172 ± 0.013	2.7 / 10	1.7 / 10
ManGO^+Self-IS (k=256)	1.106 ± 0.002	0.445 ± 0.043	1.052 ± 0.000	0.094 ± 0.002	1.079 ± 0.003	0.123 ± 0.033	2.0 / 10	1.3 / 10

	RE(n=2)		RE(n=3)		RE(n=4)		Mean Rank
	Avg. HV(↑)	Avg. IGD(↓)	Avg. HV(↑)	Avg. IGD(↓)	Avg. HV(↑)	Avg. IGD(↓)	HV Rank(↓)	IGD Rank(↓)
\({{\mathcal{D}}}_{\,\text{train}}^{\text{(best)}\,}\)(Preferred)	1.0 (1.037)	1.0 (0.0)	1.0 (1.082)	1.0 (0.0)	1.0 (1.310)	1.0 (0.0)	/	/
MM-NSGA2(k=1)	1.016 ± 0.004	56.958 ± 12.159	0.935 ± 0.007	2.979 ± 0.159	0.780 ± 0.009	1.504 ± 0.060	9.3 / 10	9.7 / 10
MO-DDOM (k=1)	1.010 ± 0.012	2.261 ± 0.220	1.046 ± 0.002	0.579 ± 0.020	1.058 ± 0.003	0.845 ± 0.008	8.7 / 10	4.7 / 10
ManGO (k=1)	1.024 ± 0.004	6.809 ± 0.740	1.051 ± 0.011	0.782 ± 0.083	1.234 ± 0.009	0.421 ± 0.018	5.7 / 10	6.0 / 10
ManGO^+Self-IS (k=1)	1.022 ± 0.004	4.717 ± 2.081	1.066 ± 0.005	0.588 ± 0.027	1.240 ± 0.016	0.304 ± 0.020	4.7 / 10	4.3 / 10
MM-MOBO	1.027 ± 0.002	1.156 ± 0.133	1.071 ± 0.001	0.912 ± 0.093	1.123 ± 0.009	0.816 ± 0.029	3.7 / 10	5.0 / 10
ParetoFlow	1.017 ± 0.004	12.403 ± 0.052	1.010 ± 0.004	1.806 ± 0.024	0.668 ± 0.001	1.887 ± 0.006	9.0 / 10	8.7 / 10
MM-NSGA2 (k=256)	1.034 ± 0.000	98.945 ± 0.011	1.069 ± 0.001	2.160 ± 0.029	1.249 ± 0.012	0.320 ± 0.052	2.0 / 10	6.7 / 10
MO-DDOM (k=256)	1.020 ± 0.001	1.476 ± 0.031	1.056 ± 0.001	0.552 ± 0.007	1.073 ± 0.004	0.808 ± 0.008	6.7 / 10	3.3 / 10
ManGO (k=256)	1.026 ± 0.002	3.373 ± 0.428	1.063 ± 0.005	0.611 ± 0.026	1.248 ± 0.006	0.388 ± 0.014	4.0 / 10	4.7 / 10
ManGO^+Self-IS (k=256)	1.028 ± 0.001	2.264 ± 0.237	1.072 ± 0.002	0.498 ± 0.013	1.264 ± 0.002	0.234 ± 0.020	1.3 / 10	2.0 / 10

Higher HV values indicate better performance, while lower IGD values are preferred. \({{\mathcal{D}}}_{\,\text{train}}^{\text{(best)}\,}\)(Preferred) denotes the best (preferred) HV/IGD in the offline training dataset. For compact presentation, reported numbers represent tasks’ performance averaged by the objective number n. The sets of ZDT(n = 2), OmniTest (n = 2), and DTLZ (n = 3) consist of 5 ZDT tasks [41], 1 OmniTest task [52], and 2 DTLZ tasks [42], respectively. The sets of RE (n = 2), RE (n = 3), and RE(n = 4) comprise 5, 7, and 2 real-world application tasks [43] with n = 2, 3, and 4, respectively. Note that each task’s results are normalized by the best HV and IGD of its training dataset, and RE (n = 2) presents higher averaged IGD values because the RE22 task has ten times more IGD value than other tasks. Note that MO-DDOM represents a standard conditional diffusion-based baseline method.

Search