Table 4 Ablation study on core components of our method.

From: Generating Chinese intangible cultural heritage images with structure and color awareness

Configuration

Content Pres

Style Align

CLIP-S

User Pref.

w/o Structure Token

80.1

74.0

0.468

28.5

w/o Color Token

81.2

76.1

0.471

29.7

w/o Two-Stage Training

82.4

77.3

0.479

31.4

Full Model (Colorful Heritage)

84.6

81.2

0.502

34.5

  1. Removing any module leads to performance drops, demonstrating the necessity of each. Note: User preferences are based on pairwise comparisons with the full model and do not sum to 100%.
  2. Best results are highlighted in bold.