Table 2 Comparison with recent state-of-the-art text-to-image generation models
From: LFMDiff: generation of Chinese traditional landscape paintings based on diffusion model
Method | CLIP-T ↑ | LPIPS ↓ | FID ↓ |
|---|---|---|---|
SDXL | 0.224 | 0.011 | 67.784 |
DALL-E3 | 0.223 | 0.085 | 88.698 |
GLIDE | 0.219 | 0.969 | 118.474 |
Taiyi | 0.221 | 0.831 | 90.354 |
ControlNet | 0.226 | 0.026 | 148.352 |
P+ | 0.019 | 0.021 | 116.453 |
T2I-Adapter | 0.981 | 0.085 | 157.265 |
CCLAP | 0.234 | 0.303 | 62.523 |
Tongyi Wanxiang | 0.222 | 0.765 | 77.631 |
RAPHAEL | 0.231 | 0.498 | 72.687 |
WenXin4.5 Turbo | 0.247 | 0.684 | 72.846 |
LlamaGen-XL | 0.229 | 0.807 | 97.541 |
OpenMAGVIT2 | 0.214 | 0.797 | 116.505 |
DALL.E Mini | 0.229 | 0.734 | 77.733 |
Ours | 0.334 | 0.438 | 61.544 |