Table 2 Comparison with recent state-of-the-art text-to-image generation models

From: LFMDiff: generation of Chinese traditional landscape paintings based on diffusion model

Method

CLIP-T

LPIPS

FID

SDXL

0.224

0.011

67.784

DALL-E3

0.223

0.085

88.698

GLIDE

0.219

0.969

118.474

Taiyi

0.221

0.831

90.354

ControlNet

0.226

0.026

148.352

P+

0.019

0.021

116.453

T2I-Adapter

0.981

0.085

157.265

CCLAP

0.234

0.303

62.523

Tongyi Wanxiang

0.222

0.765

77.631

RAPHAEL

0.231

0.498

72.687

WenXin4.5 Turbo

0.247

0.684

72.846

LlamaGen-XL

0.229

0.807

97.541

OpenMAGVIT2

0.214

0.797

116.505

DALL.E Mini

0.229

0.734

77.733

Ours

0.334

0.438

61.544

  1. The bold values represent the best results for each evaluation metric among all compared methods, and they are also used to emphasize the results of our proposed model (Ours) for clearer comparison.