Table 1 The Quantitative Results of the Comparison Experiment. Lower Values of LPIPS33 Indicate Higher Similarity Between Two Images, and the Values Closer to 1 of SSIM34 Indicate Higher Similarity Instead. Higher BLIP-Pmatch35 Indicates Better Matching of Image and Text, Which Means that the Image Manipulation Method is More Capable of Semantic Representation. These Results are Calculated as an Average of 200 Edited Images for Each Text Prompt, Respectively.

From: Irrelevant region preserving for counterfactual image manipulation

 

LPIPS\(\downarrow\)

SSIM\(\uparrow\)

BLIP-Pmatch\(\uparrow\)

Ours

CF-CLIP

Ours

CF-CLIP

Ours

CF-CLIP

GreenLipsticks

0.3164

0.348

0.7557

0.7048

0.8457

0.8281

RainbowHair

0.3978

0.3769

0.677

0.6355

0.8045

0.5972

BlueGoatee

0.3676

0.4028

0.7304

0.6434

0.8116

0.7484

BlueEyes

0.3378

0.3911

0.6992

0.6161

0.8865

0.8715

Makeup

0.3025

0.3468

0.7588

0.6624

0.1147

0.1221

Smile

0.3522

0.3688

0.7343

0.7188

0.5427

0.6374

RedHair

0.3293

0.3798

0.7193

0.6779

0.7685

0.7921