Table 1 The Quantitative Results of the Comparison Experiment. Lower Values of LPIPS33 Indicate Higher Similarity Between Two Images, and the Values Closer to 1 of SSIM34 Indicate Higher Similarity Instead. Higher BLIP-Pmatch35 Indicates Better Matching of Image and Text, Which Means that the Image Manipulation Method is More Capable of Semantic Representation. These Results are Calculated as an Average of 200 Edited Images for Each Text Prompt, Respectively.
From: Irrelevant region preserving for counterfactual image manipulation
LPIPS\(\downarrow\) | SSIM\(\uparrow\) | BLIP-Pmatch\(\uparrow\) | ||||
|---|---|---|---|---|---|---|
Ours | CF-CLIP | Ours | CF-CLIP | Ours | CF-CLIP | |
GreenLipsticks | 0.3164 | 0.348 | 0.7557 | 0.7048 | 0.8457 | 0.8281 |
RainbowHair | 0.3978 | 0.3769 | 0.677 | 0.6355 | 0.8045 | 0.5972 |
BlueGoatee | 0.3676 | 0.4028 | 0.7304 | 0.6434 | 0.8116 | 0.7484 |
BlueEyes | 0.3378 | 0.3911 | 0.6992 | 0.6161 | 0.8865 | 0.8715 |
Makeup | 0.3025 | 0.3468 | 0.7588 | 0.6624 | 0.1147 | 0.1221 |
Smile | 0.3522 | 0.3688 | 0.7343 | 0.7188 | 0.5427 | 0.6374 |
RedHair | 0.3293 | 0.3798 | 0.7193 | 0.6779 | 0.7685 | 0.7921 |