Table 3 Comparative analysis of Tiger Model against other data augmentation methods in the classification of benign and malignant rare subtypes of thyroid cancer

From: Improving AI models for rare thyroid cancer subtype by text guided diffusion models

Augmentation Method

(Augmentation data

number = 30 k)

AUROC (95% CI)

Benign vs PTC (Resnet50)

Benign vs FTC (Resnet50)

Benign vs MTC (Resnet50)

Valid

Test

Valid

Test

Valid

Test

No Data Augmentation

0.8532

(0.83–0.93)

0.8677

(0.83–0.92)

0.7312

(0.68–0.75)

0.7364

(0.69–0.75)

0.7438

(0.63–0.78)

0.7523

(0.67–0.79)

Basic Data Augmentation

Basic image manipulation38

0.8763

(0.85–0.88)

0.8614

(0.84–0.87)

0.6999

(0.68–0.72)

0.6872

(0.65–0.69)

0.6988

(0.63–0.70)

0.6511

(0.63–0.66)

Mixing image39

0.8321

(0.81–0.84)

0.8563

(0.82–0.87)

0.6761

(0.65–0.69)

0.6321

(0.60–0.64)

0.6947

(0.64–0.72)

0.6827

(0.63–0.72)

Random erasing40

0.8426

(0.83–0.86)

0.8637

(0.82–0.89)

0.6590

(0.56–0.72)

0.6472

(0.62–0.73)

0.6743

(0.63–0.69)

0.7182

(0.65–0.72)

Feature space augmentation41

0.8532

(0.83–86)

0.8452

(0.81–0.87)

0.6842

(0.65–0.73)

0.6828

(0.60–0.72)

0.7083

(0.67–0.73)

0.7427

(0.72–0.75)

Deep Learning Generation

PG-GAN42

0.8427

(0.82–0.86)

0. 8453

(0.82–0.86)

0.7022

(0.68–0.75)

0.6732

(0.63–0.72)

0.7529

(0.68–0.81)

0.7468

(0.72–0.79)

Diffusion Transformers43

0.8743

(0.84–0.92)

0.8703

(0.85–0.92)

0.7380

(0.71–0.75)

0.7328

(0.69–0.75)

0.7121

(0.65–0.73)

0.7252

(0.69–0.76)

Imagen44

0.8538

(0.82–0.86)

0.8412

(0.84–0.86)

0.7543

(0.72–0.80)

0.7577

(0.68–0.82)

0.7574

(0.74–0.82)

0.7581

(0.63–0.79)

Stable Diffusion24

0.8694

(0.84–0.89)

0.8503

(0.85–0.89)

0.7471

(0.73–0.79)

0.7737

(0.76–0.82)

0.7627

(0.74–0.82)

0.7582

(0.73–0.76)

⑩ Stable Diffusion + ControlNet63

0.8473

(0.84–0.89)

0.8527

(0.83–0.87)

0.7033

(0.65−0.75)

0.7172

(0.69–0.74)

0.7261

(0.69–0.76)

0.7425

(0.71–0.76)

⑪ Tiger-N

0.8882

(0.84–0.89)

0.8969

(0.84–0.92)

0.8067

(0.79–0.82)

0.8132

(0.78–0.83)

0.7733

(0.76–0.83)

0.8027

(0.79–0.82)

⑫ Tiger-F

0.9127

(0.90–0.93)

0.9263

(0.85–0.94)

0.8338

(0.76–0.83)

0.8442

(0.75–0.85)

0.8043

(0.75–0.83)

0.8234

(0.78–0.85)

  1. refers to the baseline without any data augmentation. refers to Flipping, Cropping, Rotation, Translation, and Noise injection operations. refers to the type of image-to-image generation models. ⑩⑪⑫ refers to the type of Text-to-image generation models. ⑩⑪⑫ are fine-tuned using our thyroid image-level features. ⑪ is the Tiger Model trained and generated solely using disease subtype names as prompts. ⑫ is a Tiger Model trained and generated using the nodule image’s detailed description text as prompts.