Table 4 Data distribution for the coarse-labeled training set used in fine-tuning.

From: Optimizing skin disease diagnosis: harnessing online community data with contrastive learning and clustering techniques

Disease Name

Number of Images

Disease Name

Number of Images

Acne

36942 (27.71%)

Lupus erythematosus

2515 (1.89%)

Actinic keratosis

822 (0.61%)

Melasma

5256 (3.94%)

Alopecia areata

1728 (1.30%)

Palmoplantar pustulosis

5970 (4.48%)

Androgenetic alopecia

1068 (0.80%)

Pigmented nevus

2149 (1.61%)

Blue nevus

1008 (0.75%)

Psoriasis

6483 (4.86%)

Cutaneous amyloidosis

6793 (5.10%)

Seborrheic dermatitis

16380 (12.19%)

Eczema dermatitis

11448 (8.59%)

Seborrheic keratosis

3437 (2.58%)

Epidermal cyst

4550 (3.41%)

Tinea

2116 (1.59%)

Folliculitis

900 (0.67%)

Urticaria

3868 (2.90%)

Herpes zoster

999 (0.75%)

Viral warts

4866 (3.65%)

Lichen planus

8672 (6.50%)

Vitiligo

5327 (4.00%)

Total

 

133297(100%)

 
  1. Despite 1.18 million dermatosis-related skin images without annotations used in the pre-training, we collected 0.13 million coarse-labeled images based on keywords and topics from the Internet. We relied on physician experience to standardize the labels referring to CDISC and ICD10.