Table 1 Pretrained models’ parameters. The models include different variations of EfficientNet (B0-B7) and CLIP-based architectures, such as RN50, RN101, and vision transformer (ViT) models. These models are evaluated for their generalization performance in our benchmarking testbed.
From: A practical generalization metric for deep networks benchmarking
EfficientNet | # Params | CLIP | # Params |
|---|---|---|---|
efficientnet-b0 | 5.3M | RN50 | 38M |
efficientnet-b1 | 7.8M | RN101 | 56M |
efficientnet-b2 | 9.2M | RN50x4 | 87M |
efficientnet-b3 | 12M | RN50x16 | 167M |
efficientnet-b4 | 19M | RN50x64 | 420M |
efficientnet-b5 | 30M | ViT-B/32 | 87M |
efficientnet-b6 | 43M | ViT-B/16 | 86M |
efficientnet-b7 | 66M | ViT-L/14 | 304M |