Fig. 8 | Scientific Reports

Fig. 8

From: An autoencoder and vision transformer based interpretability analysis on the performance differences in automated staging of second and third molars

Fig. 8

Comparison of mean stage images and attention maps for tooth 38. Top row: Mean stage images reveal a blurrier average per stage, indicating the tooth shapes for 38 contain more intra-class variation. Second row: Mean ViT attention maps seem similar to those of tooth 37, remaining plausible; however they do not incorporate the information below the mid-region of the tooth, and thus do not explain the lower accuracy. Third row: Mean reconstructions are less visually similar to mean stage images, indicating the mean images are not the optimal representation of stage morphology. Bottom row: The attention maps from the AE + ViT pipeline, for all stages, focus on the lower region of the tooth more than ViT, indicating that the root formation informed the classification process.

Back to article page