Fig. 6: Gradient-weighted token attribution by model.

Wordpiece-level attributions are normalized per note to sum to 1 and then averaged across notes (5-fold CV). To avoid rare-token artefacts, only tokens that appear in ≥ 50 notes ( ≥ 100 total occurrences) are considered. Bars show the mean normalized attribution share for the top 20 tokens per model. Values printed on the bars represent the distribution of attribution mass among the displayed top tokens, not corpus-level frequency. Models trained by 25% of labeled data were used to extract these tokens.