Fig. 5: Heatmap of learned probability scores assigned to each feature extractor by our mixture of experts framework.

Source tasks are ordered left to right from smallest to largest dataset size. Contrary to the heuristic of transferring from the largest source task, our MoE framework did not usually assign the largest score to the largest source task. Instead, the model often assigned scores which were physically intuitive.