Extended Data Fig. 3: Classification performance of TADA and effect of features on TADA’s prediction performance.
From: Identification of plant transcriptional activation domains

a, The loss of TADA during training and validation. b, TADA’s performance in terms of precision, recall, area under the receiver operating curve (AUC), accuracy, AUPR and F1 score. TADA was trained three distinct times using random peptides20, PADI (referred to as “plant TFs”), and random peptides and PADI combined. c, TADA outperforms all published AD predictors. We compared the performance TADA with three published AD predictors (ADpred, PADDLE and a composition model4,10,20. We used a hand-curated list of 599 ADs from 451 human TFs. For each TF, we predicted ADs and considered predictions that overlapped a known annotation by > 10 amino acids to be true positive, using each predictor. TADA made the most predictions, had the highest Sensitivity, and highest F1 score. d, Z-score normalized SHAP values leading to the selection of 8 features with a z-score above 1. e, Normalized SHAP values ranked from overall most important to least important for fragments scoring above 1 for each of the 6 identified AD subclasses.