Fig. 3: GEMSTAT performance measures and parameters.
From: Mechanistic analysis of enhancer sequences in the estrogen receptor transcriptional program

a Scatter plot shows the training and validation performance (area under ROC curve, or AUROC) of 4624 GEMSTAT models in constructed ensemble. Each point represents a model, Y-axis shows its training AUROC, and X-axis shows validation AUROC. Points in red indicate models that are in the top 150 of the ensemble by either the validation AUROC or the validation AUPRC; these models were selected for testing on unseen data (c). b Same as (a), except the performance metric shown is area under PR curve (AUPRC), instead of AUROC; red points have the same meaning as in (a). c Scatter plot represents the test performance (AUROC and AUPRC) of models colored red in (a, b). d, e ROC and PR curves indicating test performance of the best model in terms of AUPRC on the test set. Color bar indicates raw prediction threshold values used to generate the ROC and PRC curves. f Heatmap representation GEMSTAT ensemble model parameters. Rows represent 244 GEMSTAT models forming the ensemble, and falling into three broad clusters. Columns correspond to model parameter values that belong to “Activation”, “Binding”, or “Cooperativity” categories. Larger values of Binding parameter indicate greater binding potential for a TF. Activation parameter greater/less than 1, represents activatory/repressive role for a TF, respectively and its absolute log10 transformed value represents the regulatory strength of the TF. Cooperativity parameter of greater/less than 1 represents cooperative/antagonistic interaction, respectively, and its absolute log10 transformed value represents the strength of interaction.