Fig. 4: The bio-primed model systematically identifies CN biomarkers with significantly stronger co-dependency and biological relevance to target genes compared to the baseline model.
From: Bio-primed machine learning to enhance discovery of relevant biomarkers

A Scatter plot shows the correlation coefficient between gene-level CN variation and UTP4 dependency (y-axis) across genes sorted by genomic location (x-axis). CN biomarkers identified by the baseline and bio-primed models are colored in blue and red, respectively. The size of the gene symbol is proportional to the absolute coefficient derived from either model. Points are colored by chromosomes. Genes identified by baseline, bio-primed, and both are colored in red, blue, and purple respectively. B Boxplot shows UTP4 co-dependency, Pearson correlation, (y-axis) for top mutually exclusive biomarkers derived from the bio-primed (red) and baseline (blue) models (x-axis). Biomarkers derived from the bio-primed model show significantly greater co-dependency. C Scatter plots with associated Pearson correlation coefficients and p-values show the association between UTP4 dependency (y-axis) and CN variation as well as co-dependency (x-axis) for DDX10 and BRIX1. Pearson correlation coefficient and p-value for each trendline in blue. D Empirical cumulative density plot shows increased co-dependency between target gene and biomarkers derived from the bio-primed (blue) compared to the baseline (red) models.