Extended Data Fig. 6: Functional biomarker discovery using interpretable machine learning analysis.
From: Interpretable inflammation landscape of circulating immune cells

(a) Gene list ranked top-to-bottom by importance (absolute d-SHAP value), coupled with max-normalized expression levels computed per cell type (Level1) and considering selected diseases. From left to right, reporting top ranked genes for n T CD4 Naive cells in RA disease as well as for monocytes and pDC in SLE patients. (b) Rank by importance (absolute d-SHAP value) of the CYBA gene in every combination of cell type (Level1) and disease. (c) Scatter plot of d-SHAP values against the aggregated s-SHAP values on monocyte population and specific diseases (first row: PS, PSA, CD, and second row: UC, Asthma, COPD, from left to right). (d) Rank by importance (absolute d-SHAP value) of IFITM1 gene in every combination of cell type (Level1) and disease. (e) Scatter plot of d-SHAP values against the aggregated s-SHAP values on T CD4 Non-Naive (top) and ILC (bottom) population and specific diseases (Asthma, COPD, and Cirrhosis, from left to right). In Panels (a), (b), and (d) we first dropped the genes expressed in less than 5% of the selected cell population. In Panels (c), and (e), the top 20 genes according to d-SHAP are marked in turquoise; of these, the genes that are also among the top 20 by s-SHAP are marked in purple. The gene of interest is annotated in red.