Fig. 9: Analysis of personalized risk factors shows multiple paths towards risk of a composite event. | Nature Communications

Fig. 9: Analysis of personalized risk factors shows multiple paths towards risk of a composite event.

From: Machine learning can identify newly diagnosed patients with CLL at high risk of infection

Fig. 9: Analysis of personalized risk factors shows multiple paths towards risk of a composite event.The alternative text for this image may have been generated using AI.

a Input interface of CLL-TIM webserver. CLL-TIM may be used with any patient data available and after data input, the physician is provided with: the predicted risk (high or low) of a composite event within the next 2 years; a confidence value indicating the reliability of the given patient prediction; a set of risk factors specific to that patient that were responsible in driving the patient towards their predicted risk. b Personalized risk factors provided by CLL-TIM’s webserver. Examples are for three patients in the internal test cohort for which CLL-TIM correctly predicted the outcome of, with high confidence. CLL-TIM’s probabilistic outputs were 0.09 (low-risk, high-confidence) for the first patient, and 0.87 and 0.71 (high-risk, high-confidence) for the remaining two patients. To calculate these risk factors, for each patient, the average SHAP value for each of CLL-TIM’s 228 features was extracted and averaged across the 28 base-learners of CLL-TIM. The features were then positively and negatively ranked to extract the top 5 features pushing the patient towards high-risk and those pushing the patient towards low-risk. c Co-occurrence of the top personalized high-risk factors. All features that were in the top 3 personalized high-risk factors for the 970 patients in the entire Danish cohort with a composite outcome. In total, 970 high-risk patients had 330 unique combinations of 34 features in their top 3 personalized high-risk factors. Figure generated using UpSet web-interface. d t-distributed Stochastic Neighbor Embedding (t-SNE) clustering of the top 3 personalized high-risk factors. t-SNE was performed using scikit-learn with default parameters on the top 3 personalized high-risk factors for 970 patients with a composite outcome. Color coding is according to c.

Back to article page