Fig. 6: Negative predictive value and positive predictive value at 5 years on the training cohort.

Determination of the stratification thresholds on the training cohort. The left-side Figure shows the false omission rate (equivalent to 1—Negative Predictive Value) at five years according to various decision thresholds. The right-side Figure shows the positive predictive value at five years according to various decision thresholds. The machine learning model provides a relapse risk for all horizon times t that have been seen in the training dataset. For our use case, we decided to set t to 5 years as it is the standard horizon clinicians would consider building surveillance plan for their patients. Our primary goal is to find a significant group of patients with a very low risk of recurrence at 5 years. To do so, we decided to plot the false omission rate as a function of the cumulative frequency of patients in the very low-risk group by varying the risk threshold. We define our very low risk threshold such as there is a significant increase in the false omission rate. We can then use a similar strategy with the positive predictive value (PPV) to determine a high-risk group of patients. We look for PPV “plateau” to determine the risk thresholds. This method is reused to differentiate medium and low-risk groups.