Fig. 4: Retrospective validation results. | Nature Communications

Fig. 4: Retrospective validation results.

From: Development and deployment of a histopathology-based deep learning algorithm for patient prescreening in a clinical trial

Fig. 4

A Confusion matrix with sensitivity and specificity metrics (target and achieved). Note this dataset was enriched for FGFR+ patients to achieve a statistical power of 93% (FGFR prevalence of 43%). B Simulated confusion matrix given 1000 patients assuming typical FGFR+ prevalence in trial of ~15%10 and observed algorithm performance (shown in A). Note that 28.7% of patients screened by the image-based device would not be recommended for FGFR molecular testing. C Performance stratified by central laboratory site. Left-side plot shows Receiver Operating Curves (ROC) and area under the curve (AUC) values per site. The right-side plot shows sensitivity point estimates along with 95% confidence intervals (CI) per site. All sites totaled to n = 348 independent samples, distributed across Site #1 to #5 as n = 210, n = 47, n = 35, n = 30 and n = 26 respectively. D Simulated 3-tier FGFR model showing potential clinical utility for prioritizing (or de-prioritizing) patients for molecular testing in a standard clinical setting (where molecular testing may not be part of standard of care).

Back to article page