Table 3 Research questions as outlined in Methods.

From: Multimodal machine learning in precision health: A scoping review

RQ1

• The literature published in this area as displayed and characterized in the Results’ section is one that is of growing and global interest.

• Fueled by a desire to improve predictive capabilities, relying on complementary and correlative (reinforcing) data. This was found to be the case in the papers surveyed and included in this review, with an increase in 6.4% accuracy.

• Most common health topics were neurology and cancer. This is likely fostered by curated databases that lend themselves to multi-modality predictions such as Alzheimer’s Disease Neuroimaging Initiative165 and The Cancer Genome Atlas Program166.

• Dominance of early data fusion methods likely owe their pervasiveness for three reasons:

â—¦ 2 modalities over 3 means less work overall in model building and deployment.

â—¦ EHR and image data do not require extensive digital conversion for models as does text.

â—¦ Early fusion is built on a single model with a multitude of feature inputs and is typically less computationally complex than is intermediate or late fusion.

• Seldomly did articles perform comparisons of machine learning findings against their human clinician counterparts.

• Several did perform comparisons between uni-modal and multi-modal predictions, with the majority having found a consistent improvement in classification accuracy, sensitivity, and specificity53,113,159 when leveraging multi-modal data.

• Performance benefits seemingly not limited to a particular subtype of multi-modal strategy that was detectable in our metadata.

• Genera recommendation that multi-modal data integration be attempted to improve performance and better mirror a human expert by creating a higher validity environment from which to make clinical decisions.

RQ2

• The analysis techniques are varied and currently do not showcase a gold standard machine learning method in the field. This is likely linked to it being a relatively new and emerging field.

• The varied techniques implemented are highlighted in Table 3.

• N-cross fold validation was the most common and a robust estimator in the face of bias within a dataset.

• Strength of generalizability stems from either the dataset set containing multi-site/location patient data to begin with, or using an external dataset from a remote location167.

RQ3

• Health contexts predominantly impacted by this include Neurology and Cancer.

• No domain/method laid claim to building translation models via FDA (or equivalent) approval for use in clinical circumstances.

• Compare models more readily to physician decision makers168,169,170,171. This will guide the validity of the environments suitable to machine learning and increase adoption and permit FDA approval of these tools172.