Fig. 7: Data cleaning process.
From: UKB-MDRMF: a multi-disease risk and multimorbidity framework based on UK biobank data

For continuous and integer variables, we first apply special encoding techniques and then determine whether to handle them as continuous or discrete based on whether they have more than 20% identical values. For categorical variables, unordered encodings are transformed into binary variables. Ordered encodings undergo special encoding techniques before discrete handling. Icons provided by Icons8 (https://icons8.com).