Fig. 1

Overview of the developed framework to integrate EHRs and a BKG to identify patient endotypes. Step 1 involves cohort creation, extraction of its clinical characteristics, and their mapping to medical concepts in the BKG. In Step 2, the BKG is enriched by incorporating potential new medical concepts derived from EHRs but not present in the BKG. Step 3 focuses on graph representation learning, where embeddings are generated for each node in the BKG and utilized to augment the standard binary patient representation. Finally, Step 4 illustrates the clustering process applied to the augmented patient representation to identify subgroups. These subgroups are then described based on clinical characteristics sourced from EHRs and relations identified in the BKG.