Fig. 1: TCRD knowledge graphs concept and overview of the meta-path-XGBoost algorithm, MPxgb(AD), and workflow. | Communications Biology

Fig. 1: TCRD knowledge graphs concept and overview of the meta-path-XGBoost algorithm, MPxgb(AD), and workflow.

From: Machine learning prediction and tau-based screening identifies potential Alzheimer’s disease genes relevant to immunity

Fig. 1

Centered around the knowledge tree, this concept was essential in selecting data types (Table 1) for the ML algorithms used to impute AD associations for potential proteins/genes. a. Transformation of knowledge graph to ML-ready dataset and training of the model. An example metapath: {Target — (member of) → PPI (protein–protein interaction network) ← (member of) — Protein — (associated with) → Disease} summarizes multiple metapaths for PPI data. b. Evidence weighting by degree-weighted path count (DWPC). c, d. Five-fold cross validation and test set performance are used to evaluate a weighted method (left) AUC-ROC = 0.91/0.93 (five-fold CV/test set) and balanced method (right) AUC-ROC = 0.98/0.62 (five-fold CV/test set) to select the best performing model. e. Feature importance prediction for the AKNA-AD association.

Back to article page