Table 2 Model performance of APOE-ε4 count, polygenic risk score, and SNP models in dementia genetic prediction, UCLA ATLAS sample, stratified by genetic inferred ancestrya

From: Improving genetic risk modeling of dementia from real-world data in underrepresented populations

  

AUPRC

AUROC

F1 score

Accuracy

Precision

Recall

Specificity

Hispanic Latino Americans (N = 610)

 APOE

ε4 count

0.308

0.652

0.424

0.707

0.357

0.524

0.754

    AD-PRS models

     AD EUR PRS

P-significant

0.306

0.619

0.335

0.759

0.389

0.294

0.880

Gene-annotated

0.288

0.615

0.387

0.397

0.245

0.921

0.260

     AD AFR PRS

P-significant

0.312

0.644

0.409

0.692

0.339

0.516

0.738

Gene-annotated

0.305

0.648

0.427

0.666

0.330

0.603

0.682

     AD multi-ancestry PRS

P-significant

0.298

0.626

0.389

0.444

0.252

0.857

0.337

Gene-annotated

0.298

0.640

0.401

0.448

0.259

0.897

0.331

    Multi-PRS models

     PRSs using AD GWASs onlyb

P-significant

0.312

0.643

0.415

0.644

0.314

0.611

0.653

Gene-annotated

0.302

0.646

0.404

0.670

0.322

0.540

0.705

     PRSs using AD + Neuro GWASsc

P-significant

0.283

0.617

0.382

0.661

0.306

0.508

0.700

Gene-annotated

0.309

0.643

0.411

0.662

0.321

0.571

0.686

    Elastic Net SNPs models

     SNPs from AD GWASs only

P-significant

0.321

0.662

0.408

0.530

0.276

0.786

0.463

Gene-annotated

0.351

0.679

0.436

0.602

0.308

0.746

0.564

     SNPs from AD + Neuro GWASs

P-significant

0.359

0.715

0.472

0.633

0.336

0.794

0.591

Gene-annotated

0.410

0.728

0.458

0.779

0.463

0.452

0.864

    Non-linear SNPs models

     SNPs from AD + Neuro GWASs

GBM

0.304

0.634

0.381

0.707

0.337

0.437

0.777

     Gene-annotated SNPs

XGBoost

0.298

0.642

0.375

0.710

0.338

0.421

0.785

African Americans (N = 440)

 APOE

ε4 count

0.271

0.606

0.388

0.570

0.267

0.714

0.537

    AD-PRS models

     AD EUR PRS

P-significant

0.221

0.592

0.369

0.432

0.234

0.869

0.329

Gene-annotated

0.226

0.573

0.348

0.318

0.213

0.952

0.169

     AD AFR PRS

P-significant

0.242

0.584

0.322

0.732

0.311

0.333

0.826

Gene-annotated

0.241

0.581

0.344

0.584

0.246

0.571

0.587

     AD multi-ancestry PRS

P-significant

0.234

0.592

0.360

0.386

0.225

0.905

0.264

Gene-annotated

0.230

0.598

0.370

0.443

0.236

0.857

0.346

    Multi-PRS models

     PRSs using AD GWASs onlyb

P-significant

0.238

0.589

0.358

0.527

0.242

0.690

0.489

Gene-annotated

0.233

0.590

0.357

0.484

0.234

0.750

0.421

     PRSs using AD + Neuro GWASsc

P-significant

0.187

0.516

0.311

0.195

0.186

0.952

0.017

Gene-annotated

0.217

0.538

0.087

0.809

0.500

0.048

0.989

    Elastic Net SNPs models

     SNPs from AD GWASs only

P-significant

0.356

0.669

0.356

0.802

0.471

0.286

0.924

Gene-annotated

0.421

0.678

0.342

0.834

0.704

0.226

0.978

     SNPs from AD + Neuro GWASs

P-significant

0.391

0.704

0.342

0.825

0.606

0.238

0.963

Gene-annotated

0.446

0.710

0.365

0.834

0.677

0.250

0.972

    Non-linear SNPs models

     SNPs from AD + Neuro GWASs

GBM

0.225

0.479

0.314

0.186

0.187

0.976

0.000

     Gene-annotated SNPs

XGBoost

0.220

0.506

0.139

0.802

0.412

0.083

0.972

  1. Abbreviations: AD Alzheimer’s Disease, APOE apolipoprotein E, AUROC Area Under the ROC Curve, AUPRC Area Under the Precision-Recall Curve, EUR European, GBM Gradient Boosting Machine, GWAS Genome-Wide Association Study, PRS Polygenic Risk Score, SNP Single-Nucleotide Polymorphism.
  2. Notes:
  3. aAll models (if not other specified) have regressed out age, sex, and ancestry-specific principal components. Thresholds were determined by maximizing absolute Matthews correlation coefficient.
  4. bAll AD PRSs built with EUR, AFR, and multi-ancestry GWASs using P-significant/gene-annotated SNPs were included in the model at the same time.
  5. cAll AD PRSs built with EUR, AFR, and multi-ancestry GWASs, and neurodegenerative disease PRS (Parkinson’s disease, progressive supranuclear palsy, Lewy body dementia, and stroke) using P-significant/gene-annotated SNPs were included in the model at the same time.