Table 3 Human genetic variation data sets and derived tools.

From: Commonalities across computational workflows for uncovering explanatory variants in undiagnosed cases

	BaylorSeq	BCM	Duke/Columbia	Harvard	Miami	NIH	PacificNW	Stanford	UCLA	Utah	Vanderbilt	WUSTL
Known disease gene databases
ClinVar	●	●	●	●	●	●	●	●	●	●	●	●
OMIM	●	●	●	●	●	●	●	●	●	●	●	●
HGMD: Human Gene Mutation Database	●	●	●	●		●			●	●	●
dbSNP	●	●	●				●			●
CGD: Clinical Genomic Database										●	●
Orphanet								●			●
Healthy human population single-nucleotide variant (SNV)/indel databases
gnomAD: Genome Aggregation Database	●	●	●	●	●	●	●	●	●	●	●	●
ExAC: Exome Aggregation Consortium	●	●	●	●	●		●	●	●		⚬	●
1000 Genomes Project	●		●				●	●	●	●	●	●
Institution—internal controls^a		●	●	●		●	●		●	●	●
EVS: Exome Variant Server	●		●		●		●	●
TOPMed: Trans-Omics for Precision Medicine			⚬				●	●		⚬		⚬
UK10K							●	●	●
Greater Middle East (GME) Variome Project			⚬									⚬
xKJPN: 1000+ Japanese			⚬
GenomeAsia 100 K Project			⚬
Iranome			⚬
Human structural variant (SV) databases
gnomAD-SV: Genome Aggregation Database SVs		●		⚬	●		●	●	●	●	●	●
DGV: Database of Genomic Variants		●		⚬			●	●	●	●	●	●
dbVar: Database of Genomic Structural Variation		●						●		●		●
ClinGen: Clinical Genome Resource		●		⚬				●		●		●
DECIPHER		●		⚬			●			●
Institution—internal controls^a									●	●		●
Within-human selective constraint scores
pLI: probability of loss-of-function (LoF) intolerance		●	●	●	●	●		⚬	●	●	●	●
Missense (constraint) Z score		●	●		●	●	●		●	●
pREC: probability of homozygote LoF intolerance			●	⚬					●
(sub)RVIS: Residual Variation Intolerance Score			●			●
L-o/e-UF: LoF observed/expected upper-bound fraction				●			●
CCR: constrained coding regions									●	●
LIMBR: Localized Intolerance Model w/ Bayesian Regression			●
MTR: missense tolerance ratio			●
s_het: selective effect of heterozygous LoF				●
M-o/e-UF: missense observed/expected upper-bound fraction							●
LoFtool										●
● Tool used by default. ⚬ Tool used in specific cases or contexts only.^b

Knowledge of variation within human populations with and without disease can be effectively used to assess the likelihood of a variant to cause the genetic condition under investigation. Tool and data set citations are listed in Extended Data Table 1.
^aHuman sequence variation data sets that are internal to particular institutions and used by clinical sites surveyed here include variants present in patients from Baylor College of Medicine (BCM), the Institute for Genomic Medicine (Duke/Columbia), Brigham Genomic Medicine (Harvard), the NIH Undiagnosed Diseases Program (NIH), Centers for Mendelian Genomics (PacificNW), University of California–Los Angeles (UCLA), the Centre d’Etude du Polymorphisme Humain (Utah), and BioVu (Vanderbilt), and a curated set of copy-number variants (CNVs) detected via genome sequencing (GS) and confirmed via chromosomal microarray analysis (Washington University School of Medicine [WUSTL]).
^bThe contexts in which specific human population variant data sets are used include historical reasons (ExAC), when a variant’s gnomAD-derived MAF is 0 or close to 0 (TOPMed), when patients’ inferred ancestry is non-European (TOPMed), Middle Eastern (GME), Japanese (xKJPN), Asian (GenomeAsia), and/or Iranian (Iranome), and when a predicted structural variant impacts a clinically relevant gene (gnomAD-SV, DGV, ClinGen, DECIPHER).

Back to article page

Table 3 Human genetic variation data sets and derived tools.

Search

Quick links