Fig. 3
From: Localization of adaptive variants in human genomes using averaged one-dependence estimation

Genome-wide SWIF(r) scan for adaptation in ‡Khomani San SNP array data. a Empirical genome-wide univariate distributions of three of the component statistics, XP-EHH, ΔDAF, and iHS are shown in gray, as is the empirical joint distribution of ΔDAF and iHS (darker bins have more observations than lighter bins). The number of sites in each genome-wide univariate distribution differs due to some component statistics being undefined more often than others. In pink are the corresponding distributions for the 108 variants that SWIF(r) identifies as having posterior sweep probabilities of >50% (variants above the dashed pink line in b). The full set of distributions can be found in Supplementary Fig. 25. b The value plotted for each position along the genome is the calibrated posterior probability of adaptation computed by SWIF(r) (per-site prior for a selective sweep is π = 10−4 to detect signals of relatively old sweeps given the high long-term Ne of the ‡Khomani San); only SNPs with a calibrated posterior sweep probability >1% are plotted and the horizontal line indicates a probability cutoff of 50%. A strong signal of adaptation over the major histocompatibility complex on chromosome 6 is shown in black. Gene names are listed for genes previously associated with metabolism-related and obesity-related traits (colors match categories in c; open circles denote genes of interest that are not in any category in c). c We used gene set enrichment analysis tool Enrichr35 to identify categories that had an overrepresentation of genes containing SWIF(r) signals (Supplementary Data 3). We found multiple enriched dbGaP categories related to metabolism and obesity, including adiponectin (a protein hormone that influences multiple metabolic processes, including glucose regulation and fatty acid oxidation), body mass index, and triglycerides. Genes in these categories containing SWIF(r) signals are listed next to category names. p values, q values, and the total number of genes are shown for each category, and categories are ranked by a combined score computed by Enrichr35. Adiponectin, body mass index, and γ-glutamyltransferase all have q values below 5%