Fig. 3: Functional mapping of the FLT1 signal.

a Regional association plot of FLT1 signal in the Cameroon-Tanzania meta-analysis. Linkage disequilibrium (LD or r2) of the lead variant (rs74617914) and the rest of the variants is represented as a coloured key. The middle window presents the relative position of the fine-mapped variants; pink represents the meta-analysis fine-mapped variant, blue represents the fine-mapped variants in the Cameroonian cohort (Supplementary Fig. 8a presents the regional association for the Cameroonian cohort FLT1 signal). b Statistics of the fine-mapped variants (BETA, effect size; SE, standard error of effect size estimate; P, unadjusted p value) from the association test described in Fig. 1 and Table 1. PIP (posterior inclusion probability) of each fine-mapped variant being causal. Functional annotations from the ENSEMBL resource and the JASPAR algorithm (TFBS transcription factor binding site) are shown. The distance of each variant from the FLT1 transcription start site is indicated as dTSS. NA, information not available. Transcription factors in bold have known roles in erythropoiesis (see Supplementary Information). c Genomic map of the FLT1 regulatory region showing chromatin state predictions in different cell lines, the promoter (HS1) and candidate enhancer (HS2), the relative position of the fine-mapped variants (light blue vertical lines), and relevant TFBSs (visualised in the UCSC Genome Browser using the hg19 reference sequence). Hypoxia response elements (HREs) bound by the hypoxia-inducible factors (HIFs; HIF1A/2 A), are highlighted in yellow. d Minor allele frequency (MAF) distribution of the fine-mapped variants and other variants looked up in the Tanzanian, CSSCD, and SITT cohorts. The MAFs are displayed for Cameroonian and Tanzanian sickle cell anaemia populations, as well as unascertained global ancestries from the 1000 Genomes dataset. One of the six variants in perfect LD is used to represent the rest. e–g Sequence logo of the TFB motifs disrupted (retrieved from https://jaspar.genereg.net/ and reverse complemented to the forward strand to reflect the base change presented throughout our text). rs75294023 disrupts the absolutely required GFI1 binding core AATC (reverse complement: GATT)115. The box plots show additive effects of rs11840478, rs75294023, and rs11843606 on fetal haemoglobin (HbF) level and mean corpuscular volume (MCV). Centre line in box plots denotes the median, the lower and upper ends of the boxes denote the lower and upper quartile. Whiskers extend from the ends of the boxes to the minimum (lower whisker) and maximum (upper whisker) values. Violin plots describe the density of the distribution. h ATAC-seq data from 3 datasets of erythropoietic cell lines provided visual overlap, showing the FLT1 signal to be enhancer-associated. BCL11A signal is used here as control for both Cameroon and Tanzania GWAS. Source data are provided as Source Data Fig. 3.