Extended Data Fig. 3: Schematic showing how variant to gene distance features are calculated.

Distance to the transcription start site (TSS) is the number of bases from the variant to the TSS of the canonical transcript of the gene as defined by Ensembl. Distance to the gene footprint is the smallest number of base pairs between the variant and any position between the TSS and transcription end site of the canonical transcript. (Calculations, left) Some L2G distance features are an average across variants, weighted by each variant’s posterior probability from fine mapping. (Calculations, right) ‘Neighborhood’ features are defined on a log scale relative to the gene with the best score in that category (here, smallest distance). The negative log is used so that genes with better feature values have higher neighborhood scores.