Supplementary Figure 7: The median absolute gene expression extremity statistics over 462 individuals in the GEUVADIS data set. | Nature Methods

Supplementary Figure 7: The median absolute gene expression extremity statistics over 462 individuals in the GEUVADIS data set.

From: Quantification of private information leakage from phenotype-genotype data: linking attacks

Supplementary Figure 7

(a) For each individual, the extremity is computed over all the genes (23,662 genes) reported in the expression dataset. The median of the absolute value of the extremity is plotted. X-axis shows the sample index and y-axis shows the extremity. The absolute median extremity fluctuates around 0.25, which is exactly the midpoint between minimum and maximum values of absolute extremity. (b) The plot shows the extremity threshold versus the median number of genes (over 462 individuals) above the extremity threshold. Around half of the genes (indicated by dashed yellow lines) have higher than 0.3 extremity on average over all the individuals. Also, around 1000 genes have higher than 0.45 extremity over all individuals (indicated by green dashed lines). (c) Accuracy of extremity based genotype prediction with changing absolute correlation threshold. (d) The linking accuracy with changing absolute extremity (x-axis) and absolute correlation thresholds (y-axis). The heatmap colors indicate the accuracy.

Back to article page