Fig. 1: Overview of the GOAT algorithm. | Communications Biology

Fig. 1: Overview of the GOAT algorithm.

From: GOAT: efficient and robust identification of gene set enrichment

Fig. 1

Gene set enrichment with GOAT in four steps: (1) required input is a list of genes and their respective test statistics (p value/effect size), and a list of gene sets obtained from GO or alternative resources. (2) Test statistics from the gene list are transformed to gene scores by rank(-p value)2 or rank(effect size)2 depending on user input, i.e., smaller p values translate to higher gene scores. The result is a skewed gene score distribution. (3) For each gene set size N (number of genes), bootstrapping procedures generate a null distribution of gene set scores. This yields a skew-normal distribution for small gene sets and converges to a normal distribution for large gene sets. (4) To determine gene set significance, the score (mean of respective gene scores) is tested against the skew-normal distribution that represents the size-matched null distribution (same gene list length and gene set size).

Back to article page