Fig. 2: Experimental design strategies for optimizing sample size and improving signal:noise ratio. | Nature Communications

Fig. 2: Experimental design strategies for optimizing sample size and improving signal:noise ratio.

From: How thoughtful experimental design can empower biologists in the omics era

Fig. 2

Statistical power depends on both the between-group variance (the signal or effect size) and the within-group variance (the noise). Smaller effect sizes require larger sample sizes to detect, especially when noise is high. A Points show trait values of individuals comprising two populations (used in the statistical sense of the word); horizontal lines indicate the true mean trait value for each population. Thus, the distance between horizontal lines is the effect size. The populations could be two different species of yeast; the same species of yeast growing in two experimental conditions; the wild type and mutant genotypes; etc. To estimate the difference between populations, the researcher can only feasibly measure a subset (i.e., sample) of individuals from each population. The yellow boxes report the minimum sample size per group needed to provide an 80% chance of detecting the difference using a t-test, as determined using power analysis. B Blocking reduces the noise contributed by unmeasured, external factors (e.g., soil quality in a field experiment with the goal of comparing fungal colonization in the roots of two plant species). Soil quality is represented by the background color in each panel. Top: without blocking, soil quality influences the dependent variable for each replicate in an unpredictable way, creating high within-group variance. Middle: spatial arrangement of replicates into two blocks allows estimation of the difference between species while accounting for the fact that trait values are expected to differ between the blocks on average. Bottom: in a paired design, each block contains one replicate from each group, allowing the difference in fungal colonization to be calculated directly for each block and tested through a powerful one-sample or paired t-test. C Three statistical models that could be used in an ANOVA framework to test for the difference in fungal density between plant species, as illustrated in panel B. Relative to Model 1, the within-group variance for each plant species will be reduced in both Model 2 and Model 3. For Model 2, this is accomplished using blocking; a portion of the variance in fungal density can be attributed to environmental differences between the blocks (although these may be unknown and/or unmeasured) and therefore removed from the within-group variances. For Model 3, it is accomplished by including covariates; a portion of the variance in fungal density can be attributed to the concentrations of N and P in the soil near each plant and therefore removed from the within-group variances. Note that for Model 3, the covariates need to be measured for each experimental unit (i.e., plant), rather than each block; in fact, it is most useful when blocking is not an option.

Back to article page