Table 1 Common errors in the statistical analysis of sex inclusive preclinical studies
From: Navigating the paradigm shift of sex inclusive preclinical research and lessons learnt
Error | Why? |
|---|---|
No statistical tests used to support conclusions reached11. | Humans are hard-wired to see patterns and therefore statistical tests are needed to challenge cognitive biases and assess whether the observed relationship is caused by something other than chance. |
Pooling the data for a treatment across the sexes studied11,16 | Fails to account for variation introduced by sex which will reduce sensitivity and does not allow an assessment of whether the treatment effect depends on sex. |
Disaggregating the data by sex during the analysis11,16 e.g., running independent statistical analysis for each sex studied | Loses statistical power as data is not shared in a common statistical model. Does not allow an assessment of whether the treatment effect depends on sex and encourages the comparison of p values error (Differences in sex-specific significance error). |
Differences in sex-specific significance error (DISS error)11,46 e.g., finding a statistically significant effect in one sex but not in the other sex and concluding the effect depended on sex. | This is flawed reasoning and is not utilising a statistical test to support the conclusion. This strategy is at high risk of false positives. The differences between significant and not significant could be a function of sampling variability or statistical power. Fundamentally, “because the difference between significant and not significant need not itself be statistically significant”61. |
Comparing the females and males within the treatment group to assess whether there was sex-related variation in the treatment effect11. | This is a flawed strategy. It assumes the differences between the groups is due solely to the treatment effect and could be confounded by a common baseline sex difference. |