Extended Data Fig. 2: Study design.
From: Insights into the genetic architecture of the human face

Sample Wrangling: Images and genotypes from each study were intersected and unrelated participants of European ancestry, with quality-controlled images, covariates, and imputed genetic data were selected to obtain the analyzed data. Identification: For each facial segment, canonical correlation analysis (CCA) and Rao’s F-test approximation was used to identify the multivariate combination of facial principal components most correlated with the genotypes, which led to a P value (PCCA-US or PCCA-UK) and multivariate phenotypic trait most correlated with each SNP (TraitUS and TraitUK). Verification: The principal components of the other dataset were then projected onto this trait to obtain a univariate variable representing the distribution of participants from the verification dataset for the trait identified in the identification dataset (UniVarUK and UniVarUS). The genotypes of the verification dataset are then tested against this variable via linear regression, resulting in an additional P value (PUniVar-UK and PUniVar-US). Meta-Analysis: The P values from identification and verification are meta-analyzed using Stouffer’s method, resulting in the final set of P values from each meta-analysis track (PMETA-US and PMETA-UK).