Fig. 4

Setting of weights for the 53 SDM traits (parameters) to differentially detect sexual differences. (A) Principal component analysis (PCA) of the 53 SDM traits. All traits, including one-hot encoded categorical variables, were z-score transformed prior to PCA. (B) Differences in the degree of sexual dimorphism in WT mice before and after missing-value imputation. The x-axis shows the 53 SDM traits, and the y-axis indicates the statistical significance of the sex differences, presented as − log10(p-value). Colored regions highlight traits whose − log10(p-value) changed markedly following imputation. For measured parameters (SDM traits) with small sample sizes within the same analysis group (typically n = 3–4 per sex), missing-value imputation can artificially exaggerate the apparent sex differences and inflate the corresponding − log10(p-value). For the parameter names shown on the x-axis, see the “SDM ID” or “Parameter ID” columns in Table 2. (C) Correction of PC1 factor loadings distorted by missing-value imputation. Because variance distortion introduced by missing-value imputation propagated to the PCA and produced artificially large PC1 factor loadings for affected traits, we applied robust nonlinear regression (MM estimation) to obtain stabilized loading values. For each SDM trait, PC1 loadings that were exaggerated relative to the actual degree of abnormality (− log10(p-value)) were corrected by this model (panel (i), before correction; panel (ii), after correction).