Figure 1

Study design and predictive modeling of head circumference at 3 months and 1 year using CHILD Cohort Study data (n = 1022 mother-infant dyads; n = 672 features. (A) A multitude of variables (features) across eight categories (modalities) were assessed at different time points before and after birth. This includes Home Environment (35 variables), Maternal Characteristics (42), Parental Body Composition (8), Maternal Health (149), Maternal Diet (207), Infant Feeding (45), Infant Morbidities (95), Human Milk (157). Using machine learning approaches, we then jointly integrate these data modalities to model infant head circumference (z-score for age) at 3 months and 1 year, respectively. For a full list of features, see Supplemental Table S1) Summarizes results from linear (ridge regression) and nonlinear (support vector machines) models using different combinations of features for prediction. Bars indicate the significance of the predictive power for each data subset and model measured by the negative log p-value of the association between the prediction and head circumference (Spearman), with 95% confidence intervals (black lines). For Spearman r values, see Supplemental Fig. S3. Dashed red lines denote p-value < 0.05 thresholds indicating the statistical significance of the predictions without multiple hypothesis comparison correction, the grey dashed lines denote a Bonferroni corrected statistical significance threshold assuming 22 experiments. Bar colors correspond to the modality color scheme in this figure. Combining all features into a multi-modal model (B1.1,B2.1) increases predictive power, particularly at 3 months. Predicting head circumference further into the future (1 year) is more challenging than short-term predictions (3 months). Human milk components are predictive (B1.2,B2.2), particularly fatty acids at 3 months (B1.3,B2.3). HMOs, human milk oligosaccharides. For a full list of features and modalities, see Supplemental Table S1.