Fig. 1: Unsupervised hierarchical clustering of 16S rRNA sequences identifies association between GI microbiota composition, housing, diet, and risk of above median P. fragile parasitemia in RMs.

A Divisive hierarchical clustering to identify major clusters in baseline microbiota samples. RM microbiota profiles segregated according to housing location/diet using this unsupervised approach (Building A/high fiber=teal; Building B/high protein=purple; n = 8 RMs per group). B Percent infected RBC demonstrate significantly higher peak peripheral parasitemia in Building A/high fiber RMs as compared to Building B/high protein RMs (p = 0.00031; n = 8 RMs per group). Values along the left y axis indicate Z-scored %RBC, while right y axis values indicate %infected RBC. Individual data points represent Z-scored %RBC at peak parasitemia for each macaque; box limits represent the upper and lower quartiles, while box whiskers represent 1.5x the interquartile range. Horizontal lines within each box represent the median. Points overlaying the box and whisker bars represent data from individual RMs. Statistical significance between Building A/high fiber and Building B/high protein RM parasitemia was determined using a 2-sided Wilcoxon rank-sum test, with the p value shown above the horizontal line at the top of the plot. C %RBC data for each macaque across the entire study. Plots are ordered according to peak parasite load for each RM and colored according to cohort. W=Week.