Fig. 2: Schistosoma mansoni population structure.

Principal component analysis (PCA) of genetic differentiation within and between the 198 S. mansoni isolates produced in the present study, plus nine previously published samples from Uganda and elsewhere. a Principal components 1 and 2 and b components 3 and 4 with the first four principal components accounting for 60% of the total variance. Points represent samples from three schools Mayuge district (Uganda), Bugoto (light blue), Bwonda (green) and Musubi (yellow) and one school from Tororo district, Kocoge (pink). We included nine additional previously published samples, one sample was from Buloosi school (orange) ~40 km east of Mayuge district, a second sample was from Walukuba school (dark blue) near Lake Albert and the remaining samples (grey) were from Guadeloupe, Senegal, Kenya, Puerto Rico and Cameroon. c Midpoint rooted neighbour-joining phylogeny showing the relatedness between samples, branches are coloured based on the school or region where sample collection occurred. d Autosomal nucleotide diversity (π) values, calculated as the mean of non-overlapping 5 kb windows for each school population: Bugoto (n = 75 miracidia), Bwondha (n = 60), Musubi (n = 46) and Kocoge (n = 17). For all boxplots, the central line indicates the median, the top and bottom edges of the box indicate the 25th and 75th percentiles, respectively. The maximum whisker lengths are specified as 1.5 times the interquartile range. e Pairwise comparisons of fixation index (FST) and absolute divergence (dXY) between each school population. Both statistics were calculated using autosomal variants in non-overlapping 5 kb windows using the same sample sizes as (d). Median values for each comparison are shown in bold, numbers in parentheses represent the 95% bootstrap confidence intervals around the median. f ADMIXTURE plots illustrating the population structure, assuming 2–4 populations are present (K), using 10-fold cross-validation and standard error estimation with 250 bootstraps. Y-axis values show admixture proportions for different values of K (K = 2–4), each shade of purple indicates a different population.