Introduction

Vaccines play a critical role in the prevention of infectious diseases, especially in high-burden populations1,2,3,4,5. However, vaccine-induced immune responses vary across populations. Lower vaccine immunogenicity and efficacy (‘hypo-responsiveness’) is mainly observed in rural areas in low- and middle-income countries (LMICs), especially when compared to high-income countries (HICs) or to urban areas within the same countries6,7,8,9. Vaccine responses are impacted by a range of extrinsic and intrinsic factors, including host (age, sex, genetics, co-morbidities), behavioral (e.g. smoking) and environmental (rural/urban living location, season) factors, diet and nutrition, pre-existing immunity and vaccine factors (vaccine type, adjuvants, dose and administration route)10,11,12. Evidence suggests that the gut microbiome influences immune system development and regulation, thereby affecting vaccine responses13,14,15,16,17,18,19. Like vaccine responsiveness, the gut microbiota is highly variable across age, geographical locations and between people of different lifestyles20,21,22,23. Broad differences in gut microbiota composition and diversity have been observed between individuals living in rural and urban areas in LMICs23,24,25,26,27. Generally, these rural living individuals show a predominance of bacteria capable of polysaccharide degradation and fermentation, including Prevotella and other commensal bacteria such as the spirochete Treponema succifaciens, which is enriched in non-industrialized populations20,23,24,28,29. In contrast, urban living individuals, who generally consume more processed foods and refined sugars, have a less diverse gut microbiota characterized by high abundance of Bacteroides23,26,30,31.

Gut microbiota variation has been linked to both humoral and cellular vaccine responses6 with the strongest associations found in mice receiving non-adjuvanted vaccines (e.g. non-adjuvanted influenza vaccine and inactivated polio vaccine)32,33. The mechanisms underlying these associations include the production of immunomodulatory metabolites such as short-chain fatty acids6,33. In addition, it has been shown that microbial ligands from the microbiome may serve as natural vaccine adjuvants34. Indeed, studies in knockout mice have shown that innate sensing of bacterial flagellin in the gut microbiome can provide adjuvant signals enhancing the antibody response to non-adjuvanted vaccines such as trivalent influenza vaccines (TIV) and polio vaccine32. The role of the microbiome in live-attenuated (self-adjuvanted) vaccine responses remains contentious. While studies on the yellow fever vaccine show that antibody responses in antibiotic-treated mice are similar to those in non-treated mice, suggesting minimal microbiome influence32,34, there is also research indicating that the microbiome may impact responses to other live-attenuated vaccines, like BCG35,36,37. These findings so far suggest variation in associations between the host microbiome and the response to specific vaccines. In humans, similar in-human antibiotic intervention studies have not been performed for self-adjuvanted vaccines. Therefore, it is currently unknown whether the microbiota plays a role in potentiating the immune responses to self-adjuvanted vaccines, such as yellow fever vaccine. Given that the gut microbiota can be modulated through diet and the administration of pre- or postbiotics, it may pose an interesting target for future strategies to enhance vaccine responses in these vulnerable populations.

We hypothesize that the gut microbiota profiles in rural and urban living Tanzanians are linked to vaccine responsiveness. To study this, we recruited healthy Tanzanian adults and vaccinated them with yellow fever vaccine (YF-17D). Stool and blood samples were collected over time. In line with previous work, we demonstrate that the gut microbiota composition is profoundly different between rural and urban living individuals and is linked to dietary habits. Variation in yellow fever neutralizing antibody responses may be at least partly explained by differences in microbiota community state types, potentially outweighing the impact of living location.

Results

Baseline characteristics of the study population

We enrolled 185 individuals living in rural and urban locations in Moshi, Northern Tanzania (Fig. 1a, b and Supplementary Fig. 1). Individuals were randomized into a vaccination group, including those who received the yellow fever vaccine (YF-17D; n = 155) or a non-vaccinated control group (n = 30). All individuals were followed over six months and demographics and lifestyle variables (housing, assets and food history) were collected (Supplementary Tables 1 and 2, Supplementary Fig. 2). Plasma and/or stool samples were collected at baseline, day 28, 56 and 178 (Fig. 1c).

Fig. 1: Study overview.
figure 1

a Geographic map of study sites in Tanzania (Moshi Urban and Moshi Rural), within the Kilimanjaro region. b Graphical representation of the number of urban and rural living participants included in the study and of those from whom stool samples for microbiota assays were available, stratified by the total number of participants (N) and the number of individuals who received the yellow fever (YF-17D) vaccine (Nvacc). c Study design and number of samples at each time point. Numbers depicted as n (nvacc), i.e. total number of samples (number of samples from vaccinated individuals). The number of samples in which neutralizing antibodies were measured only includes those samples with a matching stool sample. d Graphical summary of study analyses. Questionnaire data from all included individuals (N = 185) were used in multiple correspondence analyses (MCA)/principal component analyses (PCA) to derive lifestyle/food scores. The associations between microbiota profiles and demographics and lifestyle factors (summarized in scores) and vaccine responses were assessed.

Stool samples were available for microbiota sequencing from 143 of 185 (77.3%) individuals (n = 74 rural and 69 urban) (Table 1). A total of 154 stool samples were collected from 143 individuals (paired samples before and after vaccination were available from 11 individuals). The median age was 23.1 years (interquartile range [IQR], 21.2-27.1 years) and 43% were female (35% vs 52% in rural and urban locations, respectively, p = 0.040). Baseline characteristics of the microbiota cohort were similar to the overall cohort (Table 1 and Supplementary Table 1). The prevalence of parasitic infections was 4.9% and these infections were detected only in individuals from rural areas (Table 1).

Table 1 Baseline characteristics of the study population (N = 143)

Lifestyle and food scores vary with living location

Lifestyle questionnaire data were combined into a single lifestyle score to objectively gauge rural/urban living location-associated lifestyles (N = 185). To obtain the lifestyle score, we applied multiple correspondence analysis (MCA, a dimensionality reduction method for categorical data) to 38 questions (118 variable categories; Supplementary Fig. 3a). MCA separated individuals based on living location, especially across principal component (PC) 1 (‘lifestyle score’, Supplementary Fig. 3b), which captured 12.1% of the variation in questionnaire data. Rural individuals showed a larger spread across both PC1 and PC2, indicating they exhibit more diverse lifestyles. All variable categories contributed to the lifestyle score (Supplementary Fig. 3c), with variables related to possession of assets showing the highest cumulative contribution (57.8%). Variable categories most related to higher lifestyle score (associated with living in urban areas) included possession of household assets (e.g. working television, iron, watch, computer, refrigerator, radio, car/truck or computer). Variables related to housing quality, including the presence of a pit latrine, floors made of soil or sand and walls made of cane/palm/trunks/bamboo contributed to low lifestyle score (related to living in rural areas) (Supplementary Fig. 3d-e). PC2 scores were additionally driven by livestock-associated variables (Supplementary Fig. 3f, g). Lifestyle scores (PC1) were similar for the overall (N = 185) and microbiota (N = 143) cohort (Supplementary Fig. 3h).

Given the known association between diet and microbiome38, we additionally developed a food score, based on 11 questions on frequency of consumption of specific food per week (ordinal variables). Using principal component analysis (PCA), variation in food consumption was captured across PC1 (‘food score’), again showing a clear separation between rural and urban groups (Supplementary Fig. 4a, b). In contrast, PC2 covaried with the average reporting frequency across all food variables (Supplementary Fig. 4c), indicating interindividual differences in how the food questionnaire was filled out. Logistic regression analysis indicated that frequent consumption of fish, locally brewed beer (composed of fermented banana and millet) and green vegetables (p ≤ 0.009) was significantly associated with rural living, whereas a carbohydrate-rich diet consisting of rice and potatoes was related to living in urban areas (adjusted for all other food variables and sex, p ≤ 9.0 × 10-4; Supplementary Fig. 4d). Both lifestyle and food score were included in microbiota association tests (N = 143; Fig. 1d).

Characteristics of microbiota data

To characterize the microbiota profiles of rural and urban living individuals, stool samples were subjected to Illumina MiSeq sequencing of V3-V4-region of the 16S-rRNA-gene, resulting in a median of 93,471 reads (range 58,482–124,868) per sample. A median of 68,648 reads (38,118–93,731) per sample remained after bioinformatic processing and quality filtering. We identified 3354 amplicon sequence variants (ASVs) that occurred in ≥2 samples. ASVs could be aggregated into 272 genus-level taxa, 25 of which were part of the ‘core microbiota’ (genera detected in 80% of samples at ≥0.1% relative abundance). Highly abundant genera included Prevotella (mean relative abundance 27.6%), Bacteroides (7.9%), Faecalibacterium (7.7%), Lachnospiraceae (4.2%), Oscillospiraceae (3.5%), Blautia (2.6%), and Succinivibrio (2.4%). All microbiota analyses are conducted on samples collected at day 56, if not mentioned otherwise.

Microbiota profiles differ between rural and urban living Tanzanian adults

First, we investigated differences in microbiota profiles between rural and urban living Tanzanian adults. The number of observed ASVs was higher in rural compared to urban living Tanzanians (linear model, adjusted for vaccination status, age, sex and sequencing depth, β = 89.0, p = 1.6 × 10−8). Shannon diversity, a measure of within-sample microbial diversity, was higher in rural compared to urban living adults (β = 0.368, p = 1.1 × 10−6, respectively; Fig. 2a). Similarly, a higher number of observed genera was detected in rural compared to urban living individuals (β = 13.2, p = 1.0 × 10−4), yet genus-level Shannon diversity did not show differences between groups (β = -0.019, p = 0.789; Fig. 2b).

Fig. 2: Microbiota profiles differ across Tanzanians living in rural and urban regions.
figure 2

ASV-level (a) and genus-level (b) Shannon diversity between study groups. Box plots represent the 25th and 75th percentiles (lower and upper boundaries of boxes, respectively), the median (middle horizontal line), and measurements that fall within 1.5 times the interquartile range (IQR; distance between 25th and 75th percentiles; whiskers). Density plots were used to visualize the distribution of data points. Statistical significance between groups was assessed using linear models with observed richness or Shannon diversity as an outcome, adjusting for vaccination status, age, sex, and sequencing depth. c Principal component analysis (PCA) biplot using CLR-transformed genus-level microbiota features across day 56 samples. Percentages in square brackets denote the total variance explained by the first two principal coordinates. Each data point indicates a stool microbiota sample colored by group (rural/urban living). Ellipses reflect the data spread at a 95% confidence level. Density plots show the distribution of MDS1 (x-axis) and MDS2 (y-axis) scores. The 15 highest ranking genera across all day 56 samples were simultaneously visualized (squares). R2 and statistical significance of the association between the group and the overall microbiota composition was assessed using PERMANOVA-test (1,000 permutations), while adjusting for vaccination status, age, and sex. d Bar plots indicating the effect size (R2) of the association between demographic, technical, lifestyle, or food-related variables (or derived scores based on these variables; see Supplementary Fig. 3 and Supplementary Fig. 4) and the overall microbiota composition. Statistical significance was assessed using PERMANOVA-tests. Each variable was tested separately. All analyses were adjusted for living location (except when assessing the impact of lifestyle/food-related variables, helminth infection status or sex), vaccination status, age, and sex.

Next, we investigated the global differences in gut microbial community structure between rural and urban living Tanzanians. Principal component analysis (PCA on a centered-log-ratio [CLR]-transformed genus-level abundance matrix) indicated profound differences linked to rural/urban living location (Fig. 2c), which was confirmed by Permutational Multivariate Analysis of Variance (PERMANOVA; adjusting for vaccination, age and sex; R2 = 9.0%, p < 0.001). Other important factors driving overall microbial community variation included lifestyle score (PC1 housing, assets and food-related questionnaire data), food score (PC1 food-related questionnaire data), highest level of education, PC2 (lifestyle) and sex (PERMANOVA, R2 = 7.9%, 6.6%, 6.3%, 2.2% and 1.6%, respectively, p ≤ 0.005), which are all at least partly collinear with rural/urban living location, which was therefore not considered in our models (Table 1 and Supplementary Table 1, Supplementary Fig. 3b and Supplementary Fig. 4b). No statistically significant association between helminth infection status and microbiota composition was detected (R2 = 0.8%, p = 0.184). Importantly, no association with vaccination status was detected (PERMANOVA, adjusting for living location, age and sex; R2 = 0.5%, p = 0.787), suggesting that the vaccine had no impact on the gut microbiota composition at day 56 post-vaccination. In addition, no statistically significant effects for age, BMI or sequencing depth were observed (PERMANOVA, adjusting for living location, vaccination age and sex, as appropriate; R2 = 0.8%, 0.8% and 0.6%, respectively, p ≥ 0.122; Fig. 2d).

Differential abundance analyses were primarily performed at genus level (MaAsLin2), testing all genera present at >0.1% abundance in ≥10% of samples. We found a higher abundance of 34 genera in rural living individuals compared to 14 genera enriched in those living in an urban environment (109 genera tested; padj < 0.05 and absolute log2-transformed fold change (FC) > 1.5; Fig. 3a).

Fig. 3: Differential abundance analyses indicate specific taxa are related to urban/rural living.
figure 3

a Volcano plot visualizing genera enriched in rural or urban living individuals. Statistical significance between groups (rural/urban living) was assessed using linear models with genus-level abundance as outcome, adjusting for vaccination status, age and sex. Genera with padj < 0.05 and log2-fold change (FC) <-1.5 or >1.5 are colored and genera with a padj < 1 × 10-5 and log2FC <-1.5 or >1.5 were annotated. The shape of the data points indicates whether these genera were previously considered to belong to VANISH (volatile and/or associated negatively with industrialized societies of humans) or BloSSUM (bloom or selected in societies of urbanization/modernization) taxa. 2 × 2 tables showing the number of BLoSSUM (b and d) or VANISH taxa (c and e) across urban- and rural-associated taxa. The number between brackets indicates the percentage of VANISH/BloSSUM/other taxa out of the total number of (non-)-rural-/urban-associated taxa. Statistical significance was assessed using Fisher’s Exact tests. f Stacked bar chart showing the genus-level taxonomic composition of each sample as relative abundances. Bar colors are based on a hierarchical color palette with hues specified by Phylum and shades specified by Genus. Samples are arranged based on Bray-Curtis dissimilarities and the plot is divided by living location. Right of the per-sample bar chart, a bar chart showing mean relative abundance for each taxon per group is shown. g Density plots showing the relative abundance distribution of Bacteroides, Succinivibrio and Treponema within rural or urban living individuals. Modes are indicated by dashed lines.

Specific taxa have previously been associated with industrialized and non-industrialized populations and are referred to as BloSSUM (bloom or selected in societies of urbanization/modernization) and VANISH (volatile and/or associated negatively with industrialized societies of humans) taxa respectively22. We found that urban-associated genera (14/109 genera tested) were significantly enriched for BloSSUM taxa (9/14 urban-associated genera, compared to 6/95 non-urban-associated genera, Fisher’s Exact test p = 1.8 × 10-6; Figs. 3b and 2c), whereas rural living individuals were specifically devoid of these genera (1/34 compared to 14/75, p = 0.034; Fig. 3d). In contrast, genera associated with rural living location were not significantly enriched for VANISH taxa (total 18/109; 8/34 rural-associated genera, compared to 10/75 non-rural-associated genera, Fisher’s Exact test p = 0.264; Fig. 3e).

For rural living individuals the strongest enriched genus was Succinivibrio (log2FC = 6.76, padj = 1.3 × 10-12). Together with Treponema (log2FC = 1.62, padj = 0.061), these genera showed a clear multimodal distribution within rural living individuals, with maximum relative abundance peaks detected at 8.6% and 7.0% for Succinivibrio and Treponema, respectively (post-hoc analysis, Fig. 3f, g). Six genera belonging to the family of Prevotellaceae were strongly associated with rural living, including Prevotella (7/9) and Alloprevotella (log2FC ≥ 3.92, padj ≤ 7.7 × 10-5). Similarly, five genera belonging to the family of Lachnospiraceae were enriched in rural living individuals including Butyrivibrio, Eubacterium ruminatium and Ruminococcus torques (log2FC ≥ 1.78, padj ≤ 3.3 × 10-4). Other highly significantly enriched genera in rural individuals included Fournierella, Holdemanella, Solobacterium, Sutterella, Anaeroplasma and Catenibacterium (padj ≤ 3.4 × 10-5; Fig. 3a and Supplementary Fig. 5).

In urban living individuals, abundance of Bacteroides and Parabacteroides was significantly higher compared to rural living (log2FC = -3.70 and -2.30, padj = 8.9 × 10-12 and 3.78 × 10-7, respectively). Bacteroides showed a multimodal distribution among individuals living at an urban location with two major peaks at a relative abundance of 1.2% and at 24.2% (Fig. 3g). Other highly urban-associated genera included Alistipes, Parasutterella, Bifidobacterium, Odoribacter, Bilophila and Akkermansia (padj ≤ 2.8×10-6). The gut microbiome of urban living individuals was further enriched for genera belonging to the family of Enterobacteriaceae, including Enterobacter, Klebsiella and Escherichia/Shigella (padj ≤ 0.009; Fig. 3a and Supplementary Fig. 5). Taken together, rural living individuals lacked BloSSUM taxa and had higher abundance of VANISH taxa such as Prevotella and Succinivibrio, whereas urban individuals were enriched with BloSSUM taxa like Bacteroides.

Rural and urban-associated differences in diet are linked to microbiota profiles

Given the multimodal distribution of the relative abundance of specific genera not only between rural and urban living individuals, but also within individuals from the same location (Fig. 3g), we next assessed the impact of dietary history on microbiota composition. To study this, we clustered samples into two Community State Types (CSTs; Supplementary Fig. 6a and Supplementary Fig. 6b) using Dirichlet-multinomial modeling (DMM). CST1 was characterized by a (non-significant) enrichment of VANISH bacteria (25% vs 12%, Fisher’s Exact test p = 0.106), including Prevotella, Alloprevotella, Holdemanella and Succinivibrio, whereas CST2 was strongly enriched for BloSSUM taxa (67% vs 7%, Fisher’s Exact test p = 6.5×10-6), like Bacteroides, Alistipes, Parasutterella and Bifidobacterium (Fig. 4a and Supplementary Fig. 6c-f).

Fig. 4: Frequent consumption of carbohydrate-rich dietary products is associated with urbanization.
figure 4

a Volcano plot visualizing genera enriched in CST1 and CST2. Statistical significance between CSTs was assessed using linear models with genus-level abundance as outcome (no adjustment for covariates). Genera with padj < 0.05 and log2-fold change (FC) <-1.5 or >1.5 are colored and genera with a padj < 1 × 10-7 and log2FC <-1.5 or >1.5 were annotated. The shape of the data points indicates whether these genera were previously considered to belong to VANISH (volatile and/or associated negatively with industrialized societies of humans) or BloSSUM (bloom or selected in societies of urbanization/modernization) taxa. b Differences in lifestyle scores (PC1/PC2; Supplementary Fig. 3) and food scores (PC1/PC2; Supplementary Fig. 4) between rural living individuals, rural-like urbanites (urban living individuals belonging to community state type [CST]1; Supplementary Fig. 6) and urban living individuals (urban living individuals with CST2). See legend Fig. 2a, b. Statistical significance between groups (n = 3) was assessed using linear models with lifestyle/food score as outcome, adjusting for vaccination status, age and sex. Pairwise contrasts were extracted and adjusted using Tukey’s post-hoc test. c Association between food variables and group (rural living individuals, rural-like urbanites (urban living + CST1) or urban living individuals with CST2). For each comparison (panel), a separate logistic regression model was fitted including all food variables and sex. Model estimates are depicted along the x-axis (colored points). Colors indicate the group with which a given food variable is positively associated. Whiskers denote 95% confidence intervals (CIs; Wald-method). Association between food variables and genera (present in ≥10% of samples at >0.1% abundance across day 56 samples; n = 109) within rural (d) and urban living individuals (e). Results were stratified by food variable, and only food variables with any significant (p < 0.05) associations are shown. Log2 fold change (FC) is shown along the x-axis. Whiskers denote 95% confidence intervals (CIs; Wald-method). Asterisks denote statistical significance (* p ≤ 0.05; ** p ≤ 0.01; *** p ≤ 0.001).

Although the majority of samples from rural living individuals clustered in CST1 (93.2%), urban living individuals were split across CST1 (30.4%) and CST2 (69.6%). Baseline characteristics between urban living individuals with CST1 (referred to as ‘rural-like urbanites’) and urban living individuals with CST2 did not differ, except that a higher proportion of rural-like urbanites was male (71.5% vs 37.5%, p = 0.009; Supplementary Table 3). Comparing rural-like urbanites to urban-living individuals revealed highly divergent profiles, with differences similar to those observed when assessing CST1 compared to CST2. Conversely, rural-like urbanites showed much more subtle differences compared to rural-living individuals, with enrichment of Haemophilus and Akkermansia, but lack of rural-associated Succinivibrio and Fournierella (109 genera tested; padj < 0.05 and absolute log2-transformed fold change (FC) > 1.5; Supplementary Fig. 7).

We did not observe differences in either lifestyle or food score between rural-like urban or urban individuals (Fig. 4b). However, individual food variables did show differences, indicating that rural-like urbanites less frequently consume rice (logistic regression analyses adjusted for all other food variables and sex, β = −1.95, p = 0.032) and more beans and/or peas (β = 0.85, p = 0.078) compared to urbanites belonging to CST2 (i.e. non-rural-like urbanites). Regardless, rural-like urbanites still more frequently consumed rice compared to rural living individuals (β = 2.20, p = 0.033). Furthermore, rural-like urbanites ate significantly less ugali (maize stiff porridge) compared to rural living individuals (β = -3.38, p = 0.023), yet ugali consumption was no different from other urban living individuals (CST2) (β = −0.73, p = 0.212). Last, we found that rural-like urban individuals less frequently eat potatoes compared to both urban living individuals (CST2; β = −0.815, p = 0.087) and rural living individuals (β = −2.02, p = 0.100; Fig. 4c). Together, these findings suggest urban living individuals less frequently consuming starch/carbohydrate-rich dietary products (potatoes and rice) and more fiber-rich products (beans and/or peas) harbor a microbiota reminiscent of that of rural living individuals, despite slight differences in dietary habits compared to this latter group.

Genus-level microbiota association tests (linear models including all food variables, vaccination, age and sex) were stratified by living location, given the strong collinearity between living location and dietary habits. Among rural living adults, we detected seven significant associations, suggesting enrichment of Prevotella¸ Prevotellaceae and Eubacterium ventriosum and lower abundance of Odoribacter with increased consumption of locally brewed beer. Odoribacter abundance was also negatively associated with the consumption of bananas, as was Alistipes abundance (Fig. 4d). Within urban living individuals, we detected five significant associations, indicating frequent ugali consumption is linked to the depletion of Oscillibacter, Christensenellaceae, Clostridia, Eubacterium eligens and Eubacterium siraeum (Fig. 4e).

Rural-like microbiota profile is linked to yellow fever neutralizing antibody titers

Next, we investigated whether vaccine immunogenicity differed between rural and urban living individuals. Across all vaccinated individuals (N = 155), no significant differences in yellow fever Plaque Reduction Neutralization Test (PRNT) (PRNT50 and PRNT90) were found between rural and urban living individuals at 4 weeks post-vaccination (generalized linear mixed effects regression [GLMER], adjusted for age and sex, p = 0.161 and p = 0.226, respectively, Supplementary Fig. 8a-b). However, for the microbiota sub-cohort (i.e. with stool sample available, N = 121), yellow fever neutralizing antibodies (PRNT50) at 4 weeks post-vaccination were higher in rural (geometric mean titer [GMT], 954 [95% CI, 742–1,226]) compared to urban living individuals (656 [95% CI, [490–880], p = 0.042). This was similar for PRNT90 values (p = 0.032, Fig. 5b and Supplementary Fig. 8d). Antibody titers (PRNT50) were lower at 6 months compared to 4 weeks post-vaccination for both urban (PRNT50; β = −1.16, p = 0.082) and rural living individuals (PRNT50; β = -1.48, p = 0.040), with urban living individuals showing a slightly stronger and statistically significant drop (Fig. 5b). Similar results were found when considering all individuals (N = 155; Supplementary Fig. 8c).

Fig. 5: Yellow fever neutralizing antibody titer variation across living location and with microbiota profiles (N = 121).
figure 5

af Boxplots (see legend Fig. 2a, b) showing yellow neutralizing antibody titers (PRNT50) in vaccinated individuals from whom a stool sample was collected (N = 121) at baseline (day 0, d0), 4 weeks post-vaccination (day 28, d28) and 6 months post-vaccination (day 178, d178). Participants were compared according to group (rural/urban living location; (a) CST (c) and group/CST (i.e. rural living individuals, rural-like urbanites [CST1] and urban living individuals [CST2]) (e). Boxplots showing yellow neutralizing antibody titers (PRNT50) at 4 weeks post-vaccination (day 28, d28) compared to 6 months post-vaccination (day 178, d178) for group (rural/urban living location; (b), CST (d) and group/CST (f). Diamond-shaped data points and color values at the bottom of the plot denote log10-base geometric mean titers. Statistical analyses were performed using a generalized linear mixed effects model with log10-transformed PRNT50-values as outcome, group, CST or group/CST, time point, the interaction between group, CST or group/CST and time point, age and sex as fixed effects and participant ID as random effects. Pairwise comparisons of estimated marginal means between groups were computed at each time point (a, c and d) or between day 28 and day 178 for each group (b, d and f). Values under the limit of detection (<10) were assumed to have a value of 5.

To assess whether microbiota profiles impact subsequent YF-17D-induced vaccine responses, we first ascertained that vaccination at day 0 had no impact on microbiota profiles collected at day 56. No significant difference in overall microbiota composition between day 56 samples from vaccinated compared to non-vaccinated subjects were found (PERMANOVA-test, adjusted for living location, age and sex, R2 = 0.5%, p = 0.787; Fig. 2d and Supplementary Fig. 9a). Also, paired day 0 and day 56 samples of vaccinated individuals did not show a consistent direction of movement (Supplementary Fig. 9b,c, PERMANOVA-test, R2 = 2.3%, p = 0.422). Within- and between-subject distances between sample pairs and permutation tests across between-subject distances similarly indicated no statistically significant effect of vaccination (Supplementary Fig. 9d,e). Therefore, day 56 samples were considered representative of day 0 (i.e. before vaccination) and were used as such for downstream analyses.

Next, yellow fever neutralizing antibodies (PRNT50 at 4 weeks post-vaccination), were linked to Shannon diversity estimates, adjusting for living location, the interaction between living location and Shannon diversity, age and sex. Shannon diversity was negatively associated with antibody titers in rural-living individuals (β = −0.345, p = 0.034), but positively in urban-living individuals (interaction term; β = 0.496, p = 0.025; Supplementary Fig. 8e). Similar results were found when considering PRNT50 at 6 months post-vaccination and PRNT90 titers.

Following, yellow fever neutralizing antibodies (PRNT50 or PRNT90) were related to CSTs, showing that CST1 (enriched in rural living individuals) is linked to higher antibody titers compared to CST2 (PRNT50; p = 0.005, Fig. 5c and Supplementary Fig. 8f [PRNT90]). Of note, this association showed large model estimates and was highly significant, as opposed to initial comparisons between rural and urban living individuals, indicating microbial signatures may better explain antibody variation than living location. Though antibody titers were higher 4 weeks post-vaccination, individuals with CST1 showed a much stronger and faster drop over time (PRNT50; β = −1.96, p = 0.002) compared to CST2 (β = −0.02, p = 0.805; Fig. 5d).

We found that within urban living individuals, there was a striking variation in antibody titers (particularly PRNT50) linked to CSTs, with rural-like urbanites showing higher PRNT50 compared to urban living individuals with a CST2-profile (GMT 95%CI, 1,064 [723–1,565] vs 548 [381–787], p = 0.022, Fig. 5e and Supplementary Fig. 8g [PRNT90]). Furthermore, in rural-like urbanites, stronger waning was seen (PRNT50 at 6 months compared to 4 weeks post-vaccination; β = −0.502, p = 2.5 × 10-4) compared to rural living individuals (β = −0.116, p = 0.065) and urbanites with CST2-profile (β = −0.015, p = 0.856, Fig. 5f). Findings for PRNT90 were similar but less outspoken (Supplementary Fig. 10). Together, these findings indicate that microbiota profiles might be an important driver of yellow fever neutralizing antibodies in this population.

Differential abundance analyses at genus level (n = 109) revealed no significant associations between specific genera and yellow fever neutralizing antibodies (PRNT50 or PRNT90) at 4 weeks or 6 months post-vaccination (linear models, adjusting for either (1) living location, age and sex or (2) age and sex). This may indicate that the bacterial community in its totality, rather than single genera, relates to yellow fever vaccine immunogenicity.

Discussion

Here, we assessed the associations between rural or urban living location, gut microbiota profiles, and yellow fever vaccine responses. We identified striking differences in microbiota community composition between rural and urban living individuals, which appeared in part related to dietary habits. Gut microbiota profile variation was linked to yellow fever vaccine immunogenicity and waning, at least at a microbiota community, but not a genus-level.

We show that rural living individuals harbor a gut microbiota enriched for Succinivibrio, Treponema, and Prevotella, which is consistent with literature20,26,28,39 and likely explained by variation in dietary habits. Higher abundance of these microbiota members has been associated with the digestion of plant-rich diets and the production of short-chain fatty acids24,40,41. Indeed, we found that Tanzanians living in rural settings consume more grain-based food products such as ugali (maize stiff porridge), vegetables and local beer made by a mix of fermented bananas and finger millet26. Strikingly, within those living in rural areas, local beer consumption in particular was associated with a higher abundance of Prevotella, suggesting it may be a particularly important driver of the rural microbiota signature.

Urban living individuals exhibited a higher abundance of Bacteroides, Parabacteroides, Enterobacteriaceae and Bifidobacterium, which may be related to the consumption of a more carbohydrate-rich diet, again largely aligning with literature27,31,42. Interestingly, approximately half of the urban-living individuals showed a high abundance of Prevotella, which we identified as characteristic of the rural microbial signature. The microbiota composition of this subgroup of urban living individuals co-clustered with rural living individuals based on their microbial make-up. Despite that, these rural-like urbanites showed a different dietary consumption pattern from both rural living individuals and their urban counterparts, with high consumption of beans and peas (rich in proteins, carbohydrates, and dietary fibers) and lower rates of potato and rice consumption. Within the urban population we found that frequent consumption of ugali (maize stiff porridge) is related to a reduced abundance of Eubacterium siraeum. Eubacterium siraeum has previously been linked to the consumption of a Western diet and systemic inflammation43, possibly indicating that ugali consumption limits the establishment of microbes related to industrialization.

Genera associated with rural or urban living location were classified as either BloSSUM or VANISH taxa. The distinction between these taxa has been based on previous work comparing the gut microbiome of a Tanzanian population of Hadza hunter-gatherers to that of populations living industrialized lifestyles22. Although rural living individuals in our study showed a lack of industrialized lifestyle-associated BloSSUM taxa, we did not observe a significant enrichment for VANISH taxa, suggesting that, indeed the population of rural living individuals recruited in our study is possibly in transition between traditional and industrialized lifestyles. Conversely, a clear signature of Westernized microbiota profiles was detected in urban-living individuals, indicated by significant enrichment of BloSSUM taxa. Apart from diet, there may be other host and environmental factors contributing to the observed differences in microbiota composition between rural and urban living individuals. Among others, agricultural activities, such as livestock farming44,45, and environmental exposures, including indoor cooking, wood stove cooking or exposure to pollutants, may impact microbiota composition46. Helminth infections have previously been linked to microbiota changes47,48,49,50, but since helminth infection prevalence was low in our rural cohort (4.3%), we presume this was not a main driver of rural-urban differences in our study.We detected slightly higher yellow fever neutralizing antibody responses 4 weeks post-vaccination in participants living in rural compared to urban settings. The difference was statistically significant in the microbiota sub-cohort, but not in the total vaccinated cohort. At 6 months, titers remained higher in rural settings. This is not in line with the general hypothesis that vaccine immunogenicity in rural populations is lower than urban urban-living individuals, especially within LMICs, which has been observed for a wide range of vaccines6,51. Yellow fever vaccine, which is highly effective, has previously been used to model human immune responses to vaccines in general. As such, several studies assessed yellow fever vaccine immunogenicity, showing lower antibody titers and rapid waning in Ugandan compared to Swiss individuals and lower seroprevalence rates in rural Ghanaian compared to urban Malian infants52,53. In both studies, the observed differences in immunogenicity are relatively small compared to other vaccines, such as for malaria54,55, and may be obscured by between-country differences. Moreover, discordant results have been reported in some studies assessing factors associated with yellow fever vaccine immunogenicity56,57. Regardless, a recent study in Uganda (POPVAC)58, did show higher yellow fever vaccine antibody titers in urban living individuals compared to individuals living in rural settings, which is in contrast with our findings and at least in part could be explained by the high helminth and malaria infection prevalence in the Ugandan study. Also, the rural sites sampled in our study are relatively developed compared to other rural sites in Tanzania and some other East African countries, with better access to health care and nutrition59,60. Last, rural individuals in our study live at a high altitude, which induces a hypoxic state, as reflected by their high hemoglobin levels. We speculate this causes an increase of transcriptional factor hypoxia-inducible factor (HIF), which induces metabolic and phenotypic changes in B cells and boosts B cell differentiation and antigen switching, thus resulting in higher antibody titers in these rural individuals61,62,63,64. We found that especially within urban living individuals, a rural-like microbiota profile was related to relatively higher antibody titers but stronger waning, as opposed to a more Prevotella-depleted, industrialized microbial community composition. Despite this, no significant associations were detected between specific genera and yellow fever vaccine-induced neutralizing antibody titers. This lack of genus-level associations could be explained by insufficient statistical power or limited sequencing resolution from 16S-rRNA-sequencing. Alternatively, it may suggest that the overall bacterial community, rather than individual genera, is linked to yellow fever vaccine immunogenicity. The microbiota has been suggested to potentiate vaccine responses by providing adjuvating signals, although this seems most relevant to vaccines other than the yellow fever vaccine, which has endogenous adjuvants and is therefore able to engage with toll-like receptors (TLRs) and RIG-1 receptors itself32. Alternatively, we speculate higher yellow fever antibody titers observed in rural-living individuals and rural-like urbanites could be related to the increased consumption of fiber-rich diets, which enriches for short-chain fatty acid (SCFA)-producing bacteria, such as Prevotella40. In B cells, SCFAs increase acetyl-CoA and metabolic activity, resulting in increased antibody production65. Indeed, a study showed that mice fed with a diet low in dietary fibers had lower production of SCFAs and a reduced specific antibody response towards pathogens65. It is possible that the microbiota-associated increase in antibody titers observed in rural individuals in our study has been obscured in other rural populations by factors such as a high prevalence of helminth infections and other infections such as malaria.

Although the gut microbiota of rural and urban living individuals in LMICs has been described previously, there remains a significant gap in the literature regarding gut microbiota profiles from LMIC populations currently transitioning from traditional to industrialized lifestyles. Our work expands on this topic, but also furthers our understanding on microbiome-host associations in the context of yellow fever vaccination. Our study also has limitations, such as the use of 16S-rRNA sequencing rather than shotgun sequencing. Additionally, as this was primarily an observational study aimed at identifying associations between the gut microbiome and the yellow fever vaccine responses, we could not assess causal effects. Also, we cannot exclude the possibility that (unmeasured) microbiota-independent factors (e.g., altitude, genetic variation and historic microbial exposures) drive vaccine immunogenicity, although the observed variation within the urban subgroup advocates against this. Fungal and viral microbiota may covary with the bacterial microbiota and explain the residual variation in vaccine responses we identified. We lacked longitudinal stool samples for most participants, as most samples were collected post-vaccination, which may have influenced the findings. However, our extensive analyses, also leveraging the paired samples we had available, indicated no major impact of the vaccine on the gut microbiota. This warrants our assumption that the microbiota measured post-vaccination is reflective of the baseline microbiota in these individuals.

To further explore microbe-host interactions in the context of yellow fever vaccination, future studies could include measurements of the cellular immune responses, which have previously been shown to differ at least between HIC and LMIC-populations52. Taken together, we show strong differences in gut microbiota profiles between rural and urban living individuals, with part of urban living individuals currently transitioning towards a more industrialized microbiota profile. Microbiota variation within urban living individuals was associated with moderate variation in yellow fever vaccine-induced neutralizing antibodies and antibody waning. These findings suggest that gut microbiota profiles may impact vaccine responsiveness in a vaccine- and context-dependent manner. Identifying adult subpopulations where microbiota influences vaccine responsiveness creates opportunities for research on microbiota-based interventions.

Methods

Study design

This is a prospective longitudinal cohort study (CapTan) in a healthy, 18-35-year-old Tanzanian population recruited in rural and urban Moshi, Northern Tanzania. Volunteers were enrolled and randomized into a vaccine group receiving a single 0.5 mL intramuscular dose of the yellow fever vaccine (YF-17D, Sanofi-Pasteur) and a control group. Randomization occurred by allocating every sixth individual to the control group. No placebo was administered to the control group. Neither the volunteers nor study personnel were blinded to group allocation. Plasma samples were collected at baseline (pre-vaccination), day 28 (4 weeks post-vaccination), 56 (8 weeks post-vaccination), and 178 (~6 months post-vaccination). Paired stool samples were collected at baseline and on day 56 for 11 individuals. For 143 individuals, stools samples were collected on day 56 only. All questionnaires and clinical samples were collected by a well-trained study team consisting of medical doctors, nurses, and laboratory scientists. All samples were processed according to established standard operational procedures and good clinical and laboratory practice principles.

The study protocol was approved by the Ethical Board of the KCMC University (No. 2588) and by the Tanzania National Ethical Committee Board (NIMR/HQ/R.8a/Vol.IX/4089). The study was registered under The Pan African Clinical Trial Registry (PACTR) with trial number PACTR202405738173023 on 03 May 2024. Data were collected in REDCap, a cloud-based electronic data collection system, with a server hosted at the Kilimanjaro Clinical Research Institute (KCRI) in Tanzania.

Description of study areas

The study was conducted in rural and urban Moshi, located in the Kilimanjaro region (total population of 1.9 million (Fig. 1a). The rural study area is at an elevation of 2000–2100 meters above sea level, while urban Moshi is at 700–950 meters above sea level. The district of Moshi City (urban Moshi) is the administrative, commercial, and educational center of the Kilimanjaro region, having 331,733 inhabitants. Most people practice a Western lifestyle with good general sanitation and access to clean water. The main ethnicities are Chagga and Pare. Formal business is the main activity, followed by government and public employment, while a small proportion of people are involved in agricultural and entrepreneurial activities. Rural Moshi has about 535,803 inhabitants who are mainly involved in farming activities. Most people have access to clean water, but a small proportion uses borehole water sources. People live in large family units, and their main economic activities are subsistence farming and animal husbandry. The primary ethnicity is Chagga, and people follow Chagga traditions, such as drinking local brews made from banana or plantain.

Participant screening and enrollment

In rural Moshi, study information was given through community leaders and announcements during mass gatherings in mosques, churches, and village meetings. In urban Moshi, study information was distributed using leaflets and through community leaders, office announcements and university gatherings. Eligible participants (aged 18–35 years and with permanent residency of a given location) were asked to enroll in the study. Following informed consent, 233 participants were voluntarily screened for in- and exclusion criteria. Exclusion criteria were having an acute or chronic disease (including HIV, tuberculosis, cancer, cardiovascular disease, gastrointestinal diseases, recurrent infections, liver, renal, endocrine or neurological disorders), ≥2 hospital admissions times/year in the last year, chronic use of antibiotics or corticosteroids, use of other immunosuppressive drugs, recent receipt of another vaccine, lactation, positive pregnancy test, history of blood product transfusion, hemoglobin level ≤8.5 g/dL, testing positive HIV or malaria, low or high blood pressure (≤90/60 mmHg and ≥140/90 mmHg, respectively) or high blood glucose (≥7.1 mmol/L fasting or ≥11.1 mmol/L random glucose).

Participants were screened for HIV infection (SDBIOLINE HIV-1/2 3.0kit, LOT:03ADG020A), malaria (Malaria Ag p.f/Pan, Ref: 05FK60, LOT:05EDG018A) and soil-transmitted helminth such as hookworms (Ancylostoma duodenale and Necator americanus), Trichuris trichiura, Ascaris lumbricoides, Strongyloides stercoralis and Schistosoma mansoni using Kato-Katz or POC-CCA (Schistosoma mansoni and Schistosoma haematobium; batch no:220701075). Furthermore, hemoglobin levels were measured (HemoCue Hb 301(CE:1450820055) and random blood glucose was assessed (ACCU-CHECK glucose test strips, Roche Diabetic care,06993761001). Weight and height were measured using a well-calibrated machine (RGZ-160, made in China), and blood pressure was measured using OMRON (SN:202111007949 V). All individuals with abnormal laboratory or clinical findings except those with parasitic infections received nurse counseling, referral, or treatment before being excluded. Based on exclusion criteria, 48 of 233 participants were excluded.

Lifestyle questionnaire

Questionnaires were adapted from the Tanzania Demographic and Health Survey and Malaria Indicator Survey (TDHS-MIS) and a food history questionnaire previously applied to a similar population54,60. We collected information on the water source, toilet type, available cooking facilities, and the materials used to construct the floor, roof, and walls of the house. We assessed exposure to livestock by collecting information on the number of milk cows, cattle, goats, sheep, horses, and poultry owned. Inquiries were made on land ownership and possession of non-productive assets, including radios, televisions, computers, refrigerators, ironing tools (whether powered by charcoal or electricity), watches, motorcycles, trucks, animal-drawn carts, generator,s and motorboats. Using food questionnaires, data was collected on the frequency of dietary products participants consume weekly, including ugali (maize stiff porridge), plantain, banana, rice, potatoes, meat, fish, beans/peas, green vegetables, fruits and locally brewed beer.

Microbiota profiling

Stool sample collection

Stool samples were collected at baseline and/or at day 56. For rural individuals, stool samples were collected at the outpatient clinic or at home. For urban individuals, stool was collected at the Kilimanjaro Christian Medical Center (KCMC) or at home. Feces were initially stored in a dry stool container before it was transferred into DNA/RNA Shield Fecal Collection (Zymo Research, Irvine, California, USA) for transportation and storage (median [IQR] time between stool production and storage in medium, 58.5 [15.0–113.0] minutes). Samples were stored in a −80 °C freezer at the KCMC (median [IQR] time between storage in medium and storage in freezer, 120.5 [93.0–168.5 minutes]). Samples were transported to the Netherlands on dry ice and stored at −80 °C before further processing. Samples of individuals who took antibiotics between study enrollment and stool sample collection at day 56 were excluded (n = 9).

Bacterial DNA isolation

DNA was extracted from 250 µL diluted feces (in DNA/RNA Shield) by Repeated Bead Beating (RBB) combined with purification using the chemagic DNA Stool 200 Kit H96 (Revvity, Waltham, Massachusetts, USA). Briefly, 250 µL diluted feces was homogenized with 1.0 mL Lysis Buffer 1 of the chemagic DNA Stool 200 Kit H96 by vortexing in a 2 mL tube (with screw cap) containing 0.5 g of sterile zirconia beads Ø 0.1 mm (BioSpec, Cat. No. 11079101z). Cells were mechanically lysed on a FastPrep-24™ 5G Instrument (MP Bio, Irvine, California, USA) at 5.5 m/s for 3 cycles of 1 minute. Samples were subsequently centrifuged (16,000 x g 4 °C, for 5 min) and the supernatant was transferred to a new tube to which 30 µL Proteinase K was added, mixed and incubated for 10 minutes at 70 °C. Thereafter, samples were incubated for another 5 minutes at 95 °C followed by centrifugation for 5 minutes at high speed (13,000 rpm). Deep-well plates were filled with 800 µL of the supernatant (lysates), and further isolation was done on a chemagic™ 360 instrument (Revvity, Waltham, Massachusetts, USA) according to the manufacturer’s instructions. Finally, purified DNA was quantified on a Qubit Fluorometer (Thermofischer Scientific, Waltham, Massachusetts, USA) using the dsDNA HS Assay Kit (Invitrogen, Waltham, Massachusetts, USA).

16S-rRNA gene amplicon sequencing

The V3-V4 variable regions of the bacterial 16S rRNA gene were PCR amplified from each DNA sample in a single reaction workflow with simultaneous indexing and target amplification using the EasySeq™ 16S Microbiome Library Prep Kit (NimaGen, Nijmegen, the Netherlands) according to the manufacturer’s instructions. Amplicon libraries were sequenced on an Illumina MiSeq instrument (Illumina, Eindhoven, The Netherlands) (MiSeq Reagent Kit v3, 2 × 300 cycles, 10% PhiX) to generate paired-end reads of 300 bases in length in both directions.

16S rRNA gene amplicon sequence data processing

Forward and reverse primers were removed using cutadapt v4.766. Following, using DADA2 (v1.28.0) paired-end sequences were filtered and trimmed (maxEE = 2, truncLen = 240/210 bp), denoised, merged (minOverlap = 10, maxMismatch = 0), and a sequence table was constructed. Chimeras were identified and removed (method = ‘consensus’). After denoising and merging, any ASVs with a length of <350 or >500 bases were discarded. ASVs were annotated up to genus-level using the DADA2 implementation of the naïve Bayesian classifier based on the SILVA v138.1 reference database. Species-level annotations were added using the addSpecies()-function67. ASVs not assigned to the kingdom Archaea/Bacteria or annotated as Mitochondria (family) or Chloroplast (class) were removed.

Yellow fever neutralizing antibody titer measurements

Blood sample collection and processing

Blood samples were collected were collected in 5 mL EDTA tubes on day 0 (pre-vaccination), day 28 (4 weeks post-vaccination; presumed peak in antibody titers) and day 178 (~6 months post-vaccination; detect delayed response/assess weaning status). Blood samples were transported to a clinical laboratory for plasma separation through centrifugation. Plasma samples were stored at −80 °C within two hours. Samples were shipped between institutions on dry ice.

Yellow fever plaque reduction neutralization tests

Plasma neutralizing antibodies against the yellow fever virus from all individuals were quantified using plaque reduction neutralization tests (PRNTs). Results are presented as PRNT50 and PRNT90 titers, which correspond to the reciprocal of the plasma dilution, achieving a 50% and 90% reduction in virus plaque-forming units per 0.1 mL of the reference 17D virus preparation, respectively. Assay quality control included determining the 50% and 90% neutralization cut-off values for each assay via back titration of the virus inoculum. Further details are provided elsewhere68.

Data analysis

All data preprocessing and statistics were performed in R v4.3.3 within R Studio v2024.04.1 + 748 (Boston, MA). Microbiota-specific analyses and visualization, including alpha- and beta-diversity analyses and microbiota association tests were performed using the microViz v0.12.1 R package69. All statistical tests were two-sided and p-values were corrected for multiple testing using the Benjamini-Hochberg procedure (referred to as padj-values). P-/padj-values < 0.05 were considered statistically significant.

Baseline characteristics

Descriptives for baseline and lifestyle questionnaire data were generated using the gtsummary v1.7.2 R package.

Lifestyle and food score

Lifestyle questionnaire data (N = 185) were mode imputed (missing values) and subjected to either Multiple Correspondence Analysis (MCA) or Principal Component Analysis (PCA) to calculate lifestyle and food score, respectively. Lifestyle score was calculated by applying MCA to categorical questionnaire data (n = 38 manually curated lifestyle-related questions; 21 on assets, 11 on food, and 6 on housing) (FactoMineR v2.7 R package, MCA()-function; PC1 defined as ‘lifestyle score’). (Cumulative) variable category contributions were extracted and shown. To calculate a food score, variables capturing the frequency of consumption of dietary products (n = 11; 0, 1, 2–4 or ≥5×/week) were converted into ordinal variables and used as input variables for a PCA ordination (PCA()-function). Coordinates of samples and variable categories/variables were visualized in biplots. For lifestyle score analyses, (cumulative) variable category contributions were extracted and shown.

Alpha-diversity analysis

Diversity analyses were performed on unfiltered and non-rarefied raw read counts (ASVs present in ≥2 samples). Shannon diversity was primarily used as a measure for within-sample/alpha-diversity, leveraging both species richness and evenness (ASV- and genus-level). In addition, we compared the number of observed ASVs or genera. Statistical significance between groups was assessed using linear models with observed richness or Shannon diversity as an outcome, adjusting for vaccination status, age, sex, and sequencing depth.

Beta-diversity analysis

Beta-diversity analyses were performed on centered-log-ratio [CLR]-transformed genus-level abundance matrices. CLR-transformed abundances were used in principal components analysis (PCA) to visualize major patterns of microbiota variation. Associations between rural/urban living location, host and environmental variables (age, sex, helminth infection status, highest level of education, body mass index (BMI), lifestyle and food scores (PC1 and PC2), vaccination status and technical variables (sequencing depth) were explored by multivariable Permutational Multivariate Analysis of Variance (Aitchison-distance matrix; PERMANOVA), adjusting for living location, vaccination, age and sex as appropriate. For variables for which collinearity with living location was expected (i.e. lifestyle/food scores, helminth infection status, and sex), models were adjusted for vaccination, age, and sex (dropping living location). Paired day 0 and day 56 samples from vaccinated individuals were tested for differences between time points (indicating a vaccination effect) using a PERMANOVA with permutations restricted within participants.

Microbiota clustering

Microbiota profiles (n = 154 samples) were clustered into Community State Types (CSTs) by fitting a Dirichlet Multinomial model (DirichletMultinomial v1.46.0 R-package; dmn()-function). Genus-level count data were taken as input, and models were fitted with 1 to 10 components. The optimal number of Dirichlet components was determined by inspecting measures of fit (Laplace, AIC, and BIC). Species associated with either CST1 or CST2 were identified using differential abundance analyses (MaAsLin2; no adjustment for covariates, see below).

Food-variable association tests

To assess the association between food variables (frequency of consumption of dietary products per week) and living location and living location/CST (including rural-like urban individuals), we used logistic regression analysis, adjusting for all other food variables and sex.

Differential abundance analysis

Differentially abundant taxa between groups were identified using linear models, like the implementation in the MaAsLin2 framework (default parameters; log2-normalization)68. For each comparison genera present at >0.1% relative abundance in ≥10% of samples were selected. We explored the associations between (1) rural/urban living location (adjusted for vaccination status, age and sex), (2) CST1 and CST2 (no adjustment for covariates), (3) urban (urban + CST2)/rural living individuals and rural-like urbanites (urban + CST1) (no adjustment for covariates), (4) food variables (stratified by living location; adjusted for other food variables and sex), and (5) log10-transformed yellow fever neutralizing antibodies (adjusted for living location, age and sex or age and sex) and relative abundances of individual taxa. When comparing >2 groups (e.g. urban/rural/rural-like urbanites), estimates and p-values for each pairwise comparison between groups (e.g. urban vs rural) were extracted using the emmeans v1.10.0 R-package. p-values were adjusted per rank, term and contrast, as appropriate, and resulting BH-adjusted p-values of below 0.05 were considered statistically significant.

BloSSUM and VANISH taxa

Classification into BloSSUM (bloom or selected in societies of urbanization/modernization) and VANISH (volatile and/or associated negatively with industrialized societies of humans) taxa was based on a publication describing gut microbiota profiles in a non-industrialized population of Hadza hunter-gatherers. Lists of BloSSUM and VANISH taxa were extracted from the publication and collapsed at genus-level. Genera classified as both BloSSUM- and VANISH-associated were dropped. For differentially abundant genera associated with rural/urban living or CST1/CST2, we tested for significant enrichment of either BloSSUM or VANISH taxa using Fisher’s Exact tests.

Multimodal analyses

Based on visual inspection of stacked bar plots, we manually selected several genera for which the (multimodal) distribution of relative abundance was assessed in a post-hoc analysis using the multimode R-package. The number of modes was determined using the nmodes()-function, based on the inspection of a stairs plot across a range of density bandwidths. The mode location was determined by the locmodes()-function.

Yellow fever neutralizing antibody titers

Log10-base geometric mean yellow fever neutralizing antibody titers (PRNT50 and PRNT90) at baseline (pre-vaccination), day 28 (4 weeks post-vaccination), and at day 178 (~6 months post-vaccination) were calculated. Values under the limit of detection ( < 10) were assumed to have a value of 5. Statistical analyses were performed using a generalized linear mixed effects model with log10-transformed PRNT50/PRNT90-values as outcome, group, CST or group/CST, time point, the interaction between group, CST or group/CST and time point, age and sex as fixed effects, and participant ID as random effects. Pairwise comparisons of estimated marginal means between groups were computed at each time point or between day 28 and day 178 for each group (emmeans v1.10.0 R package).