Introduction

The impact of parasites on health is usually studied in the laboratory under controlled conditions with a relatively small sample size of host individuals. Scaling such studies to large wild populations can be challenging due to the level of sampling required but molecular microbiome approaches make large-scale surveys of wild populations tractable. In Antarctic penguins, the difficulty of these parasite surveys is compounded by the remote location and climate. Landscape level surveys are, however, a priority in the polar regions, since the extreme seasonality and large variations in local/regional climate might influence parasite persistence and diversity. Features of polar ecosystems (such as freezing of soil outside the breeding season) means polar parasitology likely differs from better documented temperate and tropical systems. Freezing temperatures may kill parasites outside the host between seasons and impact transmission of some parasites.

While there is limited research on parasites in Antarctic fauna, helminths (parasitic worms) generally have lower diversity compared to their temperate and terrestrial counterparts. Intestinal parasite diversity has also been found to be lower in pelagic species than those that spend the majority of time on land1. The majority of helminths in these pelagic species are generally tapeworms (Cestoda), and this remains the case in penguins1,2. There have been 13 recorded helminth species in Antarctic and sub-Antarctic penguins1. Studies of internal parasites in penguins have generally been reliant on necropsies3,4; these ‘samples of convenience’ may be biased towards sick animals that are not representative of the larger population, which limits our understanding of parasite loads throughout the population at large. Some studies have used faecal samples from birds to detect the presence of parasites, but these hinge on the detection and identification of tapeworm eggs in faeces5 which is accurate6 but can be time consuming. Molecular methods can be an alternative and are more easily scaled.

In pygoscelid penguins, which comprise Adélie, chinstrap, and gentoo penguins in the Antarctic and sub-Antarctic islands, the cestode Parochites zederi and the nematode Stegophorus macronectes are most frequently reported1,3,4,7,8,9. In a necropsy-focused study, Martin et al. 2016 found that 7 of 8 adult pygoscelid small intestines contained Parochites zederi4 which were associated with lesions, haemorrhage, and oedemas, and probably affected the heath of the birds4. There are also two reports of Tetrabothrius pauliani (another cestode) in chinstrap penguins7,10. Palacios et al. 201211 found a correlation between de-worming treatment and greater weight gain among free-living chinstrap penguin chicks, which supports the hypothesis that parasites can impact bird fitness, at least in early development stages.

Little is known about the lifecycles of the tapeworm species that have been associated with gentoo penguins and other pygoscelids. Generally, intermediate hosts are thought to be the krill and fish that dominate pygoscelid diets, though the greater reliance of chinstrap and Adélie penguins on krill versus the more flexible gentoos12,13 may influence infection rates and helminth identity. As Parochites zederi belongs to the Dilepididae, which usually involve crustaceans in their lifecycle, there is a hypothesis that krill rather than fish are an intermediate host4,14. In one study, 7 of 9 adult pygoscelids had multiple P. zederi present per specimen, almost all of which were small or immature7. Tapeworms of the genus Tetrabothrius are commonly found in pelagic birds, and thus their detection in pygoscelids is unsurprising. A lifecycle encompassing a first and second intermediate host of crustacean and fish or cephalopod for those Tetrabothrius that infect pygoscelids has been proposed7.

Generally, while helminth infection is not believed to be sufficient to cause direct mortality in infected populations, it likely contributes to mortality and morbidity when in combination with other stressors including nutritional, environmental and infectious challenges15. Indeed, helminths can modulate immune responses to other infections and affect microbiome communities, which in turn are thought to influence general health and resilience in host species16. Given that environmental factors (e.g., temperature) and diet are key variables that may affect the life history of helminth infections, it is reasonable to suggest that the changing climate in the Western Antarctic will lead to changes in the distribution of parasites, as also suggested by Carlson et al.17. This study documents helminth infection of penguins across a broad ecological range and may help predict the current and future impact of these infections on the health and resilience of wild penguin populations.

Materials and methods

We reanalysed the data set reported in Kaczvinsky et al.18, which is accessioned at the Sequence Read Archive (SRA) repository under project PRJNA956456. These samples were collected from penguin colonies at 23 sites around the Scotia Arc, with 1 to 26 samples per colony (see Supplemental Table 1). Within each colony, faecal samples were taken from nests at least 2 nests apart in penguin colonies to minimize resampling of an individual. These samples were then preserved in RNALater and were frozen upon arrival to the UK. DNA was then extracted using MoBio PowerSoil® kit (Qiagen, 51,804, now branded as Qiagen QIAamp® PowerFecal® DNA Kit), with the lysis step modified to a 12–18 h bead beating on a shaker block at 65 °C. Extracted DNA was then frozen at -20 °C and amplified using barcoded PCR primers. Prokaryotic barcodes were amplified using the V3-V4 region of the 16S rRNA using primers 16S _338F (ACTCCTACGGGAGGCAGCAGT)19 and 16S_806R (GGACTACHVGGGTWTCTAAT)20. Eukaryotic DNA was amplified using a ~ 170 bp fragment of the V7 region of the nuclear small subunit (SSU) 18S rRNA gene using primers 18S rRNA gene_SSU3_F (GGTCTGTGATGCCCT-TAGATG) and 18S rRNA gene_SSU3_R (GGTGTGTACAAAGGGCAGGG)21. DNA amplification involved standard PCRs with the high fidelity Phusion Hot Start Flex DNA polymerase enzyme (New England Biolabs, UK, M0535), with 2 µL of a 1:10 dilution of extracted sample. Negative controls were included at the PCR step and are available in the data set as WC.

PCR products were visualised on agarose gels and normalized with a sub-group of samples quantified with the Qubit 1X dsDNA High-Sensitivity Assay Kit (Invitrogen, UK, Q33231) using a Qubit Fluorometer. Samples were pooled into batches of up to 96 samples and cleaned with the MinElute PCR purification kit (Qiagen, UK, 28,004). Libraries were then prepared with the NEBNext® Ultra™ II DNA Library Prep Kit for Illumina® (New England Biolabs, UK, E7103S) according to the manufacturer’s instructions, with no size selection. NEBNext Multiplex Oligos for Illumina (New England Biolabs, UK, E7335S and E7500S) indexing primers were to allow multiple libraries in a lane. MiSeq TapeStation (Agilent Genomics) and qPCR (NEBNext® Library Quant Kit for Illumina®, E7630S, New England Biolabs, UK) were used to check for successful preparation of libraries. The libraries and 5% phiX were added to the MiSeq at the University of Oxford, Department of Zoology. Sequencing was divided across four MiSeq runs, all using the 600-cycle MiSeq Reagent kit v3 (Illumina, UK), giving 300 nucleotide paired reads that could be merged with sufficient overlap to cover the 468 bp amplicons (with the barcode sequence) for the 16S data. 18S samples used 150 bp paired end reads on the 300-cycle MiSeq Reagent Kit v2 (Illumina, UK, MS-102–2002).

Bioinformatic analysis

Sequence data were processed using a custom Python22 script to de-multiplex sequences, remove primer and barcode sequences with demultiplex23. These were then processed in R using DADA2, analysing the data as a pool24 to remove chimeras, align sequences, and assign taxonomy using the DADA2 formatted NCBI database for 16S rRNA gene data and the SILVA 18S rRNA gene database, version 132.9925 (script available: https://doi.org/10.6084/m9.figshare.20457378). These assignments were performed26 using the reference data sets with Amplicon Sequence Variant (ASV) assignment set to 0.99 identity. Reads were trimmed based on error plots of the pooled data sets to minimize degradation at the end of reads. No uncalled positions (Ns) were allowed and k-values for the alignment algorithm were set to 3. After demonstrating that technical triplicates replicated the outputs to a high level, these data were pooled for downstream analyses as in18.

16S data were then filtered to exclude cyanobacteria and chloroplast DNA, and 18S were filtered to remove chloroplasts and fungi. Reads that could not be identified to phyla were excluded from both 16S and 18S data sets to improve data quality. Percent host/bird was the percent of reads attributed to Aves divided by the total reads of non-chloroplasts and fungi. All avian reads were then removed from the 18S data set for downstream analysis as per Kaczvinsky et al.18. All non-metazoans and non-identified families were also removed at this point. Next, a food subset was created (Eumalacostraca, Decapodiformes, Neopterygii, and Thaliacea), and the percent crustacean was recorded relative to the total food reads. This was selected as an informative metric as most diet for these species was either krill or fish, with tiny amounts of the other food sources, making percent crustacean a meaningful descriptor of the most recent meal18. For 18S, a tapeworm data subset was created that included all ASVs identified as belonging to tapeworm (recorded as Cestoda). Then, ASVs with fewer than 100 reads were excluded from the dataset, which left 10 ASVs present in the analysis, with between 50,860 reads and 142 reads in the unrarefied data set. The percentage of reads identified as tapeworm was then identified as a percentage of the remaining metazoan reads (minus host and unidentified families) and added to the sample data for the filtered 16S gene profiles.

While the Silva 18S25 database used with phyloseq was able to identify the 10 ASVs as tapeworms, further identification was not possible with that database. Therefore, the exact sequences that were collapsed into ASVs by DADA2 (from 1 to 6 sequence variations) were run using megaBLAST27 with the nucleotide collection to refine identification. The percentage of reads for the top four tapeworm ASVs (10, 25, 48, and 50) relative to the filtered 18S dataset were also added as a variable to the 16S data.

We generated phylogenies using knowns from the BLAST database that were generally model systems or common zoonotic parasites. The phylogenies were all generated using the NJS algorithm in ape28 in R29, but different models (TN94, K80, JC69, and GC95) were used to better ascertain likely relationships.

Beta diversity of the associated microbial profile was calculated using Bray–Curtis diversity on the 16S gene profiles from the same samples (see Supplemental Fig. 1 for PCoA). To account for the read depth bias of this metric, each analysis was run using a randomized rarefication down to 5098 reads, and samples with fewer than this excluded (removing 40 of 381). The models were then run 100 times using this rarefication process and the median, 5%, and 95% co-efficient estimate values were reported along with the median p-value. Each model and hypothesis is explored below. This was repeated for 16S data, where all samples with fewer than 5000 reads were pruned (leaving 333 samples).

Fig. 1
Fig. 1
Full size image

A heatmap showing the co-occurrence of each ASV identified as tapeworms. The centre line from the top left to bottom right reports the total number of times the ASV was present in at least one read for the 337 total samples.

Tapeworm and microbiome variations

We assessed the association between host tapeworm burden and gut community composition using PERMANOVA (Permutational Multivariate Analysis of Variance)30,31. Beta diversity metrics were calculated after rarefying sequence data to an equal depth, with 100 independent rarefaction replicates to account for sampling stochasticity. Each PERMANOVA was run with percent tapeworm infection as the predictor and colony included to control for potential non-independence among individuals from the same colony. To evaluate sensitivity to how colony structure was modelled, we tested two approaches: (i) including colony as a random effect in the model, and (ii) specifying colony as a stratum term to constrain permutations within colonies. We also generated a phylogeny with bootstraps to examine the groupings used in this study. While support was relatively low for bootstrap values, this is likely due to the limitations of 170 bps. The same tree was supported by multiple algorithms (See Supplemental Figs. 25 for all trees). While none of the branches had very high support, Parochites zederi samples and ASV247 and ASV48 grouped consistently. We also built a heatmap to examine the co-occurrence of individual tapeworm ASVs (Fig. 1) using a custom script (Figshare: https://doi.org/10.6084/m9.figshare.30011779). Interestingly, ASV50 had much lower rates of grouping with other top ASVs (see Fig. 1).

Fig. 2
Fig. 2
Full size image

The distribution of tapeworm sequences in individual penguins. Histogram of the proportion of tapeworm 18S sequences as a fraction of total metazoan 18S sequences after removal of penguin specific reads. Frequency is the total number of samples in each category. Of the 337 gentoo samples examined, 9 (2.7% of the samples) had 0 reads identified as tapeworm, represented by the green bar on the far left of the histogram.

Species comparison

Finally, we also had a data set of 20 samples each from chinstraps and gentoos from the Aitcho Islands, both collected in 2018. We fit a set of ordinary least square regressions to examine the between-species differences in percent tapeworm and the between-species differences in the percent represented by the top three ASVs. We also ran a differential abundance test of the microbiomes of the two species using DESeq232.

Results

The percent tapeworm in the metazoan data set (minus host) ranged from 0% to 97.8% and showed a long right tail consistent with tapeworm infections across populations33 (Fig. 2). Of the 337 samples in the dataset, 9 (2.7%) had no reads identified as tapeworms. Infections also appeared variable by colony, with some colonies having much higher than average proportion of tapeworm reads in samples and others much lower than average (mean: 0.198, std: 0.236). The mean and standard deviation were calculated from the proportion tapeworm for all 337 gentoo samples relative to the total metazoan reads identifiable to phyla minus those recorded as “Aves.” All samples were weighted equally and numbers were calculated before samples were removed for rarefaction. Colonies did not show clear geographic differentiation (Fig. 3).

Fig. 3
Fig. 3
Full size image

A map of all sample sites coloured by proportion tapeworm (from metazoan reads excluding host), centred around the mean value across all samples of 0.196. Each colony reports the mean proportion of tapeworm for that colony of proportion tapeworm (from metazoan reads excluding host). Colour scale is graduated, with examples of rough proportions shown in the legend, ranging from high proportions in green to low proportions in purple. The insert shows the entire Scotia Arc region, including the 3 sites not in the Western Antarctic Peninsula. Map was created in R 4.5.229 using packages raster (3.6–32)34, sf35, mapdata (2.3.1)36, and prettymapr (0.2.5)37.

Identification

ASV10 is likely a Tetrabothrius sp. (100%- 99.42% identity though it had the same to slightly lower identity with Mesocestoides sp.), which is a known pathogen in pyogoscelids. The encompassing orders Tetrabothriidea and Mesocestoididea, respectively, are monophyletic in recent phylogeny with the Cyclophellidea and Nippotaeniidea and sometimes collapsed within Cyclophellidea38,39. The close relationship between the orders, the absence of records of Mesocestoides sp. in Antarctica, and its status as a carnivore parasite (though birds are intermediate hosts) suggests a Tetrobothrius sp. identification38. ASV48 had 100% identity with Parochites zederi, the most commonly reported pygoscelid tapeworm.

The first sequence of ASV50 had a majority (64%) of 100% identity matches in BLAST from Diphyllobothridae. This was consistent with the other two sequences grouped to this ASV, which each had 100 matches at 94.2% identity, also with 64 matches to Diphyllobothridae. Given the current estimations of tapeworm phylogeny, this family is relatively basal39, meaning that a match to Diphyllobothridae is not improbable. However, given the uncertainty of this data, we are unwilling to further speculate on the identity of this. (See Supplemental Table 2 for example BLASTN results for ASV10). ASV25 had 5 different sequences associated with it but the identity of these could not be resolved, having matches to multiple cestode orders (See Supplemental Table 3 for example BLASTN results for ASV25). Hence, we report the ASV but do not link the sequence to a defined identity. All four ASVs showed a similar right-tail skew distribution as was present in the histogram of percent tapeworm DNA distribution (see Supplemental Fig. 6).

The majority of ASVs seem to co-occur with each other. Of the 337 samples, only nine (2.7%) had none of the top ten tapeworm ASVs, and 27 (8%) had only one of the top ten tapeworm ASVs. Among the samples tested, 89.3% had at least two tapeworm ASVs. However, ASV50 seems to be the ASV that most commonly occurs separately from other cestode ASVs, possibly indicating a different infection mechanism, such as different intermediate host. Of the top three, all seem to have significant overlap in their occurrence with ASV10, but it could be possible that they are indicating co-infection rather than genetic variation within a species or a sequencing error. Based on the putative phylogeny of the samples and the heatmap showing high co-occurrence, we merged ASV247 with ASV48 and ASV457 with ASV10.

Diet and infection

Many tapeworms enter the gut through diet, which is also deducible from 18S signatures. Four generalized linear mixed models run with glmm40 using the ordered beta linear model looked at percent crustacean as a function of the percent ASV with colony as a random effect. None of the models showed a correlation between the tapeworm ASV and the percent crustacean in diet (p-values: 0.602, 0.266, 0.0531, 0.0531).

Tapeworm and microbiome variation

The same PERMANOVA models for the total amount of tapeworm present in samples were repeated only for the percent tapeworm attributed to the top four ASVs. While the presence of ASV25 was significant in all of the tests, neither ASV48 nor ASV10 were significant when the variation was examined only within colony (Tables 1, 2, 3, 4 and 5). Generally, the presence of tapeworms is associated with significant differences in the beta diversity of the associated host microbiome, i.e. samples with greater amounts of tapeworm DNA have different host microbiomes than those with less or no tapeworm DNA present. The estimate was much lower when colony was a stratum term, i.e. the differences in variables were considered only within colony. This likely indicates that there are significant differences in microbiome beta diversity that are related to the different infection rates between colonies as demonstrated in Fig. 3. That is, structure in microbiome and infection rates are associated with specific colonies.

Table 1 PERMANOVA comparing the total percent of tapeworm reads (of all metazoan reads identifiable to phyla minus those identified as host) and beta diversity of the microbiome.
Table 2 PERMANOVA comparing the total percent of the of top tapeworm ASV (ASV 10, putative identification as Tetrabothrius sp.) (of all metazoan reads identifiable to phyla minus those identified as host) and beta diversity of the microbiome.
Table 3 PERMANOVA comparing the total percent of the of top 2nd tapeworm ASV (ASV 25) (of all metazoan reads identifiable to phyla minus those identified as host) and beta diversity of the microbiome.
Table 4 Percentage of top 3rd tapeworm ASV (ASV 48, Putative Identification as Parochites zederi) (of all metazoan reads identifiable to phyla minus those identified as host) and beta diversity of the microbiome.
Table 5 Percentage of top 4th tapeworm ASV (ASV 50) (of all metazoan reads identifiable to phyla minus those identified as host) and beta diversity of the microbiome.

Infection cut-offs

Based on the distribution of percent tapeworm, different PERMANOVAs were run using different cut-offs of reads identified as a percentage of the remaining metazoan reads (minus host and unidentified families), 0.5%, 1%, 2%, 3%, 4%, 5%, 10%, and 50%. These cut-offs created a categorical model where all samples with a percent tapeworm above the cut-off would be considered infected and below which it would not. These cut-offs served as ways to try to capture the fact that the percentage representations likely contain a mix of true infections and incidental tapeworm (for example tapeworms of prey that were not infectious to penguins). Models linking host infection status and beta diversity of microbiomes showed colony and infection cut-offs varied in significance depending on the percent of tapeworm DNA selected as the cut-off (see Table 6 for full results). Based on Kaczvinsky et al.18, we know that colony has a significant impact on the microbiome of gentoos, so we restricted permutations of the data to those within colonies.

Table 6 Infection cut-off models examining different cut-offs for percentage of tapeworm as a factor in the differences in beta diversity of microbiome across samples.

We evaluated cut-offs based on a histogram of the proportion of tapeworm present in all samples (see Supplemental Fig. 7) It is notable that moving from a 1% cut-off to a 2% cut-off results in a change in both the median and statistical significance. While a 3% cut-off is not significant, it has an estimate consistent with others at the middle values identified from the histogram. Above 10%, the model is not significant and the estimate changes, but there are only a few samples that would meet this threshold, reducing n, and these samples are likely ones that had proglottids contained in the extracted portion. Given this, we would speculate that for these results, around 2% tapeworm DNA is the point at which key changes can be seen in microbiome and might represent a health impacting amount of tapeworm.

Differential abundance

Using a differential abundance test comparing the percent tapeworm present in each sample (compared with all metazoan reads identifiable to phyla that are not avian), we found 27 bacterial ASVs with log2-fold changes in abundance. These ASVs came from one of three phyla: Fusobacteria (8), Firmicutes (12), or Proteobacteria (7) (Fig. 4). Of these, 14 of the bacterial ASVs increased with an increasing proportion of tapeworm (7 of which were classed as Fusobacteria), while the opposite pattern (reduced ASVs with increased amounts of tapeworm in the sample) was seen in the remaining 13. Both Firmicutes and Proteobacteria are generally considered part of the core avian microbiome, which is also the case in penguins41,42,43,44,45. The identified taxa comprise a wide variety of bacteria, including methyltropic46 and spore forming bacteria47. When differential abundance tests were conducted on the tapeworm ASV groups isolated, they had lists of differentially abundant bacteria, that while not identical nevertheless overlapped with each other and the broader tapeworm numbers (Supplemental Figs. 811; Supplemental Tables 48).

Fig. 4
Fig. 4
Full size image

A visualization of the results of the differential abundance test in DESeq2 for the percent tapeworm in sample controlling for colony, showing all ASVs with log2-fold changes to their lowest determined identity. Those in purple showed negative log fold changes and those in green showed positive log fold changes. The ASV number for each ASV is on the y-axis and the closest identified taxa is recorded on the bar. The length of the bar shows the magnitude of the log fold change over the model.

ASV10, ASV25, and ASV48 all showed significant negative log-fold changes in the bacterial ASV15 associated with increasing amounts of the respective tapeworm ASV. This barcode was identified as a member of the Clostridiales. Increasing amounts of tapeworm ASV10 and ASV25 were associated with significant log-fold reductions in the presence of Carnobacterium (bacterial ASV23). Increases in the presence of both tapeworm ASV48 and ASV50 were associated with log-fold changes in bacterial ASV115, though in opposite directions. This bacterial signature was identified as a Dermaccocus sp.

Direct species comparison

We also had a data set of 20 samples each from chinstraps and gentoos on Aitcho Islands, both collected in 2018. A t-test examining differences between the species in how much tapeworm was recorded had an estimate of 0.40 intercept for gentoos compared to 0.078 for chinstraps with a p-value of 2.1e-14 and 0.0023, respectively. The gentoos had lower percentages of ASV10 (estimates of 0.36530 for chinstrap intercept and -0.21, with p value of 0.0018 for the gentoos) and ASV25 (estimates of 0.29 for chinstrap intercept and -0.25 for gentoos, with a p-value of 3.4e-05 for the gentoos). However, there were no significant differences between the amount of ASV48 between the two groups, which had estimates of 0.063 for chinstraps and 0.070 for gentoos. This could be because of relatively fewer hits for ASV48 than ASV10 and ASV25, but it could also indicate genuine differences between the parasite load of these species for the first two ASVs that is not the case for ASV48.

The simple PERMANOVAs comparing the 16S data for the two species and including percent tapeworm as a factor indicate that while there are statistically significant differences between the microbiomes of the two species, these microbiomes do not appear to differ significantly due to percent tapeworm (Tables 7, 8, 9, 10 and 11). With the exception of ASV48 (discussed further below), species differences in the microbiome are not correlated to percent tapeworm, consistent with the PERMANOVAs looking at specific ASVs and microbiome differences. By contrast, ASV48 (potentially Parochites zederi) is correlated with different microbiomes, even when considered only within a single species (stratum set to species). The species term did show significant microbiome differential abundances, with 22 ASVs showing log-fold reductions in gentoos compared to chinstraps, and 29 showing log-fold increases (Fig. 5).

Table 7 PERMANOVAs examining differences in the beta diversity of the microbiome between gentoos and chinstraps at Aitcho Island.
Table 8 PERMANOVAs comparing the total percent of the of top tapeworm ASV (ASV 10, putative identification as Tetrabothrius sp.) (of all metazoan reads identifiable to phyla minus those identified as host) and beta diversity of the microbiome between chinstraps and gentoos at Aitcho Island.
Table 9 PERMANOVAs comparing the total percent of the of top 2nd tapeworm ASV (ASV 25) (of all metazoan reads identifiable to phyla minus those identified as host) and beta diversity of the microbiome between chinstraps and gentoos at Aitcho Island.
Table 10 PERMANOVAs comparing the total percent of the of top 3nd tapeworm ASV (ASV 48, putative Identification as Parochites zederi) (of all metazoan reads identifiable to phyla minus those identified as host) and beta diversity of the microbiome between chinstraps and gentoos at Aitcho Island.
Table 11 PERMANOVAs comparing the total percent of the of top 4th tapeworm ASV (ASV 50) (of all metazoan reads identifiable to phyla minus those identified as host) and beta diversity of the microbiome between chinstraps and gentoos at Aitcho Island.
Fig. 5
Fig. 5
Full size image

A visualization of the results of the differential abundance test in DESeq2 for 16S microbiome ASVs examining the difference between chinstrap and gentoo penguins at Aitcho Island, showing all microbial ASVs with log2fold changes to their lowest determined identity. Those in purple showed negative log fold changes and those in green showed positive log fold changes. The ASV number for each ASV is on the y-axis and the closest identified taxa is recorded on the bar. The length of the bar shows the magnitude of the log fold change over the model. The chinstrap is the reference species in the graph, so logfold changes are for gentoos relative to chinstraps.

Discussion

The work presented validates the use of molecular methods to examine parasite infections in penguin faeces, finding that tapeworm infections were commonly detected throughout the Antarctic Peninsula. This study also revealed correlations between microbiome beta diversity and tapeworm DNA content that are variable by both colony and the specific ASV of the tapeworm DNA. Patterns of tapeworm distribution as measured by the presence of tapeworm DNA in gentoo samples show strong differences between colonies, but do not show clear biogeographic patterns. Note especially the different results with permutations limited to within colonies versus permutations throughout the dataset (stratum term). This indicates the differences associated with increasing amounts of tapeworm DNA seems to be operating more at a colony than individual level, though there are correlations between the amount of tapeworm DNA and the composition of the microbiome. This is roughly consistent with the geographic and colony infection patterns we found.

Prey species represent the primary vector for tapeworm infection in penguins. However, the sparsity of research on tapeworms in Antarctic systems means their lifecycle and intermediate hosts are unknown. For Parochites zederi, the primary speculation has been a krill intermediate host, though fish has been suggested4. Tetrobothrius is more uncertain. While Tetrabothrius species have been reported in pygoscelids2,3, potential intermediate hosts have not been identified though krill, fish, and cephalopods are suitable hosts in the pelagic environment2. While our study did not find strong correlations between diet and tapeworm infection, it should be noted that this study encompasses only the most recent meal these individuals consumed. It is entirely possible that a study examining longer term dietary composition using methods like isotope analysis might provide different results.

In the pygoscelid species comparison, there are strong differences between the species that are independent of the amount of tapeworm DNA recorded. However, the percent of tapeworm DNA present was different between the species for everything but ASV 48, while this was the only ASV where the amount present was connected to differences in microbiome beta-diversity even when considered only within the species. This ASV has been identified as having 100% identity with Parochites zederi, a known parasite in this species, which provides evidence that infections with Parochites zederi may have broad impacts on pygoscelid microbiomes. The differential abundance test also indicates the taxa with largest differences between groups and provides a foundation for future studies examining the interactions between parasites and specific microbes in infected birds and their susceptibility to infection in the first place.

This study is reliant on DNA databases, which are mostly focussed on diseases of humans or livestock with patchy coverage of wildlife species, particularly those from more difficult to access areas (such as Antarctica). As more wild bird derived pathogen sequences are added to these databases, this will support greater resolution of sequence identity. Nevertheless, this study provides a set of penguin associated helminth sequences and demonstrates how these can be used to study penguin pathogens with non-invasive methodologies. It also provides an indication of the interaction between pathogens and the microbiome as well as acting as a baseline for future studies. Future studies could also include longer sequences or non-18S targets for greater resolution of candidate helminths.

Given their reliance on sea ice, krill and other components of the food web are likely to be disrupted by climate change, or other stressors48,49,50. Together, these have the potential to interact with tapeworm infection, host-associated microbiomes, and may affect penguin health (e.g.,51). A better understanding of penguin-associated parasitology is of growing importance given their prevalence and potential to affect health either directly or as a secondary stressor. Future studies may examine the wider ecological context within Antarctica (hosts, pathogens, diet, and microbiome) to consider the broader impacts of climate change in this delicate ecosystem.

Regardless, the detection of tapeworm DNA in gentoo faeces that corresponds to known parasites using common 18S markers opens an important potential tool for further ecological studies in Antarctic systems, work that has previously most often been limited to small numbers of samples. In addition, this study joins others that demonstrate complex links between helminth infection and microbiomes in a wide variety of wild and experimental systems16,52.