Abstract
DNA metabarcoding has played a pivotal role in advancing our understanding of the diversity and function of soil-inhabiting fungi. The Australian Microbiome Initiative has produced an extensive soil fungal metabarcoding dataset of more than 2000 plots across a breadth of ecosystems in Australia and Antarctica. Sequence data requires rigorous approaches for the integration of species occurrences into biodiversity platforms, addressing biases due to false positives or overinflated diversity estimates, among others. To tackle such biases, we conducted a rigorous analysis of the fungal dataset following best practices in fungal metabarcoding and integrated it with over 100 predictor variables to fast-track data exploration. We carefully validated our methodology based on studies conducted on historical versions of the dataset. Our approach generated robust information on Australian soil fungi that can be leveraged by end-users interested in biodiversity, biogeography, and conservation. This novel resource will unlock new frontiers in soil fungal research within the Southern Hemisphere and beyond.
Similar content being viewed by others
Background & Summary
Soil-inhabiting fungi play an indispensable role in shaping the composition and function of terrestrial ecosystems1. As dominant drivers of soil carbon and nutrient cycles, fungi sustain plant production through a stable supply of available nutrients2. Mycorrhizal fungi influence plant diversity3, distribution4, and productivity5,6 by facilitating nutrient uptake, improving pathogen resistance, and promoting overall ecosystem health7. While soil-borne fungal pathogens can pose threats to global food security and ecosystem resilience in the face of global environmental change8,9, they can also promote biodiversity10,11 by engaging in antagonistic interactions with plants and animals. To address environmental challenges, it is essential to harness our understanding of soil fungal biodiversity, as the balance between functional guilds is vital to ensure ecosystem stability12, improve restoration outcomes13, and support sustainable agriculture initiatives14. However, achieving robust research to inform environmental policy and management requires the establishment of comprehensive protocols and well-curated information that can accurately capture fungal diversity15.
Fungi represent a megadiverse kingdom dominated by inconspicuous taxa that remain largely undetected by the naked eye16. Traditionally, observational approaches to documenting fungi have limited our attention to groups that produce visible reproductive structures, such as mushrooms, or fungi that can be isolated and grown under laboratory conditions17,18. The advent of high-throughput DNA sequencing methods, including metabarcoding, has dramatically improved the detection and understanding of fungal diversity in a variety of ecosystems worldwide19,20. The capacity to determine the species composition of fungi within a given environmental sample has revealed, for example, new fungal phyla and their hidden ecological functions16,17.
The Australian Microbiome Initiative aims to promote microbiome research by developing publicly available metabarcoding data focused on four main groups of organisms: bacteria, archaea, eukaryotic microbes, and fungi21 (www.australianmicrobiome.com) This initiative has primarily focused on sampling terrestrial topsoil (0–10 cm) and subsoil (20–30 cm), as well as marine samples from coastal and pelagic zones. Using Illumina amplicon paired-end sequencing, the Australian Microbiome has generated fungal metabarcodes targeting the internal transcribed spacer region (ITS), the genetic marker of choice for fungal metabarcoding22. ITS amplicons generated from topsoil samples cover 2,225 uniquely georeferenced sites so far (Fig. 1), spanning an extraordinary variety of bioregions, vegetation classes, and land use types. This dataset has been used to address a range of fundamental and applied research questions, including unravelling new fungal records in Australia23, evaluating diversity patterns in soils24, modelling the distributions of fungal species25, monitoring fungi for revegetation applications26,27 and human health purposes28, as well as exploring correlations between fungal diversity and community assembly with a range of environmental predictors29,30,31,32,33, disturbance scenarios34,35,36 and vegetation types37.
Geographic and taxonomic distribution of ITS1 fungal operational taxonomic units (OTUs) from our contemporary Australian Microbiome dataset. Plot locations (n = 2,103) of Australian Microbiome samples collected from terrestrial biomes in eight ecoregions, with the number of samples per ecoregion in parenthesis: (a) continental Australia (n = 1,874), (b) Christmas Island (n = 8) and (c) Antarctica (n = 220). (d) OTU richness, sequence abundance, and sample prevalence of the top ten most OTU-rich fungal genera colour-coded by their primary guilds. (e) KRONA chart showing the taxonomic distribution of dominant fungal phyla, orders, and species; an interactive chart for all taxonomic groups and ranks is provided on figshare57.
Sequence artefacts (or false positives) can occur at various stages of a metabarcoding project, from sample collection to bioinformatic analysis. These artefacts include biological contamination, chimera formation during library preparation, incorrect base calls (or sequencing errors) during sequencing, misassignment of sequences to samples (or index switching), or taxonomic misidentifications38. Due to sequence length limitations of Illumina platforms, most fungal metabarcoding studies, including large-scale and global datasets39,40,41, typically target the ITS1 or ITS2 subregions independently. Merging complementary forward and reverse sequences is a critical sequence quality filter to reduce sequencing errors, thereby improving the accuracy of diversity estimates38,42. The chosen polymerase chain reaction (PCR) primers of the Australian Microbiome, ITS1F43 and ITS444, target the full ITS region, leading to two amplicons (i.e. ITS1 for forward sequences and ITS2 for reverse sequences) from which sequences were generally too short to be merged45. Consequently, the GlobalFungi46 database—a comprehensive atlas of global fungal distribution comprised of hundreds of individual metabarcoding studies—revealed extremely high levels of richness in well-described fungal genera known for their relatively low diversity levels47. Yet this inflated diversity was mostly attributed to single-end sequences from the Australian Microbiome ITS dataset47, underscoring their uneven quality when processed using routine bioinformatic techniques. When analysing the bacterial component of the Australian Microbiome dataset, richness estimates in individual samples were highly dependent upon the overall diversity of the set of samples sequenced together (i.e. within a sequencing library), a phenomenon mostly attributed to index switching48. Processed Australian Microbiome ITS data were integrated into the Atlas of Living Australia49 (ALA), which is transferred to the Global Biodiversity Information Facility (GBIF). Consequently, misleading occurrences of exotic Amanita species in Australia were detected in the ALA and GBIF, which were attributed to lenient settings during the taxonomic assignment of Australian Microbiome ITS data50. Furthermore, ALA and GBIF now house more than 1,000 material sample records of ectomycorrhizal fungi in Antarctica sourced from the Australian Microbiome ITS dataset51,52, despite the absence of ectomycorrhizal host plants53, human observations, or specimen records to corroborate these findings54. Together, these studies highlighted the need to re-analyse the Australian Microbiome ITS dataset for their integration into biodiversity platforms, to achieve accurate fungal detections and robust ecological conclusions.
To provide novel and robust insights into fungal biodiversity in Australian and Antarctic soils, we meticulously reanalysed all ITS topsoil (0–10 cm) samples currently available from the Australian Microbiome, following the most up-to-date protocols and recommendations for fungal metabarcoding38,42,55,56 (Fig. 2). We established a detailed and reproducible bioinformatic pipeline and carefully benchmarked our results by evaluating the impact of data processing on fungal diversity, ectomycorrhizal occurrences, and Amanita taxonomy, based on previous studies47,48,50. We further validated our dataset by reproducing a study that modelled fungal species distributions based on a historical version of the Australian Microbiome dataset25. With the aim of boosting research on the diversity, ecology, and conservation of soil fungi in Australia, we provide a reliable dataset that is readily transferable to end-users interested in exploring, modelling, or conserving fungi in soils. Our conservative approach is best suited for the integration of sequence data into biodiversity platforms such as ALA or GBIF and will benefit future research on soil fungi in Australia and beyond.
Methods
We provide a summary of our workflow in Fig. 2. All reproducible scripts are available on GitHub (see Code Availability), and primary data files and OTU matrices are on a dedicated data repository57.
Data collection
We retrieved ITS amplicon data from topsoil (0–10 cm layer) samples generated from terrestrial soil biomes by the Australian Microbiome Initative21 (www.australianmicrobiome.com) and available on the Bioplatform Australia data portal (https://data.bioplatforms.com/organization/australian-microbiome) in January 2024, using search terms “sample_type:Soil & amplicon:ITS & depth_lower:0.1”. Accession numbers and persistent URL links for each sample are available in the sample metadata file57. The original data we sourced are openly available for re-use and re-distribution under Creative Common Attribution License: see the Australian Microbiome data sharing policy (https://www.australianmicrobiome.com/protocols/data-policy/) and references in Table 1 for the georeferenced grid data. Detailed soil sampling and sequencing protocols with historical updates are available on the Australian Microbiome website (https://www.australianmicrobiome.com/protocols).
Bioinformatics workflow
This workflow is informed by up-to-date recommendations for fungal metabarcoding38,42,55,58. We retained demultiplexed sequences from 2,443 samples representing 2,225 uniquely georeferenced plots from 42 sequencing libraries. We initially trimmed sequences using Trimmomatic59 v0.36 to eliminate noisy distal components of sequences, thereby improving sequence quality scores and the recovery of ITS1 sequences during ITS extraction. The full ITS1 subregion was extracted using ITSxpress60 v2.0.0 (all eukaryotes) and a minimum sequence length > 50 bases; partial ITS1 sequences were discarded as too noisy (see Technical Validation: Impact of ITS1 length on detection). ITS1 sequences were quality filtered using a maximum expected error rate of one, allowing zero ambiguous nucleotides in VSEARCH61 v2.22.1. Denoising was performed library-by-library to improve error rate estimates using DADA262 v1.30. We removed putative chimeras using the denovo method and reference-based method against UNITE + INSD63 v10.0 (all eukaryotes) using the UCHIME264 algorithm in VSEARCH61.
We acknowledge that we were unable to perform some recommended quality filtering steps due to compromised sequence data. For example, we could not merge complimentary forward and reverse sequences because the combined length of forward and reverse sequences was typically shorter than the full ITS (ITS1–5.8S–ITS2) amplicons that were targeted with the chosen PCR primers (ITS1F43 and ITS444). Since the single-end forward sequences (i.e. ITS1 sequences) were superior in quality to the single-end reverse sequences (i.e. ITS2 sequences), we relied on single-end forward sequences that captured the ITS1 region, as did most previous studies using Australian Microbiome data23,24,25,26,27,28,29,30,31,32,33,34,35,36,37. Assessing index switching rates based on positive control samples is another important filtering step38,56, as index switches represent one of the most detrimental artefacts in the evaluation of biogeographical patterns65. Because positive control samples only occurred in 24 of the 42 sequencing libraries generated by the Australian Microbiome, this limited our ability to rigorously detect index switches. To adapt to these methodological constraints, we chose a conservative approach when processing the sequence data to minimise the likelihood of retaining false positives. This conservative approach included denoising using the DADA262 algorithm, a haplotype-based approach (i.e. amplicon sequence variant or ASV) developed to cluster ribosomal RNA gene amplicons based on estimated rates of sequencing errors55,62. To achieve robust inferences, we quality filtered sequences using a maximum expected error rate of one, as the default settings of DADA262 (max error = 2) led to inflated diversity and unreliable species distributions. While denoising significantly reduces the proportion of sequence artifacts62, it can also underestimate the richness of rare and phylogenetically unique fungi by incorrectly identifying low abundant ASVs as noise38,42. This discrimination is likely to disproportionately eliminate early diverging fungi, while inflating the proportional richness of dominant fungal groups within the phyla Ascomycota and Basidiomycota38. We found this to be the case in our analysis (see Technical Validation: Impact of abundance filtering approaches) and recognise that our conservative approach will result in some biases in fungal diversity estimates.
Curation of the sample-by-ASV matrix
To curate the sample-by-ASV matrix, we applied two sequential filters: (1) A sample-wise abundance filter removed ASVs with relative abundance < 0.1% of the total sequence count within each sample to address sequencing errors as well as environmental and wet lab contaminations56. (2) A library-wise abundance filter removed ASVs from individual samples where thier abundance was < 0.5% of the total ASV’s sequence count across the entire library. This library-wise filter targeted index-switching artifacts, which manifest as low-abundance ‘bleed-through’ of ASVs into non-source samples56. We established these thresholds (1) based-on positive control richness, (2) by assessing ectomycorrhizal distributions in Antarctic samples where ectomycorrhizal host plants are known to be absent, and (3) by eliminating putative artefacts while minimising relative taxonomic biases between abundant and rare taxa (see Technical Validation: Impact of abundance filtering approaches). We further eliminated positive controls on a library-by-library basis. Finally, we removed plot replicates by retaining the sample with the lowest richness-to-abundance ratio. We also removed low-abundance samples with sequencing depth < 5000 sequences, since low-abundance samples tend to accumulate greater proportions of sequence artefacts than high abundance samples38. We acknowledge that our abundance filtering approach is fairly stringent and can result in false negatives and an overall underestimation of fungal diversity (see Technical Validation: Impact of abundance filtering approaches). However, we sometimes detected extremely high index switching rates in the sequence data prior to filtering, based on mock community samples with a mean richness of 141 (±123), instead of an expected OTU richness of 10. We therefore focused on limiting false positives that are more deleterious when informing biodiversity platforms, such as ALA or GBIF66, or for ecological inferences such as niche and distribution modelling65.
Taxonomic and trait assignments
We used DNAbarcoder58 to predict taxon-specific similarity score cut-offs for taxonomic assignments from rank phylum to species, using ITS1-extracted fungal sequences in UNITE + INSD67 v10.0. Local cut-offs were set for each supertaxon–subrank combination (e.g. class cut-off for subrank class in supertaxon Ascomycota) where at least ten taxa were represented by 30 sequences, and the max proportion of sequences attributed to any individual taxon was less than 50%. Local cut-off predictions for supertaxon–subrank combinations were computed across all higher ranks (e.g. species-level cut-offs were predicted for Cotinarius, Cortinariaceae, Agaricales, Agaricomycetes and Basidiomycota). Final cut-offs were retained for each supertaxon–subrank combination, giving preference to cut-offs with the highest confidence. When local cut-offs could not be obtained due to insufficient number of sequences or taxa within a subgroup, a global cut-off was predicted. Because the global predictions were highly biased toward Dikarya, we subdivided the dataset based on phylogenetic relatedness to improve global cut-off predictions from class to species (Fig. 3). Phylum-level cut-offs were predicted for all fungi, and kingdom-level cut-offs were derived from these phylum cut-offs.
Species-level global similarity scores as predicted by DNAbarcoder58 using unique fungal sequences in UNITE + INSD67 v10.0. The analysis covered three major fungal groups: Dikarya (phyla Ascomycota, Basidiomycota and Entorrhizomycota); terrestrial early diverging fungi (phyla Basidiobolomycota, Calcarisporiellomycota, Entomophthoromycota, Glomeromycota, Kickxellomycota, Mortierellomycota, Mucoromycota and Zoopagomycota); and single-celled zoosporic fungi (phyla Aphelidiomycota, Blastocladiomycota, Chytridiomycota, Monoblepharomycota, Neocallimastigomycota, Olpidiomycota, Rozellomycota and Sanchytriomycota, as well as GS01). Global similarity scores for each group are shown in parentheses. These results demonstrate that kingdom-level fungal scores are skewed toward Dikarya, and more accurate global scores of early diverging fungi can be achieved by analysing these groups separately.
We assigned taxonomy to OTUs by initially running BLASTn queries (-task blastn-short in BLAST+68 v2.14.1) against an ITS1 extracted version of UNITE + INSD63 v10.0 and retaining the top five hits. This process was independently applied to the full UNITE + INSD dataset, as well as a filtered version containing sequence species-level information. We used the resulting taxonomy tables to assign taxonomy from kingdom to species using a three-step approach. First, we computed similarity scores using BLAST percent identify as outlined in DNAbarcoder58. This approach adjusts for short alignment lengths to mitigate artificially inflated high similarity scores that result from such alignments. Second, we filtered BLAST hits using taxon-specific cut-offs predicted by DNAbarcoder58, and coverage cut-offs of 90% for genus and 95% for species to avoid misassignments based on high similarity scores and partial alignments. Third, taxonomy was accepted at a given rank if the remaining BLAST hits achieved over 66% consensus at that rank, with preference given to hits from the species-only UNITE + INSD dataset. The resulting taxonomy table was annotated with trait information based on generic identity in FungalTraits69, except for arbuscular mycorrhizal traits which were assigned to all Glomeromycota ASVs70.
We clustered ASVs into OTUs to account for intraspecific and intragenomic sequence variants in the ITS region38,42,55,71, using a taxonomically-informed approach adapted from previous research72,73,74. This approach clustered sequences from ranks kingdom to species using the taxon-specific cut-offs we predicted with DNAbarcoder58. Initially, closed-reference clustering was performed with ASVs having taxonomic affiliations serving as cluster cores, while unidentified ASVs were matched to their nearest cluster core using BLASTn (-task blastn-short in BLAST+68 v2.14.1) and the taxon-specific sequence similarity cut-offs. Similarity scores were adjusted for short alignment lengths following the methods outlined in DNAbarcoder58. This step iterated until no new matches were found, forming approximate single-linkage clusters. Single-linkage de novo clustering was then performed on the remaining unclustered ASVs using BLASTClust75 v2.2.26 and each de novo cluster was tagged with unique pseudotaxon names (e.g. ‘pseudo_class_Ascomycota_1234’ for an unidentified class cluster with taxonomic affiliation to Ascomycota). Reference-based and de novo clustering required 90% and 95% sequence coverage at genus and species ranks, respectively. This process was performed in a nested fashion from kingdom to species, with each step constrained by a given supertaxon, and species-level clusters represented OTUs. The final dataset57 contained 31,926 fungal OTUs and 2,104 unique samples58.
Collation of environmental predictors
To facilitate data exploration by end-users, we supplemented sample metadata with more than 120 predictor variables from a variety of data sources (Table 1). These predictors include sample- soil physiochemical measurements from the Australian Microbiome, as well as georeferenced predictors representing climate76, soil temperature77, soil physiochemistry78,79,80,81,82,83,84,85,86,87, vegetation structure88, plant diversity89, habitat condition90, and geographic variables91. We also provided references to source raster files of georeferenced variables to assist predictive modelling applications.
Data Records
All primary data products produced in our study are available at figshare57 in eight files: (1) sample metadata file, (2) sample metadata descriptor, (3) sample-by-OTU matrix with absolute sequence abundances, (4) sample-by-OTU matrix normalised to a minimum sequencing depth of 5000 reads, (5) quality-filtered ITS1 reference sequences of OTUs in FASTA format, (6) taxonomy file without pseudotaxon names (i.e. pseudotaxa renamed as ‘unidentified’), (7) taxonomy file with pseudotaxon names, and (8) interactive KRONA chart showing the taxonomic distribution of our contemporary Australian Microbiome dataset. The sample metadata contains persistent links to the source files in fastq format, information on sample location, date of collection, a range of measured soil physicochemical characteristics, georeferenced environmental variables, alpha-diversity statistics, and estimated level of mould contamination (Table 1). The taxonomy file contains information on representative BLAST hits for each OTU, along with the associated trait information and relevant taxonomic assignment statistics (Table 2).
Technical Validation
Data processing
Impact of ITS1 length on detection across taxonomic groups
We processed 300 bp sequences targeting the fungal ITS1, flanked by the highly conserved SSU (18S) and 5.8S rRNA genes. These sequences captured approximately 46 bp of the SSU, with many sequences extending into the 5.8S region, considering that the average length of fungal ITS1 is 177 bp92. Removing these conserved regions, which lack species-level resolution, is crucial for enhancing OTU clustering93 and taxonomic accuracy94. We used ITSxpress60 for ITS1 extraction, which is designed to extract ITS regions before denoising. Ideally, forward and reverse reads are merged during ITS extraction, with partial ITS1 sequences discarded to improve ASV calling60. Since paired-end reads in the Australian Microbiome fungal dataset cannot be merged (see Bioinformatics workflow), ITS extraction and the discarding of partial ITS1 sequences biases against fungal groups with long ITS1 regions (>230 bp) (Figs. 4–6), likely resulting in false negatives and underestimation of diversity in these groups. To retain partial ITS1 reads, ITS extraction can be performed after quality processing using ITSx95. We found that this approach improved the detection of some groups with longer ITS1 regions, such as Cortinarius and Inocybe (Fig. 5), but coincided with an overall decrease in sequence quality. We acknowledge that discarding partially trimmed ITS1 sequences is conservative and likely results in the exclusion of fungal groups with long ITS1 regions. However, we focused on limiting false positives, which are more deleterious when informing biodiversity platforms66 and inferring niche and distribution patterns65.
Distribution of unique ITS1 fungal sequence lengths across UNITE + INSD67 v10.0 and in our contemporary Australian Microbiome dataset, comparing all fungi, phylum Mucoromycota, ectomycorrhizal (ECM) fungi, and non-ECM (primarily saprotrophic) Agaricomycetes. In these groups, we expect that the exclusion of partial ITS1 sequences during ITS extraction can reduce species detection and biase diversity estimates. The dotted line at 230 bp indicates the 99th percentile of ITS1 sequence length in our contemporary Australian Microbiome dataset, suggesting that species with longer sequences are likely to remain undetected.
Impact of ITS1 extraction methods on operational taxonomic unit (OTU) richness, sequence abundance, and OTU prevalence in our contemporary Australian Microbiome dataset. The ITSx95 method includes full and partial ITS1 sequences, whilst ITSxpress60 retains only full ITS1 sequences. We selected ITSxpress as the preferred approach due to reduced noise in the final dataset compared to ITSx. The x-axis displays relative values across the global dataset, with numbers in bars showing absolute measures.
Mean (±SD) ITS1 region length of unique sequences from UNITE + INSD67 v10.0 focusing on fungal classes and mycorrhizal fungal genera with described species in Australia based on the Fungi Name Index (https://fungi.biodiversity.org.au). (a) Fungal classes, (b) ectomycorrhizal (ECM) genera from the class Agaricomycetes, (c) ECM genera from phylum Ascomycota, and (d) arbuscular mycorrhizal genera from phylum Glomeromycota. Colours illustrate classes and genera that were detected (in black) or not (in red) in our contemporary Australian Microbiome dataset. The detection of classes did not improve when partial ITS1 regions were retained. The dotted line at 230 bp marks the 99th percentile of ITS1 sequence length in our contemporary Australian Microbiome dataset. Classes with a standard deviation extending above this line are likely to contain taxa likely to remain undetected in the dataset.
Impact of abundance filtering approaches on diversity estimates across fungal phyla
To assess the robustness of our filtering approach, we evaluated how sample-wise and library-wise OTU abundance filtering impacted the diversity and prevalence of OTUs across high level groups in our contemporary Australian Microbiome dataset. These groups included phylum Ascomycota, phylum Basidiomycota, subkingdom Mucoromyceta (Calcarisporiellomycota, Glomeromycota, Mortierellomycota and Mucoromycota), and early diverging phyla of unicellular zoosporic fungi96 (mostly phyla Rozellomycota and Chytridiomycota) (Fig. 7).
Impact of sample‐wise and library‐wise abundance filtering thresholds on the mean richness (absolute and relative), abundance (absolute and relative), and prevalence of fungal operational taxonomic units (OTUs): (a) Phylum Ascomycota, (b) phylum Basidiomycota, (c) subkingdom Mucoromyceta (comprising phyla Calcarisporiellomycota, Glomeromycota, Mortierellomycota and Mucoromycota), and (d) early‐diverging lineages of zoosporic fungi (predominantly phyla Rozellomycota and Chytridiomycota). Sample‐wise thresholds are shown along the x‐axis, and paired boxes represent library‐wise thresholds of 0.5% (on the left) and 1.0% (on the right). Panels (e,f) demonstrate the impact of sample‐wise filtering on global relative richness and abundance, based on the 0.5% library‐wise abundance filter, which produced a marked effect on richness and a marginal effect on relative sequence abundance at high taxonomic resolution.
Increasing OTU filtering thresholds generally led to a decrease in mean OTU abundance, richness, and prevalence (Fig. 7). Overall, higher filtering thresholds favoured abundant OTUs in phylum Basidiomycota, while discriminating against OTUs from early diverging phyla. Consequently, our chosen 0.1% sample-wise and 0.5% library-wise cut-offs tended to overestimate the proportional richness of Basidiomycota OTUs to the detriment of early diverging phyla. A similar trend was noted for global OTU richness, however, there was no significant impact on the global relative abundance of sequences (Fig. 7).
Impact of sequencing depth on OTU richness
We visualised the relationship between sequencing depth and OTU richness, as well as sequencing depth per sample, with rarefaction curves (Fig. 8). Rarefaction curves were in saturation across all samples, and sequencing depth explained <0.1% of the variation in OTU richness within the Australian samples of the contemporary dataset. In contrast, sequencing depth explained 8.5% of the variation in OTU richness in the Antarctic samples (Fig. 8a). These results illustrate that the raw sample-by-OTU matrix does not need rarefaction before conducting diversity analyses on Australian samples, yet diversity analyses specifically focusing on Antarctic samples may benefit from including log-transformed sequencing depth as a covariate in models.
Impact of sequencing depth on operational taxonomic unit (OTU) richness: (a) Relationship between sequencing depth and fungal OTU richness in Australian and Antarctic samples; (b) rarefaction curves per sample. The vertical grey line represents the minimum sequencing depth with the maximum sequencing depth truncated to 20,000 sequences to improve readability.
Impact of mould abundance on OTU richness
Moulds resulting from poor sample preservation can negatively impact sample OTU richness97. We tested this by correlating mould relative abundance (i.e. Mucorales, Mortierellales, Umbelopsidales, Aspergillaceae, Trichocomaceae, Bifiguratus and Trichoderma) with fungal OTU richness in our contemporary dataset. Although mould relative abundance did not affect the overall richness of OTUs, it accounted for 10.1% of the variation in OTU richness in samples with a mould relative abundance greater than the median value (Fig. 9a). We found that the relative abundance of Mortierellales, Umbelopsidales and Trichoderma (herein collectively referred to as moulds) had a particularly negative impact on OTU richness, and used their cumulative relative abundance as a proxy for mould contamination. After removing samples with putative mould contamination based on a 35% relative abundance threshold (i.e. mean mould relative abundance plus three standard deviations), the impact of mould relative abundance on OTU richness was drastically reduced (Fig. 9b).
Relationships between mould (Mortierellales, Umbelopsidales and Trichoderma) relative abundance and operational taxonomic unit (OTU) richness in our contemporary Australian Microbiome dataset: (a) All samples, including those with putative mould contamination, and (b) samples without putative mould contamination (i.e. samples with mould relative abundance below the 35% cut‐off). The non‐truncated linear lines (in blue) and Loess curves (in red) represent all data points, while the truncated linear lines (in green) are fitted to data points with mould relative abundance exceeding the median.
Spatial validation
Tracking ectomycorrhizal distributions as a proxy to index switching
Considering that no putative ectomycorrhizal host plants exist in Antarctica53, nor any human observation or specimens of ectomycorrhizal fungi have been recorded from that region54, material sample records of ectomycorrhizal fungi in Antarctica attributed to the Australian Microbiome ITS dataset51,52 are likely artefactual. We therefore used these Antarctic samples as ‘environmental controls’ to assess index switching rates based on the detection of ectomycorrhizal OTUs in Antarctica. We further explored biogeographic patterns in the ectomycorrhizal genus Cortinarius, a group with a well-described distribution in Australia based on >11,000 human observation and fungarium specimen records98. We compared Cortinarius distribution in Australia based on observation and specimen records with occurrences from the historical Australian Microbiome dataset on ALA51, as well as from our contemporary dataset52 (Fig. 10). To make a fair comparison between the contemporary and historical datasets, we limited the latter to topsoil samples and ITS1 sequences, as ALA also houses data from subsoil samples and the ITS2 region.
Distribution of Cortinarius in Australia: (a) Records based on human observations and fungarium specimens98, (b) amplicon sequence variants (ASVs) from the historical Australian Microbiome dataset on the Atlas of Living Australia51, and (c) operational taxonomic units (OTUs) from our contemporary dataset.
Our contemporary dataset exhibited 11,379 occurrences of ectomycorrhizal OTUs corresponding to 1,932 OTUs in 81 genera across 1,523 plots (67.4% of all plots), none of which occurred in Antarctica. In contrast, the historical dataset (topsoil ITS1 sequences) detected 24,948 occurrences of ectomycorrhizal ASVs, corresponding to 89 genera across 1,503 plots (87.2% of all plots). The historical dataset contained 818 occurrences of ectomycorrhizal ASVs from 38 genera in Antarctica, including 102 occurrences of Cortinarius ASVs. In Australia, Cortinarius OTUs in our contemporary dataset followed a similar distribution to human observation and specimen records98 (Fig. 10a and c). On the other hand, the historical dataset revealed high Cortinarius prevalence in the central and northern regions of Australia where observational records are sparse (Fig. 10a,b). Many of these detections probably include false positives due to index switching in the historical dataset, leading to erroneous distributions in the ALA and GBIF. Such misleading information is likely to be mirrored in other dominant taxa, including important pathogens. Therefore, we advocate for the substitution of the historical Australian Microbiome dataset on the ALA and GBIF with our contemporary dataset.
Evaluating the accuracy of taxonomy assignment
We compared our taxonomic assignments with those in the historical Australian Microbiome dataset on Bioplatforms Australia (https://data.bioplatforms.com/bpa/otu), which classified ASVs using the Ribosomal Database Project Classifier and UNITE v9.0. Our contemporary dataset provided taxonomic information for 14,312 OTUs (44.9%) at the genus rank and 2,058 OTUs (6.5%) at the species rank. In contrast, the historical dataset assigned 157,396 (51.3%) ASVs at the genus rank and 35,996 (11.7%) ASVs at the species rank. Each identified species in our dataset corresponded to a single OTU, linking 2,058 unique species to 2,058 unique OTUs. Conversely, the historical dataset linked 35,996 ASVs to 5,358 unique species.
There are currently 8,712 accepted non-lichenised fungal names in Australia, according to the Fungi Names Project (https://fungi.biodiversity.org.au/)—updated using the MycoBank database (https://www.mycobank.org/). We found that only 2,889 of these fungi had sequences in UNITE, suggesting that many species identified in the historical dataset might be linked to sequences from outside Australia.
Our taxonomic assignment approach proved to be comparatively more conservative than the historical dataset. For instance, the historical dataset had 252 ASVs linked to 50 species and 18 genera in the class Ustilaginomycetes (which are mostly plant pathogenic smut fungi), while our contemporary dataset identified 14 species and 10 genera within Ustilaginomycetes. Both datasets extended the known geographic range of many Ustilaginomycetes genera and species, particularly in the savanna region of north-central Australia (Fig. 11). However, many ASVs in the historical dataset had sequences with low similarity to their species annotations. When applying our taxonomic assignment approach to the historical dataset, only 97 of the 252 ASVs remained identified at the class Ustilaginomycetes, with 33 of these receiving new taxonomic annotations. This suggests that our contemporary dataset provides a more conservative but robust species-level annotations and species-distribution estimates.
Distribution of Ustilaginomycetes species in Australia: (a) Records based on human observation and fungarium specimens98, (b) amplicon sequence variants from the historical Australian Microbiome dataset on Biopatforms Australia (https://data.bioplatforms.com/bpa/otu), and (c) operational taxonomic units from our contemporary dataset.
A recent study highlighted misidentifications of Amanita records from the historical Australian Microbiome dataset on the ALA, due to lenient confidence thresholds used during taxonomic assignment50. Specifically, 18 unique sequences were misassigned with northern hemisphere taxon names and likely represented native Amanita species closely related to those exotic species. To validate the reliability of Amanita annotations in our contemporary dataset, we tracked collection locations of sequenced specimens (i.e. source specimens) linked to species hypotheses in UNITE (https://unite.ut.ee/search.php), along with human observation and preserved specimen records in Australia98, and compared their distribution with Amanita OTUs from our contemporary dataset.
Our contemporary dataset comprised 13 OTUs identified as Amanita (Table 3). Among these, 11 specimens had reference sequences originating from Australia. The distribution patterns of these OTUs were typically consistent with source specimens and known locations, yet some exhibited range extensions consistent with known biogeographic patterns in Australia99 (Fig. 12). Amanita muscaria, known for its global distribution, was also detected, as well as Amanita silvifuga, which may be an exotic introduction, though further material is needed for confirmation50. These findings suggest that our taxonomic assignment approach using DNAbarcoder58 was generally robust and reliably identified Amanita fungal OTUs.
Distribution of Amanita in Australia: Operational taxonomic units from our contemporary Australian Microbiome dataset (OTUs; in green), locations of species hypotheses linked to source specimens in UNITE + INSD67 (Source spec.; in gold), and human observations and fungarium specimen records98 (Obs. & spec.; in purple).
Modelling the ecological niches and distribution patterns of mycorrhizal fungi
To assess the performance of our contemporary dataset in species distribution models (SDMs), we replicated SDMs that were previously built for two orchid mycorrhizal fungal OTUs (OTU C and OTU O) from the family Ceratobasidiaceae, based on the historical version of the Australian Microbiome dataset25. Presence-background maximum entropy100,101 distribution models were created using the R package dismo102. The SDMs were developed by independently constructing continental-scale climatic and edaphic models, then multiplying these to generate a composite estimate of habitat suitability for each OTU (Fig. 13). We then compared estimations of niche overlap and area of occupancy between the contemporary and historical datasets for each OTU. Detailed methods and results are presented in Supplementary File 1.
Projected distributions of Ceratobasidiaceae mycorrhizal fungi for operational taxonomic units (OTUs) OTU C (top panel) and OTU O (bottom panel), combining both climatic and edaphic drivers: Projected likelihood of occurrence and proportional contributions to SDMs using the historical Australian Microbiome datasets25 (panel a) and the contemporary dataset generated from this study (panel b) as training data. Red points indicate Australian Microbiome data used to train the respective models. Orange points indicate records isolated from orchids25.
The performance of the models for both OTUs appears to be generally robust across datasets, with moderate to high performance scores indicating their reliability in projecting geographical occurrence within the training niche space (Supplementary Table 1). Intraspecific niche overlap between the models was low (OTU C = 0.598; OTU O = 0.269), highlighting marked differences in predictive performance between the historical and contemporary datasets (Fig. 12). Models using our contemporary dataset projected 23,147 km2 more suitable area for OTU C and 1,075,343 km2 more suitable area for OTU O compared to the historical dataset. Models based on our contemporary dataset appeared more plausible for OTU C, a taxon that is also ectomycorrhizal103, therefore with a distribution influenced by host distribution and environmental variables104, and less likely to occur across large bioregional gradients. In contrast, models based on our contemporary dataset suggested a more widespread distribution for OTU O than those based on the historical dataset. This modelled distribution matches more closely with known occurrences of OTU O orchid hosts (particularly Pterostylis spp.) that are distributed throughout south-west and south-east Australia, including in inland and semi-arid areas25. Models based on our contemporary dataset continue to suggest that these fungi have distributions larger than that of their orchid hosts, which is expected as orchid mycorrhizal fungi can have multiple lifestyles105, as free-living saprotrophs106,107 and ectomycorrhizal fungi with non-orchid plants103,108.
Usage Notes
The contemporary Australian Microbiome dataset generated here is ready-to-use for the detection and ecological modelling of soil fungi in Australia and Antarctica. To account for differences in sequence depth in abundance-based analyses, a normalised OTU-by-sample matrix has been provided. Our dataset can be used without further bioinformatic manipulation or expertise in fungal taxonomy. This dataset is particularly suitable as presence-only data for exploring fungal occurrences and distributions. Our conservative quality filtering approach has likely led to some level of underestimation of fungal diversity, as well as false absences. With this in mind, we recommend that researchers using this dataset for diversity analyses (1) exclude samples with ‘mould contamination’ greater than 35% to account for its negative effect on fungal OTU richness (Fig. 9), (2) note that rarefaction may be required for Antarctic samples, but not for Australian samples (Fig. 8), and (3) keep in mind that the detection rate and diversity of early diverging fungi, as well as fungal groups with long ITS1 regions, are likely to be disproportionately underestimated (Figs. 4–7).
The taxonomic and functional annotations have been rigorously assessed using up-to-date methods for taxonomic assignment and functional reference databases, ensuring robust OTU annotations that overcome underlying quality issues in the raw data. Therefore, we strongly advocate for the integration of our contemporary Australian Microbiome dataset into biodiversity platforms such as the ALA and GBIF, unlocking its immense potential to advance fungal biodiversity and ecology research from local to global scales.
Code availability
The Bash and R scripts and a list of dependencies used to perform the bioinformatics and downstream analyses are available at https://github.com/LukeLikesDirt/AusMycobiome. All primary data products produced in our study are available at figshare57.
References
Bardgett, R. D. & van der Putten, W. H. Belowground biodiversity and ecosystem functioning. Nature 515, 505–511 (2014).
Gessner, M. O. et al. Diversity meets decomposition. Trends Ecol Evol 25, 372–380 (2010).
Delavaux, C. S. et al. Mycorrhizal feedbacks influence global forest structure and diversity. Commun Biol 6 (2023).
Delavaux, C. S. et al. Mycorrhizal fungi influence global plant biogeography. Nat Ecol Evol 3, 424–429 (2019).
Anthony, M. A. et al. Forest tree growth is linked to mycorrhizal fungal composition and function across Europe. ISME J 16 (2022).
Luo, S. et al. Higher productivity in forests with mixed mycorrhizal strategies. Nat Commun 14, 1377 (2023).
van der Heijden, M. G. A., Martin, F. M., Selosse, M. A. & Sanders, I. R. Mycorrhizal ecology and evolution: The past, the present, and the future. New Phytol 205, 1406–1423 (2015).
Singh, B. K. et al. Climate change impacts on plant pathogens, food security and paths forward. Nat Rev Microbiol 21, 640–656 (2023).
Fisher, M. C. et al. Emerging fungal threats to animal, plant and ecosystem health. Nature 484, 186–194 (2012).
van Ruijven, J., Ampt, E., Francioli, D. & Mommer, L. Do soil‐borne fungal pathogens mediate plant diversity–productivity relationships? Evidence and future opportunities. J Ecol 108, 1810–1821 (2020).
Sarmiento, C. et al. Soilborne fungi have host affinity and host-specific effects on seed germination and survival in a lowland tropical forest. Proc Natl Acad Sci USA 114, 11458–11463 (2017).
Liu, S. et al. Phylotype diversity within soil fungal functional groups drives ecosystem stability. Nat Ecol Evol 6, 900–909 (2022).
Averill, C. et al. Defending Earth’s terrestrial microbiome. Nat Microbiol 7, 1717–1725 (2022).
Lutz, S. et al. Soil microbiome indicators can predict crop growth response to large-scale inoculation with arbuscular mycorrhizal fungi. Nat Microbiol 8, 2277–2289 (2023).
Ryberg, M. & Nilsson, R. H. New light on names and naming of dark taxa. MycoKeys 30, 31–39 (2018).
Lücking, R. et al. Fungal taxonomy and sequence-based nomenclature. Nat Microbiol 6, 540–548 (2021).
Niskanen, T. et al. Pushing the Frontiers of Biodiversity Research: Unveiling the Global Diversity, Distribution, and Conservation of Fungi. Annu Rev Environ Resour 48, 149–176 (2023).
Gonçalves, S. C., Haelewaters, D., Furci, G. & Mueller, G. M. Include all fungi in biodiversity goals. Science (1979) 373, 403–403 (2021).
Mikryukov, V. et al. Connecting the multiple dimensions of global soil fungal diversity. Sci Adv 9, eadj8016 (2023).
Tedersoo, L. et al. Global diversity and geography of soil fungi. Science 346, 1052–1053 (2014).
Bissett, A. et al. Introducing BASE: the Biomes of Australian Soil Environments soil microbial diversity database. Gigascience 5, s13742–016 (2016).
Schoch, C. L. et al. Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi. Proc Natl Acad Sci USA 109, 6241–6246 (2012).
Midgley, D. J., Greenfield, P., Bissett, A. & Tran-Dinh, N. First evidence of Pezoloma ericae in Australia: using the Biomes of Australia Soil Environments (BASE) to explore the Australian phylogeography of known ericoid mycorrhizal and root-associated fungi. Mycorrhiza 27, 587–594 (2017).
Davoodian, N., Jackson, C. J., Holmes, G. D. & Lebel, T. Continental‐scale metagenomics, BLAST searches, and herbarium specimens: The Australian Microbiome Initiative and the National Herbarium of Victoria. Appl Plant Sci 8 (2020).
Freestone, M. W. et al. Continental-scale distribution and diversity of Ceratobasidium orchid mycorrhizal fungi in Australia. Ann Bot 128, 329–343 (2021).
Yan, D. et al. High-throughput eDNA monitoring of fungi to track functional recovery in ecological restoration. Biol Conserv 217, 113–120 (2018).
Mills, J. G. et al. Revegetation of urban green space rewilds soil microbiotas with implications for human health and urban design. Restor Ecol 28, S322–S334 (2020).
Liddicoat, C. et al. Ambient soil cation exchange capacity inversely associates with infectious and parasitic disease risk in regional Australia. Sci Total Environ 626, 117–125 (2018).
Egidi, E. et al. Delving into the dark ecology: A continent-wide assessment of patterns of composition in soil fungal communities from Australian tussock grasslands. Fungal Ecol 39, 356–370 (2019).
Delgado-Baquerizo, M. et al. Ecological drivers of soil microbial diversity and soil biological networks in the Southern Hemisphere. Ecology 99, 583–596 (2018).
Li, J., Nie, M., Powell, J. R., Bissett, A. & Pendall, E. Soil physico-chemical properties are critical for predicting carbon storage and nutrient availability across Australia. Environ Res Lett 15, 094088 (2020).
Viscarra Rossel, R. A. et al. Environmental controls of soil fungal abundance and diversity in Australia’s diverse ecosystems. Soil Biol Biochem 170, 108694 (2022).
Yang, Y., Shen, Z., Bissett, A. & Viscarra Rossel, R. A. Estimating soil fungal abundance and diversity at a macroecological scale with deep learning spectrotransfer functions. SOIL 8, 223–235 (2022).
Bowd, E. J. et al. Direct and indirect effects of fire on microbial communities in a pyrodiverse dry‐sclerophyll forest. J Ecol 110, 1687–1703 (2022).
Bowd, E. J., Banks, S. C., Bissett, A., May, T. W. & Lindenmayer, D. B. Disturbance alters the forest soil microbiome. Mol Ecol 31, 419–447 (2022).
Waymouth, V. et al. Riparian fungal communities respond to land-use mediated changes in soil properties and vegetation structure. Plant Soil 475, 491–513 (2022).
Waymouth, V., Miller, R. E., Ede, F., Bissett, A. & Aponte, C. Variation in soil microbial communities: elucidating relationships with vegetation and soil properties, and testing sampling effectiveness. Plant Ecol 221, 837–851 (2020).
Tedersoo, L. et al. Best practices in metabarcoding of fungi: From experimental design to results. Mol Ecol 31, 2769–2795 (2022).
Egidi, E. et al. A few Ascomycota taxa dominate soil fungal communities worldwide. Nat Commun 10, 2369 (2019).
Větrovský, T. et al. A meta-analysis of global fungal distribution reveals climate-driven patterns. Nat Commun 10, 5142 (2019).
George, P. B. L. et al. Divergent national-scale trends of microbial and animal biodiversity revealed across diverse temperate soil ecosystems. Nat Commun 10, 1107 (2019).
Hakimzadeh, A. et al. A pile of pipelines: An overview of the bioinformatics software for metabarcoding data analyses. Mol Ecol Resour 00, 1–17 (2023).
Gardes, M. & Bruns, T. D. ITS primers with enhanced specificity for basidiomycetes ‐ application to the identification of mycorrhizae and rusts. Mol Ecol 2, 113–118 (1993).
White, T. J., Bruns, T., Lee, S. & Taylor, J. Amplification and direct sequencing of fungal ribosomal rna genes for phylogenetics. in PCR Protocols: A Guide to Methods and Applications 315–322, https://doi.org/10.1016/B978-0-12-372180-8.50042-1 (Academic Press, San Diego, CA, 1990).
Manter, D. K. & Vivanco, J. M. Use of the ITS primers, ITS1F and ITS4, to characterize fungal abundance and diversity in mixed-template samples by qPCR and length heterogeneity analysis. J Microbiol Methods 71, 7–14 (2007).
Větrovský, T. et al. GlobalFungi, a global database of fungal occurrences from high-throughput-sequencing metabarcoding studies. Sci Data 7, 1–14 (2020).
Tedersoo, L. et al. The Global Soil Mycobiome consortium dataset for boosting fungal diversity research. Fungal Divers 111, 573–588 (2021).
Bissett, A. & Brown, M. V. Alpha-diversity is strongly influenced by the composition of other samples when using multiplexed sequencing approaches. Soil Biol Biochem 127, 79–81 (2018).
Belbin, L., Wallis, E., Hobern, D. & Zerger, A. The Atlas of Living Australia: History, current state and future directions. Biodivers Data J 9 (2021).
Lebel, T. et al. Confirming the presence of five exotic species of Amanita in Australia and New Zealand. Swainsona 44, 1–44 (2024).
Atlas of Living Australia occurrence download for Biome of Australia Soil Environments https://doi.org/10.26197/ala.17104622-e3ef-4f3f-b0ad-2e83743c0dcd (Accessed 8 April 2024).
GBIF occurrence download for Biome of Australia Soil Environments https://doi.org/10.15468/hn7xl9 (Accessed 8 April 2024).
Colesie, C., Walshaw, C. V., Sancho, L. G., Davey, M. P. & Gray, A. Antarctica’s vegetation in a changing climate. WIREs Climate Change 14 (2023).
GBIF occurrence download for Antarctica https://doi.org/10.15468/dl.68djdq (Accessed 8 April 2024).
Kauserud, H. ITS alchemy: On the use of ITS as a DNA marker in fungal ecology. Fungal Ecol 65, 101274 (2023).
Drake, L. E. et al. An assessment of minimum sequence copy thresholds for identifying and reducing the prevalence of artefacts in dietary metabarcoding data. Methods Ecol Evol 13, 694–710 (2022).
Florence, L. et al. A curated soil fungal dataset to advance fungal ecology and conservation research in Australia and Antarctica, figshare, https://doi.org/10.6084/m9.figshare.27938037 (2025).
Vu, D., Nilsson, R. H. & Verkley, G. J. M. Dnabarcoder: An open-source software package for analysing and predicting DNA sequence similarity cutoffs for fungal sequence identification. Mol Ecol Resour 22, 2793–2809 (2022).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Rivers, A. R., Weber, K. C., Gardner, T. G., Liu, S. & Armstrong, S. D. ITSxpress: Software to rapidly trim internally transcribed spacer sequences with quality scores for marker gene analysis. F1000Res 7, 1418 (2018).
Rognes, T., Flouri, T., Nichols, B., Quince, C. & Mahé, F. VSEARCH: a versatile open source tool for metagenomics. PeerJ 4, e2584 (2016).
Callahan, B. J. et al. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods 13, 581–583 (2016).
Abarenkov, K. et al. Full UNITE + INSD fasta release for eukaryotes. Version 04.04.2024. UNITE Community https://doi.org/10.15156/BIO/2959331 (2024).
Edgar, R. C. UCHIME2: improved chimera prediction for amplicon sequencing. biorxiv 074252 (2016).
Calderón-Sanou, I., Münkemüller, T., Boyer, F., Zinger, L. & Thuiller, W. From environmental DNA sequences to ecological conclusions: How strong is the influence of methodological choices? J Biogeogr 47, 193–206 (2020).
Nilsson, R. H. et al. Introducing guidelines for publishing DNA-derived occurrence data through biodiversity data platforms. Metabarcoding Metagenom 6 (2022).
Abarenkov, K. et al. Full UNITE + INSD fasta release for fungi. Version 04.04.2024. UNITE Community https://doi.org/10.15156/BIO/2959330 (2024).
Camacho, C. et al. BLAST + : architecture and applications. BMC Bioinformatics 10, 421 (2009).
Põlme, S. et al. FungalTraits: a user-friendly traits database of fungi and fungus-like stramenopiles. Fungal Divers 105 (2020).
Tedersoo, L. et al. High-level classification of the Fungi and a tool for evolutionary ecological analyses. Fungal Divers 90, 135–159 (2018).
Bradshaw, M. J. et al. Extensive intragenomic variation in the internal transcribed spacer region of fungi. iScience 26, 107317 (2023).
Ovaskainen, O. et al. Global Spore Sampling Project: A global, standardized dataset of airborne fungal DNA. Sci Data 11, 1–17 (2024).
Furneaux, B., Bahram, M., Rosling, A., Yorou, N. S. & Ryberg, M. Long- and short-read metabarcoding technologies reveal similar spatiotemporal structures in fungal communities. Mol Ecol Resour 21, 1833–1849 (2021).
Burg, S. et al. Experimental evidence that root‐associated fungi improve plant growth at high altitude. Mol Ecol 33 (2024).
Dondoshansky, I. & Wolf, Y. BLASTCLUST-BLAST score-based singlelinkage clustering. (2000).
Harwood, T. 9 s climatology for continental Australia 1976-2005: BIOCLIM variable suite. v1. https://doi.org/10.25919/5dce30cad79a8 (2019).
Lembrechts, J. J. et al. Global maps of soil temperature. Glob Chang Biol 28, 3110–3144 (2022).
Malone, B. & Searle, R. Soil and Landscape Grid National Soil Attribute Maps - Sand (3” Resolution) - Release 2. v3. Data Collection. https://doi.org/10.25919/rjmy-pa10 (2022).
Malone, B. & Searle, R. Soil and Landscape Grid National Soil Attribute Maps - Clay (3” Resolution) - Release 2. v4. Data Collection. https://doi.org/10.25919/hc4s-3130 (2022).
Malone, B. & Searle, R. Soil and Landscape Grid National Soil Attribute Maps - Silt (3” Resolution) - Release 2. v2. Data Collection. https://doi.org/10.25919/2ew1-0w57 (2022).
Malone, B. & Searle, R. Soil and Landscape Grid National Soil Attribute Maps - Soil Depth (3” Resolution) - Release 2. v3. Data Collection. https://doi.org/10.25919/djdn-5x77 (2020).
Viscarra Rossel, R., Chen, C., Grundy, M., Searle, R. & Clifford, D. Soil and Landscape Grid Australia-Wide 3D Soil Property Maps (3” Resolution) - Release 1. v3. Data Collection. https://doi.org/10.4225/08/5aaf553b63215 (2014).
Wadoux, A. et al. Soil and Landscape Grid National Soil Attribute Maps - Organic Carbon (3” Resolution) - Release 2. v2.Data Collection. https://doi.org/10.25919/ejhm-c070 (2022).
Zund, P. Soil and Landscape Grid National Soil Attribute Maps - Available Phosphorus (3” Resolution) - Release 1. v1. Data Collection. https://doi.org/10.25919/6qzh-b979 (2022).
Malone, B. Soil and Landscape Grid National Soil Attribute Maps - Cation Exchange Capacity (3” Resolution) - Release 1. v1. Data Collection. https://doi.org/10.25919/pkva-gf85 (2022).
Roman Dobarco, M. et al. Soil and Landscape Grid National Soil Attribute Maps - Soil Organic Carbon Fractions (3” Resolution) - Release 1. v5. Data Collection. https://doi.org/10.25919/fa46-ey49 (2022).
Malone, B. Soil and Landscape Grid National Soil Attribute Maps - PH (Water) (3” Resolution) - Release 1. v1. Data Collection. https://doi.org/10.25919/37z2-0q10 (2022).
Scarth, P., Armston, J., Lucas, R. & Bunting, P. Vegetation Height and Structure - Derived from ALOS-1 PALSAR, Landsat and ICESat/GLAS, Australia Coverage. V1. https://portal.tern.org.au/metadata/TERN/de1c2fef-b129-485e-9042-8b22ee616e66 (2023).
Mokany, K. et al. Plant Diversity Spatial Layers for Australia. v2.Data Collection https://doi.org/10.25919/mk24-1792 (2022).
Harwood, T. et al. 9 Arcsecond Gridded HCAS 2.1 (2001-2018) Base Model Estimation of Habitat Condition for Terrestrial Biodiversity, 18-Year Trend and 2010-2015 Epoch Change for Continental Australia. v7. Data Collection. https://doi.org/10.25919/nkjf-f088 (2021).
Gallant, J., Wilson, N., Tickle, P. K., Dowling, T. & Read, A. 3 Second SRTM Derived Digital Elevation Model (DEM). Version 1.0. https://pid.geoscience.gov.au/dataset/ga/69888 (2009).
Yang, R.-H. et al. Evaluation of the ribosomal DNA internal transcribed spacer (ITS), specifically ITS1 and ITS2, for the analysis of fungal diversity by deep sequencing. PLoS One 13, e0206428 (2018).
Nilsson, R. H., Ryberg, M., Abarenkov, K., Sjökvist, E. & Kristiansson, E. The ITS region as a target for characterization of fungal communities using emerging sequencing technologies. FEMS Microbiol Lett 296, 97–101 (2009).
Lindahl, B. D. et al. Fungal community analysis by high-throughput sequencing of amplified markers - a user’s guide. New Phytologist 199, 288–299 (2013).
Bengtsson-Palme, J. et al. Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for analysis of environmental sequencing data. Methods Ecol Evol 4, 914–919 (2013).
Amses, K. R. et al. Diploid-dominant life cycles characterize the early evolution of Fungi. Proc Natl Acad Sci USA 119, 1–10 (2022).
Tedersoo, L. et al. Regional-Scale In-Depth Analysis of Soil Fungal Diversity Reveals Strong pH and Plant Species Effects in Northern Europe. Front Microbiol 11, 1–31 (2020).
Hao, T., Elith, J., Guillera-Arroita, G., Lahoz-Monfort, J. J. & May, T. W. Enhancing repository fungal data for biogeographic analyses. Fungal Ecol 53, 101097 (2021).
May, T. Biogeography of Australasian Fungi: From Mycogeography to the Mycobiome. in Handbook of Australasian Biogeography 156–213 https://doi.org/10.1201/9781315373096 (CRC Press, Boca Raton, FL, 2017).
Phillips, S. J., Anderson, R. P. & Schapire, R. E. Maximum entropy modeling of species geographic distributions. Ecol Modell 190, 231–259 (2006).
Phillips, S. J. & Dudík, M. Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation. Ecography 31, 161–175 (2008).
Hijmans, R. J., Phillips, S., Leathwick, J., Elith, J. & Hijmans, M. R. J. Package ‘dismo’. Circles 9, 1–68 (2017).
Bougoure, J., Ludwig, M., Brundrett, M. & Grierson, P. Identity and specificity of the fungi forming mycorrhizas with the rare mycoheterotrophic orchid Rhizanthella gardneri. Mycol Res 113, 1097–1106 (2009).
van der Linde, S. et al. Environment and host as large-scale controls of ectomycorrhizal fungi. Nature 558, 243–248 (2018).
Veldre, V. et al. Evolution of nutritional modes of Ceratobasidiaceae (Cantharellales, Basidiomycota) as revealed from publicly available ITS sequences. Fungal Ecol 6, 256–268 (2013).
Martos, F. et al. Independent recruitment of saprotrophic fungi as mycorrhizal partners by tropical achlorophyllous orchids. New Phytol 184, 668–681 (2009).
Suetsugu, K., Yamato, M., Matsubayashi, J. & Tayasu, I. Comparative study of nutritional mode and mycorrhizal fungi in green and albino variants of Goodyera velutina, an orchid mainly utilizing saprotrophic rhizoctonia. Mol Ecol 28, 4290–4299 (2019).
Yagame, T., Orihara, T., Selosse, M., Yamato, M. & Iwase, K. Mixotrophy of Platanthera minor, an orchid associated with ectomycorrhiza‐forming Ceratobasidiaceae fungi. New Phytol 193, 178–187 (2012).
Sánchez‐Ramírez, S., Tulloss, R. E., Amalfi, M. & Moncalvo, J. Palaeotropical origins, boreotropical distribution and increased rates of diversification in a clade of edible ectomycorrhizal mushrooms (Amanita section Caesareae). J Biogeogr 42, 351–363 (2015).
Acknowledgements
This work was supported by an Australian Government Research Training Program scholarship from La Trobe University to L.F. We would like to acknowledge the contribution of the Australian Microbiome and Biomes of Australian Soil Environments consortiums in the generation of data used in this publication. The Australian Microbiome initiative is supported by funding from Bioplatforms Australia and the Integrated Marine Observing System through the Australian Government’s National Collaborative Research Infrastructure Strategy, Parks Australia through the Bush Blitz program funded by the Australian Government and BHP, and CSIRO. We thank Sophie Mazard and Andrew Bissett for guidance in accessing the Australian Microbiome data. Leho Tedersoo, Sten Anslan, Kessy Abarenkov, Brendan Furneaux and Duong Vu gave valuable suggestions for the data analyses. Tom May and Kristen Fernandes provided feedback on the manuscript.
Author information
Authors and Affiliations
Contributions
L.F. conceived the study, collected and processed the primary data products, and contributed to the first draft of the manuscript. S.T. contributed to collect and analyse the data relating to the spatial distribution models, and to the first draft of the manuscript. M.F. contributed to collect and analyse the data, relating to the spatial distribution models, and to the first draft of the manuscript. J.W.M. conceived and supervised the study. J.L.W. conceived and supervised the study. C.T. conceived and supervised the study, contributed to process the primary data products and to the first draft of the manuscript. All authors contributed to the manuscript revision.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Florence, L., Tomlinson, S., Freestone, M. et al. A curated soil fungal dataset to advance fungal ecology and conservation research in Australia and Antarctica. Sci Data 12, 353 (2025). https://doi.org/10.1038/s41597-025-04598-5
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41597-025-04598-5
This article is cited by
-
A Reference DNA Barcode Library for UK Fungi associated with Bark and Ambrosia Beetles
Scientific Data (2025)















