Background & Summary

Functional traits, representing morphological, physiological, and behavioral characteristics, provide insights into species’ ecological strategies, niche differentiation, and community assembly1,2. These traits can be classified into two broad categories: response traits and effect traits. The study of functional traits is especially important for soil protists because they have diverse morphologies and lifestyles. Soil protists serve as bioindicators of soil health and ecosystem disturbance, reflecting the impacts of land management practices, pollution, and climate change3. Among different groups of soil protists, testate amoebae are an important type of single-celled protists that inhabit soils worldwide. They have been recognized as essential indicators of environmental change and have been used in studies of paleoecology, biogeography, and biogeochemistry4,5,6,7. Testate amoebae have evolved a range of shell shapes, sizes, and compositions, as well as behavioral and physiological adaptations that allow them to survive and thrive in a range of soil environments8,9,10,11,12. Recent evidences show that the shell of arcellinid testate amoebae is a crucial component facilitating the amoebae’s attack of large prey, thus, the shell might be considered as not purely protective, but also as a weapon13. All these enable testate amoebae to play important roles in nutrient cycling and decomposition, and make them sensitive to a range of ecological factors14. Response traits for testate amoebae represent the adaptations to their environment. They reflect the organism’s ability to react to changes and variations in environmental conditions. Response traits in testate amoebae include morphological characteristics (the shape, size, and composition of their shells), locomotion strategies in response to environmental cues, and reproduction rate1,15,16,17,18,19. Effect traits for testate amoebae show the functional roles on the ecosystems. They indicate the organism’s influence on ecological processes and community dynamics. Some examples of effect traits in testate amoebae include feeding strategy and trophic position1,20. Within these broad categories, there are numerous specific traits that are relevant to testate amoebae, such as structural features, body dimensions and feeding types. These functional traits provide a comprehensive representation of their ecological roles21. The trait data of testate amoebae could be found in taxonomic keys, primary literature and online materials22,23. However, a detailed database is absent. Thus, the primary aim is to provide a dataset on the functional traits of testate amoebae to facilitate further research into their ecological significance, functional diversity, and contributions to ecosystem functioning.

Methods

Taxonomic and geographical coverage

We compiled a list of testate amoeba species from three major sources (Fig. 1). First, we extracted a species list from our own extensive database of samples which were collected in three main habitat types (mires, soil, lakes) across Eurasia during 2004–2020.Specifically, 1115 samples were collected from mires, 91 samples were taken from lakes, and 429 samples originated from soils. These samples are partially represented in open access databases24,25,26,27.

Fig. 1
figure 1

(a) Location of sampling sites for filtering the list of testate amoebae species. (b) Venn diagram which shows the number of species in three major sources.

Second, we extracted species list from the database of quantitative samples compiled to develop pan-European testate amoeba transfer function for reconstructing peatland palaeohydrology28. This database contains 1799 samples from 113 sites in 18 countries spanning 35° of latitude and 55° of longitude.

Third, we extracted species list from the database of testate amoeba quantitative samples compiled to develop palaeohydrological transfer function for North America29. This database contains 1956 samples from 137 peatlands located throughout the Canada and the USA.

In total, the list of 376 morphospecies (Fig. 1b) was compiled from three databases22,23. It covers ca. 20% of about 2000 testate amoeba species describes so far14.

Trait collection

The traits were chosen based on their relevance to the ecological functions and morphological characteristics of testate amoebae, aiming to capture essential aspects of their ecological strategies1. Additionally, traits selected for analysis were those readily available and well-documented in the literature and online material. We collected functional trait values in the primary literature22, online databases (https://arcella.nl/), and taxonomic keys23. Most data sources provide the ranges for numerical traits. It is well known that continuous trait values vary across species geographic range depending on climatic conditions and environmental variables1,30. However, detailed information on intraspecific variation is available for a very limited number of testate amoeba species. Thus, to provide consistency across the database, we retrieved the maximum and minimum values of the numerical traits. These extreme values should belong to extreme environmental conditions (either beneficial or unfavorable), so we also provide the average value which we recommend to use in the most applications. If only a single value was available, we provided this as the average value for the species and considered minimum and maximum for this trait as missing values. For categorical and binary traits, search was based on the textual content to determine the specific type of this trait.

The summary table contains detailed trait information included in the dataset (Table 1). All 18 traits could be categorized into five major blocks.

Table 1 Description for traits in the dataset.

First, the shell dimensions (shell length, shell width, shell depth, indices R1 and R2) of testate amoebae determined their physical characteristics and resource acquisition abilities1,31. Shell length was determined as the distance from the aperture of the shell to the opposite side along the first axis (Table 1; Fig. 2a). Shell width was defined as the maximum dimension perpendicular to the first axis, representing the widest part of the shell as second axis (Fig. 2a). Lastly, shell depth referred to the dimension perpendicular to both the first and second axes, representing the depth or thickness of the shell (Fig. 2a). These three traits described the overall size of the shell. Shell size influences protection against environmental stressors. Larger shells provide enhanced resistance to desiccation and predation, increasing survival rates in challenging habitats1,31. The two ratios R1 and R2 were derived from the above three traits which provide insights into structural proportions from different views. Each of these ratios reflected specific shape characteristics of testate amoebae. R1 represented the proportion of shell width to its length, indicating the shape for lateral view. The lower the R1, the closer the overall shape of the shell resembled a long strip. When R1 approached one, shape looked like a sphere. Our definition of shell length allows it not to be the longest dimension. Thus, ratio R1 could be more than one. This case corresponds to a shape similar to plate viewed from the side. R2 quantified the ratio between shell depth and shell width, reflecting the shape for latero-apertural view. The closer this ratio to one, the closer the shape to a circle.

Fig. 2
figure 2

The position of measured axis for the numerical traits of testate amoebae: (a) broad lateral view (Lagenodifflugia vas) (b) apertural view (L. vas). Implications of the numbers: 1-shell length, 2-shell width, 3-shell depth, 4-aperture length, 5-aperture width.

Second, outline represented the overall shape of the shell in three-dimensional space. This is a qualitative trait with eight categories (Fig. 3). The shapes of shell were related to the environment in which testate amoebae lived31.

Fig. 3
figure 3

Outline/shell (test) shape: (a) sphere (Difflugia urceolata); (b) hemisphere (Phryganella hemisphaerica); (c) cylinder (Cylindrifflugia lanceolata); (d) patelliform (Arcella vulgaris undulata); (e) rectangular cuboid (Heleopera petricola); (f) ovoid (Euglypha rotunda); (g) pyriform (Difflugia bacillifera); (h) spiral (Lesquereusia spiralis).

Third, there were eight traits associated with the aperture. In particular, aperture length for testate amoebae referred to the longest dimension of the opening of the shell (Table 1; Fig. 2b). Aperture width was defined as the shorter dimension perpendicular to the aperture length, representing the second maximum width of the opening (Fig. 2b). Aperture length and width were used to accurately describe the extent of the opening for apertures. The size of the aperture can affect the size of prey captured and feeding efficiency1,16. Larger apertures facilitate food intake and movement, leading to enhanced nutrient uptake and growth rates. Two ratios (R3, R4) were derived from the above numeric traits which provide insights into structural proportions16. Each of these ratios reflected specific shape characteristics of testate amoebae. R3 denoted the ratio between aperture length and shell width, offering insights into the size of opening for aperture relative to the shell width from vertical view. R4 measured the ratio between aperture width and aperture length, providing insights into the shape of the aperture in apertural view. R4 values close to unity indicated circular apertures, suggesting a potentially more efficient feeding or locomotion. Conversely, lower R4 values suggested narrower apertures which may be characteristic of specific feeding strategies or environmental conditions. Variations in aperture-related traits, such as position (Fig. 4), invagination degree (Fig. 5), aperture rim (Fig. 6) and presence or absence of collars (Fig. 7a), further contribute to their ecological success by influencing prey capture efficiency and protection against desiccation1,32.

Fig. 4
figure 4

Position of the aperture: (a) straight terminal (Assulina muscorum); (b) sub terminal (Cyphoderia ampullula); (c) central ventral (Arcella catinus); (d) shifted ventral (Centropyxis aerophila); (e) amphistomic (Archerella flavum).

Fig. 5
figure 5

Degree of invagination of aperture: (a) absent (Phryganella acropodia); (b) slightly (Trigonopyxis arcula); (c) strongly (Galeripora catinus).

Fig. 6
figure 6

Aperture rim: (a) straight (Gibbocarina galeata); (b) curved (Quadrulella symmetrica); (c) lobbed (Netzelia oviformis); (d) denticular (Euglypha ciliata).

Fig. 7
figure 7

Three binary traits: (a) presence of collar (Arcella hemisphaerica); (b) presence of internal partitions (Zivkovicia spectabilis) (c) presence of spines/horns (Centropyxis aculeata).

Fourth, three structural features (presence or absence of internal partitions Fig. 7b, spines/horns Fig. 7c, types of shell covering Fig. 8) affect predator avoidance, draught adaptations, ecological strategy and competitive interactions1,33.

Fig. 8
figure 8

Shell covering: (a) organic (Hyalosphenia elegans); (b) xenosomes (Difflugia venusta); (c) idiosomes (Trinema lineare); (d) cleptosomes (Gibbocarina galeata).

Fifth, the feeding type (e.g. mixotrophy, bacterivory, and predatory) reflects dietary preferences and foraging strategies of testate amoebae.

All traits reflect adaptations to specific ecological pressures, contributing to the ecological success of testate amoebae populations34,35. In summary, the suite of morphological and behavioral traits in testate amoebae reflects their ecological strategies for survival, resource acquisition, and niche differentiation.

Data visualization

To more clearly display the distribution characteristics of all traits to the readers and check for the existence of abnormal data, we used the R library ‘ggplot2’36 to create histograms and bar plots.

Data Records

The dataset (species-level_trait.csv file)37 comprises trait data for 372 species of testate amoebae, representing the most commonly encountered species in Northern Holarctic realm. We have not found trait data for four morphospecies (Centropyxis aculeata minor, Centropyxis aculeata gibbosa, Centropyxis ecornis quadripannosa, Euglypha dolioformis). The dataset covers 100% of species from our own samples, 97% of species from pan-European peatland database28, and 100% of species form North American peatland database29. Thus, we consider our dataset as covering the most widespread testate amoeba species from the Northern Holarctic realm.

The completeness of the dataset37 is 90% (defined as 10 - proportion of missing values in the main table). Missing values are present only for extreme numerical trait values when no information on range was available. In all such cases, average value is provided. Therefore, our database contains full set of traits for each of 372 species.

Genera with relatively high species richness are Difflugia (81 species), Euglypha (35 species), Centropyxis (31 species) (Fig. 9). However, there are also 19 genera comprising only one species. All selected species ensure broad representation across different geographic regions and habitat types. Each species in the dataset is characterized by a set of trait measurements, providing comprehensive coverage of key functional traits relevant to the ecological roles and behaviors of testate amoebae. The trait data offer insights into the morphological, physiological, and behavioral characteristics of these organisms.

Fig. 9
figure 9

Number of species in each genus for trait data table.

We could characterize the distribution pattern of the traits in five major blocks. First, for most species, the average values of length, width and depth for their shells were in the range from 3 to 200 micrometers and the distribution of all three dimensions was skewed to the right (Fig. 10a–c). More than half of the species have an R1 value less than one reflecting strip-like shape (Fig. 10d). Most species have the same width and depth reflected in R2 value of one (Fig. 10e). Second, ovoid and pyriform shells are most common, while rectangular cuboid shells are less frequent (Fig. 11). Third, for over half of the species, their average aperture size was less than 50 micrometers (Fig. 10f,g). R3 ratio have more or less symmetric distribution over median value of 0.39 (Fig. 10h). Most species have a circular aperture reflected in R4 value of one (Fig. 10i). Also, more than half of species are characterized by apertures with straight terminal position, without invagination, with straight or curved rims, and with no collar (Fig. 11). Fourth, xenosomes and idiosomes shell coverings, along with the absence of internal partitions, spines, or horns, are prevalent among all species (Fig. 11). Fifth, the vast majority of species feed on bacteria (Fig. 11).

Fig. 10
figure 10

Histogram of distribution for the numerical traits.

Fig. 11
figure 11

Bar plots of distribution for the categorical and binary traits.

Technical Validation

To ensure the reliability of our data, we have implemented the following three measures. First, we utilized primary literature, authoritative online resources and major identification guides in our field. When collecting trait data, we used the latest species names and accurate taxonomic keys for retrieval. Second, we used trait distributions to check for outliers. If outliers were identified, we reviewed the data sources and verified them to determine whether to retain the data. Third, after completing data collection, experts in the field conducted a specialized review of the data.

Usage Notes

Broad range of applications

The dataset’s potential applications extend beyond ecology, with relevance to climate change studies, soil science, and conservation biology. For example, the functional traits data can be utilized in ecological modeling to predict species responses to environmental changes, assess biodiversity under different scenarios, and monitor ecosystem health. Case studies demonstrating these applications, such as using community-weighted means (CWMs) to track shifts in community composition due to climate change38,39 or employing functional diversity metrics to guide conservation strategies40, this will underscore the dataset’s broad utility and significance.

Specific applications

The collection of trait data on the morphological and feeding features of testate amoebae offers the potential applications in ecological, environmental and paleontological research. One application is the calculation of community-weighted means (CWMs). By integrating our trait data with abundance data from specific studies, researchers can assess the average trait values in a community, providing insights into community assembly processes and environmental filtering41. Additionally, our trait data enable the calculation of functional distances between species. These distances can be used in various analyses, such as ordination and clustering, to explore patterns of functional similarity and divergence among communities42.

Our data also can facilitate the use of null models to test hypotheses about community assembly. By comparing observed patterns to those generated by null models, researchers could infer the influence of ecological processes such as competition, niche differentiation, and stochastic events on community composition43. Furthermore, the trait data can be employed to measure functional diversity (e.g. functional richness and evenness), which provide a comprehensive understanding of biodiversity that incorporates the ecological roles and interactions of species44. Overall, the integration of trait data with ecological and statistical methods allows for a comprehensive understanding of testate amoebae communities, enhancing both our knowledge of these microorganisms and broader ecological theories.

Dataset limitations

There are two major limitations of our dataset. First, it is a compilation of species-level information from literature. Most information comes from keys, which have identification as a primary purpose. The categorical traits incorporated into our dataset are species-invariant, so their values could be regarded as highly confident. However, numerical traits are usually provided by data sources without a detailed measurement procedure. Consequently, we were unable to estimate the degree of uncertainty. Second, most of the quantitative data we used to compile the species list were samples from peatlands, and a disproportionally little number of samples originated from freshwater lakes and mineral soils. Thus, our dataset could be biased to some degree towards peatlands.

Future directions

We can outline two major directions to improve the quality and applicability of functional trait databases for testate amoebae. First, higher confidence could be achieved with individual-based measurements of numeric traits. However, the development of database of this type is time-consuming and should be a topic of a separate collaborative effort. Second, important information on testate amoebae morphology should describe the test shape. Conventionally, this information was represented by a categorical trait (in our dataset, we provide a trait with eight categories). A better alternative is to have a quantitative description of shape. Here we provide two quantitative ratios, R1 and R2, describing the main aspects of test shape. However, the modern framework to describe shape is geometric morphometry based on multivariate analysis of landmark configuration. It provides a unique opportunity to perform shape analysis without sacrificing any shape aspects, which is inevitable when using shape indices. This type of analysis has already been tested on freshwater Arcellinida31. The development of the comprehensive shape database requires the creation of a large set of high-quality images of amoeba shells, which could be regarded as a potential direction of further research.