Background & Summary

Thermal limits of life have interested researchers since at least the 1700s1,2,3,4,5, with early comparisons of thermal tolerances between organisms published in the late 1800s3. Despite the large volume of research on the biological and ecological influence of temperature in the last two centuries, many questions regarding thermal tolerances remain unanswered6,7,8. For example, the relationship between oxygen and thermal tolerance is still contested9,10, and the influence of thermal tolerance on range limits not fully understood11,12. The evolutionary determinants of thermal tolerance continue to engage contemporary research13,14,15,16,17.

Furthermore, thermal tolerance can be measured in different ways, complicating large-scale comparisons18,19,20,21,22. Two fundamental methodologies have emerged to measure thermal tolerance of a species in the last century. One method involves ramping the temperature until a specified endpoint is reached, known as the critical thermal method, coined by Cowles and Bogert in 194423, though similar methods existed previously24,25. The metrics for upper and lower thermal tolerance are known as critical thermal maximum (CTmax) and critical thermal minimum (CTmin), respectively, which represent the upper and lower bounds of a thermal performance curve26,27,28. The most commonly measured endpoint for this method is the loss of equilibrium, which represents a temperature at which an organism can no longer escape conditions that would lead to death; however, the organism should be able to recover once removed from the test19,29. This method has gained popularity over time, as the test can be conducted quickly, without permanently harming the organisms, repeatedly on the same organism, and with a small sample of test organisms29. Modern mobile heating units allow this methodology to be conducted in the field30,31. The lethal thermal maximum (LTmax) or lethal thermal minimum (LTmin) metrics are reached when the temperature is ramped until death occurs32. The second method is a static assay, wherein groups of organisms are kept at a fixed temperature for a fixed period of time. This method was formally named the incipient lethal method by Fry in 194733, based on work by others in the two decades prior34,35,36,37,38. Percentage mortality is assessed as an endpoint for each group, and statistical methods such as probit analysis used to evaluate the temperature at which a certain percentage of the group is affected37. The LT50 metric is the temperature at which 50% of the organisms perish, analogous to the LC50 of ecotoxicological assays for chemicals33,39. As with LC50 metrics, LT50 metrics are commonly measured at 24, 48, or 96 hours18. The incipient upper or lower lethal temperature (IULT and ILLT, respectively) is a metric distinguished from the LT50 by the longer time course of the test – the IULT represents the temperature which is not fatal to 50% of a population irrespective of exposure time34. This method has decreased in relative popularity over time, due to the large amount of resources required29. A few studies have compared these two metrics on a theoretical40,41 or empirical42 level, but with limited sample sizes.

A third method known as the thermal death time (TDT) was developed concurrently with the critical thermal and incipient lethal temperature methods. This method observes the time elapsed until an endpoint (usually death) occurs at constant temperatures. The role of time in temperature tolerance has been acknowledged since the 1940s, notably by Fry during the development of the incipient lethal temperature procedure25,34,43. Recent works have unified thermal tolerance methodologies by considering tolerance in the context of intensity and duration of exposure40,41,44,45,46. Thermal death time studies fall outside of the scope of the present study, as the TDT metric (time) does not directly match the temperature metric of the other methodologies. While mathematical models could resolve this discrepancy40,44,45, the present work includes only reported metrics that can be tracked to a primary source.

With the advanced understanding of climate change effects, large collections of thermal limits have been compiled to assess the vulnerability of organisms to warming on a global scale and to answer fundamental questions on the determinants of thermal tolerance47,48,49,50,51,52. Freshwater taxa are not well-represented within these collections, with the exception of amphibians47,52,53,54. Freshwaters support disproportionately high levels of biodiversity and provide vital resources for people55. Freshwater ecosystems, including rivers, lake, and wetlands, cover approximately 3% of Earth’s surface but provide habitats for over 50% of all described fish species and a third of all known vertebrates56,57. In addition, these systems provide services such as drinking water, water purification, and contribute to climate regulation. Freshwater ecosystems are directly threatened by global warming on a large scale, and locally by thermal pollution from industrial effluents58,59,60. Moreover, climate change may also indirectly increase water temperatures. For example, flow intermittence induced by climatic changes may indirectly result in strongly increased water temperatures6,61. Other stressors, such as hydromorphological changes and removal of riparian vegetation can also be associated with increased water temperature6,62,63. Evaluating the effects of increasing temperatures on biodiversity requires information on species’ thermal tolerance. Hence, we established the ThermoFresh database focusing on the thermal tolerance of freshwater taxa, which allows for intra- and interspecific comparisons among global freshwater assemblages, assessment of thermal vulnerability under warming scenarios, and prediction of future range shifts.

Thermal tolerance can also change in the presence of other stressors53,64, which affect its ecological relevance in a world where many organisms are increasingly subjected to multiple stressors58,65. We include thermal tolerance tests in the presence of additional stressors, with information on the type and level of the stressor(s). This facilitates future use of the database in disentangling complex interactions between multiple stressors and consequential effects on organisms in stressful environments.

The database was developed following current recommendations of best practice66,67,68. In comparison to previous studies we aimed to reduce geographic bias, provide multiple records per species (where possible, to allow assessment of intraspecific variation), and report measures of dispersion, to address gaps identified in recent literature7,54. We have included thermal tolerance metrics determined from the critical thermal method or the incipient lethal method, including those conducted on the same species within the same study42,69, which facilitates comparison of how the metrics are related across taxonomic groups. Our database focuses on freshwater fish and invertebrates, with records collated from previous compilations and expanded with new literature searches, including in widely spoken languages other than English. Including widely spoken non-English languages in ecological literature searches decreases geographical bias, which is critical for extrapolating results across the diverse regions of the earth70,71. The current uncertainty induced by geographical bias in physiological data overrides the uncertainty in future climate predictions72, and non-English literature can help counteract such bias73,74,75. Additionally, the publication rate of non-English studies on biodiversity is growing in many countries76. The relevance of non-English literature has been increasingly recognized70,71,73.

Methods

Workflow

The conceptualization of the database started with two previous compilations, the GlobTherm database and a study by Leiva and colleagues47,52. The search terms used in these studies were implemented in scoping searches, during which two additional compilations, authored by Cereja53 and Sundermann and colleagues77, respectively, were found. These four recently published databases were initially selected based on their size, freshwater taxa inclusion, and date of publication. The reasons for the inclusion or exclusion of previous databases are listed in Table 1. To cover gaps in previous publications, we conducted searches in Chinese, German, French, and Spanish up until April 2023, and searches in English until April 2023 in Web of Science as well as from 2019 until April 2023 in Google Scholar. English search terms are listed in Table 2, and all search terms in Supplementary Table 1. During the searches it became apparent that invertebrates were severely underrepresented compared to fish. Therefore, we exerted additional search effort to increase the coverage of invertebrates in the database, using taxonomic names of common invertebrates in the search terms. Throughout this study, several additional compilations51,78,79,80,81,82,83,84,85,86,87 were discovered, most newly published (Table 1). Of these, four were selected, then harmonized and added to the database following the same procedure as before. The complete workflow is visualized in Fig. 1.

Table 1 A list of previously published large collections of thermal tolerance data considered for harmonization in this study, with information on when it was found and the reason for inclusion or exclusion.
Table 2 Search strings for the literature searches conducted in English, along with the targeted organism group and platform within which the search was conducted.
Fig. 1
figure 1

A flow chart describing the workflow to obtain studies included in the database. A total of 572 studies were included culminating in 6825 records of thermal tolerance in total. Nine studies were obtained from references.

Inclusion criteria

The database includes studies that fulfill three primary criteria: 1) report at least one thermal tolerance metric for an organism, 2) have mortality or a sublethal indication of imminent mortality (such as loss of equilibrium, loss of righting response, or onset of spasms) as an endpoint, and 3) test organisms are fish or invertebrates residing in fresh or brackish water habitats for at least part of their life cycle (Fig. 1). Studies that contained incomprehensible or missing methodology for thermal tolerance tests were excluded; minor methodological inconsistencies were noted during data extraction. The thermal tolerance metrics considered for inclusion needed to represent the result of either a dynamic or static assay, and either contain enough methodological information to deduce which kind of assay was conducted or cite the recognized metric name (CTmax, CTmin, LTmax, LTmin, LT50, IULT, ILLT) in accordance with literature describing methodology (notably key papers such as Cowles and Bogert 1944, Becker and Genoway 1979, Lutterschmidt and Hutchinson 1997, Beitinger and Bennett 2000)18,19,23,29.

Previous databases

We initially selected four recently published databases on thermal tolerance that include freshwater organisms and contain variables relevant to the test organisms and the test metrics. These were identified via scoping searches on thermal tolerance literature. The GlobTherm database47 contains thermal tolerance records for the largest number of species in one compilation to date, the compilation of Leiva and colleagues52 includes a high proportion of freshwater invertebrate taxa, the compilation by Cereja53 includes only aquatic species (therefore more freshwater taxa) and notes additional stressors53, and the dataset by Sundermann and colleagues77 focuses exclusively on freshwater invertebrates. Other large datasets either contained fewer relevant variables due to the nature of the research questions they were collected for51,88, fewer freshwater taxa (especially invertebrates)54,89, or a high duplication of studies with the selected databases90,91. We excluded collections published before 2018, given the high duplication with more recent literature (Table 1).

We attempted to harmonize the selected databases. However, previous databases either only reported one value per taxon, even when multiple values were reported in constituent studies, mislabelled thermal tolerance metrics (i.e. labelling all thermal tolerance values as CTmax, though some were other metrics), or lacked key information related to the method, location, or test organism. Thus all selected databases were filtered for relevant freshwater or brackish taxa, all existing data for these taxa combined, and each corresponding source reference examined to cross-check existing data, correct errors, and fill in missing variables. This included extracting additional data from the main text, tables, figures, and supplementary information of source studies. Additional records were added when present in the original source. When data was only presented in figures, it was extracted using Plot Digitizer92. Collected variables are summarized in Table 3, and the extraction protocol with descriptions of all variables in detail is available in the “methods” folder on GitHub (https://github.com/hsbayat/ThermoFresh/tree/main/methods; see also Code Availability section).

Table 3 An overview of the variables included for records in the database.

Four additional large collections were identified during and after the search and data extraction process (Table 1). The compilation by Dahlke and colleagues81 focuses on fish, including many sources from the grey literature, and all papers compiled by Comte & Olden 201784. Collections by Morley and colleagues51, Pottier and colleagues82, and Sasaki and colleagues83 all include invertebrates as well as fish. These were filtered for freshwater taxa and screened for duplicate sources; then data was extracted from each original source as described above.

Literature searches

We complemented the selection of records from existing compilations with new literature searches. A general literature search was conducted in Google Scholar in English for the time period from January 1, 2019 until April 26, 2023, to cover studies that appeared after the previous databases were compiled, with search terms targeting both the critical thermal and incipient lethal methodology (Table 2). Far more studies were obtained using search terms targeting the critical thermal method than those targeting the incipient lethal temperature. We concentrated search efforts on upper thermal tolerance, but also extracted lower thermal tolerance where it was reported. We also conducted a search in Web of Science from the earliest index date until April 26, 2023. Pilot searches were conducted in all languages to fine-tune the search terms. Final searches in French, German, and Spanish were conducted in Google Scholar, and the search in Chinese was conducted in the Chinese National Knowledge Infrastructure (CNKI), for the earliest indexed studies up until April 26, 2023 (Supplementary Table 1). The pilot search and search term refinement for the Spanish search were performed in the Scientific Electronic Library Online (SciElo) and Google Scholar. More relevant publications were indexed in Google Scholar, so the final search was conducted there. Pilot searches in Italian and Norwegian did not result in any relevant publications. One publication in Danish came up as part of the English search and was included. Papers in other languages were excluded.

Fish made up the majority of freshwater taxa in existing databases, while invertebrates were vastly underrepresented47. The GlobTherm database containing the most taxa (over 2000) includes only 8 species of freshwater insects47, a four-fold underrepresentation according to current estimates of the total species on earth93,94. Therefore we exerted additional effort to find papers focused on the thermal tolerance of freshwater invertebrates. We deemed this extra effort unnecessary for fish, since freshwater fish were already well-represented in existing collections, with the large compilation by Dahlke and colleagues focusing on fish entirely. Freshwater invertebrate species outnumber freshwater fish at least seven to one; freshwater insect species alone outnumber freshwater fish species five to one95, yet fish species outnumber invertebrate species in all existing collections covering both groups.

Scoping searches confirmed that key papers focused on invertebrates were partly missed in searches with general search terms. We then modified the general English search terms by adding the order, family, or genus name of freshwater invertebrates obtained from a list of freshwater invertebrates sampled in Germany over the last 12 years, spanning habitats from near-natural to highly degraded conditions96,97. The orders and families from this list are distributed globally56, and contain many of the most studied taxa. Papers resulting from these searches were not limited to one geographic region. Nonetheless we expanded the search to include global invertebrate families from Africa, Asia, Europe, North America, Oceania, and South America98,99. A full list of all taxonomic names used in the search is found in the “methods” folder on GitHub (see Code Availability section). The expanded search resulted in 6 additional eligible studies, compared to 67 resulting from the initial list.

Google Scholar search results were saved as a file using Publish or Perish v8 software100. Search results were screened first for relevance by title and abstract, then for duplicates with the studies already included from previous databases. Some studies were eliminated during the data extraction phase because inclusion criteria were not met, in these cases the reason for elimination was noted.

Data extraction and processing

Data was extracted according to a standardized protocol. Following data extraction, a subset of the data was cross-checked against the source references by three authors. Data was also examined for outliers and unreasonable values indicative of errors (for instance, a weight of 2 kilograms for invertebrate larvae was reported, which was caused by a missed decimal point) were corrected. Spelling errors were also checked and corrected. The endpoint was scored variably by different authors, though it was often essentially the same. These were unified into simpler categories and details relegated to the notes. All categorical variables were examined and corrected for consistency; all variables and their categories are described in the full metadata on GitHub (https://github.com/hsbayat/ThermoFresh/tree/main/data/metadata). Once datasets from all authors were compiled and errors checked, coordinates were used to query the Köppen-Geiger climate region101, elevation, continent, and country of each sampling location. Missing values, primarily due to coastal proximity, were entered manually. Full taxonomic information for each taxon was queried from the Open Tree of Life102, which collates taxonomic information from multiple sources. To do this, Open Tree Taxonomy (OTT) identifiers were queried for all taxa names in the database, then taxonomic classifications were added for each taxon using the OTT identifier. All screening, harmonization, data processing, querying, and plotting were conducted in R v4.3.3103 with the following packages: revtools v0.4.1104, tidyverse v2.0.0105, data.table v1.14.10106, rotl v3.1.0107, taxize v0.9.100108, sf v1.0-15109, elevatr v0.99.0110, geodata v0.5-9111, terra v1.7-65112, fs v1.6.3113 rnaturalearth v1.0.1114, viridis v0.6.5115, patchwork v1.2.0116, ggridges v0.5.6117.

Data Records

ThermoFresh is split into four tables: thermal tolerance test information, reference information, taxonomic information, and location information. A combined table with all data, and code to combine them, is available on GitHub (https://github.com/hsbayat/ThermoFresh). Code and data are also available at Zenodo118. Full metadata is available in the “metadata” subfolder in the GitHub repository. All data files in the repository are saved as comma-separated values files (.csv). Reference information, including DOI where available, is provided in the reference table (thermtol_reference_final_ch.csv) as well as for each record in the combined table (thermtol_comb_final.csv). Reference information in the original language is also provided.

The database includes 6825 records for a total of 931 taxa, 470 invertebrates and 461 fish, from 1082 locations worldwide (Fig. 2). The database contains primarily organisms identified to the species level, with 86 taxa at the genus level and 8 above the genus level. Of the 931 total taxa, 666 reside solely in freshwater, 73 only in brackish waters, and 192 in both. A total of 505 tests recorded thermal tolerance of species in the presence of an additional stressor (e.g., pollutants, pathogen, reduced oxygen, flow velocity, salinity). The inclusion of variables like body size, life stage, coordinates, and environmental conditions allows for a wide range of hypotheses to be tested with this data. The detailed information on test metrics and methodology allows for metrics to be compared; Fig. 3 illustrates the density distributions for each metric. Records can be filtered according to desired features, e.g. a high or low acclimation temperature. Thermal tolerances from different climates can also be compared (Fig. 4a). Non-English languages account for 620 records, with 323 Chinese, 235 Spanish, 34 French, 21 German, and 7 Danish records. Non-English studies contributed most to records tested in arid climates, followed by temperate, tropical, and continental (Fig. 4b). The database features records from 1900 until 2023, which presents the opportunity for comparisons across time, though these may be limited by the scarcity of early data (Fig. 5).

Fig. 2
figure 2

A map of the critical thermal maxima (CTmax) values in the database; the color ramp indicates the value in degrees Celsius.

Fig. 3
figure 3

A comparison of the density distributions of thermal tolerance values in the database for each metric. The points underneath each curve represent the sample size; with n = 460, 409, 692, 4362, 172, 83, 94, and 553 from top to bottom (upper other, upper LTmax, upper LT50, upper CTmax, lower other, lower LTmin, lower LT50, and lower CTmin, respectively). The fill color indicates the tolerance value in degrees Celsius.

Fig. 4
figure 4

A comparison of all tolerance values by Köppen-Geiger climate region; (a) the distributions of tolerance values within each group, scaled by the sample size, and (b) the fraction of values within each group by source language.

Fig. 5
figure 5

Histogram of the number of studies per publication year included in the database. Publication date spans from 1900 until 2023.

Technical Validation

Data was extracted from the original literature sources for all studies resulting from literature searches; a note was made when data was found in supplementary information or extracted with Plot Digitizer92. Data was cross-checked with the original study for each record obtained from previous databases. A manual double-check of data entry with reference to the original source was done for 20.8% of all records by at least one additional person. Additionally the distributions of all numerical variables was visualized in R103, and outliers were manually checked. Outliers were defined as values exceeding 1.5 times the interquartile range above and below the first and third quartiles. Typos in taxon names were corrected, and outdated species names changed to currently accepted names (as of July 2024 in the Open Tree of Life Taxonomy). The names as referred to in the original source are in the notes column, where names were changed. The OTT identifier, which is included as a variable, allows for names to be updated more easily should they change in the future.

All records include a taxon name at the genus or species level, the origin of the test taxon, a thermal tolerance measure, the metric (indicating which methodology was used), the endpoint, the habitat, the location of sampling (or laboratory location for non-wild organisms), variables relating to the location (continent, country, elevation), and reference information (including publication year and language). Over 90% of records include sample size, acclimation information, and the life stage of the organism at the time of the test. Error measures are included for 65% of records; the type of error measure is also noted in all cases where error is included. This information, or lack thereof, can be used to filter the data as needed to investigate various research questions. The variables describing test methods (ramp, duration) allow for integration of the tolerance metrics into mathematical models of the tolerance landscape framework41,44,45. These contextualize tolerance with respect to intensity and duration of thermal stress, and enable extrapolation to field conditions119.

The multi-faceted search strategy involving harmonization of previous literature, additional literature searches in multiple languages across three search platforms (Google Scholar, China National Knowledge Infrastructure, and Web of Science), produced a comparably comprehensive compilation of thermal tolerance for freshwater invertebrates and fish. The concentrated search effort added 299 invertebrate taxa, nearly double the 171 taxa obtained from existing databases. Of 470 freshwater or brackish invertebrates, 322 are insects, representing 0.4% of the currently estimated total number of species93,95. While this may seem low, it is several hundred to thousand-fold more than previous work focused on all taxa47,52,53. The 461 freshwater fish species in the database represent 3% of the currently estimated number of species95.

While we attempted to counteract geographic bias in our approach, Europe and North America represent 65.6% of total records in the database though they make up 23.7% of Earth’s land area. With only English studies included, they would make up 70% of the records. While non-English languages contribute to lessening data gaps (Fig. 2), considerable work remains to fill large gaps in the geographic distribution of data.

When it comes to data gaps, multiple factors, including political ones, are at play. Data may simply be absent from certain areas, due to a lack of access, interest, or resources. In these areas more research and resources are needed to increase data coverage. However, data may also be locked in older studies which are much more difficult to access than more recently published work. For instance, thermal tolerance papers concerned with the deleterious effects of thermal effluents from power plants were prevalent in the time period from 1960 to 1980. While scans of these are common in US archives that are indexed by Google Scholar, the archives of other countries, which almost certainly also had research programs on the topic120, are difficult or impossible to access for non-native researchers. Screening literature in non-English languages requires resources, skill, and collaboration, but can play a key role in countering bias and harnessing information from locations that are least covered.

Given the continual increase in volume of literature on thermal tolerance (Fig. 5), on par with scientific research as a whole121, we expect our work to benefit from an update in the future. The search terms provided can be used to conduct equivalent searches and expand the database. Advances in automated data extraction122,123,124,125 foretell the automation of data synthesis research, which can enhance efforts to update synthetic work amidst ever-increasing output. So far, full automation of data extraction has been conducted for only few ecological variables122 or in medical research123,124,125. Medical research enforces stringent reporting guidelines for individual studies, which assist data synthesis and allow automated methods to operate effectively. Reporting guidelines have also been introduced for terrestrial respirometry126, climate change genomics127, and landscape ecology128, but have yet to be developed for thermal tolerance studies. A well-curated database such as ours can serve as a reference, benchmark, and training tool in future efforts to continuously synthesize new information on freshwater thermal tolerance.