Background & Summary

Invasive mosquito vectors are a global public health concern because of their capacity to transmit pathogens that cause substantial human mortality and morbidity1. Among these species, some belonging to the Aedes genus, specifically Aedes aegypti and Aedes albopictus, have rapidly expanded their geographical range over the last few decades. Moreover, these two species have been implicated as the main vectors in recent arboviral epidemic outbreaks around the world in both their native and exotic ranges2,3,4.

The substantial public health burden posed by these species has prompted the development of predictive mathematical models designed to deepen our understanding of mosquito population dynamics, vectorial capacity and, ultimately, to enhance our ability to anticipate the risk of arbovirus transmission [e.g.,5,6,7,8,9]. Because many processes underlying vector life cycles and pathogen transmission are highly temperature-dependent10, mosquito-borne disease models increasingly incorporate mechanistic temperature-driven processes. This incorporation of mechanism is aimed at improving our ability to extrapolate predictions of vector seasonality and disease transmission risk across time and space11,12,13,14,15.

As poikilothermic ectotherms, the biological rate processes that govern mosquito traits, such as survival, reproduction, and viral transmission rates, are strongly influenced by variation in environmental temperature16,17,18,19,20. However, although laboratory studies have provided valuable information on how temperature influences mosquito traits (e.g., larval development time, extrinsic incubation period), the current knowledge base remains fragmented. Thermal traits of Ae. aegypti and Ae. albopictus are comparatively well studied in some regions of the world17,19, while research on thermal traits of other invasive Aedes species, such as Aedes japonicus and Aedes koreicus, is still in its infancy and are understudied21,22,23,24,25.

Synthesis of thermal traits that underlie modelling efforts requires that data be readily available in consistent formats. However, published data is often presented in the tables and figures of scientific publications, mostly in summarised formats. Even when data are made available (as they increasingly are by default as editorial requirement) upon publication, the format and data standards are non–standardised26,27. Together, these factors require researchers who wish to synthesise information across studies to invest substantial time and effort in the manual extraction and management of the data into machine–readable formats. In this study, we present AedesTraits, a dataset of systematised temperature–dependent traits observations extracted from the published literature for four Aedes species: Ae. aegypti, Ae. albopictus, Ae. japonicus, and Ae. koreicus. By creating a machine–readable dataset that encompasses multiple species, populations, and experimental settings, this work supports in–depth investigations into the biology of Aedes mosquitoes and provides the broad basis necessary to further develop predictive mechanistic models. Furthermore, it allows the identification of critical gaps in current knowledge, such as the need for more experimental data on understudied species, specific traits, and environmental conditions, guiding future research efforts to fill these voids. AedesTraits aims to assist the research community by providing a comprehensive basis for advancing our understanding of vector–borne disease risk and supporting the development of outbreak forecasting approaches.

Methods

Literature search

To identify studies for inclusion, we followed the PRISMA [Preferred Reporting Items for Systematic Reviews and Meta–Analyses;28] procedure, a structured approach to conduct and report systematic reviews and meta-analyses, ensuring transparency and consistency between studies. We conducted an extensive global literature search across multiple electronic databases, including Scopus, PubMed, and Web of Science. The last search was performed on January 28th, 2025. The search encompassed published journal articles without restrictions on date or language. We queried each database using Boolean operators with the following terms for each species to limit duplicates:

  • (“Aedes aegypti” OR “Yellow fever mosquito”) AND temperature AND survival AND development

  • (“Aedes albopictus” OR “Tiger mosquito” OR “Stegomyia albopicta”) AND temperature AND survival AND development

  • Aedes koreicus” AND temperature AND survival AND development

  • Aedes japonicus” AND temperature AND survival AND development

In addition, we manually searched for references to articles and relevant reviews for potential supplementary studies. The screening process comprised three sequential steps. First, duplicate records were eliminated. Subsequently, articles were screened by three authors based on title, abstract, and keywords, followed by a full-text evaluation to extract pertinent information. The inclusion criteria focused on studies examining the relationship between mosquito traits (e.g., life history, physiological, transmission) and temperature. This encompassed both laboratory and field experiments conducted in diverse experimental settings, and using specimens from various populations or geographic origins. To qualify for inclusion, studies had to meet four criteria: (1) they must be laboratory or field experiments, rather than surveillance-based entomological studies; (2) they must report measurable Aedes life-history traits, such as survival, developmental time, or size, as outcomes; (3) temperature must be the main environmental driver investigated; and (4) the data must provide sufficient detail to be digitised and integrated into a machine–readable format.

The initial search across the academic databases yielded a total of 510 studies: Scopus (78), PubMed (205), and Web of Science (227) (Fig. 1; Table 1). After removing duplicates, we screened the titles and abstracts of 324 studies, ultimately selecting 59 for digitisation. In addition, we identified and digitised 68 other studies sourced from Google Scholar and the reference lists of relevant articles. This search process resulted in a total of 127 digitised studies, distributed across species as follows: Aedes aegypti (86), Aedes albopictus (59), Aedes japonicus japonicus (1), and Aedes koreicus (1).

Fig. 1
figure 1

PRISMA flow diagram illustrating the selection process of studies included in AedesTraits following an initial search across three databases (Scopus, PubMed, and Web of Science). Note that some studies investigated more than one species.

Table 1 Number of studies per Aedes species retrieved from each citation database source (Scopus, Web of Science, and PubMed).

Data extraction

We requested raw data directly from the corresponding authors where possible. In cases where no response was received, we manually digitised the data and compiled it into tables. For data presented in figures, where raw data were not available, we used WebPlotDigitizer v4.829 to extract the data and convert it into table format. Throughout the process of building the dataset, we followed the standard format established by the VectorByte initiative (https://www.vectorbyte.org/), which is a global platform for open–access trait [VecTraits;30] and abundance [VecDyn;31] data on disease vectors, alongside tools [e.g., Bayesian thermal performance curve fitting;32] and training for researchers.

The information extracted from the literature includes species, life-history stage, location, GPS coordinates, experimental settings, and rearing conditions. This information was digitised according to the following rules: specimens reared in colonies for more than five generations in the laboratory were considered adapted to laboratory conditions and hence different from the field populations33. If the coordinates for a specimen’s collection site were unavailable, we used the centroid of the administrative area provided in the study. Additionally, a “coord_precision” field was added to indicate the spatial resolution of the location according to the Database of Global Administrative Areas (GADM), so that users can account for geolocation uncertainty in downstream analyses: (i) ADM3+ (municipality/city, local level), (ii) ADM2 (county/district level), (iii) ADM1 (state-level), and iv) ADM0 (national when only the country was reported).

Data Records

AedesTraits34 currently hosts 30,969 rows of temperature–dependent Aedes trait observations, described through fields such as “originaltraitname”, “originaltraitdef”, which describe traits using their names (e.g., development time) and original definitions (e.g., mean duration of life stage). The values, units, and errors for these traits are stored in “originaltraitvalue”, “originaltraitunit”, “originalerrorpos”, “originalerrorneg” and “originalerrorunit”, respectively.

Environmental and experimental contexts are described using fields such as “habitat”, “labfield” (i.e., where the experiment was performed), “ambienttemp”, and “ambientlight”, among others, which capture the surrounding conditions and experimental setup under which the observations were collected. Geographical data is recorded in fields such as “locationtext”, “locationtype” (i.e., if the specimen comes from a wild or colony strain), “latitude”, “longitude”. The specific temperatures that individuals were exposed to during experiments are stored in the “interactor1temp” and ‘interactor1tempunit” fields, respectively. Fields including “interactor1stage” and “interactor1sex” are used to indicate the life stage (e.g., larval, pupal, adult) and sex (female, male, indeterminate) of the species observed during experimentation. When publications studied the effect of temperature and additional variables, the latter are recorded in the “secondstressor” fields. Studies that assessed mosquito traits vs fluctuating temperature are included in the database as long as they also include a fixed temperature treatment. In such cases, the fixed temperature treatment is the first stressor and fluctuating temperature is the second stressor. Publication and data lineage are detailed in fields such as “figuretable”, “citation”, and “doi”. The “notes” field provides options for extra metadata, ensuring each dataset’s completeness and usability.

Data Overview

AedesTraits holds information on an array of temperature–dependent traits, summarised here for clarity purposes, following the guidelines provided in Moretti et al.26, to five broad categories (Table 2): Behaviour, Infection & Transmission, Life History, Morphology, and Physiology. The original trait names, as reported in the studies, were nonetheless kept in the dataset to preserve transparency and facilitate traceability.

Table 2 Summary of trait diversity grouped into five macrocategories, following the classification framework of Moretti et al.26.

An overview of the number of distinct trait types documented for each mosquito species across the five aforementioned functional categories is described in Fig. 2A. It is important to note that this count reflects the diversity of traits, not the number of studies, i.e., a single study may contribute data for multiple trait types. For instance, although all traits recorded for Ae. koreicus originate from a single study, they encompass multiple distinct traits within the Life History category23. Overall, most traits are classified under Life History, with Ae. aegypti exhibiting 14 distinct traits and Ae. albopictus, 10. Infection & Transmission traits are also well-represented, with 6 traits for Ae. aegypti and 8 for Ae. albopictus. Morphological, Stress Tolerance & Physiological Performance traits, along with Behaviour traits, are comparatively under-represented. For Ae. japonicus japonicus and Ae. koreicus, a limited number of traits are currently documented, highlighting gaps in available trait information for these species. For illustration purposes, we display here the larval development time for Ae. aegypti and Ae. albopictus, one of the many life-history traits available in AedesTraits. (Fig. 2B). The observations exhibit considerable variation in development time across both species, likely reflecting differences in experimental protocols such as temperature, humidity, and resource availability35,36, as well as inherent ecological plasticity and potential local adaptation in Aedes populations [sensu37]. This pronounced variation emphasises the challenge of isolating intrinsic biological traits from external experimental factors and underscores the importance of adopting standardised methodologies to improve cross–study comparability27.

Fig. 2
figure 2

(A) Number of distinct trait types reported for each mosquito species across five functional categories. Bars represent the diversity of traits reported rather than the number of studies. Note that multiple traits may originate from a single study; (B) larval development time of Ae. aegypti and Ae. albopictus (orange and light blue dots respectively).

During data extraction, when possible, we also categorised the origin of mosquito populations as either derived from laboratory colonies or field collections, based on information reported in the original studies. For Ae. aegypti, most populations originated from field collections (62 instances), while colony populations were used in 30 cases, and one study did not report the origin. Similarly, studies on Ae. albopictus showed a predominance of field–derived populations (73 instances), with colony populations used in 36 instances and two studies with unspecified origin. In contrast, data for Ae. japonicus japonicus and Ae. koreicus are more limited, with only field collections reported (2 and 1 instances, respectively).

An overview of the geographical distribution of collection/experimental sites available in AedesTraits and retrieved for Ae. aegypti and Ae. albopictus is displayed in Fig. 3A. Experimental sites for Ae. aegypti are predominantly concentrated in tropical and subtropical regions. In contrast, Ae. albopictus experimental sites are primarily located in Europe or the global North, as shown in Fig. 3B, reflecting the more temperate range of Ae. albopictus compared to Ae. aegypti. As only two studies met our inclusion criteria for Ae. koreicus and Ae. japonicus, their locations—northern Italy and western Germany, respectively—are not shown. It is striking that, despite their medical importance and widespread distribution, relatively few Aedes populations have been sampled in local areas denoted highly suitable for DENV transmission 3, likely underestimating the degree to which mosquito trait responses to temperature may vary across geographically distinct populations and species38,39.

Fig. 3
figure 3

(A) DENV transmission suitability Index P from Nakase et al.42 and location of sampled population/experiments of Ae. aegypti and Ae. albopictus (orange and light blue dots, respectively) included in AedesTraits. The P index represents a mechanistic measure of dengue transmission suitability for Ae. aegypti mosquitoes based on temperature and relative humidity. (B) Latitudinal distribution of Ae. aegypti and Ae. albopictus experiments included in AedesTraits. The bars represent the number of digitised studies conducted at different latitudes, illustrating the geographic trends in experimental coverage for both species. (C) Temporal distribution of Ae. aegypti and Ae. albopictus experiments included in AedesTraits. The bars show the number of studies published per year, highlighting temporal trends in research activity across both species.

We retrieved studies spanning nearly a century, with publication years ranging from 1930 to 2024. For Ae. aegypti, studies date back as early as 1930, while for Ae. albopictus, the earliest studies were published from 1969 onwards (Fig. 3C). However, most studies for both species are concentrated from 2000 onwards, reflecting the increased research attention over recent decades. In contrast, studies on Ae. japonicus japonicus and Ae. koreicus are much more recent, first appearing in 2018 and 2019 respectively, consistent with their more recent recognition as invasive vector species.

Technical Validation

All data were verified against original sources. When information was found in reviews or secondary literature, we systematically consulted the original publications to extract and confirm the data. Studies from the same research group or location were included if they represented independent experiments (e.g., different time points, temperature regimes, or treatments). Exact duplicates were excluded, but our aim was to retain biological and methodological variation when distinct data were reported.

Manual input of large volumes of data is likely to introduce errors. To minimise such errors during data entry, each life-history trait variable was checked using frequency histograms, box plots, and/or scatter plots in R40. Any outliers identified in these plots were cross–checked against the source publications, and discrepancies were corrected accordingly.

Additionally, to quantify the variability introduced by manual digitisation, we re-digitised a randomly selected subset of approximately 10% of all figures in AedesTraits. Five operators extracted the data using WebPlotDigitizer29, and we compared the results across operators for identical datapoints. The analysis was performed by grouping points according to relevant contextual variables (e.g., sex, temperature, life stage, and additional stressors) to ensure that variability was measured within comparable experimental contexts. For each group, we calculated the coefficient of variation (CV), the robust coefficient of variation (rCV), and the quartile coefficient of dispersion (QDC). The rCV was computed as:

$${\rm{rCV}}=\frac{1.4826\times {\rm{MAD}}}{| {\rm{Median}}| }$$

where MAD is the median absolute deviation. The factor 1.4826 scales the MAD to be comparable to the standard deviation under a normal distribution41. Unlike the conventional CV, the rCV is less sensitive to outliers and skewed distributions, making it particularly appropriate for heterogeneous datasets. The QDC, defined as

$${\rm{QDC}}=\frac{{Q}_{3}-{Q}_{1}}{{Q}_{3}+{Q}_{1}}$$

(where Q1 and Q3 are the first and third quartiles, respectively), provides a robust, unitless measure of dispersion. Across all re-digitised points, we obtained a mean CV of 0.121 (SD = 0.257; median = 0.010), a mean rCV of 0.113 (SD = 0.282; median = 0.005), and a mean QDC of 0.072 (SD = 0.163; median = 0.005). While the low medians indicate high consistency for most points, the large standard deviations reveal that some points exhibited substantial variability among operators. These high-variability cases likely arose from figures with low resolution, overlapping datapoints, unclear axis scaling, or non-standard layouts. Although the median values suggest that the majority of digitised points were extracted consistently, this analysis confirms that manual digitisation inevitably introduces additional variability compared to direct use of raw numerical data. The observed pattern is not unexpected for retrospective data extraction, but it underscores the importance of publishing primary datasets in machine-readable form to improve reproducibility and reduce uncertainty. We did not assess accuracy per se, as true values were unavailable; our metrics therefore focus exclusively on inter-operator consistency.

Usage Notes

Users are advised not to use the dataset for fine-scale spatial modelling, as some locations are approximate and intended solely to indicate general collection regions rather than precise georeferencing. Georeferenced information for laboratory cultures should be treated with particular caution, as it may reflect either the colony location or the original collection site; in either case, such data are not suitable for spatial interpolation.

The dataset preserves the original definitions and measurement protocols of all traits, even when similarly named traits differ in biological meaning across studies. This approach ensures transparency and methodological fidelity, but it also means that users should carefully inspect the “trait_definition” field and accompanying metadata before combining data across sources. Apparent duplicates, such as repeated use of the same laboratory strain in different experiments, were retained when the experimental settings differed, as the aim was to capture as much biological variability as possible.

Although the focus is on experimental studies where temperature was the main manipulated variable, some records also reflect interactions between temperature and other factors, such as resource availability. Traits measured without concurrent temperature manipulation were excluded. Other environmental drivers, including precipitation and humidity, can also influence mosquito traits but are not directly represented here.

Overall, AedesTraits should be considered a structured compilation of diverse trait data presented within a systematised framework, rather than a fully standardised dataset in which all records are directly comparable without further filtering or transformation. Users should therefore avoid spatial interpolation for approximate locations, carefully review trait definitions, and take into account experimental context before fitting temperature-performance curves or conducting cross-study comparisons.