Abstract
Accurate identification of mosquito species is essential for effective vector control and mitigation of mosquito-borne disease outbreaks. Traditional morphological identification requires highly specialized personnel and is time-consuming, while molecular techniques can be cost-effective and dependent on comprehensive genetic information. Wing geometric morphometry has emerged as a promising alternative, leveraging detailed geometric measurements of wing shapes and vein patterns to distinguish between species and detect intraspecies variations. This paper presents a curated dataset of 18,104 mosquito wing images, collected from 10,500 mosquito specimens, annotated with extensive meta-information, designed to support research in wing geometric morphometry and the development of machine learning models, ultimately supporting efforts in vector surveillance and research.
Similar content being viewed by others
Background & Summary
Mosquitoes are the most important arthropod vectors of pathogens worldwide, e.g. for dengue and Zika virus1. Global change, particularly global warming and globalization, facilitates the spread of mosquitoes and their pathogens, increasing the risk of mosquito-borne pathogen transmission in previously unaffected areas2. This highlights the need for mosquito monitoring and surveillance methods to develop early prevention measures. Therefore, it is essential to accurately identify mosquito species as the ecology and vector capacity strongly vary among species3.
Mosquito species are commonly identified using taxonomic keys based on morphological characters4. The morphological species identification can be time-consuming and requires intensive entomological experience, limiting its scalability for large-scale studies5. In addition, it has been widely recognized that entomological expertise is declining as traditional taxonomy becomes less central to the biological curriculum6. Alternative methods (e.g. DNA barcoding) require specialized equipment and technical expertise7, which can be challenging to use in low-resource settings. These limitations highlight the need for complementary approaches and comprehensive datasets to improve species identification accuracy.
Wing geometric morphometrics offers a promising alternative to traditional identification methods of mosquitoes, as it reliably captures interspecific variations8,9,10. This approach utilizes the coordinates of anatomical features of the wing to analyse shape differences between species. Furthermore, the method can be used to detect subtle intraspecific differences, providing valuable insights into population structure, breeding conditions, or fecundity11,12,13. This makes wing geometric morphometrics a versatile and powerful tool for entomological research, offering a scalable and cost-effective solution for large-scale studies in mosquito biology and vector control. However, the analysis is based on the manual setting of selected landmarks on the mosquito wing image, a time-consuming and observer-biased process14.
Advances in computer vision, utilizing image processing and machine learning techniques, offer new pathways by automating the species and landmark identification process15. Convolutional Neural Networks (CNN) have demonstrated a high potential in accurately identifying mosquito species based on images of either the entire body16,17,18 or solely the wings19,20. For instance, Sauer et al. reported that a CNN model trained on wing images achieved a 91% macro-F1 score in classifying seven Aedes species19. Additionally, segmentation and regression models were used to automatically identify landmarks in Diptera. For example, Geldenhuys et al. (2023) demonstrated this for Tse Tse flies (Glossina spp.), training their model on 14,534 pairs of wings of two different species and achieving automated landmark detection with a mean distance error of 3.43 pixels per landmark21.
Despite these advancements, the public availability of large high-quality datasets remains a bottleneck for the widespread application of geometric morphometrics and machine learning techniques. Existing wing datasets are often limited in size or scope, hindering the development of robust and generalizable models22,23. Herein, we present a dataset with a comprehensive collection of diverse wing images from 72 mosquito taxa from 12 countries on five continents collected between 2008 and 202424. This dataset supports traditional morphometric studies but also enables the application of advanced machine learning techniques, creating opportunities for new insights and innovations in mosquito surveillance and research. By providing this information to the scientific community, we aim to accelerate the progress of research in mosquito biology, vector control, and disease prevention.
Methods
Mosquito Sampling
The wing dataset is composed of images consolidated from various experimental and field studies. This study consolidates the efforts of 22 research projects conducted from 2008 to 2024, incorporating a total of 10,500 mosquito specimens from 12 countries across 5 continents (Fig. 1). Various methods were used to collect the mosquitoes, including CO2-baited traps (n = 4,614), aspirators (1,630), ovitraps (308), egg-raft collection (1,683), and rearing from breeding facilities (2,113). Most of the sampled mosquitoes were identified as female (n = 9,049), with a smaller proportion being male (1,425). Species identification was primarily conducted using morphological methods (n = 4,658)4 or molecular techniques such as COI/nad4 gene barcoding (1,914)25, ITS2 gene barcoding for the identification of Anopheles species (621)26 and qPCR targeting CQ11 and ACE2 genes to identify the taxa morphologically identified as Culex pipiens s.l./torrentium (2,092)27. The remaining samples were identified based on their association with a laboratory colony (1,839).
Geographic distribution of images in the dataset. Countries colour-coded by the number of images (A). Panel (B) shows the mosquito sampling locations in Europe and (C) sampling locations in Southeast Asia. We included both Europe and Asia in the figure due to their higher variance in sampling locations compared to Africa and South America.
The complete wing dataset comprises specimens from nine genera: Culex (n = 3,980), Aedes (5,029), Anopheles (1,135), Coquillettidia (141), Culiseta (158), Uranotaenia (1), Armigeres (49), Mansonia (6), and Toxorhynchites (1). Many mosquito species are difficult or impossible to identify based solely on morphology. However, the dataset includes specimens from such groups, which were identified using morphological characteristics alone. Thus, to present the taxonomic information in a machine-readable format, we developed a hierarchical system of taxonomic levels that also describes the uncertainties in species identification (Supplementary Table 1). The first level corresponds to the family, the second level to the genus and the fourth level to species. The third taxonomic level encompasses morphologically very similar species pairs (e.g. Ae communis/Ae. punctor), species groups (e.g. Ae. annulipes group), species complexes (An. maculipennis s.l.) or combinations of these aggregated taxa (e.g. Cx. pipiens s.l./Cx. torrentium). The species names (fourth taxonomic level) for these specimens were assigned only when the identification was confirmed through molecular assays. Information on subspecies or biotypes is presented under the fifth taxonomic level.
Wing preparation and image capture
Wings were removed from mosquitoes with tweezers under a stereo microscope. The wings were placed on a microscopic slide and embedded in Euparal (Carl Roth, Karlsruhe, Germany) with a cover slide for long-term storage. Detailed instructions on the wing removal process are provided in the supplementary material (see supplementary material: wing_removal_instructions.pdf). In total 18,104 images were captured using different stereomicroscopes and a smartphone with an attached macro-lens. Most images (n = 12,462) were captured using the Olympus SZ61 (Olympus, Tokyo, Japan) in conjunction with the Olympus DP23 camera (Olympus, Tokyo, Japan), followed by 3,577 images with the Leica M205c microscope (Leica Microsystems, Wetzlar, Germany) and 1,685 images using an iPhone SE 3rd generation (Apple Inc., Cupertino, USA) in combination with a macro-lens taken at 24x magnification (Apexel-24XMH, Apexel, Shenzhen, China). The images were captured in TIF format, with resolutions of 3024 × 3024 for smartphone images, 3088 × 2076 for Olympus DP23 images and 2560 × 1920 for images captured using the Leica M205c. Imaging settings were not standardized and thus varied between image collection projects in parameters not further recorded, e.g. exposure time or lighting conditions (Fig. 2). The images captured with the stereomicroscopes are displayed with a scale providing a reference for wing size measurements such as wing length.
Data Records
The image dataset was uploaded to the Bioimage Archive and published under a CC-BY 4.0 license (S-BIAD1478, https://doi.org/10.6019/S-BIAD1478)24. Images are organized in a hierarchical folder structure with separate folders per 2nd taxonomic level, i.e. genus. The images are named according to the following template and comprehensive metadata (e.g., sampling location or image capture device) is provided for each image (Supplementary Table 2).
<2. Taxonomic Level > _ <project> _ <sex> _ <wing-side> _ <image_id>.TIF
Technical Validation
Mosquito taxa were identified using mostly two approaches: by trained taxonomists with a dichotomous taxonomic key4 or through established molecular techniques such as gene barcoding or taxa-specific PCRs as described in the method section25,26,27.
The validity of the mosquito wing images has been demonstrated in previous studies. Specifically, several projects within this dataset, such as “landmark-mosquito-identification“8, “aegypti-diversity-study“28 and “landmark-aedes-collection“9 have shown that the wing images exhibit species-specific characteristics useful for wing geometric morphometrics for species identification or the analysis of population structure. Moreover, the data set has been proven valuable for deep learning applications, i.e. Sauer et al.19 (Project: CNN-study) and Nolte et al.20 (Project: ConVector) successfully trained CNN models to identify mosquito species. Moreover, Maciel-de-Freitas et al. (Project: aegypti-Wolbachia-study) demonstrated the utility of these image data to analyse the relationship between wing measurements, such as wing length and shape, and mosquito fitness parameters, such as fecundity, in Ae. aegypti11.
Despite extensive efforts to ensure comprehensive data collection, some metadata entries remain incomplete. Capture location is unavailable for 7.9% of samples, capture method is missing for 1.4%, and precise collection dates are missing for 24.1% of wing samples. Notably, the majority of samples lacking a collection date (75.5%) were derived from laboratory colonies.
Additionally, 48.4% of the wing images in this dataset originate from unpublished projects that adhered to data collection standards consistent with those in previously published studies. The remaining 9433 images were associated with 8 different publications8,9,11,19,20,29,30,31 of which half provided the images as part of their publication32,33,34,35.
Usage Notes
This dataset supports the development and testing of geometric morphometric methods and machine learning models. However, users should be aware of certain limitations. The images were not captured under standardized conditions, resulting in significant variation across projects, such as differences in lighting and background (Fig. 2). This is due to the retrospective nature of data collection. Additionally, the dataset includes images of damaged wing samples (n = 598), which lack certain morphological features.
As further image data are collected for mosquitoes and other dipteran vectors, we plan to expand this dataset regularly. We also invite contributions from the scientific community to enhance this growing collection of wing images by contacting the authors.
Code availability
No custom code was used in this study. All images are available in their original state without any processing applied.
References
Franklinos, L. H. V., Jones, K. E., Redding, D. W. & Abubakar, I. The effect of global change on mosquito-borne disease. Lancet Infect. Dis. 19, e302–e312 (2019).
Kraemer, M. U. G. et al. Past and future spread of the arbovirus vectors Aedes aegypti and Aedes albopictus. Nat. Microbiol. 4, 854–863 (2019).
Ferraguti, M. Mosquito species identity matters: unraveling the complex interplay in vector-borne diseases. Infect. Dis. 56, 685–696 (2024).
Becker, N. et al. Mosquitoes: Identification, Ecology and Control (Springer, Cham, 2020).
Farlow, R., Russell, T. L. & Burkot, T. R. Nextgen Vector Surveillance Tools: sensitive, specific, cost-effective and epidemiologically relevant. Malar. J. 19, 432 (2020).
Wilkerson, R. C., Linton, Y.-M. & Strickman, D. Mosquitoes of the World. https://doi.org/10.1353/book.79680 (Johns Hopkins University Press, 2021).
Piper, A. M. et al. Prospects and challenges of implementing DNA metabarcoding for high-throughput insect surveillance. GigaScience 8, giz092 (2019).
Sauer, F. G. et al. Geometric morphometric wing analysis represents a robust tool to identify female mosquitoes (Diptera: Culicidae) in Germany. Sci. Rep. 10, 17613 (2020).
Sauer, F. G. et al. Using geometric wing morphometrics to distinguish Aedes japonicus japonicus and Aedes koreicus. Parasit. Vectors 16 (2023).
Martinet, J.-P. et al. Wing Morphometrics of Aedes Mosquitoes from North-Eastern France. Insects 12, 341 (2021).
Maciel-de-Freitas, R. et al. Wolbachia strains w Mel and w AlbB differentially affect Aedes aegypti traits related to fecundity. Microbiol. Spectr. 12, e00128–24 (2024).
Lorenz, C. et al. Geometric morphometrics in mosquitoes: What has been measured? Infect. Genet. Evol. 54, 205–215 (2017).
Phanitchat, T. et al. Geometric morphometric analysis of the effect of temperature on wing size and shape in Aedes albopictus. Med. Vet. Entomol. 33, 476–484 (2019).
Dujardin, J.-P. A., Kaba, D. & Henry, A. B. The exchangeability of shape. BMC Res. Notes 3, 266 (2010).
Høye, T. T. et al. Deep learning and computer vision will transform entomology. Proc. Natl. Acad. Sci. 118, e2002545117 (2021).
Goodwin, A. et al. Mosquito species identification using convolutional neural networks with a multitiered ensemble model for novel species detection. Sci. Rep. 11, 13656 (2021).
Zhao, D. et al. A Swin Transformer-based model for mosquito species identification. Sci. Rep. 12, 18664 (2022).
Couret, J. et al. Delimiting cryptic morphological variation among human malaria vector species using convolutional neural networks. PLoS Negl. Trop. Dis. 14, e0008904 (2020).
Sauer, F. G. et al. A convolutional neural network to identify mosquito species (Diptera: Culicidae) of the genus Aedes by wing images. Sci. Rep. 14, 3094 (2024).
Nolte K, Sauer FG, Baumbach J, Kollmannsberger P, Lins C, Lühken R. Robust mosquito species identification from diverse body and wing images using deep learning. Parasites & Vectors. 17, 372 (2024).
Geldenhuys, D. S. et al. Deep learning approaches to landmark detection in tsetse wing images. PLOS Comput. Biol. 19, e1011194 (2023).
Virginio, F. et al. WingBank: A Wing Image Database of Mosquitoes. Front. Ecol. Evol. 9 (2021).
Cannet, A. et al. An annotated wing interferential pattern dataset of dipteran insects of medical interest for deep learning. Sci. Data 11, 4 (2024).
Nolte, K. et al. Dataset: Comprehensive Mosquito Wing Image Repository for Advancing Research on Geometric Morphometric- and AI-Based Identification. BioImage Archive https://doi.org/10.6019/S-BIAD1478 (2024).
Fang, Y., Shi, W.-Q. & Zhang, Y. Molecular phylogeny of Anopheles hyrcanus group (Diptera: Culicidae) based on mtDNA COI. Infect. Dis. Poverty 6, 61 (2017).
Lühken, R. et al. Distribution of individual members of the mosquito Anopheles maculipennis complex in Germany identified by newly developed real-time PCR assays. Med. Vet. Entomol. 30, 144–154 (2016).
Rudolf, M. et al. First Nationwide Surveillance of Culex pipiens Complex and Culex torrentium Mosquitoes Demonstrated the Presence of Culex pipiens Biotype pipiens/molestus Hybrids in Germany. PLOS ONE 8, e71832 (2013).
Hounkanrin, G. et al. Genetic diversity and wing geometric morphometrics among four populations of Aedes aegypti (Diptera: Culicidae) from Benin. Parasit. Vectors 16, 320 (2023).
Sauer, F. G., Timmermann, E., Lange, U., Lühken, R. & Kiel, E. Effects of Hibernation Site, Temperature, and Humidity on the Abundance and Survival of Overwintering Culex pipiens pipiens and Anopheles messeae (Diptera: Culicidae). J. Med. Entomol. 59, 2013–2021 (2022).
Keirsebelik, M. S. G. et al. Dengue Virus Serotype 1 Effects on Mosquito Survival Differ among Geographically Distinct Aedes aegypti Populations. Insects 15, 393 (2024).
Şuleșco, T., Sauer, F. G. & Lühken, R. Update on the distribution of Anopheles maculipennis s.l. members in the Republic of Moldova with the first record of An. daciae. Preprint at https://doi.org/10.21203/rs.3.rs-4747160/v1 (2024).
Nolte, K. et al. Data from: Robust mosquito species identification from diverse body and wing images using deep learning. 4845261526 bytes Dryad https://doi.org/10.5061/DRYAD.B8GTHT7MX (2024).
Sauer, F. et al. Data from: A convolutional neural network to identify mosquito species (Diptera: Culicidae) of the genus Aedes by wing images. 9952586077 bytes Dryad https://doi.org/10.5061/DRYAD.VX0K6DJZ9 (2024).
Sauer, F. G. et al. Data from: Using geometric wing morphometrics to distinguish Aedes japonicus japonicus and Aedes koreicus. 5320199560 bytes Dryad https://doi.org/10.5061/DRYAD.ZCRJDFNJN (2023).
Sauer, F. G. et al. Data from: Geometric morphometric wing analysis represents a robust tool to identify female mosquitoes (Diptera: Culicidae) in Germany. 3688850492 bytes Dryad https://doi.org/10.5061/DRYAD.ZS7H44J5S (2020).
Acknowledgements
This project is funded through the Federal Ministry of Education and Research of Germany, with the grant number 01Kl2022, PAN-ASEAN Coalition for Epidemic and Outbreak Preparedness (PACE-UP; German Academic Exchange Service (DAAD) Project ID: 57592343), Federal Ministry of Health (ZMI1-2521NIK400), Deutsche Forschungsgemeinschaft (DFG) (grant number MA 9541/1-1 and SCHM 2413/9-1), Federal Office for Agriculture and Food (BLE) (FKZ 2819113519), Fundação Carlos Chagas Filho de Amparo à Pesquisa no Estado do Rio de Janeiro (grant number E-14/2019)), Heinrich-Böll-Stiftung grants for doctoral students of the German Federal Ministry of Education and Research, GIZ (Deutsche Gesellschaft für Internationale Zusammenarbeit; grant number 81281913), German Center for Infection Research, and Conservation, Building and Nuclear Safety (BMUB) through the Federal Environment Agency (UBA) (FKZ 3721484020).
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Contributions
K.N., F.G.S., R.L. wrote the manuscript. K.N. harmonized all data entries. All authors collected the original data. All authors revised and corrected the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Nolte, K., Agboli, E., Garcia, G.A. et al. Comprehensive Mosquito Wing Image Repository for Advancing Research on Geometric Morphometric- and AI-Based Identification. Sci Data 12, 715 (2025). https://doi.org/10.1038/s41597-025-05043-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-05043-3