Abstract
Reliable identification of teleost fishes is essential for understanding their biology and conserving biodiversity. To support this, a comprehensive DNA barcode reference library was developed for Aurora, Cagayan, and Zambales, Luzon, Philippines. A total of 1,513 specimens were collected from 27 sampling sites, including fish markets, landing areas, and freshwater habitats, and analyzed using COI (707 sequences) and 12S rRNA gene (343 sequences) markers. The dataset identified 323 fish species across 187 genera, 83 families, and 37 orders, including nine newly recorded species for the Philippines and nine deep-sea species (>200 meters). Additionally, 29 newly barcoded taxa were deposited in GenBank, with sequences available for COI, 12S rRNA gene, or both. The expansion of the 12S rRNA gene sequence library enhances its utility as an alternative genetic tool, particularly for environmental DNA (eDNA) studies. This reference database serves as a valuable resource for species identification, biodiversity assessments, and sustainable fisheries management in North Luzon, Philippines.
Similar content being viewed by others

Background & Summary / Introduction
Species identification is fundamental to biological research and conservation, enabling biodiversity assessment, ecological studies, and effective conservation strategies1,2. Accurate identification supports monitoring species abundance, restoring habitats, understanding community dynamics, and ensuring food safety while combating food fraud3,4,5. Incorrect species identification can result in flawed conclusions, wasted resources, compromised research and misdirected conservation efforts, thereby hindering biodiversity understanding and ecosystem management1,2. Morphological methods once dominated species identification, especially in taxonomy, but require extensive expertise, particularly for specialized groups like fish6,7. The shortage of taxonomists highlights the need for alternative, accurate identification methods8,9.
DNA barcoding has revolutionized species identification, offering a reliable, standardized approach that detects subtle morphological differences and intraspecific variation10. Its advantages are particularly valuable in biodiversity-rich marine habitats, supporting fisheries management, species discovery, wildlife trafficking prevention, and detecting marketplace mislabeling11,12,13,14. Various genetic markers aid in species identification, with the cytochrome c oxidase subunit 1 (COI) gene being widely recognized for its high species-level resolution, especially in fish3,15,16.
The 12S rRNA gene is also an effective alternative for fish identification through eDNA metabarcoding, although occasional ambiguities can arise due to the short DNA fragment length (<200 bp), which is a result of the specific primers commonly used in eDNA studies17,18,19. Despite this, these markers reliably distinguish most species and genera within the same family. Additionally, DNA barcoding has emerged as a key tool to monitor biodiversity, combat wildlife trade, and guide conservation efforts20,21,22.
The Philippines, dubbed the “global center of marine biodiversity,” lies within the Indo-Pacific Coral Triangle, the epicenter of reef fish diversity22,23,24. With over 3,700 marine species and 374 freshwater species documented, its waters are vital to global fish diversity and national food security, —supporting ecological balance, sustaining coastal livelihoods, enhancing fishery resource availability, and contributing to national nutrition and self-sufficiency22,25,26. However, overfishing, habitat loss, and pollution threaten fisheries and biodiversity27,28.
North Luzon hosts key fisheries hotspots—Aurora, Cagayan, and Zambales— renowned for their rich fish diversity and significant contributions to the Philippine fishing industry29,30,31. Further research is essential for sustainable fisheries and effective conservation. DNA barcoding, particularly using the COI gene, has proven invaluable for fish species identification in the Philippines32,33,34, while the 12S rRNA gene enhances eDNA metabarcoding analyses of species diversity and the detection of invasive species17,18,19.
Our dataset, focusing on DNA barcoding of teleost fishes in Aurora, Cagayan and Zambales, was compiled from 1,513 voucher specimens collected between January and June 2023 (Fig. 1, Table 1). Samples were obtained from fish markets, landing sites, and rivers across these three provinces of the North Luzon Islands. A total of 707 COI and 343 12S rRNA gene sequences were successfully generated, representing 323 species, 187 genera, 83 families, 37 orders, and one infraclass. A Summary of species per family with Genbank Accession numbers is presented in Table 2, while detailed sample information is provided in Tables S1a–S1c. However, 11 specimens were identified only up to the genus level (Table 2) due to one or more of the following reasons: (1) species lacking corresponding BLAST results with genetic similarity equal to or greater than 98%; (2) species that were challenging to identify based on morphology; and (3) species that may represent potentially undescribed taxa requiring further investigation.
Map of 27 sampling sites in the provinces of Aurora (Blue), Cagayan (Red), and Zambales (Green). The large circle represents two adjacent sampling areas: two fish markets in Cagayan and two river sites in Aurora. Details of sampling localities in Table 1.
Specimens were either purchased directly from markets and landing sites or collected with the assistance of local fishermen to capture freshwater species diversity. Each specimen was sequenced for a ~650 bp fragment of the COI gene and ~170 bp of the 12S rRNA gene (before trimming) to build a barcode reference library linked to voucher specimens. Our dataset also constitutes the most comprehensive regional DNA reference library of bony fishes, including voucher specimens, collection data, color photographs (Figure S1), and genetic samples for future analyses. Voucher specimens were meticulously identified through combined DNA sequencing and morphological examination, ensuring accurate taxonomic classification —even for taxa lacking COI or 12S rRNA gene entries in GenBank (see Methods and Technical Validation).
Among the 83 documented families, eight exhibited particularly high species diversity (>10) (Fig. 2a), including Lutjanidae (30 species), Oxudercidae (25 species), Carangidae (23 species), Acanthuridae (17 species), Epinephelidae (16 species), Labridae (13 species), Lethrinidae (11 species), and Nemipteridae (10 species). The dataset also includes 29 taxa from 16 families that were newly sequenced for either COI, 12S rRNA gene, or both in GenBank and BOLD (Fig. 2b; Table 2). We further analyzed the most speciose families separately for each province—Aurora, Cagayan, and Zambales (Fig. 3a–c). Marine and freshwater species distributions are illustrated in Figs. 2a and 3a–c, respectively, to highlight habitat differences and sampling effort. These figures clearly distinguish species collected from marine environments (e.g., coastal markets and landing sites) from those obtained in freshwater habitats (e.g., rivers), underscoring both the ecological diversity of the sampled species and the comprehensive nature of our field surveys.
Species diversity included in the DNA barcoding of teleost in Aurora, Cagayan and Zambales. (a) Number of species per family across all sampling sites; families with only one barcoded DNA species shown to right of the graph. (b) Number of newly sequenced species in the dataset. Black bar represent species can be found in marine, gray bar represents freshwater and light gray bar represent species can be found in both marine and freshwater habitats.
Number of species per family in (a) Aurora, (b) Cagayan, and (c) Zambales. Black bar represent species can be found in Marine, gray bar represents freshwater and light gray bar represent species can be found in both Marine and freshwater habitats. Families with only one species representative are not included.
Our dataset also includes nine newly recorded species for Philippine waters, representing seven teleost families (Table 3). Moreover, it significantly expands the documented fish fauna by barcoding and identifying nine deep-sea taxa (Table 4), adding 78 taxa not recorded by35 (Table 2), and supplying 12S rRNA gene data for all species.
The dataset reveals that most species are classified as Least Concern (78.64%) on the IUCN Red List. However, a notable proportion are categorized as threatened, including Vulnerable (1.55%), Near Threatened (0.62%), and Endangered (0.31%). Additionally, 4.95% of species are listed as Data Deficient, and 45 species have not yet been evaluated.
Our genetic diversity analysis using COI and 12S rRNA gene amplicons showed increasing K2P genetic distances with higher taxonomic levels. For COI, the mean distances were 0.45 ± 0.03% (intra-specific), 14.50 ± 0.23% (intra-generic), 19.87 ± 0.18% (intra-family), 23.83 ± 0.33% (intra-order), and 25.10 ± 0.08% (intra-class) (Fig. 4a, Table 5). While, for 12S rRNA gene, it revealed a similar trend of increasing distances with higher taxonomy persisted; however, the intra-specific distances could not be calculated due to only one sample sequenced per species. Nevertheless, the mean distances were 4.92 ± 0.21% (intra-generic), 14.99 ± 0.40% (intra-family), 19.82 ± 0.83% (intra-order), and 23.84 ± 0.26% (intra-class) (Fig. 4b, Table 6). A neighbor-joining (NJ) tree reconstructed from 707 COI sequences confirmed monophyly for all the species. The NJ tree was generated using optimal substitution models, with 1,000 bootstrap replications and outgroup sequences (Figures S2-S3) as per36.
Distribution of K2P genetic distances at different taxonomic levels for (a) COI based on 501 bp of 678 sequences (specimen identified up to genus level were excluded) and (b) Distribution of K2P genetic distances at different taxonomic levels for 12S rRNA gene on 146 bp of 341 sequences (two sequences were excluded for being too short after trimming). K2P genetic distances were used strictly as a technical validation tool to support the reliability of the barcode data and the presence of a barcode gap.
DNA barcoding of teleost fishes in Aurora, Cagayan, and Zambales expanded the Philippine DNA barcode library using COI and 12S rRNA gene. The 12S rRNA gene, an emerging marker for species identification, is increasingly utilized in eDNA metabarcoding for species surveys. The DNA barcoding of teleost fishes (Teleostei) in Aurora, Cagayan, and Zambales, serves as checklist and reference library, improving knowledge of species diversity and distribution. It also aids in verifying the identity of fish specimens collected from key sampling areas, supporting accurate documentation of ichthyofaunal diversity.
Methods
Collection of samples
Sample collection was coordinated with the DA-Bureau of Fisheries and Aquatic Resources (BFAR) Regional Offices 02 (Tuguegarao City, Cagayan) and 03 (City of San Fernando, Pampanga), Philippines. Field sampling was conducted over 24 days (January 10–20 and June 13–25, 2023), with necessary collection and transport permits secured from BFAR Central Office and Regions 2 and 3. Marine fish specimens were obtained from 12 fish markets and landing sites, while freshwater species were collected with the assistance of local fisherfolk from 15 rivers across North Luzon (Fig. 1, Table 1). Each specimen was promptly documented, with fin clips or tissue samples preserved in 95% ethanol. Voucher specimens were fixed in 10% formalin and gradually transferred to 70% ethanol for long-term preservation. Samples from Aurora, Cagayan, and Zambales were transported to the Laboratory of Ichthyological Systematics and Evolution, Department of Oceanography, National Sun Yat-sen University, Taiwan (DOS), for DNA extraction. An initial species checklist was created based on morphological assessment of voucher specimens and photographs, guiding the selection of up to two samples per species for genetic analysis.
DNA Extraction, amplification and sequencing
DNA extraction was performed from fin clip samples (1–2 per species) using the GeneMark DNA purification Kit (GMbiolab, Taiwan), following the manufacturer’s protocol. Partial mitochondrial COI and 12S rRNA gene fragments were amplified via PCR using fish-specific primers, for 12S rRNA gene37 and for COI38. Each PCR reaction comprised a total volume of 25 µL containing 9.5 µL of ultrapure water, 12.5 µL Zeju PCR Master Mix (2X), 1.0 µL of each of forward and reverse primers, 1.0 µL of extracted DNA. PCR amplification conditions for the COI gene were initial denaturation at 94 °C for 5 min; 35 cycles of denaturation at 94 °C for 1 min, annealing at 55 °C for 45 sec, and extension at 72 °C for 1 min; followed by a final extension at 72 °C for 10 min. For the 12S rRNA gene, the thermal profile included an initial denaturation at 95 °C for 1 min; 35 cycles of denaturation at 95 °C for 15 s; annealing at 60 °C for 15 s; and extension at 72 °C for 15 s with a final extension at 72 °C for 25 min. Amplification success was assessed via electrophoresis on a 1.5% agarose gel. Successfully amplified products were purified using the SAP-EXO Purification Kit (Jena Bioscience, Jena, Germany). Purified PCR products/ amplicons were submitted to Genomics Sci & Tech (Taipei, Taiwan) for sequencing. Resulting sequences were checked and manually edited, if necessary, using Molecular Evolution Genetics Analysis software (MEGA version 11)39. Sequences were translated into amino acids to detect potential indels or stop codons indicative of nuclear mitochondrial pseudogenes36. All sequences have been submitted to GenBank, with corresponding accession numbers and specimen metadata detailed in Tables S1a–S1c. Additional information—including collection location, depth, date, and method—has been provided both in the FASTA headers and in the supplementary tables.
Failed or low-quality sequences were managed through a series of remedial steps. For the COI gene, two primer sets (F1R1 and F2R2) were employed; when amplification with one set was unsuccessful, the alternative set was used, accompanied by optimization of PCR conditions. Samples that failed to amplify with both primer sets were excluded from subsequent analyses. For the 12S rRNA gene, a single primer pair was utilized. In instances of amplification failure, PCR conditions were similarly optimized; however, if amplification remained unsuccessful, those samples were likewise omitted from further analysis.
Data Analyses
Chromatogram signals were examined using MEGA version 1139, with manual trimming as needed. Species identification was verified using BLAST-NCBI (https://blast.ncbi.nlm.nih.gov) and Barcode of Life Database (BOLD; https://www.boldsystems.org/), accepting matches with ≥ 98% sequence similarity36. COI and 12S rRNA gene sequences were compared against GenBank and BOLD databases to generate a species list. Phylogenetic analysis was conducted in MEGA 11 software. For specimens with conflicting morphological and molecular identifications or multiple high similarity matches in online databases, identification was verified using voucher specimens, colored photographs, taxonomic keys, and published literature (e.g.40). The online tools Automatic Barcode Gap Discovery (ABGD) (https://bioinfo.mnhn.fr/abi/public/abgd/abgdold.html) and Assemble Species by Automatic Partitioning (ASAP) (https://bioinfo.mnhn.fr/abi/public/asap/asapold.html) were applied solely as supplementary tools to support the consistency, quality, and reliability of the barcode dataset.
All manually trimmed sequences were uploaded to these platforms to assess molecular divergence and identify candidate species clusters. Taxonomic classification followed41,42. The final species list was cross-referenced with25 and recent studies (i.e.35,36,43,44) to identify newly recorded species in the Philippines. Conservation status was assessed using IUCN45. After species identification, all sequences were aligned and trimmed to uniform lengths (COI: 501 bp, 12S rRNA gene: 146 bp) using MEGA 1139. The Kimura-2-Parameter (K2P)46 genetic distances at different taxonomic levels were calculated, including intra-specific distance (excluding species with only one sequence), inter-specific distance within the same genus (excluding genera with only one species), inter-genus distance within the same family (excluding families with only one genus), and inter-family distance within the same order (excluding orders with only one family). The best-fit nucleotide substitution model (e.g., TN93 + G + I) was selected via Bayesian Information Criterion (BIC) in MEGA 11 and applied to reconstruct a neighbor-joining (NJ) tree with 1000 bootstrap replicates47,48,49. Results were compared with the final species list based on morphology and BLAST matches to confirm species identities.
Data Records
The verified COI and 12S rRNA gene sequence library for DNA barcoding of teleost fishes from Aurora, Cagayan, and Zambales includes (1) voucher specimens, (2) voucher collection information, (3) fresh color photographs, and (4) COI (≥501 bp) and 12S (≥150 bp) sequences. All DNA sequences were deposited in NCBI GenBank (COI: PP355131–PP461061; 12S: PP456720–PP697971)50,51, while photographs, voucher catalog numbers, and collection data were detailed in Figure S1 and Tables S1a–1c.
Technical Validation
To resolve discrepancies between morphological and molecular identifications or when databases suggested multiple closely related species, we re-examined photographs and voucher specimens (if available) using taxonomic keys and published literature (e.g.31). This verification process was essential for ensuring accurate species identification. Molecular validation was performed using COI sequences from BOLD, which integrates with GenBank and provides a comprehensive reference library. To complement this, 12S rRNA gene sequences from GenBank were used. Specimens were identified as the same taxon if sequence similarity was ≥98.0%, with confirmation through voucher specimen examination or fresh-color photographs. Specimens with ≤98.0% similarity were either reprocessed through PCR and sequencing to ensure barcode quality. To further verify species identity, a neighbor-joining (NJ) tree was constructed using 707 COI and 343 12S rRNA gene sequences, with monophyletic clustering as the primary criterion. Additionally, Automatic Barcode Gap Discovery (ABGD) analysis estimated the number of operational taxonomic units (OTUs) to range from 261 to 331, while Primary Species Delimitation using Automatic Partitioning (ASAP) suggested a potentially higher diversity, identifying 307 to 336 OTUs. In total, among 707 COI and 343 12S rRNA gene sequences in our dataset 29 taxa were either “newly sequenced for 12S rRNA gene” or “newly sequenced for COI” or both.
Code availability
No custom code was used.
References
Gaston, K. J. & Spicer, J. I. Biodiversity: An Introduction. (Blackwell Publishing, 2004).
Primack, R. B. Essentials of Conservation Biology. (Sinauer Associates, 2017).
Hebert, P. D. N., Ratnasingham, S. & deWaard, J. R. Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proc. R. Soc. 270 (2003).
Leston, L. et al. A framework to support the identification of critical habitat for wide-ranging species at risk under climate change. Biodivers Conserv. 33, 603–628 (2024).
NOAA. Presidential Task Force on Combating Illegal Unreported and Unregulated (IUU) Fishing and Seafood Fraud Action Plan. Federal Register 80, 66867–66879 (2015).
Strauss, R. E. & Bond, C. E. in Methods for Fish Biology (eds. Schreck, C. B. & Moyle, P. B.) Ch. 4 (Am. Fish. Soc., 1990).
Saraswati, P. K. & Srinivasan, M. S. in Micropaleontology (eds. Saraswati, P.K. & Srinivasan, M.S.) Ch. 4. Springer https://doi.org/10.1007/978-3-319-14574-7_4 (2016).
Al-Asif, A. & Nerurkar, S. Taxonomy in crisis: Addressing the shortage of taxonomists in a biodiversity hotspot era. J. Aquat. Res. Sustain. 1, 1–4 (2024).
Engel, M. S. et al. The taxonomic impediment: a shortage of taxonomists, not the lack of technical approaches. Zool. J. Linn. Soc. 193, 381–387 (2021).
Trivedi, S., Aloufi, A., Ansari, A. A. & Ghosh, S. Role of DNA barcoding in marine biodiversity assessment and conservation: An update. Saudi J. Biol. Sci. 23, 161–171 (2016).
Janzen, D. H. et al. Integration of DNA barcoding into an ongoing inventory of complex tropical biodiversity. Mol. Ecol. Resour. 9, 1–26 (2009).
Poniente, J. A., Pereda, J. M. R., Dela Peña, J. T., Ventolero, M. F. H. & Santos, M. D. Mitochondrial DNA-based species testing of confiscated aquatic wildlife in the Philippines. Forensic Sci. Int. Anim. Environ. 2, 100051 (2022).
Windsor, A. M., Moore, M. K., Warner, K. A., Stadig, S. R. & Deeds, J. R. Evaluation of variation within the barcode region of Cytochrome c Oxidase I (COI) for the detection of commercial Callinectes sapidus Rathbun, 1896 (blue crab) products of non-US origin. PeerJ 7, e7827 (2019).
Iyiola, O. A. et al. DNA barcoding of economically important freshwater fish species from north‐central Nigeria uncovers cryptic diversity. Ecol. Evol. 8, 6932–6951 (2018).
Wiley, E. O. & Lieberman, B. S. Phylogenetics: Theory and Practice. (Wiley-Blackwell, 2011).
Haldar, C. & Nath, S. DNA barcoding for fish species identification: current status and future prospective. Int. J. Fauna Biol. Stud. 7(4), 72–75 (2020).
Milan, D. T. et al. New 12S metabarcoding primers for enhanced Neotropical freshwater fish biodiversity assessment. Sci. Rep. 10, 17966 (2020).
Cilleros, K. et al. Unlocking biodiversity and conservation studies in high‐diversity environments using environmental DNA (eDNA): A test with Guianese freshwater fishes. Mol. Ecol. Resour. 19, 27–46 (2019).
Shaw, J. L. A. et al. Comparison of environmental DNA metabarcoding and conventional fish survey methods in a river system. Biol. Conserv. 197, 131–138 (2016).
Fontanilla, I. K. et al. State of animal DNA barcoding in the Philippines: A review of COI sequencing of Philippine native fauna. Philipp. Sci. Lett. 7, 1–7 (2014).
Santana, P. et al. DNA barcode reveals occurrence of threatened species and hidden diversity on Teleost fish trade in the Coastal Amazon. Sci Rep. 13, 19749 (2023).
Carpenter, K. E. & Springer, V. G. The center of the center of marine shore fish biodiversity: the Philippine Islands. Environ Biol Fish 72, 467–480 (2005).
Allen, G. R. Conservation hotspots of biodiversity and endemism for Indo-Pacific coral reef fishes. Aquatic Conserv: Mar. Freshw. Ecosyst. 18, 541–556 (2008).
Go, K. T. B. et al. Species richness and abundance of non-cryptic fish species in the Philippines: a global center of reef fish diversity. Biodivers. Conserv. 24, 2475–2495 (2015).
Froese, R. & Pauly, D. Fishbase https://www.fishbase.se/search.php (2024).
Herre, A. W. Check list of Philippine fishes. Fish Wildl. Serv. Res. Rep. 20, 293–295 (1953).
Guerrero, R. III Commercially Caught Freshwater Fishes in the Philippines: Status, Issues, and Recommendations. Trans. Nat Aca. Sci. Tech. Philippines 44, 1–15 (2022).
Briones, M., Dey, M. & Ahmed, M. The future of fish in the food and livelihoods of the poor in Asia. NAGA MegaFish Cent. Q. 27, 48–50 (2004).
BFAR-RO3. Bureau of Fisheries and Aquatic Resources https://r3.bfar.da.gov.ph/fisheries-profile-central-luzon/ (2023).
Calicdan, M., Gumiran, E., Encarnacion, A. & Ayson, J. Assessment of fisheries resources in the Babuyan Channel. Philipp. J. Fish. 25, 14–24 (2018).
Rueca, L. M., Bien, N. B., Bathan, R. M., Yuzon, J. I. & Salamat, G. B. in Fish Stock Assessment in Northern Zambales Coast (Bureau of Fisheries and Aquatic Resources, 2009).
Aquilino, S. V. L. et al. DNA barcoding of the ichthyofauna of Taal Lake, Philippines. Mol. Ecol. Resour. 11, 612–619 (2011).
Aquino, L. M. G. et al. DNA barcoding of fishes of Laguna de Bay, Philippines. Mitochondrial DNA 22, 143–153 (2011).
Abdulmalik-Labe, O. P. & Quilang, J. P. DNA barcoding of fishes from Lake Lanao. Philippines. Mitochondrial DNA B 4, 1890–1894 (2019).
Bemis, K. E. et al. Biodiversity of Philippine marine fishes: A DNA barcode reference library based on voucher specimens. Sci. Data 10, 411 (2023).
Huang, W.-C. et al. DNA barcoding of marine teleost fishes (Teleostei) in Cebu, the Philippines, a biodiversity hotspot of the coral triangle. Sci. Rep. 13, 14867 (2023).
Miya, M. et al. MiFish, a set of universal PCR primers for metabarcoding environmental DNA from fishes: detection of more than 230 subtropical marine species. R. Soc. Open Sci. 2, 2150088 (2015).
Ward, R. D., Zemlak, T. S., Innes, B. H., Last, P. R. & Hebert, P. D. N. DNA barcoding Australia’s fish species. Philos. Trans. R. Soc. B Biol. Sci. 360, 1847–1857 (2005).
Tamura, K., Stecher, G. & Kumar, S. MEGA11: Molecular evolutionary genetics analysis version 11. Mol. Biol. Evol. 38, 3022–3027 (2021).
Shakhovskoy, I. B. Specific features of distribution in the World Ocean of some flying fishes of the genera Exocoetus, Hirundichthys and Cypselurus (Exocoetidae). FishTaxa 3, 40–80 (2018).
Betancur-R, R. et al. Phylogenetic classification of bony fishes. BMC Evol. Biol. 17, 162 (2017).
Fricke, R., Eschmeyer, W. N. & Van der Laan, R. Eschmeyer’s Catalog of Fisheries: Genera, Species, References http://researcharchive.calacademy.org/research/ichthyology/catalog/fishcatmain.asp (2025).
Jamandre, B. W. Freshwater fishes of the Philippines: a provisional checklist. Zootaxa 5301, 151–181 (2023).
Balisco, R. A. T. et al. Marine fishes of Palawan, Philippines: Species diversity, new records, and conservation status. Reg. Stud. Mar. Sci. 60, 102825 (2023).
IUCN. The IUCN Red List of Threatened Species https://www.iucnredlist.org/ (2025).
Kimura, M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16, 111–120 (1980).
Felsenstein, J. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39, 783–791 (1985).
Saitou, N. & Nei, M. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987).
Tamura, K. & Nei, M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10, 512–526 (1993).
NCBI GenBank https://identifiers.org/ncbi/bioproject:PRJNA1294862 (2025).
NCBI GenBank https://identifiers.org/ncbi/bioproject:PRJNA1294866 (2025).
Acknowledgements
We would like to extend our sincere gratitude to BFAR-RO2 and BFAR-RO3, especially to RD Wilfredo Cruz and Dr. Emma Ballad, for their invaluable support during field collections in Aurora, Cagayan and Zambales. We also express our special appreciation to Erickson F. Gumiran, Leon A. Antonio Jr., Norma De Peralta, Guan-Chen Liu, and Kuan-Hsun Chen for their assistance in field collection in Cagayan Province. Likewise, we are grateful to Rex Sabbaluca, Marlon Alejandro, Neil Encinares, Renel Pena, Sunshine Amo, Dianna Jean Batan, and Samuel Leung for their contributions to fieldwork in Aurora and Zambales. Additionally, we extend our deepest thanks to Jen-Ying Chen, Chih-Chen Hsu, Yan-Chi Hu, and Che-Chi Yu for their expertise in map creation and support during field sampling. Your collective efforts were instrumental in the success of this study.
Author information
Authors and Affiliations
Contributions
Al C. Dimaquibo: Conceptualization, Data collection and curation, Formal analysis, Investigation, Writing – original draft, Writing – review & editing. Wei-Cheng Jhuang: Data collection & formal analysis. Wen-Chien Huang: Formal analysis, Writing – review & editing. Angel Encarnacion: Data collection, Writing – review & editing. Melanie C. Villarao & Romina V. Yutuc: Data collection. Te-Yu Liao: conceptualization, Supervision, Writing – review & editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
41597_2025_5758_MOESM1_ESM.docx
Figure S1. Photographs of sequenced voucher specimens with accession number (COI and 12S rRNA). Figure S2. Neighbor-joining tree based on 707 COI sequences. Figure S3. Neighbor-joining tree based on 343 12S rRNA gene.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Dimaquibo, A.C., Jhuang, WC., Huang, WC. et al. DNA Barcoding of Teleost Fishes from North Luzon, Philippines: A Dataset for Ichthyofaunal Diversity Assessment. Sci Data 12, 1571 (2025). https://doi.org/10.1038/s41597-025-05758-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-05758-3