Background and Summary

Traits - the measurable properties of organisms - capture the biology and morphology of species and have been used for centuries to infer species boundaries based on shared characteristics1. Traits also dictate organismal performance and its interaction with the ecological niche2,3. Thus, exploring patterns of trait diversity across organisms and species also brings us closer to understanding the general principles that determine, among other processes, the patterns of abundance and distribution of species4,5, their evolution6, their resilience (e.g., resistance and recovery capacity)7,8, or their contribution to ecosystem functioning and services9,10. Consequently, the use of trait-based approaches in ecology has exponentially increased in recent decades, leading both to the development of new methods and an increase in the necessity of trait data among the scientific community11. The compilation of trait data across the Tree of Life remains nonetheless a major challenge12. In fact, trait databases are still restricted to relatively few taxonomic groups of mostly terrestrial organisms (e.g.13,14,15,16,17,18,19,20), and are even rarer in other systems traditionally less accessible to scientists, such as the marine realm. Nevertheless, considerable effort has recently been made to compile trait data for some marine groups (e.g., stony corals21, algae22, and fish23, among others). However, trait data gaps still exist for most taxa, and even when data are available for some species, it is often difficult and time-consuming to integrate them within a single study24. This is in part because trait information mostly comes from scattered sources that are often difficult to find or access, and often including different languages (e.g., taxonomic descriptions, local identification guides, grey literature, etc.). In addition, traits are often measured using differing methodologies and standards, and relevant contextual information is often lacking. This highlights the need for new trait databases that simplify the process for neglected yet ecologically important marine taxa.

Coral-dominated ecosystems are some of the most biodiverse ecosystems on Earth25. They provide millions of people with innumerable services including food provision, coastal protection, or recreation26. Yet, coral-dominated ecosystems are being severely transformed by global change, leading them into uncharted territories27,28,29. To advance the science of these systems more rapidly at this time of accelerating change, the Coral Trait Database was launched in 201621, which centralized diverse trait information for stony corals (i.e., Sub-Phylum Anthozoa, Class Hexacorallia, Order Scleractinia) into a single, open-access repository. The Coral Trait Database became the basis for research and education that has advanced coral reef science worldwide (e.g.8,27,30,31). However, these advances are limited because of the lack of information for the anthozoan Class Octocorallia, which significantly contributes to the biodiversity and functioning of many coral-dominated marine ecosystems.

Octocorals are a class of anthozoans that host more than 3,500 nominal species of mainly non-stony corals (e.g., soft corals, sea fans and sea pens) distributed across more than 75 families32. They can be found from the shallow waters to the deep sea and across all marine ecoregions33. Octocorals differ from hexacorals (i.e., scleractinians) in many traits; most notably they have polyps with eight pinnate tentacles rather than six or 6-fold non-pinnate tentacles, and they tend to have unconsolidated sclerites or semi-rigid skeletons rather than the hard skeletons of Scleractinia34,35. Like hexacorals, octocorals also provide critical functions to ecosystems and services to human societies. Octocorals can be foundation species that form three-dimensional habitats that are home, refuge, and spawning grounds for many associated taxa, which in turn increases biodiversity, stability, and food provision to humans36,37,38,39,40. Their complex morphologies can also locally modify water flows and light intensity, which may favor the settlement of sessile invertebrate larvae over opportunistic macroalgae, thus further promoting assemblage stability and resilience41,42. Octocorals can also directly influence nutrient cycling and the trophic network by acting as food source (e.g., for nudibranchs), and as suspension feeders that capture particulate organic matter and plankton from the water column43,44,45,46,47. Moreover, although octocorals are not typically considered reef-builders, they can contribute to the carbon cycle and reef calcification processes by photosynthesis and by fixing calcium carbonate (CaCO3) in their sclerites and skeletal structures48,49,50. Skeletons of some octocoral species such as Corallium rubrum have been used for jewelry by humans since ancient times51. Octocorals are also important sources of pharmaceutical products and have a high aesthetic value that enhances dive tourism and inspiration52,53.

Under intensifying global change, many octocoral species are playing important roles in terms of reef reconfiguration processes. For instance, in the Caribbean Sea, where stony corals have generally suffered marked declines in abundance following cumulative climatic impacts, most octocoral species have prevailed54,55,56. Similarly, some octocorals are becoming the new dominant groups in some impacted reefs of the Western and Central Indo-Pacific following a decline in hard coral cover57,58. Conversely, in other regions such as the Mediterranean Sea, habitat-forming octocorals are experiencing population collapses following recurrent marine heatwaves59.

Octocorals and their traits are thus an undeniably crucial piece of the puzzle to understand how coral-dominated ecosystems function, how they are being transformed by global change, and how we can improve their management in the Anthropocene. Here, we introduce the Octocoral Trait Database, a global, open-source database of curated trait data for octocorals. We release its first data descriptor, OctocoralTraits v2.2, which has been integrated with those data of the scleractinian corals in the Coral Trait Database (www.coraltraits.org), and hosts species- and individual-level data alongside contextual data that provide relevant framing for analyses. The primary goals of the new database are: (i) to aggregate diverse information on octocoral traits into a single open-access repository that ensures transparent and accurate archiving of coral trait data, (ii) to promote the appropriate crediting of original data sources, (iii) to continue engaging the reef coral research community in the gathering and quality control of trait data to facilitate future coral reef research, and iv) to expand the current stony coral version of the Coral Trait Database with octocoral data to promote the advancement of marine science, with a particular emphasis on coral reefs. Here, we publish a first global data release that contains a sample with more than 97,500 error-checked trait observations, including over 148,000 trait measurements across 128 traits (30 of which are contextual) and over 3500 valid octocoral species. Trait observations were compiled across all marine ecoregions and across all depth zones, from the deep sea to the shallow waters. Furthermore, while OctocoralTraits v2.2 serves as a static data descriptor, we envision it as an evolving data product that will continue to grow collaboratively to further facilitate the quantification of trait variation among coral species in the Anthropocene.

Methods

Ontology of the data descriptor

The structure of OctocoralTraits v2.2 matches that of the data descriptor published by Madin et al.21 for scleractinian corals, as well as those of other similar trait databases, such as AusTraits60. Specifically, it follows the principles of the Observation and Measurement Ontology61, where observations at the individual or species level bind associated measurements and may provide context for other observations (Fig. 1). For example, recording both the height and width of the same octocoral colony would be considered a single observation with two distinct measurements, each representing a different trait of the coral. If the water depth is also noted, it remains part of the same observation but is categorized as a contextual trait, as it does not directly pertain to the colony itself.

Fig. 1
figure 1

(a) General scheme on how data is structured in the database following the Observation-Measurement scheme61 and Madin et al.21. (b) Example of an observation. Each observation contains data for the data enterer, the species of interest, the scientific source, the type of access (i.e., optional variable that data enterers can control to keep data private before publication) and the location. In addition, each observation binds one or various measurements that apart from specifying the name of the measured trait (e.g., height of the colony), include information about: the standard used to measure it (e.g., m, cm…etc), the method used (e.g., ruler), the actual value measured, and the value type (e.g., raw data for a single measurement, expert/group opinion for single/consensus view of experts, mean, median, range, maximum, or minimum…). The number of replicates and estimates of precision (e.g., standard error, standard deviation…) can also be specified when applicable. Finally, one or multiple contextual traits (e.g., water depth, habitat type…), that might be relevant for explaining intra-specific variation in the trait/s of interest can also be associated to the observation.

Traits selection

OctocoralTraits v2.2 has been compiled to populate the Class Octocorallia within the Coral Trait Database with the first global data descriptor. To achieve this goal, we first identified a set of relevant octocoral traits based on: i) traits that were already present for scleractinian corals in the Coral Traits database and that also apply to octocorals, and ii) new traits identified by the team as relevant for octocorals in terms of their biomechanical, morphological, physiological, biogeographical, ecological, conservation and/or reproductive characteristics. The selected traits include both individual-level traits (i.e., heritable features of organisms) that are measured at the individual level and can potentially vary within species (e.g., growth rate, colony height) and species-level traits that are properties of species as entities and, therefore, invariant within species (e.g., species depth range, conservation status). The complete list of 98 selected traits can be found in Table S1, along with their definitions and detailed information on their types of associated variables (e.g., numeric vs categorical) and allowed values. Table S2 includes information on the 30 contextual traits (e.g., habitat characteristics such as depth), which provide additional information about the specific conditions under which individual-level traits were quantified. This information is crucial for understanding trait variation21.

Data acquisition

OctocoralTraits v2.2 was assembled using only scientific sources, ensuring peer-reviewed level of quality control. Specifically, in April 2022, we searched the Web of Science for scientific works using the general query: “octocoral*” OR “soft coral” OR “sea pen” OR “gorgonian” OR “sea fan”. We conducted a single, general search to balance the creation of an initial global dataset of octocoral trait data for the Coral Trait Database with maintaining a manageable workload for the team. This search yielded over 11,000 scientific sources, which were then manually screened at the abstract level using the abstrackr software62 to exclude sources unrelated to the traits or organisms of interest. This process led to an initial selection of 2,360 potentially relevant publications. However, many of them were subsequently excluded upon further review for not being related to the organisms or traits of interest. For example, many sources mentioned ‘soft corals’ in the abstract but referred to organisms outside of the Class Octocorallia, such as zoanthids.

From the selected sources, the data extraction process primarily involved capturing values from text and result tables. In the few cases where information was only available in figures, data were extracted by establishing a scale based on axis values using ImageJ2 software63. Finally, as data compilation progressed, some additional articles of interest—those not captured by the initial search but referenced within reviewed sources—were also included, along with some newly published articles unavailable at the time of the initial search. Similarly, although the initial search focused primarily on publications in English, some relevant publications in other languages (e.g., Spanish, German) were also included where team members were sufficiently confident with those languages. The complete list of 796 used data sources can be found at Supp. Table S332,36,49,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643,644,645,646,647,648,649,650,651,652,653,654,655,656,657,658,659,660,661,662,663,664,665,666,667,668,669,670,671,672,673,674,675,676,677,678,679,680,681,682,683,684,685,686,687,688,689,690,691,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748,749,750,751,752,753,754,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781,782,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814,815,816,817,818,819,820,821,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847,848,849,850,851,852,853,854,855,856,857,858,859.

This final list of data sources included published research articles, taxonomic monographs, scientific books, theses, identification guides, published conference abstracts and reports. Moreover, for the geographical traits’ marine province and marine realm, trait data compilation from the aforementioned sources was supplemented by information from the online, public, scientific platform OBIS (https://obis.org846). Specifically, we first retrieved occurrence data from the platform using the robis R package860 in RStudio software v2023.06.0 + 421861, utilizing the Aphia_ID code of each species as the taxon ID. Spatial outliers were then flagged with the CoordinateCleaner R package862 and manually examined for accuracy by consulting original sources. Verified data were subsequently classified into realms and/or provinces using the meowR package863 and incorporated into the data descriptor.

Species taxonomy

For the static OctocoralTraits v2.2 used in this descriptor, all trait observations are associated with accepted species of octocorals. The accepted name and taxonomy for each species were determined based on the most authoritative and up-to-date lists of octocoral species, namely the 2024 World List of Octocorallia864 and the World Register of Marine Species (WORMS, as of May 2024865) (http://www.marinespecies.org). In cases where source publications contained trait data for combinations that are no longer valid (e.g., Pseudopterogorgia americana), species names were updated in the database to reflect the currently accepted name (e.g., Antillogorgia americana). To ensure the traceability of taxonomic updates, all trait observations also include information on the Aphia ID code for the original species that appeared in the scientific source from which trait data was obtained. These Aphia ID codes are linked to the WORMS platform.

Data integration

To progressively integrate trait data from various scientific sources into a unified, structured dataset, we implemented a data processing workflow following Madin et al.21. Specifically, we: (i) manually assigned observation IDs to link related measurements, (ii) reformatted location coordinates to the decimal degree system when necessary, (iii) standardized terminology for traits, locations, methodologies, and standards across sources, (iv) updated taxonomy as needed (see previous section for details), and (v) harmonized values for categorical traits, as terminology can vary significantly among sources (refer to Table S1 for accepted values). A notable challenge was the extensive variety of nomenclature used in the scientific literature to describe growth forms across the Octocorallia tree of life, with different terms often describing the same concepts. To address this, we introduced a hierarchical classification system that aims to facilitate the use of octocoral morphologies in ecological studies by consolidating hundreds of terms into a manageable set of ecologically meaningful morphologies. It includes seven high-level Types of Growth (Trait 1) based on spatial occupation patterns, further subdivided into 17 basic (Trait 2) and 32 detailed (Trait 3) growth forms for finer resolution (see Tables S4-S11 for detailed descriptions). For example, a coral described as “fan-shaped” by an author in the original source would be hierarchically classified as “erect branched” (Trait 1; high level), “arborescent” (Trait 2; medium level), and “branched planar” (Trait 3; fine level) under the Type of Growth, Growth Form Basic, and Growth Form Detailed traits, respectively. Additionally, a fourth trait, labeled simply as Growth Form (Trait 4), retains the original terminology from the data source to maintain traceability.

A similar challenge was encountered with the vast existing terminology for octocoral skeletons. Here, we built upon the work of McFadden et al.32 to propose a summarized classification comprising 20 basic categories that encompass the observed variation in this trait across octocorals (see accepted values for the “Type of Skeleton” trait in Table S1).

Lastly, for quantitative traits (e.g., colony height), where standardization to a preferred unit is straightforward, we did not enforce a specific unit, leaving the choice of which standard to use up to end users.

Data Records

Access

The OctocoralTraits v2.2 data descriptor is publicly accessible as a ZIP file deposited in Zenodo866. This ZIP file, named OctocoralTraits, contains a primary core table in .csv format (OctocoralTraits_v2_2.csv) with trait observations linked to measurements, along with multiple interrelated tables (also in .csv format) that provide additional details on both the observations (e.g., location, species, resource) and the measurements (e.g., trait, methodology, standard, value type, precision type). Specific information about each of these core and interrelated tables is provided in Tables 18.

Table 1 Description of the core OctocoralTraits v2.2 table containing observations binding measurements.
Table 2 Structure of the Locations table, which contains details about each site where trait observations were made.
Table 3 Structure of the Species table, which contains details about each species for which traits were quantified.
Table 4 Structure of the Resources table, which contains details about each primary and/or secondary scientific source from which trait data were compiled.
Table 5 Structure of the Trait table, which contains details about each measured trait.
Table 6 Structure of the Methodologies table, which contains details about each methodology used to quantify trait data.
Table 7 Structure of the Standards table, which contains details about the standards and units used to quantify trait data.
Table 8 Structure of the Precision table, which contains details about the Precision estimates used to quantify the uncertainties associated to the trait measurements.

Data coverage

The OctocoralTraits v2.2 data release integrates more than 97,500 error-checked trait observations and over 148,000 trait measurements across 128 traits. These data encompass over 3,500 valid octocoral species and 75 families, with observations spanning all marine ecoregions and depth zones, from deep-sea habitats to shallow waters. OctocoralTraits v2.2 therefore offers a broad phylogenetic (Figs. 2 and 3), geographic (Fig. 4), and bathymetric (Fig. 5) coverage. However, considerable data gaps persist, since many species lack comprehensive trait data (Fig. 2), some traits are underrepresented across octocoral families (e.g., growth rate) (Fig. 3), and certain realms (e.g., polar) and depth zones (e.g., mesophotic) remain poorly sampled across the scientific community (Figs. 4 and 5).

Fig. 2
figure 2

Species-complete tree with phylogenetic distribution of trait data coverage (i.e., number of trait observations per species). The 15 largest families of the Class Octocorallia are labeled by colour. The two orders within the class, Scleralcyonacea and Malacalcyonacea, are also indicated. Arrowheads indicate that, for this first data release, Keratoisis grayi (4708 records), Muricea californica (437 records) and Corallium rubrum (280 records) have a disproportionately larger number of trait observations that do not fit in the plot. The phylogenetic tree was added to this figure solely for visualization purposes, organizing families by evolutionary relationships to enhance the interpretability of trait distributions across lineages, without implying further phylogenetic analysis. It corresponds to an adaptation of the family-resolved Maximum likelihood tree of Octocorallia inferred from 1059 bp alignment of mitochondrial gene mtMutS32, with species being incorporated as polytomies. (*) To facilitate visualization, all sea pens have been grouped into a single Pennatuloidea superfamily. Data for species belonging to genera that are currently incertae sedis have not been included in the figure. Finally, a complementary figure showing the phylogenetic distribution of trait data coverage as the number of different traits with data per species can be found in Fig. S1.

Fig. 3
figure 3

Family-complete heatmap with phylogenetic distribution of trait data coverage for each trait of interest. Parentheses following family names indicate the number of species within a given family. Purple cells correspond with the traits of interest released in this data descriptor, with color gradient indicating the % of species with data for a given trait, within a given family. Dark yellow cells indicate traits that are only possible in sea pens (e.g., the number of polyps per polyp leaf) and therefore which are not possible in other octocorals. (*) refers to Conservation trait category, while (**) refers to Stoichiometric trait category. To facilitate visualization, all sea pens have also been grouped into a single Pennatuloidea superfamily (as in Fig. 2). See Table S1 for the trait definitions and Fig. S2 for trait data coverage across genera whose family assignment is incertae sedis. The phylogenetic tree used to order families across the x axis corresponds to the family-resolved Maximum likelihood tree of Octocorallia inferred from 1059 bp alignment of mitochondrial gene mtMutS32. As for Fig. 2, it was added solely for visualization purposes, organizing families by evolutionary relationships to enhance the interpretability of trait distributions across lineages, without implying further phylogenetic analysis.

Fig. 4
figure 4

(a) Relative percentage of compiled trait observations among global and georeferenced estimates. (b) Geographic distribution of georeferenced trait data across marine realms. (c) Total number of trait observations per marine realm. (*) in panel c denotes that the number of trait observations for that region is higher than shown in the plot (i.e., Temperate Northern Atlantic; 5,633 records). Additional notes: As mentioned in Table 3, Global estimates are associated with traits that do not (or very rarely) vary within species regardless of the geographical context. Thus, a global estimate for a given trait can be assumed to be fairly or totally constant across the species. For instance, the presence of a skeletal axis is an inherent characteristic of gorgonians regardless of where the trait is quantified. Apart from the global estimates, OctocoralTraits v2.2 contains over 10,000 data points of geo-referenced records. These records are associated with traits that may vary within species depending on local/regional environmental parameters. Therefore, the geographic context of the trait observation is needed to provide relevant framing for analysis.

Fig. 5
figure 5

(a) Bathymetric (i.e., across light zones) distribution of compiled trait data across the 54 marine provinces where data has been collected. Numbers have been added to each marine province with data, corresponding exactly to the list provided by Spalding et al.868. Specifically: 1. Arctic, 2. Northern European Seas, 3. Lusitanian, 4. Mediterranean Sea, 5. Cold Temperate Northwest Atlantic, 6. Warm Temperate Northwest Atlantic, 8. Cold Temperate Northwest Pacific, 9. Warm Temperate Northwest Pacific, 10. Cold Temperate Northeast Pacific, 11. Warm Temperate Northeast Pacific, 12. Tropical Northwestern Atlantic, 13. North Brazil Shelf, 14. Tropical Southwestern Atlantic, 16. West African Transition, 17. Gulf of Guinea, 18. Red Sea and Gulf of Aden, 19. Somali/Arabian, 20. Western Indian Ocean, 22. Central Indian Ocean Islands, 24. Andaman, 25. South China Sea, 26. Sunda Shelf, 28. South Kuroshio, 29. Tropical Northwestern Pacific, 30. Western Coral Triangle, 31. Eastern Coral Triangle, 32. Sahul Shelf, 33. Northeast Australian Shelf, 34. Northwest Australian Shelf, 35. Tropical Southwestern Pacific, 36. Lord Howe and Norfolk Islands, 37. Hawaii, 38. Marshall, Gilbert, and Ellis Islands, 39. Central Polynesia, 40. Southeast Polynesia, 43. Tropical East Pacific, 44. Galapagos, 45. Warm Temperate Southeastern Pacific, 47. Warm Temperate Southwestern Atlantic, 48. Magellanic, 50. Benguela, 51. Agulhas, 53. Northern New Zealand, 54. Southern New Zealand, 55. East Central Australian Shelf, 56. Southeast Australian Shelf, 57. Southwest Australian Shelf, 58. West Central Australian Shelf, 59. Subantarctic Islands, 60. Scotia Sea, 61. Continental High Antarctic, 62. Subantarctic New Zealand. (b) Mean percentage (±SD) of trait observations per light zone across the 54 provinces with data.

Technical Validation

OctocoralTraits v2.2 is a curated data descriptor developed to contribute to the Coral Traits Database. Therefore, it has undergone the same editorial control and quality assurance processes established for that initiative (see https://www.coraltraits.org/procedures and Madin et al.21). Specifically, all compiled data comes from scientific sources that have been published (e.g., articles) or undergone rigorous peer review (e.g., PhD theses), ensuring that data accuracy was validated before integration. Moreover, to address potential unnoticed errors in prior validation and those that may arise during the harmonization of the data into a unified, comprehensive dataset, additional measures were taken to ensure a fully curated version for publication. First, whenever data from a new source was compiled, the .csv files were manually reviewed to confirm that all variables had the correct and intended data types. Following the finalization of data compilation, custom R scripts (available with the data descriptor on the Zenodo repository866) were applied to: i) detect duplicate measurements and/or observations, ii) identify inconsistencies in data structure (e.g., an observation linked to multiple species, locations, or resources), and iii) detect and flag outliers in quantitative traits and invalid values in categorical traits. Flagged values were then manually checked against the original sources by trait editors, and when necessary, original authors were consulted. Any confirmed errors identified through the technical validation process were corrected or removed, resulting in the curated OctocoralTraits v2.2 version released here.

Usage Notes

By hosting the dataset on the public repository Zenodo, we adhere to FAIR principles867 and enable reuse under a CC-BY license with appropriate attribution, specifically citing this data descriptor. Additionally, the descriptor includes direct references to all original scientific sources used to compile the global harmonized dataset, found in the resource_id.csv table (also referenced in this manuscript64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643,644,645,646,647,648,649,650,651,652,653,654,655,656,657,658,659,660,661,662,663,664,665,666,667,668,669,670,671,672,673,674,675,676,677,678,679,680,681,682,683,684,685,686,687,688,689,690,691,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748,749,750,751,752,753,754,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781,782,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814,815,816,817,818,819,820,821,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847,848,849,850,851,852,853,854,855,856,857,858,859;). We therefore encourage users to reference the original sources whenever feasible, along with this data descriptor. Finally, the OctocoralTraits data descriptor is an evolving data product, continuously linked to ongoing data entry and validation. We encourage users to access the latest versions of OctocoralTraits files, which will be regularly updated in the same Zenodo repository866. Furthermore, since OctocoralTraits v2.2 is part of the Coral Trait Database collaborative effort, subsequent new versions of the dataset will also be accessible and downloadable at (www.coraltraits.org).

Despite the meticulous curation process aimed at identifying and rectifying potential errors, the inherent complexity of data acquisition and compilation may warrant an additional layer of scrutiny from end-users. It is recommended, therefore, that users apply standard validation procedures and cross-referencing methodologies to ensure the accuracy and reliability of the data for their specific analyses. This precautionary measure, which may include consulting referenced original sources or the corresponding author of this publication in case of doubt, aligns with the best practices in data utilization, reinforcing the robustness of the database for scientific inquiry and research endeavors. Finally, special caution is advised when using data from geographical traits (e.g., marine realms and provinces), particularly those derived from the OBIS platform, as not all observations on that platform have been verified for accurate species identification or up-to-date taxonomy.