Introduction

Phenol is a widely used organic compound that frequently enters aquatic environments through both natural processes and anthropogenic activities. With global production exceeding 12 million tons annually1, phenol is of considerable environmental concern due to its high water solubility, long environmental half-life (~ 3,240 h), resistance to biodegradation, and toxicity even at low concentrations2. Phenol is frequently detected in sediments at levels ranging from 0.13 to 34,780 ng/g dry weight3,4,5,6,7,8, raising concerns about adverse effects on benthic organisms and sediment ecosystems and underscoring the need for sediment ecological risk assessment (ERA).

Previous attempts to derive sediment predicted no-effect concentrations (PNECsediment) for phenol and to assess sediment risks have yielded widely varying values (0.01–0.08 µg/g dw), largely reflecting differences in datasets and methods6,8. These estimates often relied on large assessment factors applied to limited aquatic toxicity data before extrapolation to sediments, thereby introducing substantial uncertainty. This context motivates the development of a more transparent and ecologically relevant framework for deriving PNECsediment when sediment toxicity data are scarce.

Although sediment ERAs are typically recommended for substances with organic carbon partition coefficients (Koc) of 3 or greater9,10, phenol’s persistence and mobility justify consideration even at lower Koc. Consistent with this rationale, Simpson et al. demonstrated that perfluorooctane sulfonic acid (PFOS)—a compound with low Koc—can pose risks to benthic organisms when present in sediments, and argued for expanding sediment risk assessment beyond strict trigger criteria11. When direct sediment toxicity data are insufficient, the equilibrium partitioning method (EPM) provides a pragmatic route to convert water-based thresholds (e.g., PNECwater or species-level chronic metrics) into sediment-equivalent values10,12. However, limited methodological guidance⁹ and substantial variability in reported thresholds—up to 27-fold in PNECwater across studies (0.7 µg/L in Li et al.6; 19.0 µg/L in Zhong et al.7; 3.6 µg/L in Zhou et al.8)—propagate uncertainty into sediment benchmarks.

Species sensitivity distribution (SSD) approach is recommended when sufficient ecotoxicity data exist because they incorporate interspecies variability and yield community-relevant benchmarks13,14. Yet the robustness of SSD-derived hazardous concentrations (e.g., HC5) depends on taxonomic composition, habitat type, geographic origin (native vs. foreign), and sample size15,16,17,18. Moreover, most SSDs are constructed from laboratory tests with limited ecological realism19,20,21. Artificial-stream experiments can partially bridge this gap by quantifying species-level responses under semi-natural conditions while maintaining experimental control19,22,23.

This study evaluates the applicability of water toxicity data to sediment ERA for phenol by integrating EPM with SSDs in a transparent, case-study framework. Specifically, we (1) compiled acute and chronic water-based toxicity data from laboratory tests and artificial-stream experiments using native aquatic species; (2) applied EPM to convert water-based values into sediment-equivalent concentrations and quantified how data characteristics (geographic origin, habitat type, taxonomic group, and sample size) influence SSD-derived HC5 and PNECsediment; and (3) applied the derived PNECsediment to literature-reported sediment concentrations to illustrate risk categorization under data-limited conditions. Throughout, we acknowledge that EPM-based sediment thresholds are model-derived and should be interpreted cautiously until validated with direct sediment toxicity tests.

Materials and methods

Chemical and organisms

Phenol (C₆H₅OH; CAS No. 108-95−2; ≥99% purity) was purchased from Sigma-Aldrich (St. Louis, MO, USA). Seven representative aquatic species from five taxonomic groups native to Korea were selected for toxicity testing (Table 1). Organism collection and acclimation procedures are detailed in the Supplementary Information. All experiments involving fish were approved by the Institutional Animal Care and Use Committee (IACUC) of the Korea Institute of Toxicology (Approval No. 2102-0001).

Table 1 Test conditions of phenol in laboratory toxicity tests and artificial stream experiment. TL, total length; NA, not available; LC50, median lethal concentration; EC50, median effective concentration; NOEC, no observed effective concentration.

Laboratory toxicity test

Acute toxicity tests were conducted for fish (Aphyocypris chinensis and Zacco platypus), insects (Glyptotendipes tokunagai), and crustaceans (Branchinella kugenumaensis). Tests followed OECD guidelines 20324 and 23525 for fish and insect species, respectively, and a modified OECD guideline 20226 for crustaceans B. kugenumaensis. Organisms were exposed to phenol at five concentration levels (dilution factor 1.8–2.0) (Table S1). Mortality was assessed in fish and crustaceans B. kugenumaensis, while immobilization was observed in insects G. tokunagai. Water quality parameters (temperature, pH, dissolved oxygen) were monitored at start and end of the test (Table S2). Further details are provided in the Supplementary Information.

Artificial stream toxicity test

A 30 d artificial stream experiment was performed in an indoor artificial stream system to obtain chronic toxicity values under semi-natural conditions (Table 1; Table S3). The system included a head tank, riffle, pool, run, and tail tank (Fig. S1). The details are provided in the Supplementary Information. Briefly, six species were introduced into designated sections and acclimated based on their life cycle. Organisms were exposed to three concentrations of phenol for 10.1 h. Endpoints included mortality and growth of fish (A. chinensis, Z. platypus), reproduction and mortality of crustaceans (Moina macrocopa), emergence of insects (G. tokunagai), mortality of worms (Limnodrilus hoffmeisteri), and growth for periphytic algae (Table 1). Water quality parameters were recorded throughout the experiment (Table S4).

Chemical analysis

Water samples were collected during the exposure period for phenol quantification. Analysis was performed using an Agilent 1200 HPLC system equipped with a ZORBAX EclipsePlus C18 column (4.6 × 150 mm, 3.5 μm) and a UV diode array detector at 230 nm. The mobile phase consisted of water and acetonitrile (10:90, v/v). The flow rate was 0.6 mL/min with a 20 µL injection volume. Calibration was linear between 0.05 and 10 mg/L (R² ≥ 0.999). Limits of detection (LOD) and quantification (LOQ) were 0.02 and 0.05 mg/L, respectively. Samples were diluted 10-fold and analyzed in triplicate. Retention time was 2.7 min and data were processed using Chemstation® software Rev.B.04.02.

Collection and classification of ecotoxicity data

Acute (LC50 or EC50) and chronic (NOEC) toxicity data for phenol were retrieved from the ECOTOX database and peer-reviewed literature. Data were classified by taxonomic group, geographic origin (native vs. foreign species), and habitat type (benthic vs. non-benthic). Early life stage data (e.g., egg, larva) were considered benthic. Native species were identified based on the National Species List of Korea27. Toxicity data quality was assessed using Klimisch scores28 and the ToxRTool29; only data rated 1 or 2 were included28. Acute toxicity data were converted to chronic values using an acute-to-chronic ratio of 6.330,31,32,33,34. If multiple data points were available per species, geometric means were used35.

Derivation of PNECsediment

The species mean chronic value in sediment (SMCVsediment) was estimated from water-based values (SMCVwater) using the modified equilibrium partitioning method9,11:

$$\:{SMCV}_{sediment}=\:\frac{{K}_{susp-water}}{{RHO}_{susp}}\:\times\:\:{SMCV}_{water}\:\times\:1000\:\times\:4.6$$
(1)
$$\:{K}_{susp-water}=\:{F}_{water-susp}\:\times\:\:{F}_{solid-susp}\:\times\:\:{F}_{oc-susp}\:\times\:\:\frac{{K}_{oc}}{1000}{\rho\:}_{solid}$$
(2)

where SMCVsediment and SMCVwater were the SMCVs in sediment (dry weight) (µg/g dw) and in water (mg/L); ρsusp and ρsolid were densities of wet suspended matter and the solid phase valued 1150 and 2500 in kg/m3, respectively; Fwater−susp and Fsolid−susp were the volume fractions of water and solid in suspension, defined as 0.9 and 0.1 in m3/m3; Foc−susp was the mass fraction of organic carbon in suspension, assigned as 0.1 in kg/kg; Koc (organic carbon-water partition coefficient) for phenol was 79.34 L/kg, which was estimated value obtained from EPI suite v4.11 database (US EPA, Washington, DC, USA).

SSD curve fitting was performed using the SSD Toolbox software provided by the EPA (https://www.epa.gov/chemical-research/species-sensitivity-distribution-ssd-toolbox) with maximum likelihood method and six distribution models (Normal, Logistic, Triangular, Gumbel, Weibull, and Burr distributions). The appropriate SSD curve distribution was selected based on a goodness-of-fit test with a P-value greater than 0.05, a lowest AICc (Akaike Information Criterion corrected: a statistical metric that helps choose the best-fitting model) value, and a quantile-quantile plot. The hazardous concentration for 5% of species (HC5) was calculated from the SSD curve. The PNECsediment was calculated by dividing the HC5 by an assessment factor (AF)9. The AF can be assigned a value between 1 and 5 depending on the amount and quality of toxicity data. Here, an AF of 5 was applied following international guidance11, which recommends this factor for artificial stream experiment data that are ecologically relevant but still involve uncertainties.

Sample size analysis for SSD

To determine minimum required data size for robust SSD modeling, bootstrapping was performed using 3–53 randomly sampled toxicity data points with 5000 iterations per sample size. HC5 values and confidence intervals were calculated for each iteration. Change point analysis was applied at α = 0.05 to identify stabilization in HC5 variance and define the minimum sufficient sample size36,37.

Ecological risk assessment

Ecological risk was evaluated following the European Commission’s Technical Guidance Document10. The risk quotient (RQ) was calculated as the ratio of measured environmental concentration in sediment (MECsediment) to derived PNECsediment. Ecological risk was assessed as follows: RQ value of less than 0.1 indicates a low risk, between 0.1 and 1 indicates a moderate risk, and greater than 1 indicates a high risk to organisms.

Statistical analysis

All statistical analyses of the toxicity data obtained from the experiments were performed using the Comprehensive Environmental Toxicity Information System™ (CETIS, Tidepool Scientific Software, McKinleyville, CA, USA). The LC50 or EC50 for each test species was calculated using the appropriate method, such as probit analysis, binomial analysis, or Trimmed Spearman-Kärber analysis.

Results

Phenol toxicity to native freshwater species

Laboratory acute toxicity tests revealed clear species-specific differences in phenol sensitivity among native freshwater species. The fish Zacco platypus was the most sensitive (96 h-LC50: 20.8 mg/L), followed by fish Aphyocypris chinensis (96 h-LC50: 25.4 mg/L), crustacean Branchinella kugenumaensis (24 h-LC50: 27.3 mg/L), and insect Glyptotendipes tokunagai (48 h-EC50: 52.5 mg/L) (Table 2). Overall, vertebrates were more sensitive than invertebrates, consistent with previous studies (e.g., Cyprinus carpio 30.4 mg/L, Oryzias latipes 24.1 mg/L)38,39. Similarly, the EC50 of G. tokunagai was comparable to that of Propsilocerus akamusi (48 h-EC50: 67.7 mg/L)40.

Table 2 Toxicity data of phenol to native aquatic organisms.

In the artificial-stream experiment simulating a 10.1 h pulse exposure followed by a recovery period, the emergence of G. tokunagai, reproduction of Moina macrocopa, and mortality of Z. platypus were the most sensitive endpoints, whereas algal growth and Limnodrilus hoffmeisteri mortality were less affected. The lowest chronic NOECs were 0.86 mg/L for G. tokunagai, M. macrocopa, and Z. platypus, followed by 1.64 mg/L for A. chinensis and 3.22 mg/L for periphytic algae and L. hoffmeisteri (Table 2). Reported chronic values for phenol—e.g., Pimephales promelas (1.83–20.2 mg/L)41,42, O. latipes (2.63–12.1 mg/L)32, and Daphnia magna (1.5 mg/L)35—were generally higher than those in this study, indicating that M. macrocopa and Z. platypus are relatively sensitive under semi-natural conditions.

Derivation of HC5 values using SSD

Effect of geographical distribution

Chronic toxicity data from 16 native and 36 foreign species were converted into sediment-equivalent species mean chronic values (SMCVs) using the equilibrium partitioning method (EPM) (Table 3, Table S5). SMCVs ranged from 7.38 to 737.81 µg/g dw for native species and from 0.09 to 659.09 µg/g dw for foreign species, with no significant difference (Mann–Whitney U test, p = 0.378). SSD modeling indicated triangular, logistic, and logistic distributions best described native, foreign, and combined datasets, respectively. The estimated hazardous concentration protecting 95% of species (HC5) was 10.74 µg/g dw (95% confidence interval [CI]: 6.71–28.58 µg/g dw) for native species, 2.86 µg/g dw (95% CI: 1.09–7.08 µg/g dw) for foreign species, and 4.03 µg/g dw (95% CI: 1.71–8.74 µg/g dw) for the combined dataset (Fig. 1). The HC₅ for foreign species was more conservative, likely due to the inclusion of sensitive taxa such as amphibians. However, the difference was within one order of magnitude, suggesting that foreign data can supplement native datasets where local information is limited.

Table 3 Toxicity data used to derive a species sensitivity distribution curve for phenol.
Fig. 1
Fig. 1The alternative text for this image may have been generated using AI.
Full size image

Species sensitivity distribution curves for phenol using native (red), foreign (blue), and combined (native + foreign) (green) species datasets and estimated hazardous concentration for 5% of species (HC5) with confidence intervals (dashed lines).

Effect of habitat type

Chronic toxicity data from 30 benthic and 22 non-benthic species were analyzed, with SMCVs ranging from 0.09 to 659.09 µg/g dw for benthic species and from 18.48 to 737.81 µg/g dw for non-benthic species (Table 3). SSDs were constructed separately for benthic species, non-benthic species, and all species combined (Fig. 2). The benthic SSD followed a Weibull distribution, while the non-benthic SSD fitted best to a Gumbel model. HC5 was significantly lower for benthic species (1.44 µg/g dw; 95% confidence interval [CI]: 0.32–6.86 µg/g dw) than for non-benthic species (17.85 µg/g dw; 95% CI: 13.87–25.84 µg/g dw), indicating greater sensitivity of benthic organisms to phenol exposure (Fig. 2). When all toxicity data were pooled, the resulting HC5 (4.03 µg/g dw; 95% CI: 1.71–8.74 µg/g dw) was not significantly different from that of benthic species alone (Fig. 2). The greater sensitivity of benthic organisms reflects their direct and prolonged exposure to contaminated sediments. Early life stages of amphibians (eggs and larvae), classified as benthic in this study, strongly influenced the lower tail of the SSD, contributing substantially to the overall sensitivity distribution.

Fig. 2
Fig. 2The alternative text for this image may have been generated using AI.
Full size image

Species sensitivity distribution curves for phenol using benthic (red), non-benthic (blue), and combined (benthic + non-benthic) (green) species datasets and estimated hazardous concentration for 5% of species (HC5) with confidence intervals (dashed lines).

Effect of taxonomic composition

To evaluate the effect of taxonomic composition on SSD-derived HC5 values, chronic toxicity data were reanalyzed by sequentially excluding each major taxonomic group from the dataset. Sequential exclusion of major taxonomic groups showed that removing amphibians resulted in a substantial increase in HC₅ from 4.03 to 12.33 µg/g dw, while excluding other groups caused only minor changes (1.72–3.93 µg/g dw) (Fig. 3). Amphibians thus had a pronounced influence on the conservative end of the SSD curve, confirming their key role in defining protective thresholds.

Fig. 3
Fig. 3The alternative text for this image may have been generated using AI.
Full size image

Estimated hazardous concentration for 5% of species (HC5) for phenol calculated using different taxonomic composition. Error bars denote.

Effect of sample size on HC5 Estimation

To evaluate how sample size influences the stability and accuracy of HC5 estimates in SSD modeling, change point analysis was conducted using chronic toxicity data for phenol (Fig. 4). The results showed a general decreasing trend in HC5 values as the sample size increased, indicating that smaller datasets tend to overestimate the HC5. A distinct stabilization point was observed when the number of species included in the SSD reached eight. Beyond this point, both the estimated HC5 values and their associated 95% confidence intervals showed no significant change with increasing sample size. These findings suggest that a minimum of eight species is required to obtain a statistically stable and reliable HC5 estimate for phenol. This threshold may serve as a practical guideline for determining the minimum data requirement in SSD-based ecological risk assessments.

Fig. 4
Fig. 4The alternative text for this image may have been generated using AI.
Full size image

Hazardous concentration for 5% of species (HC5) for phenol calculated based on the size of toxicity data. Dashed lines denote confidence intervals.

Ecological risks of phenol in sediments

An ecological risk assessment of phenol was performed using the predicted no-effect concentration in sediments (PNECsediment). The PNECsediment (0.81 µg/g dw) was derived by applying an AF of 5 to the HC5 value (4.07 µg/g dw), which was obtained from SSD analysis of sediment-equivalent chronic toxicity values generated via the EPM using all available aquatic toxicity data. Phenol concentrations in sediment were compiled from 23 sites across 15 previous studies3,4,5,6,7,8,43,44,45,46,47,48,49,50,51 (Table 4). The highest concentration (810 µg/g dw) was detected at a former synthetic resin manufacturing site in Brazil44, while the lowest concentration (0.002 µg/g dw) was reported in the Beitang Drainage River, China7. Based on risk quotient (RQ) values calculated as MEC/PNEC, 16 sites (70%) would be categorized as high risk (RQ > 1), and 4 sites (17%) as moderate risk (0.1 < RQ ≤ 1) (Table 4). In total, 87% of the sites indicated at least moderate ecological risk. The highest RQ value (1006) was found at the Brazilian industrial site, followed by the Xigang River (43), Xiehe River (36), and the Surui Mangrove (31) (Table 4). The lowest values were observed in the Beitang Drainage River (0.002), Dagu Drainage River (0.007), and Trabzon, Turkey (0.04) (Table 4).

Table 4 Measured concentration and risk quotient (RQ) of phenol in sediment samples.

The PNECsediment and RQ values reported in previous studies differed markedly from those recalculated in this study using the SSD-derived PNEC. For example, Li et al. reported PNEC sediment (0.01 µg/g dw) and RQ values ranging from 63.4 to 3478.0 in the Xi River and 19.2 to 2010.0 in the Pu River6, whereas the values recalculated in this study ranged from 0.79 to 43.22 and 0.24 to 24.97, respectively (Table 5). Similarly, in the Yinma River Basin, Zhou et al. reported PNECsediment (0.08 µg/g dw) and seasonal RQ ranges of 0.79–1.98 (normal season), 0.93–2.30 (wet season), and 0.41–1.65 (dry season)8 (Table 5). In contrast, this study obtained lower RQs for the same regions: 0.08–0.20, 0.09–0.23, and 0.04–0.16, respectively (Table 5).

Table 5 Comparison of risk quotients (RQs) for phenol in sediment samples.

Notably, in some marine and estuarine environments, the recalculated RQ values were higher than those originally reported. For instance, El-Dekhila Harbor and the Abu Talat area showed RQ values of 2.07–3.34 and 1.78–2.60, respectively, in this study, compared to 0.10–3.34 and 0.01–0.06 reported by El Zokm et al.3 (Table 5). These discrepancies likely result from methodological differences, particularly the reliance on freshwater-based toxicity data, the relatively high assessment factor (AF) applied, and the equilibrium partitioning parameters used in our approach. In regulatory practice, higher AFs are often applied for deriving marine PNECs when few or no marine organisms are represented in the ecotoxicity database, which may also contribute to the observed differences. Furthermore, variations in salinity, organic carbon content, and benthic community structure between freshwater and marine systems may influence the ERA outcomes. Together, these considerations highlight the importance of accounting for habitat-specific parameters, taxonomic representation, and methodological consistency when interpreting ecological risk estimates across diverse aquatic environments.

Discussion

Species-specific toxicity and endpoint sensitivity

Species-specific responses to phenol exposure varied considerably depending on both the type of toxicity test and the biological endpoint measured. In artificial stream experiments, Glyptotendipes tokunagai exhibited the highest sensitivity among all tested species, with a 28 d-NOEC of 0.86 mg/L based on emergence. In contrast, in laboratory acute toxicity tests, G. tokunagai exhibited relatively low sensitivity, with a 48 h-EC50 of 52.5 mg/L based on immobilization. This discrepancy underscores how exposure conditions and endpoint selection can markedly influence the apparent sensitivity of a species and the resulting interpretation of ecological risk.

Similar endpoint-dependent sensitivity was observed in other taxa. For the cladoceran Moina macrocopa, mortality (9 d-NOEC < 0.86 mg/L) was more sensitive than reproduction (9 d-NOEC = 0.86 mg/L). In fish, mortality also tended to be more sensitive than sublethal growth endpoints; for instance, Zacco platypus exhibited mortality at a 30 d-NOEC of 0.86 mg/L, whereas growth-related effects occurred only above 1.64 mg/L. These results highlight that reliance on a single endpoint—particularly acute mortality—may underestimate chronic or sublethal effects that are ecologically significant and more representative of long-term population impacts.

In accordance with the European Chemicals Agency’s Technical Guidance Document on Risk Assessment52, the most sensitive or, where appropriate, geometric mean values of multiple endpoints were used per species for SSD construction. This approach ensures that hazard thresholds reflect the most ecologically relevant responses while maintaining consistency and comparability across species.

Effect of geographical distribution (native vs. foreign species)

A persistent challenge in ecological risk assessment (ERA) is the limited availability of chronic toxicity data for native species. To address this gap, toxicity data from foreign species—often standardized test organisms specified by international guidelines such as the OECD Test Guidelines10—are frequently incorporated into risk evaluations. While this practice broadens the applicability of SSD-based approaches, it also raises concerns about regional ecological relevance, differences in species sensitivity, and the reliability of extrapolated thresholds53.

To investigate the influence of geographical origin on hazard threshold derivation, this study compared HC5 values estimated from native and foreign species. The HC₅ derived from foreign species (2.86 µg/g dw) was notably lower than that from native species (10.74 µg/g dw), yielding an HC₅ ratio (foreign/native) of 0.27. This indicates greater apparent sensitivity among foreign species and a more conservative risk threshold when non-native data are used. These findings align with previously reported ratios for phenol—0.26 for Australian versus Northern Hemisphere species54 and 0.39 for fish in the Netherlands compared to other regions55—and similar patterns observed for other chemicals56,57.

The lower HC₅ obtained from foreign datasets likely reflects differences in taxonomic composition and the inclusion of highly sensitive taxa such as amphibians. Notably, amphibians were identified as the most sensitive group in this study but have not yet been tested among native species, highlighting an important limitation in the current dataset. This absence may partly explain the higher HC₅ derived from native species.

Despite these limitations, the conservative nature of HC₅ values derived from foreign datasets can be advantageous in regulatory contexts, providing a precautionary safeguard where local data are scarce. Integrating high-quality foreign data—particularly those generated under standardized and internationally validated testing protocols—remains a scientifically defensible approach for preliminary risk assessments. However, continued efforts to expand chronic toxicity datasets for native species are essential to improve the regional representativeness and ecological relevance of future ERA frameworks.

Influence of habitat type (benthic vs. non-benthic species)

Comparison of species sensitivity distributions (SSDs) between benthic and non-benthic organisms revealed distinct differences in their response to phenol exposure. The HC₅ value derived from benthic species (1.44 µg/g dw) was substantially lower than that from non-benthic species (17.85 µg/g dw), indicating that benthic organisms were generally more sensitive to phenol. In addition, the SSD curve for non-benthic species showed a steeper slope, suggesting lower variability in sensitivity within this group. These differences can be partly attributed to variations in taxonomic composition and ecological traits between the two habitat types58.

The benthic dataset encompassed seven taxonomic groups, including amphibians, which exhibited the lowest chronic toxicity values (0.09–23.27 µg/g dw). In contrast, the non-benthic dataset consisted primarily of crustaceans (7.38–148.29 µg/g dw), fish (18.48–104.36 µg/g dw) and green algae (525.44–737.81 µg/g dw), all of which showed relatively higher toxicity thresholds. The presence of highly sensitive taxa such as amphibians within the benthic dataset substantially influenced the lower tail of the SSD and contributed to the reduced HC₅ values.

These findings underscore the ecological relevance of benthic species in establishing protective thresholds for sediment-associated contaminants such as phenol. Because benthic organisms inhabit or interact directly with sediments, they typically experience prolonged and repeated exposure to sediment-bound chemicals, making their responses more representative of real-world conditions. In this study, however, the exposure scenario represented a short-term pulse event (10.1 h), which does not fully replicate the continuous exposure that benthic species may encounter in natural environments. The pronounced sensitivity observed in the larvae of Glyptotendipes tokunagai under such a brief exposure suggests that even transient contamination events could trigger adverse effects in benthic communities, but longer-term tests are needed to confirm the cumulative impacts of chronic exposure.

Effect of taxonomic composition

The structure and outcome of species sensitivity distribution (SSD) models are strongly affected by the taxonomic composition of the dataset. In this study, sequentially excluding major taxonomic groups generally resulted in only minor variation in the estimated HC₅ values (Table S5). However, a clear exception was observed for amphibians, which exhibited substantially higher sensitivity to phenol than other taxa (Table 3). When amphibian data were excluded, the SSD curve became noticeably steeper, leading to higher HC₅ estimates and a potential underestimation of ecological risk.

These results highlight the disproportionate influence that highly sensitive taxa can exert on SSD-derived thresholds. Sensitive groups such as amphibians often occupy the lower tail of the SSD, thereby determining the protective end of the distribution and ensuring conservative benchmark values59. and their inclusion is critical for ensuring conservative and protective hazard estimates16. Previous research has emphasized that including a wide range of taxonomic groups—especially those representing distinct ecological niches—improves the representativeness and robustness of SSD-based predictions60. Amphibians, which often have benthic early life stages and permeable integuments, are particularly relevant in sediment exposure scenarios and should not be overlooked.

Consistent with the “Technical Guidance for Deriving Environmental Quality Standards (EQS)”61, when sufficient data are available, SSDs can also be constructed using only the most sensitive taxonomic group to derive protective HC₅ values. Nevertheless, overreliance on a single sensitive group without complementary taxa may bias the SSD if inter-taxa variability is not captured. Therefore, SSD construction for ecological risk assessment should aim to include both taxonomic diversity and sensitive groups relevant to the exposure medium, ensuring that derived thresholds are both ecologically representative and adequately protective of vulnerable species.

Influence of sample size on SSD robustness

The reliability of species sensitivity distribution (SSD) models is strongly influenced by the number and diversity of toxicity data available. Previous studies have shown that larger sample sizes reduce uncertainty in both the fitted SSD curve and the estimated hazardous concentration for 5% of species (HC5)60,62. In the present study, increasing the number of toxicity datasets resulted in narrower confidence intervals and lower HC5 values (Table S5), indicating that a more conservative and statistically stable estimation can be achieved as sample size grows37,63. Change point analysis revealed that HC₅ values stabilized once at least eight species were included, beyond which additional data had little effect on either the point estimate or its confidence range.

This aligns with previous findings recommending a minimum of 6–14 datasets depending on modeling methods (e.g., bootstrap, parametric fitting) and chemical-specific variability18,62. However, no universally accepted minimum sample size currently exists. The present findings therefore support adopting an evidence-based criterion—such as the observed convergence of HC₅ values—rather than relying on arbitrary data requirements. In regulatory contexts, particularly for data-limited substances or regions, this approach offers a more transparent and scientifically grounded basis for determining the minimum dataset needed to derive ecologically meaningful and statistically stable SSD-based thresholds.

Derivation and comparison of PNECsediment values

The method used to derive the predicted no-effect concentration in sediment (PNECsediment) plays a critical role in determining the outcome of ecological risk assessment (ERA) for phenol. In this study, a refined stepwise approach was adopted in which aquatic chronic toxicity data were first converted to sediment-equivalent concentrations using the equilibrium partitioning method (EPM), and the resulting dataset was then analyzed using the species sensitivity distribution (SSD) framework. The hazardous concentration protecting 95% of species (HC₅) obtained from the SSD was divided by an assessment factor (AF) of 5 to account for residual uncertainty, yielding a PNECsediment of 0.81 µg/g dry weight (dw).

This approach differs substantially from conventional methods employed in earlier studies, where PNEC values were typically derived by applying large AFs to a few lowest-observed- or no-observed-effect concentrations (LOECs/NOECs), often from pelagic test species, and then extrapolating these values to sediment through a post hoc EPM conversion. Such methodological differences can lead to large variations in the final risk characterization. For example, Li et al.6 reported a PNECsediment of 0.01 µg/g dw—approximately 80 times lower than the value calculated in this study—resulting in very high risk quotients (RQs up to 3478). Similarly, Zhou et al.8 applied a PNEC of 0.08 µg/g dw to seasonal data from the Yinma River Basin, which produced moderate to high risk classifications. When recalculated using the SSD-derived PNEC from the present study, the RQ values for these sites were substantially lower, in some cases by more than one order of magnitude.

These discrepancies highlight how methodological choices—particularly the sequence of EPM application and the extent of dataset integration—can significantly influence ERA outcomes. The combined EPM–SSD approach used in this study provides a more transparent and ecologically representative framework by incorporating interspecies variability and accounting for benthic exposure pathways prior to SSD construction. At the same time, the use of an AF of 5 ensures a conservative interpretation of the HC₅, reflecting the model-based nature of the extrapolation. Therefore, while the PNECₛₑdiment derived here offers an improved balance between realism and precaution, it should still be regarded as a provisional threshold until validated by direct sediment toxicity data.

Ecological risk assessment of phenol in sediments

Using the SSD-derived PNECsediment of 0.81 µg/g dw, phenol concentrations compiled from 23 sites across 15 previous studies were reassessed for ecological risk. The results revealed that 16 out of 23 sites (70%) were classified as high risk (RQ > 1), and an additional 4 sites (17%) showed moderate risk (0.1 < RQ ≤ 1), indicating that 87% of the sites exhibited at least moderate ecological concern. Extremely high RQ values were found at industrially impacted sites such as a former synthetic resin plant in Brazil (RQ = 1006), and several sites in China and Brazil with RQs exceeding 30.

These findings confirm the persistence and severity of phenol contamination in specific sediment environments, despite the use of a more ecologically robust and less conservative PNEC than in earlier studies. The reassessed risk levels remain elevated at many locations, underscoring the limited effectiveness of natural attenuation and the need for active mitigation strategies.

Interestingly, the reassessment also revealed cases where the updated RQ values were higher than originally reported, particularly in marine and estuarine systems. For example, in El-Dekhila Harbor and Abu Talat, recalculated RQ values (2.07–3.34 and 1.78–2.60, respectively) exceeded those reported by El Zokm et al.4, where some values were as low as 0.01. These differences are likely due to the use of freshwater-based toxicity data and equilibrium partitioning parameters in the current study, which may not fully reflect site-specific characteristics such as salinity, organic carbon content, or benthic community composition.

This variability highlights the importance of considering habitat-specific parameters in ERA and ensuring methodological consistency across studies. Risk assessments that rely on inappropriate or generalized parameters may either underestimate or overestimate risks in non-freshwater systems. Therefore, future assessments should prioritize regionally and environmentally relevant toxicity data, along with refined partitioning models that account for sediment-specific conditions.

Limitations and future directions

The derived PNECsediment represents a model-based extrapolation from waterborne toxicity data rather than direct sediment toxicity tests, and should therefore be considered preliminary. Validation through sediment toxicity experiments with benthic species is necessary to confirm ecological relevance. Nevertheless, artificial-stream data provided valuable chronic and sublethal endpoints under semi-natural conditions, complementing the model-based approach. Future work should focus on generating sediment toxicity data for benthic organisms, refining site-specific partitioning parameters (Koc, Foc), and expanding paired acute–chronic datasets to improve acute-to-chronic ratio estimation.

Overall, this study serves as a methodological case example demonstrating how water-based toxicity data, integrated with EPM and SSD, can support sediment ecological risk assessments for moderately hydrophilic substances such as phenol under data-limited conditions.

Conclusion

This study demonstrated the applicability of using water-based toxicity data to assess the ecological risk of phenol-contaminated sediments through the combined use of the equilibrium partitioning method (EPM) and species sensitivity distribution (SSD). The analysis revealed that ecological and dataset characteristics—such as habitat type, taxonomic composition, and sample size—can influence SSD-derived hazardous concentrations (HC5). Although benthic species and amphibians showed slightly higher sensitivity than other taxa, the differences were within one order of magnitude, suggesting that standard test species such as Daphnia magna provide broadly protective benchmarks. Nevertheless, incorporating diverse taxa improves SSD robustness and enhances the ecological representativeness of derived thresholds.

Based on the derived sediment predicted no-effect concentration (PNECsediment) of 0.81 µg/g dry weight, approximately 87% of reported sites were classified as having moderate to high ecological risk. Compared with conventional assessment factor (AF) methods, the EPM–SSD approach yielded more ecologically relevant and quantitatively consistent estimates, particularly for moderately hydrophilic substances such as phenol (Koc < 3).

While the EPM–SSD framework offers a practical and scientifically sound solution for data-limited assessments, it remains a model-based approach that requires validation through direct sediment toxicity testing. The present work therefore serves as a case study illustrating how aquatic toxicity data can be strategically applied to derive sediment risk thresholds for compounds lacking sediment data. From an environmental perspective, the findings emphasize the need for timely management of sediment contamination events and for generating sediment-specific ecotoxicological data to improve the accuracy and reliability of future sediment ecological risk assessments.