Introduction

Polychlorinated naphthalenes (PCNs) are a class of compounds containing one to eight chlorine atoms per naphthalene molecule. Exposure to PCNs reportedly causes a wide range of adverse effects, ranging from neurotoxicity, hepatotoxicity, and suppression of the immune response to endocrine disruption leading to embryotoxicity and reproductive disorders1. In May 2015, PCNs were listed in the Stockholm Convention because of their persistence, toxicity, bioaccumulation, and ability for long-range transport2.

PCNs were initially synthesized in 1833 and were widely used in commercial goods during the 1910s–1980s3. Because of their high thermal stability, hydrophobicity, and inertness, PCNs were used in many industrial applications, such as cable insulation, wood preservatives, retardants, engine oil additives, and raw materials for dye production4. During this period, approximately 150,000–400,000 t of technical PCNs were produced5. This historical manufacturing is a significant source of PCNs in the environment. Another important source of PCNs is unintentional emissions from industrial processes6, such as municipal solid waste incineration7, iron ore sintering8, and secondary copper smelting9. Although the production and use of PCNs are now restricted, the legacy of their use and current unintentional emissions have contributed to global pollution5. In recent years, PCNs have been detected in a variety of environmental media10,11, and even in food12,13,14.

An emissions inventory is essential for policymakers to take appropriate measures to reduce PCN concentrations in biotic and abiotic environments and track the origins of these toxic chemicals15. While previous studies have provided data on PCN inventories in specific regions or over specific periods, these discrete data are insufficient to understand global contamination and its associated temporal and spatial variability. For example, between 2005 and 2015, 3267 kg of PCNs were emitted from sinter plants in China16. Estimated emissions from waste incineration and metallurgical sources in 2014 in China were 511.6 kg by mass and 7650.8 mg of toxic equivalents (TEQ)2. In 2019, the total unintentional PCN emissions in China were approximately 757.0 kg, with the iron and steel industry, waste incineration, non-ferrous metal production, and cement production being the primary sources17. Estimates of annual emissions of PCNs from the global coking industry range from 430 to 692 mg TEQ18. There is still a lack of global, long-term, and comprehensive emissions inventories for the historical production and use of PCNs and current unintentional emissions.

In this work, we establish a global inventory of PCNs, spanning emissions resulting from their initial production and use, as well as current unintentional emissions. Specifically, we quantify emissions from the historical production and use of PCNs using material flow analysis (MFA) modeling. Trends in unintentional emissions of PCNs from 2000 to 2020 are estimated at a national level. A global PCN emission inventory on a 1 km × 1 km grid is developed for 2020. Additionally, we create a random forest model of emissions and economic parameters to predict PCN emissions. Future trends up to 2050 were simulated according to Intergovernmental Panel on Climate Change (IPCC) scenarios19. We also use the BETR Global transport model and Monte Carlo simulations to evaluate the uncertainty of the estimated results. Overall, the results of this work improve our understanding of the quantitative relationship between the emission sources of PCNs and their occurrences in the global environment. The emission estimates serve as a starting point for future worldwide fate and transport modeling of PCNs. The results could also inform region-specific decision-making to cost-effectively reduce the global environmental and health risks of PCNs.

Result and discussion

Global emissions from industrial production and use

From 1912 to 1987, the cumulative global production of PCNs was 1,042,790.65 t (Supplementary Method 1). The production, manufacture, and use of PCNs, which occurred mainly in the 1920s and 1960s, emitted 468,014 t of PCNs to the environment globally. Most of the cumulative PCN emissions entered the atmosphere (99.4%). Emissions to water and soil were relatively small, accounting for 0.4% and 0.2% for the total emissions, respectively. Throughout the life cycle, emissions to air occurred during all phases, except for landfilling, with 96.6 % of emissions occurring during the product use stage. It is worth noting that despite the relatively high proportion of PCNs used as impregnant in capacitors, emissions from capacitors should theoretically be low because they are closed systems. However, our data suggests that the process in which PCNs is used as an impregnant is a significant contributor to emissions, accounting for 85.3% of the total emissions. Therefore, our estimation model may have overestimated the emission potential of PCNs in closed systems. There are issues with the estimation by production and use alone that need further optimization and improvement. Additionally, dyes (8.7%) and additives (4.5%) were also important sources of PCN emissions, while masking agents contributed the least (1.8%).

Figure 1 shows the global emissions of different congeners of PCNs. Tri-CNs dominated PCN emissions to air, while mono-CNs dominated emissions to water and soil. CN-21/24 was the most significant contributor to total emissions at 92,541.2 t. This could be attributed to the fact that most technical PCNs are lower chlorinated PCNs, with tri-CNs and tetra-CNs dominating20. Lower chlorinated PCNs are more likely to volatilize into the environment than higher chlorinated PCNs21. This is consistent with findings in the environment, where tri-CNs and tetra-CNs are predominantly detected in most environments22,23. Temporal trends in global emissions showed that emissions increased significantly to a maximum between the late 1950s and the mid-1960s. This is consistent with the proportions detected in a dated core from the profundal sediments of Esthwaite Water23. In this previous study, CN-24/14 was the congener the highest concentration in the early 1960s. Our estimated yield of PCNs was greater than that in previous studies, which estimated production of 100-40,000 t (Supplementary Table 1)5,24,25,26,27. The limitations of previous estimates include: (1) focus on a single period within the entire production cycle; (2) lack of a clear methodology for yield estimation; (3) estimation of yield based solely on PCN content in industrial polychlorinated biphenyls (PCBs).

Fig. 1: Polychlorinated naphthalene emissions from historical production.
figure 1

a Cumulative emissions of PCNs by phase. WWTP: waste water treatment plant. b Annual emissions of PCNs to environmental media. c Annual emissions of PCN congeners and their share in the environment. Black line refers to total congener emissions. d Historical production material flow analysis of PCNs.

Global unintentional emissions

We used random forests with historical national emissions and critical influencing factors to construct regression models for individual sources, filling in the missing values for global national emissions of PCNs for the period from 2000 to 2020. The coefficient of determination (R2) ranged from 0.82 to 0.99 (Supplementary Table 2). This suggests our model works well for predicting PCN emissions globally at a national level (Supplementary Fig.1). The key characterization factor varies from source to source (Supplementary Table 2). For most sources, the critical characterization factor is primary energy consumption. For secondary aluminum production, the key characterization factor is gross domestic product. Population is the key characteristic factor for emission sources such as municipal waste (MW) incineration, primary copper production, secondary zinc production, household heating, and wildfires.

Global emissions of PCNs were 293.5 t (15.8 kg TEQ) in 2020 (Supplementary Table 3). Globally, MW incineration was the main source of PCNs, accounting for 94.5% of global PCNs, followed by cement production (1.6%) and electric arc furnace (EAF) steelmaking (1.6%). The main source of emissions in terms of TEQ was MW incineration (98.0%), followed by iron production (0.7%) and cement production (0.5%). The emission factor (EF) for EAF steelmaking was approximately four times that of iron production in terms of mass concentration, and one to two times that of iron production in terms of the TEQ concentration28,29. The main sources of PCNs vary from country to country because of differences in economic development, industrial structure, energy structure, population, and the natural environment. PCNs are mainly concentrated in developing countries, which account for 99.1% of global emissions. For developing countries, emissions from MW incineration were the main source (95.3%), followed by cement production (1.6%) and steel EAF steelmaking (1.2%). In contrast to developing countries, the main source of emissions in developed countries was EAF steelmaking (42.0%), followed by iron production (13.4%) and iron ore sintering (11.3%). This is because developed countries, such as the United States and Japan, have advanced waste management strategies to mitigate the environmental effects of large-scale production of consumer goods30. Kazakhstan was the largest emitter of PCNs in the world in 2020, with 125.8 t or 42.9% of global emissions, and this mainly originated from MW incineration (99.95%). Compared with developed countries, Kazakhstan has inefficient waste utilization31. The next largest PCN emitters were Russia (90.2 t), Armenia (23.3 t), Bangladesh (87.2 t), and China (66.3 t), which together had approximately 86.8% of global emissions.

Figure 2 shows the spatial distribution of total global emissions of PCNs from all source sectors in 2020 with a grid resolution of 1 × 1 km. Supplementary Fig.24 illustrate the spatial distribution of industrial sources, household combustion sources, and wildfire sources, respectively. Overall, the global emissions of PCNs varied considerably spatially. Figure 3a shows the total PCN emissions in 2020 from the nine geographic regions and the relative contribution of each source to the major geographic regions. Globally, PCN emissions were mostly from MW incineration and concentrated in South America (79.8%), North Africa (89.1%), Sub-Saharan Africa (93.4%), Europe (97.5%), South and Southeast Asia (85.0%), and West and Central Asia (99.5%). Most of these areas are economically underdeveloped and have poor waste management systems, which result in high emissions of PCNs from MW incineration31. In economically developed regions, the contribution from industrial sources was relatively high. In North America, Oceania, and East Asia, iron and steel production was the main source of emissions, with the main sources as EAF steelmaking (60.9%), iron ore production (43.5%), and iron production (26.5%). The main contributors to emissions from ferrous and non-ferrous metal production sources were iron production (19.0%), EAF steelmaking (16.8%), and oxygen blown converter steelmaking (12.2%) in East Asia. According to the World Steel Association, in 2022, China, India, and Japan were the world’s three largest crude steel producers32.

Fig. 2: Total global polychlorinated naphthalene emissions in 2020 with 1 km × 1 km resolution.
figure 2

The map was drawn based on the vector data from the Resource and Environmental Science Data Platform (https://www.resdc.cn/data.aspx?DATAID = 205).

Fig. 3: Unintentional emissions of polychlorinated naphthalenes (PCNs).
figure 3

a Emissions of PCNs by region in 2020. b Emissions of PCNs from 2000 to 2020. Black line refers to total emissions by region or year. (prod.: production; Sec.: secondary; Oxy.: oxygen blown converter; Elec.: electric arc furnace; sint.: sintering; incin.: incineration; HW: hazardous waste; MW: municipal waste).

Emissions temporal trends

Figure 3b shows the historical unintentional emission trends of PCNs globally from 2000 to 2020. From 2000 to 2020, a cumulative 11,534 t of PCNs were emitted globally. The overall trend was upward and then downward, mainly because of changes in the amount of waste incinerated. Compared with results previously reported in the literature2,16,17,33,34,35,36, our estimates are high. This is mainly because we included the effects of technological change and emissions that are not effectively controlled. Emissions of PCNs from all sources in all United Nations Economic Commission for Europe countries (except for the United States and Canada) were estimated in this study for the year 2000 and compared with those from a previous study (Supplementary Fig. 5)33. For most countries, PCN emissions in this study were 1 to 100 times those in the earlier study. Comparisons with other inventories are all within a factor of 10 (Supplementary Fig. 6). This suggests that estimation of PCN emissions as 1/50 of the PCDD/F emissions will lead to underestimation of PCN emissions.

The average annual emission of PCNs from 2000 to 2020 was 577 t, while the average yearly emission from historical intentional production and use was 6240 t. Current unintentional emissions are 10 times lower than historical intentional production. This suggests that banning the production of PCNs is effective for controlling PCN emissions. Historical intentional emissions were the main source of air pollution from PCNs during their early use, especially in countries where PCNs were predominantly produced and used37,38. Legacy effects of historical emissions have been detected even in areas where PCNs were not historically produced or used, such as Harbin in 2007, which suggests that historical emissions of PCNs are global and long-term39. However, the environmental effects of emissions from the historical production and use of PCNs have decreased over time40. For example, the concentrations of CN-71/72 and CN-59, which are indicators of the technical product halowax1014, detected in sediment cores from Jiaozhou Bay, China, have gradually decreased since 199041. However, the environmental effects of historical emissions will continue for a long time. For example, in 2014, PCN concentrations at rural and peri-urban sites in the Jenab River region of Pakistan were mainly influenced by coal and wood burning, while urban sites were mainly influenced by historical PCN production42. Recently, historical production and use and combustion sources have been identified as the main sources of emissions in environmental samples43,44. The proportion of emissions from combustion sources is gradually increasing, both in environmental and human samples13,45,46. The results of this study indicate that the continued unintentional release of PCNs during thermal processes is an increasingly important factor in PCN pollution and human exposure.

Emissions of PCNs were modeled for each country and region between 2020 and 2050 using six scenarios (A1G, A1B, A1T, A2, B1, and B2) from the IPCC Special Reporting Scenarios (Supplementary Method 2). Combining available statistics from the World Health Organization with the IPCC’s projection scenarios, most scenarios were optimistic for primary energy use47. Like many other projection studies48, our results face significant uncertainty, which stems from the complexity and unpredictability of socio-economic development, the evolution of the energy mix, and technological advances. Additionally, changes in EFs and technology split scores may also be influenced by changes in socioeconomic development. Despite these uncertainties, our projections consistently show that emissions will trend upward under most scenarios (Fig. 4). Future emissions of PCNs are likely to experience significant fluctuations and are expected to vary between a decrease of 29% and an increase of 347%.

Fig. 4: Global temporal trends of PCN emissions from 2000 to 2050.
figure 4

The six charts show the forecast results for different regions: a Total, b ALM (Africa and Latin America region), c AISA; d OECD (Organization for Economic Co-operation and Development); e REF (countries undergoing economic reform: Central and Eastern Europe, newly independent states of the former Soviet Union and Sub-Saharan); and f others (countries and territories not mentioned in the IPCC). Black line refers to total emissions by region from 2000 to 2020. A1B, A1G, A1T, A2, B1, B2 are the six illustrative scenarios predicted by the IPCC.

Future regional emissions are mainly concentrated in the ALM (Africa and Latin America region). The growth in 2020-2030 is mainly attributed to the growth in the ALM region. According to the 2000-2020 scenario, the main emissions of PCNs come from waste incineration. the ALM is dominated by lower-middle income countries. The current share of incineration in the region is almost 031. Waste generation is expected to increase significantly as urbanization and economic growth accelerate. Total waste in Sub-Saharan Africa is expected to increase approximately threefold31. Waste incineration plays a prominent role in general waste management as a means of reducing waste volumes and recovering energy, which can be used to supplement traditional supplies. For example, in upper-middle income countries, waste incineration has grown from 0.1 % to 10 % in the past decade31. As the economy develops the ALM region may follow the same trend as the AISA region did in earlier years, with an increasing and then decreasing trend.

Global PCN emissions increase most in the A1G scenario by 2050. This is because the A1G plot assumes a fossil fuel-intensive profile, and PCN emissions are influenced by primary energy consumption (Supplementary Table 4). The more stable scenario is the A1T scenario because the A1T scenario assumes a technologically diverse and non-fossil fuel-intensive world. Except for the ALM region, which shows a significant upward trend, emissions in the other regions increase only marginally. In Asia, emissions in most scenarios are declining. In the B1 scenario, Asia shows the largest decrease in PCN emissions, with a 66% reduction. However, unlike dioxins, PCNs have not been assigned clear concentration limits in plant emission control policies4,49. By 2050, global waste is expected to increase to 3.40 billion t31. Without improvements in the waste incineration sector, PCN emissions will continue to grow.

Model evaluation and uncertainty analysis

Limitations because of a lack of data can introduce uncertainty into estimates and assessments. Early historical data are poorly documented officially and mostly are only rough estimates, which directly contributes to the high level of uncertainty in estimating emissions of PCNs from historical production and use. Additionally, the physicochemical parameters used in the model estimates may contain uncertainties. This is because these parameters are heavily dependent on predictions. To assess the adequacy of model estimates of global PCN emissions, we used them as inputs to the global fate and transport model BETR Global and generated time- and space-resolved PCN concentrations for comparison with environmental monitoring data from the literature (Supplementary Fig. 7). Comparisons showed that more than 82% of the predicted measured concentrations were within two orders of magnitude of the measured data. More than 59% of the predicted measured concentrations were within one order of magnitude of the measured data. Considering the variability of concentrations in environmental monitoring, the model measurement consistency indicates that the model performs satisfactorily for characterizing global emissions of PCNs. In addition, we performed sensitivity analyses for the MFA model parameters. The sensitivity |S| values for each parameter ranged from 0 to 1, exhibiting no or limited sensitivity (Supplementary Table 5).

For current unintentional emissions, uncertainties arise mainly from incomplete data on activity levels and the geographical specificity of EFs. A modeling approach has been used to fill the data gaps, which undoubtedly introduces uncertainty to the emission estimates. However, this uncertainty is considered relatively limited because the data used are collected nationally, and the countries with missing data typically have low activity levels. Given the scarcity and heterogeneity of EF data at the global level, the geometric mean calculated using available and accessible data was used as a proxy for EFs in this study. While we recognize that this approach may not fully capture the emission characteristics specific to different countries or regions, we believe it is a reasonable approximation under current data conditions. By performing 10,000 iterations of a Monte Carlo stochastic simulation of global unintentional emissions of PCNs from 2000 to 2020, we found that the probability density function of unintentional emissions of PCNs had a log-normal distribution (Supplementary Fig. 8). We used 95% confidence intervals to define the margin of error. The difference between the 2.5th and 97.5th percentiles was within acceptable limits, which indicated the results were robust. From 2000 to 2020, the relative random error based on the mean varied significantly ranged from -96.42% to 472.68% (Supplementary Table 6). The relative random error based on the mean ranged from −99.39%% to 456.22%% in 2020 (Supplementary Table 7). PCN emissions were in the 95% probability range of 5.95 to 5398.87 kg. Additionally, we identified the emission sources that contributed the most to the uncertainty through Pearson correlation analysis. MW incineration was the primary source of uncertainty, followed by oxygen blown converter steelmaking (Supplementary Fig. 9). It is worth noting that despite the Monte Carlo analysis used in this study, additional uncertainties could not be captured because of a lack of information, which made an exhaustive quantitative assessment of all sources of uncertainty impossible.

More precise data, especially EFs, are needed to improve the accuracy and reliability of future studies. Efforts should be made to collect more localized and high-resolution emission data to develop more accurate and representative EFs. Collecting these data is essential to enhance the accuracy and robustness of the inventory. Despite the limitations described above, our methodology and associated data are the best and most proven options for the currently available data. We believe that with continuous improvement of the data and further methodology development, our study will provide a solid basis for assessing unintentional emissions.

Methods

Inventory development

In this study, emissions of PCNs were calculated for three periods (Supplementary Fig. 10): (i) the historical production period (1912–1987), (ii) the unintentional emission period (2000–2020), and (iii) the future emission period (2020–2050).

  1. (i)

    Historical production period. It is presumed that the production period spanned from the inaugural report of manufacturing activities in 1912 to the final documented instance in 198720. During this period, PCNs were circulated and used as products worldwide. Therefore, the MFA method was used to calculate the emissions throughout the entire product lifecycle, which encompassed the stages of production, processing, utilization, and disposal of products containing PCNs (Supplementary Fig. 11). Due to the lack of historical PCN import and export information, our system does not consider emissions due to trade flows. We used the CiP-CAFE (Chemicals in Products—Combined Estimates of Fate to the Human Atmosphere) model to simulate emission fractions and rates50. The model tracks long-term changes in chemicals in products globally, including the flow paths of chemicals through the stages of production, use and waste disposal, as well as stockpiles and releases over time. We considered four end-use applications of PCNs: additives, impregnant, masking agents, and soluble dyes. We have included three inputs for the module: (i) production of PCN congeners and production of finished formulations, (ii) use of products containing PCNs, (iii) physicochemical properties of PCNs. The outputs of the module include the corresponding annual emissions of PCNs to different environmental compartments (air, water and soil) at different life cycle stages for each region in the producing country.

  2. (ii)

    Unintentional emission period (2000–2020). The inventory was developed using a top-down approach. We primarily considered 20 quantifiable sources of unintentional emissions, categorized into five groups (waste incineration, ferrous and non-ferrous metal production, heat and power generation, production of mineral products, and production of mineral products) and 14 categories (Supplementary Table 8). In addition to the nine combustion sources, 11 material production sources were considered, including five processes involved in the iron and steel industry (sintering, coke production, pig iron production, oxygen blown converter steelmaking, and electric arc furnace steelmaking) and production of copper, aluminum, lead, zinc, magnesium, and cement. A total of 248 countries and regions were included in the emissions calculations (Supplementary Table 9). Because complete indicator data were unavailable for all countries for all years, we assumed that all countries had relevant economic activities. For spatial analysis, the 2020 minimum emissions data were gridded to an accuracy of 1 km × 1 km. We built a regression model of emissions using random forest models. This process may introduce uncertainty because there are differences between countries regarding politics, resources, and so forth, and the activity may not occur in some regions. But countries with missing data usually account for a minority of PCN emissions.

  3. (iii)

    Future emission period (2020–2050). Future emissions were projected for six scenarios (A1G, A1B, A1T, A2, B1 and B2) of future economic activity assumed by the IPCC using the random forest model19. Based on the IPCC SRES projections of future GDP, population, and primary energy use, ten-year global and regional (OECD, REF, ASIA, ALM) growth rates for these characteristics are calculated for 2020-2050. Calculations for other countries use the global average growth rate. Feature data for 2020-2050 are then obtained based on the 2020 feature data.

Data analysis of PCN production

Historical production of PCNs has occurred primarily in the United States, Germany, Great Britain, France, Italy, Japan, Poland and Russia (Supplementary Table 10). We gathered, reviewed, and curated the literature-reported annual production volumes of technical PCN mixtures in the main producing countries between 1912 and 1987 (Supplementary Method 1). The retrieved data were processed as follows: (1) data with long-term series were preferred; (2) mean interpolation was used for years that lacked specific production data but had total production for the period; and (3) for countries with no available data in the literature, data were interpolated using a regression model constructed from the available annual data using GDP. The model was represented by yield = 193011.371 × lg(GDP) − 1064059.419 (R2 = 0.873, P < 0.001). For U.S. production, we smooth the collected and interpolated data using locally weighted regression (LOESS) (Supplementary Method 1 and Supplementary Fig.12)51. Global production of PCNs is shown in Supplementary Fig.13. For congener yield calculations, we obtained the proportion of congeners in each use category by a weighted average of the yields of technical PCNs with known compositional proportions (Supplementary Table 11).

Activity and emission factor data

Detailed data sources for activity levels are listed in Supplementary Table 8. In the selection of activity data, priority was given to data from the same authoritative database source to ensure long time coverage and comprehensive source categories. In the case of missing data, statistical datasets from international organizations and data reported in the literature are again considered as supplements in the selection of datasets, with priority given to international organizations. The data were mainly derived from the World Steel Association (https://worldsteel.org/), the International Energy Association (https://www.iea.org), the United States Geological Service (https://www.usgs.gov), the Food and Agriculture Organization of the United Nations (https://www.fao.org), the United States Energy Agency (https://www.eia.gov), and the United Nations Statistics Division (https://unstats.un.org). For forest carbon stock, a 5-year count was used and linear growth was assumed for missing years52. Global emissions were geographically assigned to a 1 km × 1 km global grid using proxy data, including GDP (industrial sources), population data (household sources), and forest burning (forest combustion sources) (Supplementary Table 12)53,54,55. In total, we counted the emission information for 198 plants from 87 papers. To obtain emission factors (EFs) more appropriate to the local economic situation, we made assumptions about plants for which emission concentrations were measured but EFs were not calculated. We classified the technological level of the plants based on Toolkit for Identification and Quantification of Releases of Dioxins, Furans and Other Unintentional POPs56. Calculation of possible emission factors based on the assumptions made in the toolkit for the operating parameters of the different processes (mainly flue gas flow) combined with the flue gas concentrations from the article. We used the following equation to estimate the PCN emissions:

$${EF}=C\times R$$
(1)

where EF is emission factor (μg t-1 or ng TEQ t-1); C is concentration of PCN (ng Nm-3 or pg TEQ Nm-3); R is flue gas flow rate (Nm3 t-1).

The EFs mostly showed a logarithmic distribution, and the geometric mean was representative of the whole (Supplementary Table 13)48. For industrial sources, the EFs usually tend to decrease, and the use of technology partitioning is often used to model these EFs48,57,58. Emissions from industrial sources were calculated using Eq.2.

$${E}_{j,k,t}={{EF}}_{j,k,t}\times {A}_{j,k,t}\times {X}_{j,k,t}$$
(2)

where j, k, t are category, country and year, respectively; E is emissions of category j from country k in year t (μg or ng TEQ); EF is emission factor specific to each technology (μg t-1 or ng TEQ t-1); A is production (t year-1); X fraction of production for this sector by a specific technology (%); ∑X = 1 for each sector. For waste incineration emission sources, the waste incineration efficiency (98%)59 needs to be multiplied.

X was calculated using Eq. 3:

$$X=\left({X}_{0}-{X}_{f}\right){e}^{\left(-{\left(t-{t}_{0}\right)}^{2}/2{s}^{2}\right)}+{X}_{f}$$
(3)

where t is the target year; t0 is the year in which the technology transition begins; s is a rate; X0 and Xf are the initial and final values of the technology fractions, respectively. These parameter values refer to Shen et al. (Supplementary Table 14)48.

Emissions from fuel combustion was calculated using Eq. 4.

$$E=\frac{{EF}\times {FC}\times {CE}}{Q}$$
(4)

where E is emissions (μg or ng TEQ); EF is the emission factor (μg t-1 or ng TEQ t-1); FC is fuel consumption (kj year-1); Q is the calorific value of the fuel (kj t-1); CE is the combustion efficiency (%). Q is from Lee et al.60. CE is from Wang et al.59.

Emissions from wildfire was calculated using Eq.5:

$$E={FL}\times {BA}\times {EF}\times {CE}$$
(5)

where E is emissions from wildfire (μg or ng TEQ); FL is the fuel load (t km-2); BA is the area of forest burned (km2 year-1); EF is the emission factor (μg t-1 or ng TEQ t-1); CE is the combustion efficiency (%). CE is from Song et al.61.

Random forest model

We built Random Forest regression models based on calculated emissions. We have selected three common drivers: the country gross domestic product (GDP), population, and primary energy consumption. All data sources are from the World Bank (Indicators | Data (worldbank.org)). We use the model to estimate missing values of emissions for other countries. We used an 8:2 ratio to divide the complete data set into a training set and a test set. Parameters of the random forest regressor were optimized using Optuna (a Python library) (Supplementary Table 15). The hyperparameters were selected by evaluating the training set using the mean square error (MSE). The lowest MSE was obtained on the validation set. each training model by substituting the selected optimal hyperparameters into the model algorithm. R2 and MSE were used to evaluate the performance of the model. The random forest model were established using open-source code on Python using PyCharm software (https://www.jetbrains.com)62.

Model performance evaluation and uncertainty analysis

Using the BETR Global 4.0 model, concentrations in ambient media were simulated from historical emission estimates63. The performance of the CiP-CAFE model was evaluated by comparing the simulated concentrations with previously reported measurements in the literature. The BETR Global model geographically divides the global environment into 288 grid cells of 15°x 15°, each containing seven interconnected compartments, i.e., upper air, lower air, vegetation, freshwater, seawater, soil and sediment. The model can describe the transport and reaction of chemicals within compartments, the exchange of chemicals between compartments by reversible partitioning and diffusion processes (e.g. wet and dry deposition) and between cells by atmospheric or seawater advection processes. The input data required to run the model were: (1) the physicochemical properties of the chemicals (i.e., equilibrium partition coefficients between air and water, water and octanol, and octanol and air, (2) degradation half-life estimates for the chemicals in each bulk model compartment, and (3) spatially explicit emission estimates. A non-steady-state model was selected, i.e., chemical transport and conversion rates change each month of the modelled time to reflect the appropriate conditions for that month. Environmental parameters were selected to model the default non-steady-state model parameters. For more detailed information on the environmental database, visit the BETR Global website (http://sites.google.com/site/betrglobal/home). Model code sources https://github.com/BETR-Global. Relevant model parameters are included in Supplementary Table 16. Monthly average emissions are calculated for each year based on historical annual emissions (1912-1987). The monthly regional emissions were geographically assigned to BETR-Global grid cells based on proxy data (Global Population Count Grid Time Series Estimates, v1 (1970) https://sedac.ciesin.columbia.edu). The monitored PCN concentrations in the actual environment are shown in the Supplementary Tables 1719.

In addition, we performed a sensitivity analysis of the CiP-CAFE estimates by separately scaling each input parameter to 10% above and below the baseline value (i.e. the value used in the CiP-CAFE modelling) and observing the relative changes in the estimated emission. The sensitivity was calculated using Eq. 6:

$$S=\frac{{O}_{+10\%}-{O}_{-10\%}}{{O}_{0}}\times \frac{{I}_{0}}{{I}_{+10\%}-{I}_{-10\%}}$$
(6)

where S is the sensitivity to the input parameters; O0, O+10% and O-10% are the baseline output value and the output values calculated by scaling the input parameters to 10% below (I-10%) and above (I+10%) their current value (I0). Values of 0 ≤ |S| ≤ 1 show no or limited sensitivity to the input parameters, whereas |S| > 1 shows remarkable sensitivity to the input parameters. We varied the parameters uniformly for all congeners and examined the effect of the parameters on the final results.

Monte Carlo simulation was used to characterize the uncertainty of the unintentional emissions. The emissions were repeatedly calculated 10,000 times by randomly drawing all inputs from given distributions with known coefficients of variation. In analysis of emission factors, we considered a total of 12 distribution scenarios: beta, exponential, gamma, log-normal, normal, Pareto, triangular, uniform, t-distribution, logistic, minimum Weibull, and maximum Weibull. EFs were obtained from the chi-square test to obtain the most appropriate probability distribution function (Supplementary Table 20. EFs with small quantities of data were assumed to be logarithmically distributed. Household and wildfire combustion source EFs were assumed to be normally distributed. The activity rates were assumed to be uniformly distributed. On the basis of previous studies, the variation intervals were set to be 5% of the means for industrial sectors and 20% for household/domestic heating48. The coefficients of variation were set to be 8.8% for carbon stock and burning loss area of forest64.