Introduction

The acceleration of urban and industrial development has intensified human presence and activity, which in turn has increased the release and buildup of harmful environmental agents via multiple exposure pathways, including the atmosphere, hydrosphere, and terrestrial systems1. Urban industrialization has been increasingly identified as a key contributor to rising illness and death rates, as well as to declining life expectancy, based on accumulating scientific evidence2,3. Additionally, industrial complexes that have been operating for more than 20 years are increasingly seen as key sources of environmental contamination, raising health risks among nearby residents as a result of sustained exposure to emitted heavy metals4.

Heavy metals are poorly eliminated through human metabolism and tend to accumulate in bones, adipose tissue, and muscles, potentially leading to organ damage and harmful effects on the cardiovascular and nervous systems5,6. Compared to adults, children are more affected by indoor settled dust because they have weaker immune systems and spend the majority of their time in indoor environments such as homes and schools, which leads to higher exposure7,8. Although children are generally more susceptible to the toxic effects of heavy metals due to physiological and behavioral characteristics, adults may also face chronic health risks from prolonged exposure9. According to Somsunun et al. (2023)10, indoor dust in urbanized and industrial settings poses significant health risks across all age groups, highlighting the necessity for adult-specific exposure assessments and the development of tailored guidelines.

Heavy metal exposure from indoor settled dust primarily occurs through three main exposure routes, namely inhalation of airborne particles, dermal contact with contaminated surfaces, and unintentional ingestion11. Among the three exposure routes, ingestion of indoor settled dust is considered the most significant, as it poses the highest potential health risk, encompassing both non-carcinogenic and carcinogenic effects of various heavy metals12.

Dust containing heavy metals is generally introduced into indoor environments from outdoor sources, and the concentration of heavy metals in dust reflects both short-term and long-term human activities in the surrounding area13. Airborne dust can settle not only on roads (road dust) but also on various indoor and outdoor surfaces of buildings, including window frames, roofs, and windows14. Since dust particles may contain hazardous heavy metals, analyzing indoor settled dust is essential for identifying major emission sources and understanding potential exposure risks15,16. Therefore, applying statistical techniques such as principal component analysis (PCA) and hierarchical cluster analysis (HCA) helps to identify the sources of heavy metal contamination and clarify their contribution to indoor settled dust composition17. In addition, positive matrix factorization (PMF) has been widely employed to extract latent sources from environmental concentration data, even in the absence of prior knowledge about emission profiles18.

Residents of Myodo-dong, Yeosu, the Republic of Korea, located in close proximity to major industrial complexes, have been reported to exhibit higher cancer incidence rates and elevated concentrations of heavy metals in biomonitoring samples compared to national averages, indicating an increased risk of adverse health effects19. This study analyzed indoor settled dust from Myodo-dong, Yeosu, the Republic of Korea, a residential area near major industrial complexes, to assess health risks from exposure to copper (Cu), manganese (Mn), cobalt (Co), zinc (Zn), chromium (Cr), lead (Pb), cadmium (Cd), and nickel (Ni) by inhalation, dermal contact, and ingestion, and to identify emission sources through multivariate analysis.

Materials and methods

Study area

The study was conducted in Myodo-dong, Yeosu-si, the Republic of Korea, a coastal residential area in the southern part of the country (Fig. 1). Figure 1a presents Yeosu-si, located in the southern coastal region of the Republic of Korea, while Fig. 1b shows Myodo-dong in the northern part of the city. Figure 1c illustrates that Myodo-dong is directly adjacent to three major industrial complexes including Gwangyang National, Yeosu National, and Yulchon, which contain large-scale petrochemical plants, steel manufacturing facilities, and other heavy industries. The study area was selected due to its close proximity to these emission sources and previous reports of elevated environmental concentrations of heavy metals. The study area was selected due to its close proximity to these emission sources and previous reports of elevated environmental concentrations of heavy metals. From a total of 50 households in Myodo-dong, 20 households that voluntarily agreed to participate were selected to ensure adequate spatial coverage. Sampling design and household selection were guided by reference to a previous national survey conducted in the same area, thereby supporting the representativeness of the data19. Written informed consent for the collection of indoor settled dust samples was obtained from all participating households prior to sampling. Indoor settled dust samples were collected from these households over a three-day period, from June 28 to June 30, 2023. This study was conducted in accordance with standard procedures after obtaining IRB approval from Daegu Catholic University (IRB No. CUIRB-2023-0054), and all methods were carried out in accordance with the relevant guidelines and regulations.

Fig. 1
Fig. 1
Full size image

Geographical location of study area. (a) Location of Yeosu-si in the Republic of Korea. (b) Location of Myodo-dong within Yeosu-si. (c) Distribution of sampling sites (n = 20) and surrounding industrial complexes (Gwangyang, Yeosu, and Yulchon).

Fig. 2
Fig. 2
Full size image

Pearson correlation analysis of heavy metal concentrations in indoor settled dust.

Sampling and analysis methods

Indoor settled dust samples were collected from the main residential spaces, primarily the living room, by retrieving dust from beneath furniture such as sofas and refrigerators in accordance with the American Society for Testing Materials International (ASTM) D 7144-05a standard method20. Indoor settled dust was collected using a Tygon tube (R-3603) connected to a pump (Gilian Air Plus, Sensidyne, USA) and a 3-piece cassette (SKC, 225-3-01, MCE, 0.8 μm, 37 mm). Sampling was conducted at a flow rate of 2.5 L/min for a duration of 2 min. The collected samples were sealed in clean zipper bags (LDPE: Clean Wrap, 22 cm × 25 cm, Republic of Korea), transported to the laboratory, air-dried, weighed, and stored for further analysis. The extraction of metal components, including heavy metals, was performed using an acid digestion method with a mixed solution of 1.03 M HNO3 and 2.23 M HCl, following the standard methods for air pollution analysis19. The analysis of heavy metals in indoor settled dust was conducted using an inductively coupled plasma optical emission spectrometer (ICP-OES; ICPE-9800, SHIMADZU, Japan). The calibration curves for each heavy metal are presented in Supplementary Fig. S1, while the limits of detection (LODs) and analytical reproducibility are provided in Supplementary Table S1.

Exposure assessment

Exposure to heavy metals from indoor settled dust was assessed via inhalation, dermal contact, and ingestion routes using standardized exposure algorithms, based on methodologies established by the U.S. EPA and the National Institute for Public Health and the Environment (RIVM)21,22. These algorithms consider resuspension and inhalation of indoor settled dust, dermal contact through skin adherence and absorption, and ingestion, as shown in Eqs. (13). In the exposure assessment, key exposure factors including body weight (64.5 \(\:\pm\:\) 12.7 kg) and inhalation rate (14.6 \(\:\pm\:\) 3.2 m3/day), derived from large-scale national survey data, were incorporated as probabilistic distributions (mean \(\:\pm\:\) standard deviation) in the Monte Carlo simulation to capture inter-individual variability23. Indoor settled dust metal concentrations were also modeled as distributions, and simulation results were summarized using percentiles (25th, 50th, 75th, and 95th) to characterize variability in exposure estimates. All other exposure factors were applied as fixed default values recommended by the U.S. EPA, RIVM, and National Institute Environment Research (NIER), including ingestion rate (20 mg/day), skin adherence (3.00 × 10− 7 kg/cm2), surface area (4,271 cm2), particulate emission factor (1.36 × 109 m3/kg), absorption factor (0.001; Pb: 0.01), exposure frequency (350 days/year), exposure duration (25 years), and averaging time (25 years for non-carcinogenic effects; 82.7 years for carcinogenic effects)21,22,24. All participants were adults aged 50 years or older with long-term residence in the study area, and residential exposure scenarios were therefore applied.

$$\:{ADD}_{inh},{LADD}_{inh}=\frac{C\times\:InhR\times\:EF\times\:ED}{PEF\times\:BW\times\:AT}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(1)

where ADDinh is the inhalation average daily dose (mg/kg/day), LADDinh is the inhalation lifetime average daily dose (mg/kg/day), C is the concentration of heavy metal in indoor settled dust (mg/kg), InhR is the inhalation rate (m3/day); EF is the exposure frequency (days), ED is the exposure duration (years), PEF is the particulate emission factor (m3/kg), BW is the body weight (kg), and AT is the averaging time (years).

$$\:{ADD}_{der},{LADD}_{der}=\frac{C\times\:SL\times\:SA\times\:ABS\times\:EF\times\:ED}{BW\times\:AT}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(2)

where ADDder is the dermal average daily dose (mg/kg/day), LADDder is the dermal lifetime average daily dose (mg/kg/day), C is the concentration of heavy metal in indoor settled dust (mg/kg), SL is the skin adherence factor (kg/cm2), SA is the surface area (cm2), ABS is the absorption factor, EF is the exposure frequency (days), ED is the exposure duration (years), BW is the body weight (kg), and AT is the averaging time (years).

$$\:{ADD}_{ing},{LADD}_{ing}=\frac{C\times\:IngR\times\:EF\times\:ED}{BW\times\:AT}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(3)

where ADDing is the ingestion average daily dose (mg/kg/day), LADDing is the ingestion lifetime average daily dose (mg/kg/day), C is the concentration of heavy metal in indoor settled dust (mg/kg), IngR is the ingestion rate (kg/day), EF is the exposure frequency (days), ED is the exposure duration (years), BW is the body weight (kg), and AT is the averaging time (years).

Toxicity values of heavy metals in indoor settled dust

Among the heavy metals in indoor settled dust, five heavy metals (Cu, Mn, Co, Zn, and Cr) were classified as non-carcinogenic, while three heavy metals (Pb, Cd, and Ni) were categorized as carcinogenic, making a total of eight selected heavy metals. The inhalation reference dose (RfDinh) in Eq. (4) and the inhalation cancer slope factor (CSFinh) in Eq. (5) were derived by considering body weight (70 kg) and inhalation rate (20 m3/day), based on the reference concentration (RfC) and inhalation unit risk (IUR) values for Pb, Cd, Ni, and Mn25.

$$\:{RfD}_{inh}=\frac{RfC\times\:20\:{m}^{3}/day}{70\:kg}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(4)

where RfDinh is the inhalation reference dose (mg/kg/day) and RfC is the reference concentration (mg/m3).

$$\:{CSF}_{inh}=\frac{IUR\times\:70\:kg}{20\:{m}^{3}/day}\times\:\text{1,000}\:\mu\:g/mg\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(5)

where CSFinh is the inhalation cancer slope factor (mg/kg/day)-1 and IUR is inhalation unit risk (µg/m3)−1.

The dermal reference dose (RfDder) in Eq. (6) and the dermal cancer slope factor (CSFder) in Eq. (7) were calculated by applying the gastrointestinal absorption factor (ABSGI) to the ingestion reference dose (RfDing) and ingestion cancer slope factor (CSFing) of Pb, Mn, Co, Zn, and Cr26. The RfD and CSF values for each exposure route of heavy metals are presented in Supplementary Table S2.

$$\:{RfD}_{der}={RfD}_{ing}\times\:{ABS}_{GI}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(6)

where RfDder is the dermal reference dose (mg/kg/day), RfDing is the ingestion reference dose (mg/kg/day), and ABSGI is the gastrointestinal absorption factor (Pb: 1.00, Ni: 0.04).

$$\:{CSF}_{der}=\frac{{CSF}_{ing}}{{ABS}_{GI}}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(7)

where CSFder is the dermal cancer slope factor (mg/kg/day)-1, CSFing is the ingestion cancer slope factor (mg/kg/day)-1, ABSGI is the gastrointestinal absorption factor (Mn: 0.04, Co: 0.25, Cr: 0.013).

Risk assessment

The hazard quotient (HQ) for non-carcinogenic substances was obtained by calculating the ratio of the average daily dose (ADD) to the reference dose (RfD) for each exposure route (Eq. (8))26. The cancer risk (CR) was calculated as the product of the lifetime average daily dose (LADD) and the cancer slope factor (CSF) for each exposure route (Eq. (9))21. The hazard index (HI) and total cancer risk (TCR) were calculated as the sums of HQ and CR values across all exposure routes, respectively (Eqs. (1011))27. Monte Carlo simulation was applied to incorporate variability and uncertainty, generating probabilistic distributions of HQ, HI, CR, and TCR. Simulation outputs were summarized as percentiles (25th, 50th, 75th, and 95th) together with two-sided 95% uncertainty intervals (2.5th–97.5th). Risk characterization was conducted based on the proportion of simulated outcomes in which the aggregated indices exceeded the thresholds (HI ≥ 1 for non-carcinogenic risk; TCR ≥ 1.00 × 10− 6 for carcinogenic risk). In addition, sensitivity analysis was performed using partial rank correlation coefficients (PRCC) between key input parameters (metal concentration, inhalation rate, and body weight) and probabilistic risk outcomes (HI and TCR) to identify the major contributors to output variability.

$$\:{HQ}_{inh,\:der,\:ing}=\frac{{ADD}_{inh,\:der,\:ing}}{{RfD}_{inh,\:der,\:ing}}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(8)

where HQ is the hazard quotient, ADD is the average daily dose (mg/kg/day), and RfD is the reference dose (mg/kg/day).

$$\:{CR}_{inh,\:der,\:ing}={LADD}_{inh,\:der,\:ing}\times\:{CSF}_{inh,\:der,\:ing}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(9)

where CR is the cancer risk, LADD is the lifetime average daily dose (mg/kg/day), and CSF is cancer slope factor (mg/kg/day)−1.

$$\:\text{H}\text{I}=\sum\:HQ={HQ}_{inh}+{HQ}_{der}+{HQ}_{ing}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(10)

where HI is the hazard index, HQinh is the inhalation hazard quotient, HQder is the dermal hazard quotient, and HQing is the ingestion hazard quotient.

$$\:\text{T}\text{C}\text{R}=\sum\:CR={CR}_{inh}+{CR}_{der}+{CR}_{ing}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(11)

where TCR is the total cancer risk, CRinh is the inhalation cancer risk, CRder is the dermal cancer risk, and CRing is the ingestion cancer risk.

Proposal of exposure limits for heavy metals in indoor settled dust

The exposure limit for non-carcinogenic heavy metals in indoor settled dust was determined by calculating the concentration at which the HI remains < 1, as shown in Eq. (12), and for carcinogenic heavy metals, by back-calculation to determine the concentration at which the TCR remains < 1.00 × 10− 6, as shown in Eq. (13), using default exposure factors summarized in Supplementary Table S3.

$$\:\text{C}=\frac{BW\times\:AT}{EF\times\:ED\times\:(\frac{InhR}{PEF\times\:{RfD}_{inh}}+\frac{SL\times\:SA\times\:ABS}{{RfD}_{der}}+\frac{IngR}{{RfD}_{ing}})}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(12)

where C is the concentration of heavy metal in indoor settled dust (mg/kg), BW is the body weight (kg), AT is the averaging time (years), EF is the exposure frequency (days), ED is the exposure duration (years), InhR is the inhalation rate (m3/day), PEF is the particulate emission factor (m3/kg), RfDinh is the inhalation reference dose (mg/kg/day), SL is the skin adherence factor (mg/cm2), SA is the surface area (cm2), ABS is the absorption factor, RfDder is the dermal reference dose (mg/kg/day), IngR is the ingestion rate (kg/day), and RfDing is the ingestion reference dose (mg/kg/day).

$$\:\text{C}=\frac{1.00\times\:{10}^{-6}\times\:BW\times\:AT}{EF\times\:ED\times\:(\frac{InhR\times\:{CSF}_{inh}}{PEF}+SL\times\:SA\times\:ABS\times\:{CSF}_{der}+IngR\times\:{CSF}_{ing})}\:\:\:\:\:$$
(13)

where C is the concentration of heavy metal in indoor settled dust (mg/kg), BW is the body weight (kg), AT is the averaging time (years), EF is the exposure frequency (days), ED is the exposure duration (years), InhR is the inhalation rate (m3/day), CSFinh is the inhalation cancer slope factor (mg/kg/day)−1, PEF is the particulate emission factor (m3/kg), SL is the skin adherence factor (mg/cm2), SA is the surface area (cm2), ABS is the absorption factor, CSFder is the dermal cancer slope factor (mg/kg/day)−1, IngR is the ingestion rate (kg/day), and CSFing is the ingestion cancer slope factor (mg/kg/day)−1.

Source tracking

Heavy metal concentrations were log-transformed to reduce skewness and approximate a normal distribution. Pearson correlation analysis was performed to assess linear relationships between metal concentrations28. Normality of the log-transformed data was evaluated using the Shapiro–Wilk test. Principal component analysis (PCA) was conducted on standardized heavy metal concentration data to uncover latent structures and reduce dimensionality, as described in Eq. (14). Components with eigenvalues greater than 1 and a cumulative explained variance exceeding 80% were retained based on the Kaiser criterion29. Scree plots were used to determine the optimal number of components. Each principal component represents a linear combination of the original variables that maximizes the explained variance30. Component loadings were examined to interpret the grouping patterns of the heavy metals31.

$$\:{PC}_{i}={a}_{i1}{V}_{1}+{a}_{i2}{V}_{2}+\dots\:+{a}_{ij}{V}_{j}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(14)

where PCi is the i-th principal component, Vj is the j-th original observation variable, aij is the loading coefficient, representing PCi and Vj, and n is the total number of original variables.

Subsequently, hierarchical cluster analysis (HCA) was conducted to group the samples according to the compositional patterns revealed by PCA32. Euclidean distance was used to quantify dissimilarity between samples, as defined in Eq. (15)33. Ward’s linkage method was employed to calculate inter-cluster distances while minimizing within-cluster variance, as described in Eq. (16)34. The resulting dendrogram was analyzed to interpret the structural relationships among the identified sources.

$$\:d\left({x}_{i},\:{x}_{j}\right)=\sqrt{\sum\:_{k=1}^{p}{({x}_{ik}-{x}_{jk})}^{2}}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(15)

where \(\:{x}_{ik}\) and \(\:{x}_{jk}\) represent the contributions of species i and j to source k, and p is the number of sources.

$$\:\varDelta\:\left({C}_{i},\:{C}_{j}\right)=\frac{{n}_{i}{n}_{j}}{{n}_{i}+{n}_{j}}\stackrel{-}{{x}_{i}}-\stackrel{-}{{x}_{j}}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(16)

where Ci and Cj denote the two clusters being merged, ni and nj ​are the numbers of elements in each cluster, and \(\:\stackrel{-}{{x}_{i}}\) and \(\:\stackrel{-}{{x}_{j}}\) are the respective cluster centroids.

Positive matrix factorization (PMF) was applied to the measured heavy metal concentration matrix to identify source profiles and estimate their contributions, subject to non-negativity constraints, as described in Eq. (17)36.

$$\:{x}_{ij}=\sum\:_{k=1}^{p}{g}_{ik}{f}_{kj}+{e}_{ij}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(17)

where \(\:{x}_{ij}\) is the observed concentration of species j in sample i, \(\:{g}_{ik}\) is the contribution of source k to sample i, \(\:{f}_{kj}\) is the fractional concentration of species j in source k, and \(\:{e}_{ij}\) is the residual.

The PMF model is grounded in the principles of factor analysis, employing weighted least-squares minimization based on the individual uncertainties of the measured data. The model incorporates uncertainty by minimizing the objective function Q, as defined in Eq. (18), which reflects the measurement uncertainties35,36. In this study, uncertainty was estimated according to the method outlined in the EPA PMF 5.0 user guide37, which recommends incorporating both relative error and detection limit information. Model performance was assessed using the ratio of the observed Q value to its expected value (Q/Qexp), with values close to 1 indicating that the model adequately reproduced the observed data within the bounds of analytical uncertainty. In addition, bootstrap resampling was performed with 300 iterations to evaluate the robustness of the factorization results. For each iteration, source profiles were recalculated, and 95% confidence intervals (2.5th–97.5th percentiles) were derived, thereby quantifying the uncertainty associated with the source profiles.

$$\:Q=\sum\:_{i=1}^{n}\sum\:_{j=1}^{m}\left[\frac{{x}_{ij}-\sum\:_{k=1}^{p}{g}_{ik}{f}_{kj}}{{u}_{ij}}\right]\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(18)

where \(\:{x}_{ij}\) is the measured concentration of species j in sample i, \(\:{g}_{ik}\) is the contribution of factor k to sample i, \(\:{f}_{kj}\) is the profile of species j in factor k, \(\:{u}_{ij}\) is the uncertainty associated with the concentrations of species j in sample i, n is the total number of samples, m is the total number of chemical species, and p is the number of factors.

In this study, a Python-based PMF model was implemented using a non-negative matrix factorization (NMF) framework. This approach incorporates key features of the EPA model, including the use of uncertainty-weighted residuals and diagnostic evaluation through comparisons between observed and expected model fit statistics, as defined in Eq. (19)38.

$$\:X\approx\:WH\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(19)

where X is an M × N matrix representing the concentrations of M chemical species across N samples, W is an M × L basis matrix describing source profiles, and H is an L × N coefficient matrix representing source contributions.

In addition, cosine similarity was employed to assess the compositional similarity between the PMF-resolved source profiles and industry-specific emission profiles obtained from the national the pollutant release and transfer register (PRTR), as described in Eq. (20)40. The comparison was restricted to metals that were common to both datasets. For each PMF-resolved source, PRTR sector with the highest cosine similarity was identified to aid in source interpretation. A similarity threshold of 0.6 was applied, and only matches exceeding this value were considered compositionally meaningful41. To trace emission sources of heavy metals not listed in the PRTR, indoor dust concentrations were compared with fenceline monitoring data near an industrial complex. Regression analyses based on PMF-resolved sources identified the source with the highest coefficient of determination (R2) as the most probable contributor.

$$\:\text{c}\text{o}\text{s}\left(\theta\:\right)=\frac{\sum\:_{i=1}^{n}{A}_{i}{B}_{i}}{\sqrt{\sum\:_{i=1}^{n}{{A}_{i}}^{2}\times\:{{B}_{i}}^{2}}}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(20)

where Ai and Bi represent the contributions of metal i to the PMF-resolved source and PRTR-based profile respectively, and n is the total number of metals included in the comparison.

Results

Concentration of heavy metals in indoor settled dust

The concentrations of heavy metals in 20 indoor settled dust samples were analyzed. The statistical summary is provided in Table 1 and includes the arithmetic mean (AM), standard deviation (SD), geometric mean (GM), geometric standard deviation (GSD), minimum (Min), and maximum (Max) values. Among all metals, Zn had the highest maximum concentration at 4,912.01 µg/g, while Cd had the lowest at 0.001 µg/g. The average concentrations of heavy metals in indoor settled dust decreased in the following order: Zn > Mn > Cu > Cr > Pb > Ni > Co > Cd.

Table 1 Concentration of heavy metals in indoor settled dust.

Health risk assessment of heavy metals in indoor settled dust

Exposure assessment

The probabilistic assessment of five non-carcinogenic heavy metals in indoor settled dust demonstrated that ingestion was the predominant exposure route, followed by dermal contact and inhalation (Table 2). Zn consistently showed the highest exposure across all routes, with median ADDing, ADDder, and ADDinh values of 4.31 × 10− 4, 2.76 × 10− 5, and 2.26 × 10− 7 mg/kg/day, respectively. In contrast, Co exhibited the lowest exposures, with values of 1.94 × 10− 6 mg/kg/day for ingestion, 1.24 × 10− 7 mg/kg/day for dermal contact, and 1.02 × 10− 9 mg/kg/day for inhalation. The three carcinogenic heavy metals showed the same trend as the ADD results, with ingestion as the dominant exposure route, followed by dermal contact and inhalation (Table 3). Pb presented the highest exposures, with median LADDing, LADDder, and LADDinh values of 1.15 × 10− 5, 7.40 × 10− 6, and 6.05 × 10− 9 mg/kg/day. Conversely, Cd showed the lowest exposures, with ingestion, dermal, and inhalation values of 1.53 × 10− 7, 9.79 × 10− 9, and 7.99 × 10–11 mg/kg/day.

Table 2 ADD percentiles (25th–95th) of heavy metals by exposure route.
Table 3 LADD percentiles (25th–95th) of heavy metals by exposure route.

Risk assessment

The probabilistic risk assessment showed that HQ and HI values of non-carcinogenic metals in indoor settled dust remained below 1 across the 25th–95th percentile range (Table 4). Mn recorded the highest HQinh, ranging from 1.02 × 10− 2 to 2.69 × 10− 2, while Zn showed the highest HQder, ranging from 2.31 × 10− 3 to 1.16 × 10− 2, and Co exhibited the highest HQing, ranging from 3.97 × 10− 3 to 2.13 × 10− 2. The HI, which represents the combined non-carcinogenic risks from all exposure routes, also remained below 1. The risk exceedance percentage for HI was 0%, confirming the absence of non-carcinogenic health risks. The probabilistic risk assessment further showed that CR and TCR values of carcinogenic metals in indoor settled dust were distributed across the 25th–95th percentiles, with exceedances of the acceptable threshold of 1.00 × 10− 6 observed for Cd and Ni but not for Pb (Table 5). Pb presented no potential health risks across individual exposure routes. In contrast, Cd exceeded the acceptable threshold CRing ranging from 3.92 × 10− 7 to 7.72 × 10− 6, indicating that ingestion was the dominant contributor. Ni posed the greatest concern, with CRder ranging from 9.47 × 10− 6 to 3.91 × 10− 5 and CRing ranging from 5.91 × 10− 6 to 2.44 × 10− 5, both well above the threshold. The TCR ranged from 9.93 × 10− 8 to 5.31 × 10− 7 for Pb, consistently below the acceptable threshold of 1.00 × 10− 6, although a negligible exceedance of 0.59% was observed in the simulation. In contrast, TCR values ranged from 4.17 × 10− 7 to 8.22 × 10− 6 for Cd (49.75% exceedance) and from 1.54 × 10− 5 to 6.35 × 10− 5 for Ni (100% exceedance). The probabilistic distributions of HI and TCR derived from Monte Carlo simulations are illustrated in Supplementary Fig. S2. The sensitivity analysis based on PRCC identified concentration as the strongest positive contributor to both HI and TCR (PRCC: +0.95 to + 1.00), while body weight showed a strong negative influence (PRCC: − 0.89 to − 0.91). Inhalation rate generally had only a minor effect (PRCC: 0.00 to + 0.06), except for Mn where a positive contribution was observed (PRCC: +0.83), as illustrated in Supplementary Fig. S3.

Table 4 HQ and HI percentiles (25th–95th) of heavy metals and corresponding risk exceedance.
Table 5 CR and TCR percentiles (25th–95th) of heavy metals and corresponding risk exceedance.

Proposal of exposure limits for heavy metals in indoor settled dust

A back-calculation approach was used to determine exposure limits for non-carcinogenic and carcinogenic heavy metals in indoor settled dust, based on the HI and TCR, respectively, with target values of HI < 1 for non-carcinogens and TCR < 1.00 × 10− 6 for carcinogens, with results summarized in Table 6. Among the non-carcinogenic metals assessed, Cr had the highest allowable concentration at 845,118.3 µg/g, followed by Zn, Cu, Mn, and Co, with values of 801,778.9, 110,806.6, 59,930.0, and 977.6 µg/g, respectively. Based on these estimates, corresponding exposure limits are proposed to protect human health. The proposed values are approximately 845,115.0, 801,775.0, 110,805.0, 59,925.0, and 975.0 µg/g for Cr, Zn, Cu, Mn, and Co, respectively. For carcinogenic heavy metals (Pb, Ni, and Cd), the allowable concentrations calculated for adults based on this approach were 796.5 µg/g for Pb, 1.7 µg/g for Cd, and 2.5 µg/g for Ni. Accordingly, the proposed exposure limits were set at 795.0 µg/g for Pb, 1.5 µg/g for Cd, and 2.0 µg/g for Ni.

Table 6 Exposure limits of heavy metals in indoor settled dust.

Source tracking

Pearson correlation analysis

The Shapiro–Wilk test confirmed that log transformation improved normality, thereby supporting the validity of applying Pearson correlation analysis, which revealed significant relationships among heavy metal concentrations in indoor settled dust (Fig. 2). Strong positive correlations were observed between Mn, Co, Zn, Cr, and Pb including Cr–Pb (r = 0.987), Mn–Zn (r = 0.977), Co–Cr (r = 0.940), Co–Pb (r = 0.928), Zn–Cr (r = 0.760), Mn–Cr (r = 0.755), Zn–Pb (r = 0.735), and Mn–Pb (r = 0.727). Moderate positive correlations were also observed for Mn–Co (r = 0.694), Co–Zn (r = 0.688), Cu–Zn (r = 0.647), Cu–Ni (r = 0.599), and Cu–Mn (r = 0.596). All correlations were statistically significant (p < 0.01). In contrast, Cd appeared to behave independently from the other heavy metals, as it showed no statistically significant positive correlations and exhibited weak to negative associations with most of them.

Principal component analysis

The PCA results extracted three principal components (PC1, PC2, and PC3) with eigenvalues greater than 1, which together explained 89.862% of the total variance (Fig. 3). PC1 was primarily associated with Mn (loading = 0.425), Co (loading = 0.395), Zn (loading = 0.420), Cr (loading = 0.422), and Pb (loading = 0.429), accounting for 58.143% of the total variance. PC2 was characterized by Cu (loading = 0.667), and Ni (loading = 0.446) explaining 18.674% of the variance. PC3 was dominated by Cd (loading = 0.932), accounting for 13.025% of the variance (Table 7).

Fig. 3
Fig. 3
Full size image

Variance explained by principal components in PCA.

Table 7 PCA of heavy metals in indoor settled dust.
Fig. 4
Fig. 4
Full size image

Hierarchical clustering of heavy metals.

Fig. 5
Fig. 5
Full size image

Optimal PMF components based on Q/Qexp and variance.

Fig. 6
Fig. 6
Full size image

Regression analysis between fenceline and indoor settled dust heavy metal concentrations.

Hierarchical cluster analysis

The HCA was performed to assess compositional similarities among the eight heavy metals. As illustrated in Fig. 4, the heavy the metals were categorized into three distinct clusters based on their pairwise dissimilarities. The first cluster included Mn, Co, Zn, Cr, and Pb, indicating a relatively high similarity in their compositional profiles. The second cluster comprised Cu, and Ni which exhibited close interrelationships. Cd was grouped separately into a third independent cluster, reflecting a markedly different compositional pattern from the other metals. Clustering was conducted using Ward’s linkage method with Euclidean distance as the dissimilarity metric, facilitating the formation of compact and statistically meaningful clusters. The resulting dendrogram clearly delineated the three groups, visually representing the underlying compositional structure of the metals. These clustering outcomes were largely consistent with the patterns identified through PCA, further supporting the multivariate relationships among the examined elements.

Positive matrix factorization

Five components were identified as the most appropriate solution for the dataset, as they provided the best model fit. As shown in Fig. 5, the Q/Qexp ratio was reduced to 0.011, and the cumulative explained variance reached 98.556%, indicating that the five-component solution effectively captured the underlying structure without overfitting (Supplementary Table S4). The five source factors resolved by PMF and their heavy metal compositions are shown in Supplementary Fig. S4, while the detailed contribution profiles are summarized in Table 8. Source 1 primarily consisted of Cd, which represented 78.13% of the total Cd concentration, with only minor contributions from other heavy metals. Source 2 was mainly composed of Co (35.01%) and Mn (33.11%), whereas Cu, Cd, and Ni were not detected. Source 3 had major contributions from Zn (25.30%), and source 4 was the primary contributor to both Cu (9.45%) and Ni (47.65%). Source 5 was distinctively characterized by high contributions of Pb (28.76%) and Cr (26.16%). Bootstrap resampling with 300 iterations provided 95% confidence intervals for the heavy metal contributions, thereby allowing evaluation of the robustness of the resolved source profiles (Supplementary Fig. S5). The bootstrap results indicated that Cd exhibited the highest contribution with a median of 50.5% (95% CI: 3.5–85.5%), while Cu showed the lowest contribution with a median of 0.56% (95% CI: <0.01–14.5%). The PRTR-based sectoral emission profiles that can be compared with the PMF-derived source contribution profiles are presented in Supplementary Table S5. Source 1 was excluded from the similarity analysis because its dominant component, Cd (78.13%), was not included in the PRTR dataset. Emission sources were identified by calculating the cosine similarity based on the heavy metal compositional profiles derived from each PMF source and those of industrial sectors (Table 9). Source 2 showed the highest similarity to the manufacture of primary cells, batteries, and accumulators sector (0.921). Source 3 was closely matched with the manufacture of synthetic rubber and plastics in primary forms (0.898), while source 4 corresponded to the manufacture of basic precious and non-ferrous metals (0.862).Source 5 showed a moderate level of similarity to the manufacture of ceramic ware sector (0.686). All sources exceeded the similarity threshold of 0.6, indicating reliable compositional agreement with the corresponding PRTR sectors.

Table 8 Source contribution of heavy metals by the PMF model.
Table 9 Matched PRTR sectors for each PMF source based on cosine similarity metrics.

Fence monitoring

Linear regression analysis was performed to evaluate the relationship between fenceline metal concentrations and indoor settled dust levels, using monitoring data collected near a basic iron and steel manufacturing facility (Supplementary Table S6). The analysis revealed a strong positive correlation between the two datasets (R2 = 0.983, p < 0.01), indicating that ambient emissions were clearly reflected in indoor metal concentrations (Fig. 6). Among the analyzed metals, Cd exhibited particularly consistent concentration patterns in both the fenceline and indoor environments. This result is closely associated with the interpretation of source 1 identified by PMF, which was characterized by a dominant contribution of Cd.

Discussion

This study conducted a comprehensive assessment of human exposure and health risks associated with eight heavy metals (Cu, Mn, Co, Zn, Pb, Cd, Ni, and Cr) based on indoor settled dust collected from 20 households in Myodo-dong, Yeosu, the Republic of Korea. The concentrations of heavy metals in indoor dust were used to estimate both non-carcinogenic and carcinogenic risks, and exposure limits protective of human health were proposed. In addition, multivariate statistical analyses such as Pearson correlation, PCA, and HCA were conducted to explore interrelationships among the heavy metals and identify potential patterns. Furthermore, source apportionment was performed using PMF, and for substances not included in the PRTR database, potential emission sources were identified and validated through comparative analysis with fenceline monitoring data.

An analysis of heavy metal concentrations in indoor settled dust collected from 20 households in Myodo-dong, Yeosu, revealed that the average concentration was highest for Zn, with a mean value of 1,768.34 ± 1,304.81 µg/g. According to a study by Roy et al. (2024)42, elevated concentrations of Zn in indoor settled dust have been associated with various industrial activities. The high Zn concentrations detected in Myodo-dong are likely due to emissions from large-scale industrial complexes such as Gwangyang National, Yeosu, and Yulchon. ADD and LADD for five non-carcinogenic and three carcinogenic metals were estimated using exposure algorithms, with dust ingestion identified as the route contributing the highest values. Consistent with the findings of Ali et al. (2018)43, ingestion was identified as the most significant route for dust-related exposure. Although non-carcinogenic metals posed no significant risks, the TCR for Cd and Ni exceeded 1 × 10− 6, with Ni showing the highest concern, highlighting the need for continued monitoring. Exposure limits for five non-carcinogenic and three carcinogenic heavy metals in indoor settled dust were established through back-calculations based on HI and TCR. The relatively higher thresholds for Cr and Zn indicate their lower toxicological potency and greater tolerable exposure levels44. In contrast, carcinogenic substances tend to have lower thresholds due to their potential to cause adverse health effects even at low doses45. When both the RfD and CSF are available, CSF-based values are generally preferred to better protect vulnerable populations, including children, pregnant women, and the elderly.

Pearson correlation analysis revealed strong positive correlations were observed between Mn, Co, Zn, Cr, and Pb including Cr-Pb (r = 0.987), Mn-Zn (r = 0.977), Co-Cr (r = 0.940), Co-Pb (r = 0.928) suggestion a shared emission source. The simultaneous detection of Mn, Co, Zn, Cr, and Pb may be attributed to sharXed industrial sources, such as the use of metal-containing materials, high-temperature processing, and recycling activities, as supported by previous studies on heavy metal emissions46,47,48. Both PCA and HCA results showed co-occurrence of Cu and Ni, suggesting a potential link to the manufacture of basic precious and other non-ferrous metals, such as copper smelting, nickel refining, and alloy production processes that commonly emit these metals49. The PMF analysis resolved five distinct source profiles characterized by unique elemental compositions, and bootstrap resampling with 300 iterations confirmed their robustness despite the small sample size. Comparison with sector-specific emission data from the PRTR revealed strong compositional consistency across all identified sources, with cosine similarity values exceeding 0.642. Enriched in Co and Mn, source 2 showed the highest similarity with the manufacture of primary cells, batteries, and accumulators. This finding, in line with the results reported by Brown et al. (2024)50, suggests that battery-related industrial activities may serve as a major contributor to indoor metal contamination, especially in regions close to electronic or energy materials production facilities. Source 3, enriched in Zn, was identified as being associated with the manufacture of synthetic rubber and primary-form plastics, where zinc-based compounds are commonly used as stabilizers and processing aids, potentially leading to Zn emissions during production and recycling51. The source dominated by Cu and Ni showed consistency with the results of correlation analysis, PCA, and HCA, and was identified as source 4, which was associated with the manufacture of basic precious and non-ferrous metals. This attribution is further supported by previous studies, including life cycle assessments and emission analyses, which have demonstrated that copper and nickel are major pollutants released during smelting and refining processes in the non-ferrous metal industry52,53. Source 5, identified as being associated with the manufacture of ceramic ware, was characterized by elevated levels of Pb and Cr, which are commonly used in ceramic glazes and colorants and may be emitted during high-temperature firing processes54,55. Cd was not included in the PRTR inventory, making it difficult to assign Source 1, which showed a predominant contribution of Cd, to a specific sector. Nevertheless, a strong correlation between Cd concentrations at the fenceline and those found in indoor settled dust (R2 = 0.983; p < 0.01) suggests that ambient emissions are substantially reflected indoors. This finding underscores the importance of fenceline monitoring as a complementary method for detecting emission sources that may be missing or underrepresented in inventory-based analyses, and it partly addresses the limitation of direct source attribution for Cd by providing indirect validation of its origin.

This study integrated exposure and health risk evaluation based on indoor settled dust, multivariate statistical analyses, PMF receptor modeling, and validation with PRTR and fenceline monitoring data, providing a comprehensive framework for robust source tracking even for substances not listed in emission inventories. However, several limitations remain. Although 20 out of approximately 50 households in the study area participated, which represents a relatively high sampling proportion, the overall sample size is still limited. In addition, the absence of Cd data in the PRTR inventory restricted direct source attribution. Furthermore, household-specific factors such as ventilation conditions, occupant behavior, and cleaning frequency were not considered, even though they can influence dust accumulation and human exposure levels. Future studies should apply site-specific indoor air measurements to refine inhalation exposure estimates, as this study relied on a default PEF recommended by EPA and RIVM that, while ensuring consistency and comparability with previous research, is inherently site-specific and may vary depending on meteorological conditions and indoor activities. As with all PMF applications, rotational ambiguity cannot be fully excluded. In this study, bootstrap resampling with 300 iterations provided confidence intervals that supported factor stability, partly addressing this limitation. Nonetheless, additional validation is warranted, as cosine similarity between PMF-derived profiles and PRTR sectors does not resolve alternative rotations. Future research should therefore complement this approach with displacement analysis or independent tracer validation to further strengthen source attribution.

Conclusions

This study evaluated the health risks, proposed exposure limits, and identified sources of eight heavy metals (Cu, Mn, Co, Zn, Cr, Pb, Cd, Ni) in indoor settled dust from households in Myodo-dong, an industrially impacted area in the Republic of Korea. Zn was the most abundant, and ingestion was the main exposure route. While non-carcinogenic risks were within safe limits, carcinogenic risks for Ni and Cd exceeded acceptable thresholds. Source tracking were first identified through multivariate analyses (correlation analysis, PCA, HCA), and five emission sources were then quantitatively resolved using PMF. These sources were validated against PRTR sector profiles using cosine similarity. Cd, although not included in the PRTR inventory, was a dominant PMF source and showed a strong correlation with fenceline data. The findings highlight the need for integrated monitoring and modeling to capture sources overlooked by current inventories.