Introduction

Rheumatoid arthritis (RA) is an autoimmune disease characterized by limited and malformed joint function, joint fever, swelling and pain1,2. Epidemiological studies have shown that the incidence of RA is 0.5-1.0% for adults at any age, particularly pronounced in women aged 40–50 years3. Although the pathogenesis of RA is not clear at present, accumulating evidences provided that the occurrence and development of RA is a multifactorial interaction process4. Among them genetic and environmental factors are particularly important. In recent year, environment pollution such as exposure to pesticides, insecticide and traffic pollution provided increase the incidence of RA5,6,7. Despite controlling for many risk factors to decrease the development of RA, it is still necessary to analyze the novel environmental pollutants to prevent RA.

Volatile organic compounds (VOCs) are the common environment pollutants in our daily life. Because of low molecular mass and easily evaporate at normal temperatures, which produced by industrial manufacturing processes, traffic pollution, home decoration materials and cleaning8,9. VOCs can be absorbed by human bodies through respiratory system10. Compare with other pollutions, VOCs is more harmful to the human body. The VOCs can be metabolites variety of mixtures in the blood and potentially associated with the risk of several disease, including chronic respiratory disease11, kidney disease12, and obesity13. At the same time, VOCs can abnormal active or overactive immune cells. Finally, induce autoimmune disease such as RA and asthma14,15. Based on previous researchers, total 279 compounds are found in urine while almost all the VOCs cannot directly detect in blood16. These VOCs were metabolized by liver and kidney, then eliminated our body. These metabolized can be detected in the urine. Hence, the metabolites of urinary VOCs can be accurately determined in the blood and urine which identified as stable biomarkers17,18. Although some of urinary VOCs metabolites can be detected, the association between VOCs metabolites and RA is understudied.

Therefore, this study aimed to reported the association between RA and urinary VOCs which can representative specifically the degree of exposure to specific VOCs. We obtain the data from the National Health and Nutrition Examination Survey (NHANES) from 2011 to 2018 survey cycle. To our best knowledge, logistic regression, WQS, qgcomp and BKMR model were first used to systematically investigate whether the exposure to specific VOC were association with RA. This study will provide new evidence about the association of VOC exposure and RA, and guide human how to prevent and control measures against RA.

Methods

Study design and sample population

We utilized data the National Health and Nutrition Examination Survey (NHANES) sponsored by the CDC, which assess the health and nutritional status of the U.S. population by collecting demographic, dietary, examination, and laboratory data (https://www.cdc.gov/nchs/nhanes/index.htm). All protocols implemented by NHANES were reviewed and approved by the National Center for Health Statistics (NCHS) Ethics Committee, and all participants signed informed consent forms. We confirming that all experiments were performed in accordance with relevant guidelines and regulations. This program was employed a stratified, multi-stage probabilistic sampling method and performed every two years. In this study, data from four NHANES cycles (2011–2012, 2013–2014, 2015–2016, 2017–2018) were encompassed. Participants who didn’t know if they had RA, or had missing information on urinary VOCs or other covariates were excluded. Finally, we identified 3390 participants for analysis.

RA assessment

RA diagnosis was conducted via a questionnaire survey. Briefly, every participant was asked to answer the following questions. “Has a doctor or other health professional ever told you that you have arthritis?”. If the reply was ‘‘Yes,” a follow-up question ‘‘Which type of arthritis was it?” was asked to further clarify. The participants who answered “rheumatoid arthritis” were defined as RA group, while the participants answered “no” for the first question were defined as non-RA group in the study.

Urinary volatile organic compounds metabolites

In this study, the concentrate of VOCs in human urine was measured by liquid chromatography coupled with electrospray tandem mass spectrometry (UPLC-ESI/ MSMS). The limit of detection (LOD) was descripted by LOD divided by the square root of two based on the NHANES guideline. In total, 15 kinds of VOCs were included with detection rates > 70% in the analysis. The abbreviations and detection rates of urinary VOCs involved in the study are shown in Table S1.

Covariates

Data on covariates were collected according to previous study included19,20, age, sex (male or female), race/ethnicity (Mexican American, Other Hispanic, non-Hispanic white, non-Hispanic black, other race/multiracial), and education (less than high school, high school or equivalent, some college, college graduate or above) were collected from questionnaires. Other covariates such as weight, height, BMI were based on physical examination. Through baseline questionnaires, smoker was defined as someone who had smoked at least 100 cigarettes throughout the course of his or her lifetime. Similarity, alcohol users were identified as individuals who consumed alcohol drinks in the past 12 months. For hypertension and diabetes, participants were reported “Have you ever been told you had high blood pressure?” and “Did the doctor tell you that you have diabetes?”. If the answer was “yes”, which defined as “hypertension” or “diabetes”. Based on previous research, liver function (ALB, ALP, ALT, AST, GGT, TB) and kidney function (BUN, Cr) were collected from the laboratory results21,22.

Statistical analysis

In this study, baseline comparisons stratified by RA were compared using Chi-square tests. In this study, 4 survey cycles were used with average weights that were computed as 2-year subsample weights/4. All the variables are expressed as unweighted mean (SD), while categorical variables were expressed as n (percentages). The concentrations of urinary VOCs were Ln-transformed to acquire approximately normal distribution. The correlations between Ln-transformed concentrations of urinary VOCs were applied Pearson method and divided into four quartiles (Q1, Q2, Q3, and Q4). For subgroup analysis, we stratified analysis was performed according to age (young and middle-aged group (20 ≤ Age < 60), as well as the elderly group (Age ≥ 60)).

Firstly, multivariate logistic regression model was applied to determine the effect of urinary VOCs on RA. For sensitivity analyses, the first quartile (Q1) as the reference group, logistic regression was also performed to analyze through adjusting for different covariates. Trend (p trend) testing was performed to evaluate concentration of each quartile as a linear variable. In the stratified analysis, we classified age into two group and also test the correlation between urinary VOCs and RA by multivariate logistic regression.

Secondly, WQS and qgcomp were used to evaluate the overall effects of mixture VOCs on RA. For WQS, we randomly dividing the data into training and validation sets (40:60), the regression model was bootstrapped 10,000 time. Unlikeness, the qgcomp can evaluate the combined effect of all exposures and not required the directional homogeneity assumption. And the sums of positive or negative weights represented the proportion of the contribution of each VOCs. Next, BKMR model was used to assess the effect of VOCs mixed exposure on RA in a specific quartile, compared with the median through 10,000 iterations. By plotting bivariate expose-response, the expose-response and dose-response curves and visualizing correlation with VOCs. Posteriori inclusion probability (PIP) was calculated in conjunction with the BKMR to estimate the relative contribution of each component of the VOCs mixtures.

Thirdly, according to the AIC value, four nodes were selected perform restricted cubic splines (RCS) regression analysis, which further explore the correlation with RA. In order to find potential interaction, the likelihood ratio was used to test whether there were covariate and AMCC, CEMC, and CYMC interactions. All analyses were performed using R 4.2.2 software. In this study. p < 0.05 was considered statistically significant.

Results

Population characteristics

Figure 1 showed the flow chart of participants. In total, 3390 eligible participants from 2011 to 2018 were included in this study. Among them, 204 (6%) were diagnose RA and 3186 (94%) were participants without RA. Table 1 showed the baseline characteristics of this study participants. Furthermore, there were statistically significant differences between RA and non-RA participants in terms of age, ethnicity, education, alcoholic drink, hypertension, diabetes, smoke, height, BMI, ALB, ALP, TB, Cr, and GGT (all p < 0.05). Other aspects, no significant difference in gender, weight, AST, ALT, and BUN (all p > 0.05).

Fig. 1
figure 1

Flow chart of participants selection.

Table 1 Characteristics of participants.

Urinary VOCs levels and correlations of the study population

Table S1 showed the abbreviation and detect ration of 15 metabolites of urinary VOCs. Among them, the detection rates of all urinary VOCs were > 80%, except for CYMC (78.97%). Only the urinary concentrations of AMCC, CEMC, DHBC, MB3C, PHGA, and PMMC were significantly higher than in RA compare to the participants without RA (all p < 0.05). Fig S1 showed the Pearson’s correlation coefficients among the Ln-transformed concentrations of the urinary VOCs. MB3C was found to significantly correlation between HPMC (r = 0.83, p < 0.001), and HPMC (r = 0.88, p < 0.001), followed by 2MAH and 3,4-MHA (r = 0.89, p < 0.001), PMMC and HPMC (r = 0.86, p < 0.001), HPMC and CEMC (r = 0.83, p < 0.001), MADA and PHGA (r = 0.84, p < 0.001), PHGA and DHBC (r = 0.81, p < 0.001). Other urinary VOCs exhibited moderate to weak correlations which coefficients range from 0.22 to 0.80.

Urinary VOCs exposure and RA risk in the logistic regression model

In order to assess the potential association between mixed exposure of volatile organic compounds and RA. We employed univariate and multivariate logistics regression model according to adjust different covariables. Table 2 presented in model I without controlling for any covariates, we can find AMCC, CEMC, DHBC, MB3C, PHGA, and PMMC were significantly association with RA (all p < 0.05). After adjusted different covariables, we observed 2MHA, AAMC, AMCC, CEMC, CYMC, DHBC, HPMC, and MB3C significantly correlated with RA (all p < 0.05).

Furthermore, a sensitivity analysis was performed. In model I adjusted for none covariables compared to Q1, participants with increasing quantiles of AAMC, AMCC, BMAC, CEMC, CYMA, DHBC, HPMC, MB3C and PMMC showed an increased risk of RA. While adjusted different covariables in model II and model III, the increasing risk of RA was still observed in participants with increasing quantiles of AAMC, AMCC, CEMC, CYMC, DHBC, HPMC, MB3C, and PMMC (all p for trend < 0.05, except for DHBC and PMMC) (Table S3).

Table 2 Logistic regression analysis of Ln-transformed and RA, NHANES 2011–2018 analyses.

Subgroup analysis

Table 3 showed the subgroup analyses stratified by age (group I: age 20–59, group II: age older than 60). After adjusting for all covariates through multivariate logistics regression model, the results of subgroup analysis displayed that the association between urinary VOCs levels and RA was mainly present in participants aged between 20 and 59 years. Accordingly, we found that 2MHA, 3,4-MA, AAMC, AMCC, CEMC, CYMC, HPMC, PMMC and MB3C were significantly association with RA (all p < 0.05).

Table 3 Logistic regression analysis of Ln-transformed and RA in subgroup analyses stratified by age, NHANES 2011–2018 analyses.

WQS and qgcomp model to evaluate the associations of urinary VOCs co-exposure with RA risk

We employed the WQS and qgcomp model to examine the association between the combined effects of urinary VOCs and the prevalence of RA. Table 4 showed both in the model I and model II, the WQS positive direction index of urinary VOCs were significantly correlated with RA in total participants (model I: OR:1.22, 95%CI: 1.00-1.48; model II: OR:1.39, 95%CI: 1.07–1.80). However, the qgcomp index of urinary VOCs were significantly in model II (OR:1.30, 95%CI: 1.05–2.86). Interestingly, both the WQS (model I: OR:1.88, 95%CI: 1.31–2.70; model II: OR:1.7, 95%CI: 1.10–2.62) and qgcomp (model I: OR:1.63, 95%CI: 1.18–3.27; model II: OR:1.91, 95%CI: 1.33–3.47) model were showed significantly correlated with RA in age 20–59 subgroup. Meanwhile, there was no significantly correlated with RA in age older than 60 group.

Fig S2-4 presented the estimated urinary VOCs metabolites weights of RA. In total participants the weight of AMCC was top in the mixtures, followed by CYMC and AMCC. In the age 20–50 group, the weight of 2MHA was top in the mixtures, followed by CEMC and CEMC. Additionally, these three mixtures were in the positive direction through the qgcomp regression. Finally, in the age older than 60 group, the weight of CYMC was top in the mixtures, followed by CEMC and AMCC. Interestingly, in the age group 20–59, AMCC, CYMC and CEMC are all greater than the weight standard value (1/15 = 0.67).

Table 4 Odds ratios (95% CI) of RA associated with co-exposure to urinary VOCs by WQS and qgcomp analyses in total population and subgroups.

BKMR model to analyze the association of urinary VOCs metabolites co-exposure with RA

The joint effect of urinary VOCs metabolites on RA was displayed in Fig S5-7 A. For the total participants, the overall risk of RA exhibited increased of urinary VOCs metabolites concentration below the 50th percentile. While above the 50th percentile, the risk of RA tends to decline as the urinary VOCs metabolites concentration increases, the despite the overall effect was not significant. And similar trends and effects were observed in age older than 60 group. However, in the age 20–59 group, the overall risk of RA exhibited significant uptrend with the increase of urinary VOCs metabolites concentration compared to the 50th percentile. From the results of Fig S5-7B, when the concentrations of urinary VOCs were fixed at the 50th percentile, CYMC, CEMC, and AMCC showed a significant positive effect on RA risk in group age 20–59, with the PIP value were 0.514, 0.490, and 0.476, respectively. Similarly, the trends and effects were showed in total participants and age older than 60 group, despite the correlations did not reach significant levels. The PIP values of urinary VOCs were showed in Table S4.

As Fig S8-9 revealed the trends of the expose-response function of urinary VOCs. When the mixture components at the median concentration, AMCC, CEMC, and CYMC presented positive dose-response associations with RA. HP2C showed negative dose-response associations with RA among the participants age 20–59 group. Similarly, it presented DHBC was positive with RA and BMAC was negative with RA instead. In total group, we did not find significant dose-response associations with RA. We further investigated the interactions between different urinary VOCs, separately. Through fixed the other metabolites at the median level and determined the exposure-response function at the 10th, 50th, and 90th percentiles, respectively. We discovered a junction between AMCC and CEMC when the quantile from 10th increased to 90th, suggesting a possible interaction in age 20–59 group. In the total participant group or age older than 60 group, we did not find this possible interaction between AMCC, CEMC, and CYMC (Fig S11-13).

Dose-response relationships between AMCC, CEMC, CYMC and RA in age 20–59

The restricted cubic spline curves further presented the relationship between urinary VOCs and RA. After adjusting for confounders, Fig S14-16 showed a linear relationship between the concentration of CEMC (p for nonlinear = 0.543), CYMC (p for nonlinear = 0.550), AMCC (p for nonlinear = 0.689) and RA risk. It was significantly increased the risk of RA according to the levels of urinary CEMC (p for overall = 0.000). Similar association was showed in CYMC (p for overall = 0.000) and AMCC (p for overall = 0.002). As presents in Table S5-7, we stratified by gender, BMI, education, ethnicity, alcoholic drinks, smoking, diabetes, and hypertension, no significant interactions were observed in CEMC and AMCC (all p for interaction > 0.05). Additionally, alcoholic drinks (p for interaction = 0.042) and smoking (p for interaction = 0.046) might be potential modifier for the relationship between CYMC exposure and RA in age 20–59.

Discussion

In this cross-sectional study, we investigated 3390 adults from NHANES database between 2011 and 2018 and systematically analysis the association between specific of urinary VOCs and risk of RA. After adjusting for potential confounders, we can find 2MHA, AAMC, AMCC, CEMC, CYMC, DHBC, HPMC, and MB3C were independently correlated with RA through multivariate logistics regression analysis. Subgroup analysis showed that the relationship between urinary VOCs metabolites and RA were significant correlation among age 20–59. In general, the WQS regression can explore the effect of mixed exposure and outcomes in one direction, while qgcomp regression can interact with the outcome in either direction. The BKMR model was analyzed by producing kernel functions of exposure factors in the model and then using Bayesian sampling, which can allow nonlinear interactions between exposure factors. However, the BKMR model is unable to assess the co-exposure patterns of low and high levels of chemicals23. Combined with WQS, qgcomp and BKMR model, it finds that the overall effect of VOCs can increase the risk of RA in age 20–59. Additionally, CEMC, CYMC, and AMCC were the major contributors. It displayed a linear relationship and cutoff effect between CEMC, CYMC, AMCC and RA. Finally, when stratified by gender, race, education, alcohol, smoke, diabetes, hypertension, and BMI, the results revels smoke and alcohol were potential modifier for the relationship between CYMC exposure and RA.

VOCs are common and accessible pollutant that evaporate easily at room temperature and resistant to degradation in our daily life. Vehicle exhaust emissions, adhesives, paints and house decoration may increase the level of environment pollutions24,25. In general, the main route of VOCs absorbed by human is lung inhalation and enter into bloodstream through the alveoli. There were metabolized through liver digestive enzymes cytochrome P450, to form water-soluble VOC metabolites and pass out the body through kidney26. As the character of volatilization, the results are not accurate by direct detecting in blood. As such, the water-soluble VOCs metabolites in urine as the indicator of the exposure index for VOCs27. Most of previous studies focus on the relationship between urinary VOCs and diseases28,29. In recently, studies using the data from NHANES, researchers reveal that urinary VOCs metabolites are association between depress30, obesity31, and chronic respiratory diseases32. Until now, none of study systematically the relationship between VOCs metabolites and RA.

As a major metabolism of VOCs, the corresponding parent compound of CEMC was acrolein which be produced during by combustion, sterilant in industry, water treatment, and synthesizing of many industrial chemicals33. One animal study provided that the CEMC can formed after oxidation when the acrylic acid was transformed into 3HPMA through conjugating with glutathione in vivo. Based on one research about the effect of acrolein on glucose metabolism in skeletal muscle, authors revealed when the mice exposure to acrolein (2.5 and 5 mg/kg/day) for 4 weeks substantially may increase fasting blood glucose and impaired glucose tolerance34. Another nationwide cross-sectional study indicated positive associations between multiple VOCs and metabolic syndrome35. Additionally, the WQS analysis the weight of the positive association between VOCs mixture and high blood pressure was responsible for 66.40% 35. Only one research performed a cross-sectional study to investigate the association between VOC and risk of RA. Hu reported that significantly higher concentration of 7 VOCs (AMCC, CEMC, DHBMA, 3HPMA, MHBMA3, PGA and HMPMA) detected in the RA rather than non-arthritis subgroup. Through adjusted the covariates, using logistic regression showed there were 6 VOCs (CEMA, 3HPMA, DHBMA, AMCC, PGA and MA) remained to be associated with RA36. Interestingly, our study found 9 VOCs (AAMC, AMCC, BMAC, CEMC, CYMC, DHBC, HPMC, MB3C and PMMC) were correlation with RA in crude model. While adjusted the all the covariates, results show 8 VOCs (AAMC, AMCC, CEMC, CYMC, DHBC, HPMC, MB3C, and PMMC) were significant different between RA and non-RA participants. Most of these parent compounds metabolites have been proved as hazardous environmental pollutants for human.

In order to further analyze the association between exposure to VOC and RA, subgroup analysis presented that a growing risk of 8 VOCs (2MHA, 3,4-MA, AAMC, AMCC, CEMC, CYMC, HPMC, and MB3C) were significantly correlated with an elevated prevalence of RA in age 20–59. There were no significantly correlated with participants in age older than 60. The reason for this age difference is that middle-aged and young adults are easier to be exposed to VOCs than older people. As we know, in gas stations, factories, kitchens and chemical industrials, most employees are age between 20 and 59 37,38. Therefore, these participants have more chance to exposure VOCs. Additionally, the degree of immune system response to VOC is also a major factor. In general, autoimmune disease may induced by a lack of immune tolerance and caused tissue damage. Beyond the function of different immune cells, such as T cells, dendritic cells (DCs), and naïve CD4 + T cells, causing the increase of secretion cytokines such as IL-12 and IL-23 39,40,41. Finally, results in the emergence of autoimmune diseases. There are over 100 autoimmune diseases correlation with exposure VOCs42. The potential mechanisms by which VOCs were involved in RA was still not completely understood, although several explanations have been presented. None of accurate environmental factors are totally supported by direct causal evidence.

Except for acrolein, another VOCs metabolizes is found to associate positively with RA in this study. As the major metabolite of acrylonitrile, the statistical evidence for CYMC is significant in Q4 or the whole quantiles. Acrylonitrile is common applied in manufacture acrylic fibers, plastics, synthetic rubbers, and acrylamide, which we contacted in daily life43. This compounds metabolites major expelled via the urine through direct conjugation with reduced glutathione (GSH)44,45. Study shows the acrylonitrile can result in headaches, nausea and dizziness46. It was identified as possibly carcinogenic. Consistent with previous study, our research finds urinary CYMC is association with RA after adjusting covariates in model II or model III36. In order to further analyze the interaction between covariates and CYMC. It reported that smoking and alcohol drinking are the two major factors, which interact with CYMC and promote the occurrence of RA.

In addition, this study also found that AMCC can increase the risk of RA. N, N-Dimethylformamide is the parent compound of AMCC, metabolized primarily through liver enzyme47. Study has showed N, N-Dimethylformamide can induce liver damage by reduced GSH level and increased reactive oxygen species48. This was involved the activation of NLRP3 inflammasome and lead to injury the central nervous, immune and reproductive systems49. In the current study, we found AMCC was significant correlation with RA both in crude model and model II or model III. Simultaneously, the statistical evidence for AMCC proved significant in any quantiles. The further analyzed shows AMCC possible has interaction with CYMC and CEMC based on BMKR model. Those results can help us understand why VOCs may cause RA, since excessive oxidative stress and inflammatory reaction are also the main mechanism.

There were several strengths in the current study. The participants of data come from the nationally reliable and representative NHANES survey. 3390 samples allowed adjustment for multiple covariates to minimize confounding bias. To best our knowledge, this is the first study combine with logistic, WQS, qgcomp, and BKMR model in estimation of mixed effect of exposures with RA. It was analyzed using 4 quantile logistic regression models to assess the association between VOCs and RA with different covariates. The analysis results still remained stable for CEMC, AMCC, and CYMC. Nonetheless, our study has several limitations. Firstly, as the cross-sectional study design limitations, the casual relationship between the exposure of VOC and RA was unable to be inferred. Further prospective cohort study with a large sample size is essential to proof the results of our study. Secondly, the main criteria of RA were based on recollections, which may lead to bias due to participants unclear about whether they suffered from RA. Meanwhile, we adjusted the potential confounders during the analysis, the other presence of unmeasured confounding effects cannot be entirely ruled out. Finally, the urinary VOCs metabolites from the US population were used to reflect the exposure of VOCs in this study, which cannot represent the entire global population.

Conclusion

In summary, our study demonstrated a positive correlation between urinary VOCs concentration and RA. Of the 15 VOCs components involved in this study, the mixture-exposed analyses consistently revealed the positive correlations between urinary VOCs co-exposure and RA risk, and N-acetyl-S- (2-carboxyethyl)-L-cysteine (CEMC), N-acetyl-S-(N-methylcarbamoyl)-L-cysteine (AMCC), and N-acetyl-S- (2-carboxyethyl)-L-cysteine (CYMC) was the primary positive driver in age 20–59 by utilizing logistic regression, WQS, qgcomp and BKMR model. Given the limitations of this study, more prospective studies should be conducted to confirm the relationship between VOCs exposure and the prevalence of RA in the future.