Introduction

Hashimoto’s thyroiditis (HT) is the leading cause of thyroid diseases and hypothyroidism while representing the most common autoimmune disease affecting mostly middle-aged women1. The primary treatment for HT is hormonal replacement therapy with levothyroxine, aiming to substitute the thyroid gland’s lost function. However, many patients have reduced quality of life (QoL) and persisting symptomatology, even though being euthyroid with normal thyroid stimulating hormone (TSH) levels, highlighting the need to address the underlying etiopathogenetic mechanisms of HT2. HT is a multifactorial autoimmune disease involving environmental, metabolic, dietary, hormonal and lifestyle triggers aggravated by susceptible genes associated with the disease3. In addition, the presence of metabolic complications and/or comorbidities further aggravates the patient’s overall health4,5,6. The association between thyroid autoimmunity and cardiometabolic risk has been well-established, even in euthyroid patients. It has been postulated that the autoimmunity-mediated pro-inflammatory microenvironment increases the risk of metabolic diseases5.

On the other hand, studies analyzing the association between insulin resistance (IR), inflammation and autoimmune thyroiditis have indicated a contributing role for IR in autoimmunity processes7. In addition, the presence of IR may hamper the efficacy of levothyroxine treatment, as shown by a parallel study of men with or without IR receiving levothyroxine8. Central to the above mechanisms is the role of the gut microbiome through the thyroid-gut axis. Dysbiosis, which refers to the intestinal microbiome balance disruption, can directly or indirectly affect thyroid hormone production and aggravate low-grade inflammation in a T-cell-dependent manner9. Μicronutrients which are important for thyroid function and hormonal production such as iodine, iron, copper, chromium, selenium and zinc are usually scarce in patients with HT either due to low intake or malabsorption caused by dysbiosis10. Beyond their effect on the thyroid gland, micronutrients are essential for regulating immune responses and vitamin D supplementation has been shown to be beneficial in thyroid and other inflammatory diseases11,12. In addition, increased gut permeability due to dysbiosis facilitates the entry of antigens into blood circulations, activating an immune response10. Considering the large number of studies demonstrating distinct metabolic imbalances and cellular dysfunctions in HT, this study aimed to define the metabolic imprint of HT using a comprehensive panel of metabolites as markers.

Metabolites, small molecules found in human biofluids, carry all the required information for past exposures and current health state on a cellular/metabolic level. Significant advancements have been made in mapping human metabolomes and deciphering disease-associated metabolites as potential biomarkers. Still, few studies have focused on identifying potential metabolic biomarkers for HT and most of them either employ exploratory untargeted metabolomics or study a small heterogeneous group13,14,15. There is a pressing need to seek sensitive biomarkers that capture the metabolic state of patients with HT and reflect the underlying metabolic disturbances. Metabolism is a complex network of thousands of metabolites that might be a marker of specific metabolic blocks. Therefore, we analyzed the changes of metabolites participating in key cellular metabolism and function pathways. Considering their biological relevance, we measured the levels of urine organic acids and plasma fatty acids in human samples of patients with HT and compared them to age and sex-matched healthy individuals.

Materials and methods

Study design

The present study is part of the metabolic biomarkers in Hashimoto’s thyroiditis and psoriasis (METHAP) clinical trial with registration number NCT04693936 at clinicaltrials.gov. The details of the rationale, objectives and design of the study have been described previously16. Between February 2021 and July 2023, a total of 200 individuals were recruited at the Heraklion University Hospital in Crete, and the Health Clinic for Autoimmune and Chronic Diseases in Athens, with the contribution from private practices in Athens to reach the required number of participants. For the purpose of this case-control study, a total of 120 participants, 62 patients with HT and 58 age and sex-matched healthy individuals were included.

Potential participants were first screened by an endocrinologist to assess whether the patients met the following criteria:

Inclusion criteria for all participants: 18–60 years old, BMI < 30, non-lactating or pregnant women and non-athletes.

Inclusion criteria for Hashimoto’s thyroiditis: Presence of anti-thyroid antibodies and gray-scale findings.

Exclusion criteria for Hashimoto’s thyroiditis: Individuals having undergone complete thyroidectomy with malignant or congenital goiter.

Exclusion criteria for the control group: Participants with acute or chronic disease receiving medication, anti-depressants or supplements.

Participants were verbally informed of the study objectives, duration, and methodological details, and upon their consent, they were requested to read and sign the informed consent.

Baseline measurements include TSH levels for both groups and FT3, FT4, anti-TPO and anti-TG for the HT group. In addition, participants were requested to fill in a form to record demographic data, including age, sex, waist circumference, presence of comorbidities, drug and/or supplement use, smoking status, exercise frequency (times/week), alcohol consumption (number of glasses/week) while their dietary habits were recorded through the Mediterranean Diet Score (MDS). Metabolomic profiling was performed in both groups using targeted metabolomics.

Ethics

Collected data were anonymized, and the study conformed to the EU General Data Protection Regulation (GDPR). Research was performed in accordance with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. The present study has received approval by the Research Ethics Committee of the University of Crete (AP 147/10072020). An informed consent was obtained from all participants.This research received no external funding.

Sample collection

Fasted participants were requested to collect urine samples in a sterilized container. Peripheral blood was collected on the same day and time to ensure minimal day-to-day or daytime metabolite fluctuations using a K2-ethylene diamino tetra-acetic EDTA-containing vacuum blood collection tube. Plasma isolation was performed by centrifuge at 1500×g at 4 °C. In the case of hemolysis, the hemolyzed sample was discarded, and the blood collection was repeated. Urine and plasma samples were aliquoted and then stored at − 80 °C and − 20 °C, respectively, until analysis and up to 24 h to ensure minimal metabolite degradation.

Samples processing and GC–MS conditions

The steps used for the isolation of urinary organic acids and plasma fatty acids, followed by the separation, detection and quantification methodology, were conducted referring to previously published methodology on the metabolites of interest17,18. Concentrations of organic acids are reported in relation to creatinine levels. As enzymes catalyze metabolic reactions, we estimated the ratio of product metabolite to reactant metabolite to assess the function of specific enzymes.

Statistical analysis

All statistical analyses were undertaken using the statistical package SPSS, the free software r-project, and the web-based platform vMetaboanalyst 6.0. All data variables were pre-processed via auto-centering, log transformation, and median normalization while missing values for biomarkers were replaced by 1/5 of the Lower Limit of Detection. The chi-square test was employed to explore the relationship between the categorical variables in the sample such as “gender” (male/female), “smoking status” (yes/no), “exercise” (yes/no), “alcohol consumption” (yes/no) and the group of patients (Hashimoto/control). To evaluate the normality of data sets with continuous variables (Age, BMI, TSH levels), the Sharipo–Wilk test was employed as the standard approach, since indicates higher statistical power and greater credibility compared with other similar test19. Under the Normality assumption, we used the t-statistic of the t-test, while the non-parametric equivalent, Mann–Whitney U test, was used to investigate statistically significant differences across organic acids that did not follow the normal distribution. Nominal p values were corrected via the Bonferroni correction to avoid a false-positive bias. In this study, a fold change threshold of 1.5 was also adopted to identify metabolites exhibiting significant differential expression, while to control for multiple comparisons we applied a false discovery rate (FDR) adjustment at a level of 0.1. In the next step, an ordinary least squares regression (OLS) model was also employed between each of the potentially significant biomarkers as a dependent variable to explore the causal linear relationship and adjust for confounders in a log-transformed scale for the dependent variables. For the case of biomarkers with proportion data, a beta regression was employed, which represents a flexible extension of the generalized linear model, with the logit as a link function, and provides an advantageous alternative for proportion-bounded data20. In line with our previous research, an artificial neural network (ANN) framework was finally implemented as a predictive model for Hashimoto’s disease based on critical OA and TFA biomarkers alongside main demographic data17,18. The ANN model employed was a feed-forward neural network, trained using the error backpropagation algorithm. The receiver operating characteristic (ROC) curve was utilized to evaluate the model’s accuracy, which aligns with established methods for assessing model performance. Finally, a Debiased sparse partial correlation (DSPC) network analysis was employed in both groups. In short, DSPC enables the computation of partial correlation coefficients and their p values while facilitating the identification of underlying connectivity patterns among numerous metabolic features. Main central tendency measures were also estimated, such as the degree of centrality and the betweenness.

Results

Population characteristics

In the present study, 120 participants were analyzed, including 62 patients with Hashimoto’s thyroiditis and 58 individuals in the control group who were matched for age and gender. The population characteristics are summarized in Table 1. The two groups exhibited similar characteristics in terms of demographics, diet, and lifestyle habits. TSH levels were within the normal range for both groups, with no statistical significance between them.

Table 1 Study population characteristics.

Metabolic profiling of HT patients

Quantification of urinary organic acids of patients with Hashimoto’s thyroiditis and control revealed several differentially expressed metabolites shown in Table 2 (mean ± SD) and categorized by their biological role. Among them, citric acid, isocitric acid, 2-ketoglutaric acid, pyruvic acid, methylsuccinic acid, glyceric acid, methylcitric acid and vanillylmandellic acid were statistically significantly different at a nominal level of p < 0.05. In addition, methylmalonic acid and 4-hydroxyphenylpyruvic acid were statistically significantly different after multiple corrections with the Bonferroni method (p < 0.05/30). Conducting the fold-change-analysis (FCA) we further identified two markers that were significantly decreased (fumaric acid/succinic ratio and the methylsuccinic acid) as well as a significant increase for 3-hydroxybutyric acid (Fig. 1). The quantification of total fatty acids (TFAs) in cases and controls revealed statistically significant differences in palmitoleic acid (C16:1n7), palmitic acid (C16:0), myristoleic acid (C14:1), and saturated fatty acids (SFA) levels after Bonferroni correction (Table 3). False discovery rate (FDR), which demands less stringent statistical requirements compared with Bonferroni Correction, also revealed statistically significant differences in dihomo-gamma-linolenic acid (C20:3n6) and palmitoleic/palmitic ratio (C16:1n7/C16:0) (Fig. 1).

Table 2 Concentrations of organic acids and ratios in Hashimoto’s disease and control groups (mmol/mol creatinine).
Fig. 1
figure 1

Differential metabolite expression in HT patients: Up: fold change analysis of the organic acid metabolic compounds. Down: fatty acids. Corrected false discovery rate (Wilcoxon test, FDR < 0.1, fold change (FC) > 1.5). Colored dots indicate significantly different variables.

Table 3 Concentrations of total fatty acids and ratios in Hashimoto’s disease and control groups (μmol/L).

Exploring potential metabolic biomarkers

Next, an ordinary least squares regression (OLS) model was employed between each potential biomarker from organic acids analysis (methylmalonic, 4-hydroxyphenyl pyruvic, methylsuccinic acid and 3-hydroxybutyric acid) as a dependent variable to explore the potential causal relationship and to adjust for confounders in a log-transformed scale for each depended variable. Table 4 shows the primary outcomes of OLS regression models, which investigate the relationship between key metabolites and essential demographic and clinical variables. The tabulated results consist of the estimated regression coefficients and their corresponding statistical significance for each of the four models. In addition, the table presents model-specific R-squared (R2) and Adjusted R-squared (R2) values as indicators of the explanatory power of the models, along with F-statistics to assess the overall significance and goodness-of-fit. Durbin–Watson (DW) statistics were computed to validate the models further and evaluate the potential presence of autocorrelation in the residuals. Residual heteroscedasticity was examined visually, while multicollinearity was investigated by the reciprocal of the variance inflation factor. Model 1 indicates that the group (Hashimoto/control) variable has a positive statistically significant association for the metabolite expression of methylmalonic. Additionally, smoking was also positively and statistically significantly associated with higher methylmalonic expression levels, while age showed the opposite effect. Overall, the model was statistically significant (F7,112 = 7.063, p < 0.001). “The assumptions of sphericity of residuals were satisfied, as confirmed by a Durbin–Watson statistic of 1.762 for autocorrelation detection. Multicollinearity was assessed through Variance Inflation Factor (VIF) analysis, demonstrating values well below the critical threshold of 10 (VIF < 10) for all independent variables in the model. The normality and homoscedasticity of residuals were assessed using a Q–Q plot, which indicated that the residuals were homoscedastic and followed a normal distribution. Similarly, 4-hydroxyphenylpyruvic acid showed statistical significance; however, the model’s goodness of fit was relatively poor. The other two models did not demonstrate statistically significant differences from the null model. In particular, F statistic was estimated as (F7,112 = 1.429, p = 0.205) and (F7,112 = 1.426, p = 0.206) for methylsuccinic and 3-Hydroxybutyric, respectively. Table 5 displays the outcomes of the Beta regression analysis, which estimates the group variable and age as statistically significant factors at 90% and 95% levels of significance, respectively. It has to be noted that the interpretation of coefficients within the beta regression framework diverges from that of a conventional linear model. Specifically, the exponentiation of coefficients in this context yields the odds ratios for a one-unit increase in a given predictor variable.

Table 4 Ordinary-least-square (OLS) analysis results of organic and total fatty acids.
Table 5 Beta regression phi = 3.1397 (± 0.0458) (p < 0.001); Estimation method; Maximum likelihood estimation; Logit used as a link function.

For the case of TFA potential biomarkers, beta and OLS Regression analysis indicated that palmitic (C16:0), myristoleic (C14:1), SFA, and the palmitoleic/palmitic ratio (C16:1n7/C16:0) are statistically significant after confounding adjustments (Tables 4 and 5). All the models used had a slight positive autocorrelation (DW < 2), but the collinearity diagnostics revealed a lack of multicollinearity. The graphical examination of the standardized residuals revealed that the hypothesis of normality and homoscedasticity was held in all the models above.

Predictive modeling using an improved metabolic markers combination

As a next step, we tested the significant metabolites from the regression analysis as predictive biomarkers using Artificial Neural Network. The analysis revealed that methylmalonic acid, SFA, and the Palmitoleic/Palmitic ratio emerged as the most prominent predictors. The area under the receiver operating characteristic curve (AUC-ROC) test yielded a value of 0.8 (Fig. 2). Additionally, the classification analysis showed that the model correctly predicts the group membership of 80% of patients in each group without indications of overfitting. The architecture of the ANN model, depicted in Fig. 3, provides insights into the network structure and the contribution of biomarkers to the model is shown in Supplementary Fig. 1. It needs to be mentioned that after a trial-and-error approach, we concluded that using only one hidden layer was a reasonable choice due to the limited number of observations.

Fig. 2
figure 2

Receiver operating characteristics (ROC) curve for the ability of a specific combination of biomarkers to identify patients with Hashimoto’s disease. “0” represents the control group; “1” represents the HT cases.

Fig. 3
figure 3

Neural network architecture used as a predictive model. “0” represent control; “1” HT cases. SFA saturated fatty acids, MMA methylmalonic acid, 4-HPPA 4-hydroxyphenylpyruvic acid.

Debiased sparse partial correlation network analysis

Given the complexity of metabolic networks and the non-canonical pathways that may contribute to disease profile, a network analysis was undertaken for organic and fatty acids (Fig. 4). Edges represent the metabolic connections between the metabolites measured (nodes), which can be positive (red-colored) or negative (blue-colored). In addition, each network’s structure is affected by the importance of a specific metabolite (betweenness) and the number of connections (degree). In the HT group, palmitic acid (C16:0), alpha-linolenic acid (C18:3n3), and dihomo-gamma linolenic acid (C20:3n6) were the most important for the network structure based on the values of degree and betweenness (Supplementary Table 1). In addition, a strong positive nonlinear association was observed between docosahexaenoic acid (C20:6n3) and tetracosanoic or lignoceric acid (C24:0). The organic acids network was mostly affected by 3-hydroxy-3methylglutaric acid and citric acid based on their associations to the other metabolites and their impact on the network. Multiple associations were observed, with the strongest being between citric acid and suberic acid and aconitic acid with 2-ketoglutaric acid. The networks for organic and fatty acids for the control group were also constructed as a comparator and are available in the supplementary material (Supplementary Fig. 2a,b).

Fig. 4
figure 4

A debiased sparse partial correlation (DSPC) network of OA and TFA metabolites of Hashimoto's disease. In this representation, metabolites are depicted as nodes, while the edges symbolize the association measures between them. The thickness of the edges corresponds to the strength of these associations, providing a visual indication of their relative importance. The network employs color coding to differentiate between positive (red lines) and negative (blue lines).

Discussion

Hashimoto’s thyroiditis is an autoimmune disease affecting primarily the thyroid gland. The gradual destruction of the gland and the accumulation of immune cells on the damaged site can affect several distant organs and tissues, representing a wide variety of symptomatology. Beyond the quality-of-life deterioration, many patients with HT often also develop other chronic diseases such as rheumatoid arthritis, psoriatic arthritis and Sjogren’s syndrome and are at an increased risk for cardiovascular disease5,21. Therefore, there is an increasing need to define the metabolic imbalances of HT to provide early detection and targeted treatment in addition to thyroid hormone replacement therapy.

The present study identified differentially expressed urine organic acids and plasma fatty acids in euthyroid patients with HT compared to healthy individuals. From the organic acids panel, we showed that methylmalonic acid, 4-hydroxyphenylpyruvic acid, methylsuccinic acid, 3-hydroxybutyric acid and fumaric: succinic ratio were markedly different between case and control groups adjusting for confounding variables. From the fatty acids panel, palmitoleic acid (C16:1n7), palmitic acid (C16:0), their ratio (palmitoleic:palmitic), myristoleic acid (C14:1), dihomo-gamma-linolenic acid (C20:3n6) and total saturated fatty acids (SFA) were all higher in the HT group compared to control. In the next paragraphs, we discuss our findings by grouping the tested metabolites according to their biological relevance, given the underlying metabolic interconnections.

Methylmalonic acid (MMA) is a by-product of methylmalonyl-CoA, and its levels are associated with vitamin B12 deficiency. Under normal conditions of vitamin B12 sufficiency, methylmalonyl-CoA is converted to succinyl-CoA to supply the tricarboxylic acid cycle (TCA) by the enzyme methylmalonyl-CoA mutase which is vitamin-B12-dependent. In the absence of B12, methylmalonyl-CoA is hydrolyzed to MMA instead, leading to the accumulation of MMA in biological fluids22. Low vitamin B12 levels have been previously detected in patients with autoimmune thyroiditis, in line with the studies showing an increased incidence of pernicious anemia in patients with HT23. In addition, in vitro studies have demonstrated the interactive effects of smoke on several forms of cobalamin (vitamin B12), possibly explaining the association between decreased B12 levels and smoking in our and other studies24,25. Overall, the present study validates these findings in the population of euthyroid patients with HT using MMA, which is regarded as a more sensitive biomarker for functional vitamin B12 insufficiency.

4-hydroxyphenylpyruvic acid (4-HPPA) participates in the biosynthesis of several key metabolic intermediates through sequential reactions. Some of these steps are sensitive to adequate levels of vitamin C and tocopherols (vitamin E), and their deficiency may cause the accumulation of 4-HPPA26. Studies exploring the vitamin C or vitamin E status of patients with thyroid disease are scarce, though some indicate low levels in benign thyroid disorders and other autoimmune diseases27,28. In the current study, preliminary findings indicated a possible change in 4-hydroxyphenylpyruvic acid (4-HPPA) levels among individuals with HT. However, the statistical significance of this hypothesis was not confirmed in the regression model, potentially due to the limited sample size. Further research with a larger sample size is needed to investigate the relationship between 4-HPPA and HT more comprehensively and determine its clinical significance. Therefore, considering the central role of antioxidants, such as vitamins C and E, in the fine-tuning of the immune system and the findings of this study, attention should be given to the marginal micronutrient deficiencies of HT patients and the use of sensitive biomarkers for their detection.

Methylsuccinic acid belongs to the dicarboxylic acids group including ethylmalonic acid, suberic acid and sebasic acid, which were also measured in this study. Dicarboxylic aciduria is a common inborn error of fatty acids oxidation metabolism (FAO). However, subtle changes have been associated with a lack of riboflavin (Vitamin B2) due to its central role in the enzymes involved in FAO29. There are no available data on the levels of methylsuccinic acid in patients with autoimmune thyroiditis (HT or Grave’s disease), though there is evidence that vitamin B2 serum low levels are linked to thyroid dysfunction30,31.

Analysis of the major metabolites of the TCA cycle identified 3 metabolites (citric acid, isocitric acid, 2-ketoglutaric acid) that were statistically significantly higher in the HT group compared to the control (P < 0.05). Even though the differences were marginal between the two groups, when we tested for multiple comparisons, the strong relations identified in the network analysis indicate low-level metabolic reprogramming (Fig. 4). Metabolic blocks during the first steps of the cycle could be due to improper function of the responsible enzymes or overload from the upstream steps (metabolic oversupply). Metabolic oversupply significantly contributes to the disruption of immunological tolerance through complex immunometabolic pathways, leading to the development of chronic and autoimmune diseases7,32.

Aconitase is an iron-sulfur protein catalyzing the conversion of citric to aconitic and isocitric and has been associated with thyroid dysfunction through complex networks involving the adipose and muscle tissues33. Carbohydrate metabolism contributes to mitochondrial overload and oxidative stress, and in our case, pyruvic acid and 3-hydroxybutyric acid were significantly higher, although these changes were insignificant in the adjusted model, probably due to the small sample size. Since the dietary habits between the case and the control group did not differ significantly (MDS and alcohol consumption), we hypothesize that the higher levels of these metabolites are probably due to intracellular metabolism dysfunction. 3-hydroxybutyric acid is one of the ketone bodies formed during periods of low availability of carbohydrates or high intake of fatty acids. Ketogenesis is a normal body process that provides alternative sources of energy during small periods of fasting, such as overnight sleeping. Even though there is no related literature on 3-hydroxybutyric acid levels in patients with HT, this finding could be associated with the increased levels of saturated fatty acids, suggesting a pre-insulin-resistant state caused by fatty acids excess intake/synthesis and/or mitochondrial dysfunction34.

Succinate dehydrogenase (Electron Transport Chain II) is responsible for converting succinic to fumaric acid while linking the TCA cycle and oxidative phosphorylation to produce energy in the form of ATP. Enzymatic impaired activity has been studied in types of cancer-bearing pathogenic variants for the responsible genes exhibiting higher levels of succinic: fumaric ratio35. Another study showed SDH plays a determinant role in the inflammatory process and the activation of macrophage cells, adding evidence to the metabolic reprogramming observed in many inflammatory diseases36. In the present study, we assessed SDH activity by measuring the fumaric acid: succinic acid ratio and found it was significantly lower in the HT group, indicating a reduced enzymatic activity and accumulation of the reactant succinic acid. Riboflavin (vitamin B2) acts as an important cofactor for ETC and the function of SDH in particular, suggesting a potential role of marginal vitamin B2 deficiency in HT.

In the present analysis, we selected to measure metabolites that are normally absent from urine samples, and their upregulation is indicative of potentially harmful pathogen overgrowth or lack of beneficial bacteria. Methylcitric acid was found to be significantly higher in the HT group, a marker of bacteria-produced biotin (vitamin B7)37. In addition, we observed a relative increase in 4-hydroxyphenylacetic acid, which is a marker for bacteria overgrowth38. Even though these changes were not significant in the multiple-comparisons corrected models, gut dysbiosis is a well-known mechanism in the aetiopathogenesis of autoimmune diseases, including HT, and other studies identifying potential biomarkers of the gut-thyroid axis highlight the need for future larger studies on gut metabolic markers in HT39.

Case-control analysis of metabolites of the neurotransmitters dopamine, serotonine, adrenaline (homovanillic, 5-hydroxyindoloacetic and vanillylmandelic acid) showed a marginal increase which was more profound to vanillylmandelic acid (VMA). Mood changes is common in HT and neurotansmitters play a major role in the activation and regulation of immune cells and autoimmune disease40,41. The increased levels of VMA (adrenaline) could be due to the reduced energy production as a result of the mitochondrial block shown by the increased TCA metabolites42.

Fatty acid metabolism is crucial for cell structure, signaling, and function while regulating immune responses. Patients with HT have dysregulated lipid metabolism, possibly linked to the thyroid hormone imbalance and the extensive metabolic reprogramming affecting insulin signaling and inflammation processes6,43. Here we identified, significantly higher levels of the saturated fatty acids myristic acid (C14:0), palmitic acid (C16:0), stearic acid (C18:0), their related monounsaturated fatty acids myristoleic acid (C14:1) and palmitoleic acid (C16:1n7) and dihomo-gamma-linolenic acid (C20:3n6) of the omega-6 family, in line with previous case-control studies on autoimmune diseases17,44. Increased levels of serum palmitic acid (C16:0) have been previously associated with decreased free T4 and higher fT3/fT4 ratio in a retrospective study of subclinical hypothyroidism45. Mechanistic studies suggest that dysregulated lipid metabolism and storage in non-adipose tissue trigger adverse cellular events promoting the development of subclinical hypothyroidism and autoimmune thyroiditis46, 47. In turn, nutritional strategies for the management of insulin resistance such as the Mediterranean diet or low-carb dietary plan avoiding processed foods to promote the anti-inflammatory phenotype have been recommended, though few interventional studies are available48. In our dataset, the two groups had similar adherence scores to the Mediterranean Diet, suggesting a disrupted metabolism of endogenous fatty acids in the HT group rather than an increased intake of SFA. It should be noted though, that food frequency questionnaires have, in general, reduced sensitivity in capturing inter-individual differences. Another parameter to assess the non-difference in fatty acids intake was the levels of the essential fatty acids linoleic acid and a-linoleic acid which are exclusively obtained through diet and were similar between the two groups (p = 0.248 and 0.475 respectively).

This study’s ordinary least squares (OLS) and beta regression models were based on a pre-selection process that included the statistically significant Organic Acids, Total fatty acids, and demographic variables. This approach accounted for potential confounding factors while adhering to parsimonious and easily interpretable modeling principles, in line with good statistical practices. The same methodology was applied to the development of the artificial neural network (ANN), employing a limited number of interpretable and modifiable factors to facilitate the exploration of the entire process. Furthermore, it is essential to mention that although multivariate regression analysis possesses potential advantages, this study did not employ such an approach due to the limited sample size of patients involved in the trial. In such a case, the small sample size would fail to demonstrate the model’s ability to provide reliable estimates. Acknowledging this constraint, we opted to employ the aforementioned statistical approaches that were more suitable for addressing the research objectives. In the present analysis, we followed the steps described in the “Guide to Metabolomics Analysis: A Bioinformatics Workflow” incorporating basic descriptive techniques and advanced inferential statistics. The descriptive analysis showed that the two patient groups were reasonably balanced in their characteristics (as depicted in Table 1), indicating that more sophisticated techniques could be employed without compromising the primary results’ generality.

As this work originally explores the high-impact and complex relationships between statistical variables, several techniques were employed to provide objective information for interested readers. In particular, beta regression was appropriate because the dependent variable in the model was a ratio of values lying strictly within the range (0, 1) and was assumed to be continuous, bounded and unimodal. Additionally, the independent variables were reasonably expected to influence the ratio’s mean and dispersion. Conversely, OLS regression was utilized for the remaining biomarkers, as the assumptions for its use (autocorrelation, heteroskedasticity, multicollinearity) were satisfied in the model. Regarding multivariate models, ANN models were preferred since the study did not involve multiple dependent variables. This choice was based on the fact that ANN can predict outcomes and classify data with greater accuracy and are better at uncovering intricate patterns within the data than alternative models. ANN, combined with other advanced regression methodologies, has been widely utilized across various healthcare research disciplines, offering unique advantages and limitations49. The main advantage lies in its ability to analyze the interrelations between response and predictor variables without requiring a predefined functional form of dependence. This characteristic facilitates the model’s proficiency in effectively encapsulating non-linear interdependencies. On the other hand, ANN has some challenges because it relies on intricate mathematical algorithms and is prone to overfitting. Furthermore, the computational processes associated with ANN are inherently data-intensive, necessitating substantial data to achieve optimal performance. It is reasonable to assume that larger dataset would lead to better performance of the ANN framework. However, this was not feasible within the scope of this study, as the collection of clinical and targeted metabolomics data is often constrained by experimental complexity and financial considerations. In general, it has been recommended that approximately tenfold the number of training instances be allocated for each node of the model50. Numerous statistical efforts have been undertaken to mitigate this prerequisite for smaller sample populations51. Our model, which includes five nodes in the hidden layer and a total of 120 observations, partially fulfills this criterion, thereby supporting the adequacy of the sample size for ANN analysis. Sample size per group is critical for robust metabolomics assessments, though there is currently no widely accepted statistical method for this estimation, given the inherent complexities52,53. Factors such as patient heterogeneity and variability in other parameters influence these estimations. In practice, studies often include 30–50 patients per treatment group, which aligns closely with the sample size utilized in the present investigation. In summary, the implementation of ANN analytical techniques in this study demonstrates their applicability to targeted metabolomics data. Nevertheless, careful consideration of sample size and strategies to address overfitting are essential to optimize their effectiveness in medical research contexts.

To our knowledge, this is the first study to report circulating levels of a comprehensive panel of organic and fatty acids in patients with Hashimoto’s thyroiditis. We identified several critical alterations in metabolite levels between the two groups, highlighting the potential metabolic dysfunctions underlying HT: Mitochondrial dysfunction, micronutrient decreased bioavailability, microbiome imbalances, and carbohydrate and fatty acids metabolism. These findings contribute to the establishment of Hashimoto’s disease metabolic biomarkers that can be used for future studies and clinical trials. Given that targeted metabolomics in Hashimoto’s disease is an understudied field, we selected to analyze our data with several statistical techniques and discuss the results to show the subtle but biologically relevant changes, acknowledging that some may not be significant after the multiple comparison corrections. It has been argued that utilizing multiple correction thresholds, including false discovery rate (FDR), might not be the most suitable approach for metabolomics studies over the Bonferroni corrections. Indeed, identifying the optimal statistical method necessitates using statistical and clinical judgment to ensure the credibility of the results54. Therefore, this study provides a detailed report of the differences between organic and fatty acids between Hashimoto and healthy individuals, generating a novel hypothesis that needs further exploration. Furthermore, we applied more stringent approaches in predictive modeling and regression analysis to explore the potency of certain metabolites as predictive biomarkers and identified an improved combination that reached fairly good predictive accuracy. Beyond predictive biomarkers, the present study identified certain borderline significant metabolites, including citric, isocitric and 2-ketoglutaric, participating in central metabolic pathways. A possible explanation for the mild differences observed is that these metabolic networks are supplied by numerous metabolic pathways to maintain cellular function and counterbalance metabolic dysfunctions at different stages. Therefore, the interpretation of metabolomic findings needs to be done with caution, assessing them within the context of metabolic networks rather than as single markers. Network analysis methods, such as DSPC and others employed in the present and previous studies, are expected to shed light on these borderline significance differences and especially their association with related phenotypical traits, such as dysfunctional TCA with fatigue in Hashimoto.

The present study has some limitations that need to be considered while interpreting the findings and for future research. Metabolites are very sensitive to external factors (diet, stress, methodology) and some have large fluctuations due to uninduced biological variation, which could also affect subtle changes in our study55. Thus, we employed several statistical methodologies to reduce the risk of false-positive differences between groups. It should also be noted that this study focused on identifying potential metabolic biomarkers for Hashimoto’s thyroiditis (HT), rather than examining the associations between metabolic dysfunctions and disease progression or metabolite changes over time. Such analyses would require longitudinal data, which is one of the objectives of the METHAP clinical trial. Another potential limitation of the present study is the limited sample size and the lack of detailed nutritional data (quantity and frequency of food items) that may affect the observed differences between case and control. MDS score is a validated questionnaire for estimating adherence to the Mediterranean score, widely used in epidemiological studies. However, these questionnaires have reduced sensitivity in capturing inter-individual dietary habits differences, which can impact metabolic profiles, highlighting the need for validated and more descriptive questionnaires in clinical studies56. In addition, although patients’ medical history was obtained and individuals with severe acute or chronic disease were excluded, routine biochemical and blood count testing would clarify potential clusters within the group, especially for control. Age and BMI were different in females with HT compared to control, and although these variables did not affect the detected association between HT and the selected metabolic biomarkers, subgroup analysis using a larger sample size would provide in-depth insight into these associations. An in-depth subgroup analysis could reveal essential heterogeneities within the analyzed dataset. However, due to insufficient statistical power, a comprehensive separate subgroup analysis was not conducted in this study. Nevertheless, OLS regression with age and gender as predictors was performed to provide preliminary insights into potential subgroup effects. In a larger dataset, a complete subgroup analysis would necessitate either separate data analysis for each gender and age group or the generation of a regression model with interaction terms (age_group x Gender). An additional limitation of this study is the absence of analyses pertaining to population differences across regions or ethnic groups, which may restrict the generalizability of our findings to broader populations. Consequently, it is suggested that a more comprehensive population-specific analysis for distinct groups (Caucasians, African Americans, people of African descent etc.) could be the focus of future research. Overall, the above points limit the generalizability of the present study’s findings to other populations, requiring further investigation.

To conclude, Hashimoto’s thyroiditis is characterized by metabolic complications, which, in many cases, affect the patient’s overall health and quality of life, even though in the euthyroid state. In the present study, we identified markers of mitochondrial dysfunction, micronutrient deficiencies, carbohydrate and fatty acids metabolism malfunction and microbiome imbalance in HT. Early detection and monitoring of the underlying metabolic dysfunctions are critical in the decision-making process to achieve low levels of thyroid destruction and hormonal regulation. In addition, targeted strategies can significantly contribute to the alleviation of the pro-inflammatory status and the metabolic burden of HT.