Introduction

The rapid spread of the SARS-CoV-2 virus, beginning at the end of 2019, took the world by surprise and exposed numerous deficiencies in global health systems. While laboratory diagnostic tests and vaccines were rapidly developed1,2, the need for more reliable and rapidly deployable diagnostic tools to address future pandemics remains critical. Molecular methods like quantitative polymerase chain reaction (qPCR) offer high reliability for SARS-CoV-2 diagnosis, but their widespread application in countries with limited economic resources has been hindered by their high cost3,4. Although antigen tests present a cheaper alternative, their implementation in developing countries is often challenged by the lack of proper supervision5. The limitations of current diagnostic methods underscore the importance of exploring alternative approaches, such as leveraging the information contained within the complex mixture of volatile and semi-volatile compounds produced by humans. Indeed, the human body produces a diverse array of volatile and semi-volatile compounds, varying with individual factors such as age, diet, gender, genetics, and physiological status. These compounds can be considered unique individual attributes6. Pathological processes can alter this biochemical profile, influencing body odor by either inducing the production of specific compounds or modifying the overall profile to create a distinctive “olfactory fingerprint”6,7. These “olfactory fingerprints” hold potential as diagnostic olfactory biomarkers4. Gas chromatography-mass spectrometry (GC-MS) is a powerful analytical technique that plays a key role in identifying and quantifying these volatile and semi-volatile compounds. GC-MS allows for the precise separation and detection of various compounds in biological samples, such as breath, blood, or urine. This technique’s high sensitivity and specificity make it valuable for detecting subtle changes in the “olfactory fingerprint” associated with disease8. By analyzing these changes, GC-MS can help identify potential biomarkers for various illnesses, offering a complementary approach to other diagnostic methods8,9.

Different compounds have been correlated with various non-infectious diseases10,11, syndromes12, and infectious diseases13,14, using Solid-Phase Microextraction (SPME) or similar methodologies. Recently, the VOCs profile of patients diagnosed with SARS-CoV-2 was identified by breath analysis using SPME-in-mask coupled with a direct analysis in real-time mass spectrometry15 and from blood serum by headspace solid phase microextraction (HS-SPME)16,17. Moreover, it has been shown that the human skin of a healthy individuals offers long-lasting emanation of VOCs through sweat evaporation with approximately 532 different VOCs18, including ammonia, carboxylic acid, alcohols, hydrocarbons, ketones, aldehydes, esters, heterocyclic compounds, and volatile sulfur compounds from the skin19. Thus, sweat has great potential to be used as a reliable serologic biomarker.

As part of a project supported by the Mexican Government (CONACyT now SECIHTI) for training dogs in the bio-detection of SARS-CoV-2, we conducted an exploratory analysis of volatile and semi-volatile compounds present in human sweat, hypothesizing that the sweat compound profiles differ between SARS-CoV-2 infected and uninfected individuals. This study aimed to identify potential chemical biomarkers of infection and to evaluate whether sweat analysis could reliably distinguish COVID-19 positive from COVID-19 negative patients. Accordingly, two objectives were established: (1) to determine whether the analysis of sweat compounds enables differentiation between COVID-19 positive and negative individuals, and (2) to define the compound profile characteristic of SARS-CoV-2 infection.

Materials and methods

Samples

A total of 426 sweat samples were collected from humans between September 7, 2021 and February 16, 2022. All the positive samples (n = 137) were obtained from two sources, the Progreso Health Center in Progreso, Yucatán, and the private laboratory BioStudio in Merida, Yucatán. The negative samples (n = 289) were obtained from people of Merida City (offices and workers at CINVESTAV-IPN Mérida). Before sampling, each patient was asked for her/his willingness to participate in the research, and in case of a positive answer, the patient was provided an informed consent form to read and sign. All the informed consent forms are available under request. The participant’s name was removed to maintain confidentiality. Posteriorly, a questionnaire was applied to obtain the patient’s personal and epidemiological data, symptoms, and medical history. The questionnaire collected information on the full name; age; sex; diagnosed chronic diseases; headache; diarrhea; fever; loss of taste; loss of smell; cough; runny nose; sore throat; body ache; chest pain; nausea; days with symptoms; treatment, days on medication (if provided) and contact with confirmed COVID-positive people. All sweat samples were handled following the safety measures recommended by the Mexican health authorities20. In all cases, the patients took their samples under technical supervision; therefore, each patient was given a resealable Ziploc® bag containing two 7 cm high, 4 cm diameter new amber glass flasks with bakelite caps sterilized in an autoclave, and six pieces of new, sterile Jaloma® odorless gauze (10 cm × 10 cm). The patient was asked to rub her/his neck, face, and forearms for 1 min with two gauzes and to insert them in one of the flasks. Subsequently, the patient was asked to place two gauzes under each armpit for 10 min. After this time, the patient was instructed to insert the sweat-absorbed gauzes into the glass flask, close it, and place it back in the resealable bag. This process was approved by the Bioethics and Safety Committee of the University of Sonora. To reduce inter-individual variability, all sweat samples were collected in climate-controlled facilities. Participants were instructed to avoid vigorous physical activity, bathing, or applying personal care products prior to sampling. The sweat samples were transported in separate coolers at 4 °C from clinical facilities to the laboratory at CINVESTAV-IPN Mérida Unit, where they arrived within 1–2 h after collection. Upon arrival, all samples were immediately stored at − 80 °C until they were processed.

Analysis of sweat components by SPME-GC-MS

For extraction of compounds, the sweat absorbed in the gauzes was extracted with 10 mL of a solvent mixture (acetonitrile-methanol 50:50). For the capture of the sweat components, SPME was performed by introducing the SPME fiber directly into 500 µl of the solvent mixture at room temperature (approximately 22–24 °C) for 1 min. The 1-min extraction time was selected based on preliminary experiments demonstrating adequate sensitivity while minimizing carryover and fiber saturation. The Polydimethylsiloxane/Divinylbenzene/Carboxen fiber (PDMS/DVB/Carboxen) 80/23 mm (Part Number: 5191–5874 Agilent, USA) was selected because its mixed-phase coating allows the adsorption efficiencies of a wide variety of compounds, making it particularly suitable for comprehensive metabolomic profiling21. Subsequently, the SPME fiber contained in the holder was manually inserted into the injector for 1 min at 270 °C to desorb the captured compounds. The fiber was used in direct immersion with a solvent mixture (methanol: acetonitrile) due to prior experience with biological matrices. However, this combination has not been formally validated for comprehensive lipid extraction, and exposure to polar solvents may affect fiber performance over time. To mitigate this risk, the extraction time was limited to 1 min.

All the samples (426) were processed in a Gas Chromatography-Mass Spectrometry (GC-MS) equipment, Agilent Technologies model 7860 Gas Chromatograph and a model 5977B mass spectrometer. Chromatographic runs were performed under the following conditions: The injector was maintained at 270 °C in splitless mode. The chromatographic separation was performed using an Agilent HP-5MS capillary column (30 m length × 0.25 mm internal diameter × 0.25 μm film thickness; Agilent Technologies, USA). The oven was started at a temperature of 50 °C for one minute. The temperature was then raised with a gradient of 10 °C/min up to 300 °C, the waiting time was 5 min, and the chromatographic run time was 31 min. Helium was used as carrier gas at a constant flow of 1 mL/min. The response of the mass detector in electron ionization mode was generated at 70 eV and monitored in TIC (Total ion current) SCAN format from 50 to 650 m/z. A C7 to C30 alkane calibration standard (Sigma-Aldrich, St. Louis, Missouri, United States of America) was analyzed to calculate the retention index (RI) of each component following a previously reported methodology22. To ensure data integrity and minimize the risk of contamination or analytical artifacts, procedural solvent blanks were included throughout the analytical workflow. These consisted of solvent blanks (acetonitrile: methanol 1:1 v/v) processed under identical conditions as the sweat samples, including fiber exposure, thermal desorption, and GC-MS analysis. The blanks were run periodically across the sample batches to detect potential carry-over and to verify the absence of background interference. The analytical strategy employed in this study was exploratory based on relative quantification, using total ion chromatogram (TIC) signal intensity, as no calibration standards or specific quality control (QC) samples were included.

Annotation of sweat components

The gas chromatograph was used to separate the components of interest, and the mass spectrometer to identify them based on their unique mass fragmentation patterns. The mass spectrum of each chromatogram component was extracted from noise using the software Automated Mass Spectral Deconvolution & Identification System (AMDIS). The components were annotated by comparison of their extracted mass spectra with those in the National Institute of Standards and Technology database, United States of America (NIST17), including the same distribution of the 10 major ions, the parent ion, and the molecular weight ions. The NIST match and RI were also considered for annotation. Additionally, each annotation was classified according to the Metabolomics Standards Initiative (MSI) guidelines23.

Statistical treatment

The statistical analysis was conducted using the total ion chromatogram (TIC) signal intensity values derived from the GC-MS data. These values represent integrated spectral responses without retaining individual m/z features. The full TIC-based data matrix (including retention time and intensity) used for the multivariate modeling is provided as Supplementary Information S1 ZENODO repository: https://doi.org/10.5281/zenodo.16740883. The TIC data were selected because it monitors the whole window of several hundred mass-to-charge (m/z) units, so it gets most of the chemical information of each sweat sample. The TIC files were imported into the RStudio24, an integrated development environment for R software25, using the function files2SpectraObject from the “ChemoSpec” package, an R package for the preprocessing and 2D data exploratory analysis26. The “ChemoSpec” package can process spectroscopic or chromatographic 2D data (where Y is the signal intensity and X is a time or frequency unit).

The data importation assumes that the raw chromatogram files contain two columns, the first for the retention time and the second for the intensity. After data importation, a chromatogram baseline correction is followed using the peakDetection algorithm for simultaneous peak detection and baseline correction. Then, the peak alignment, and data binning up to a resolution of 0.1 min. The last preprocessing step included an intensity calibration (or normalization) with PQN normalization, using the median TIC (total ion current) of all chromatograms as the reference.

The supervised data were analyzed with the “caret” R package (short for Classification and Regression Training), which contains functions that attempt to streamline the process of creating a predictive model27. In this procedure, the data were divided into two randomly chosen groups (“training set” and “test set”). The first group, “training set” (75% of the original dataset), was used to optimize the model parameters. In each model optimization, a resampling method was added to reduce the error in the estimate of mean model performance, in this case, the repeated k-fold cross-validation method (k = ten-fold, three times repeated). Five models: PLS-DA (Partial Least-Squares Discriminant Analysis), RF (Random Forest), and three SVM (Support Vector Machines) models, a linear one, and two non-linear algorithms (polynomial kernel and Radial kernel) were implemented and compared to obtain the best classification model that discriminates correctly between positive and negative samples of SARS-CoV-2 from sweat GC-MS profiling data. These models are some of the most recurrent to assort samples into different groups from metabolomic data28,29. The area under the ROC curve (AUC) and Accuracy were the binary classification metrics to select the optimal model. Additionally, to evaluate the performance of each classification model predictors such as DOR (Diagnostic odds ratio), Accuracy, Balanced Accuracy, F1-score, Cohen’s Kappa, and MCC (Matthews Correlation Coefficient) were calculated. The formulas used to calculate the binary classification evaluation metrics are in the following references:30,31. The second group of samples, called the “test set” (which is the rest of the original dataset) was used to evaluate the models, that is, to evaluate its classification power to discriminate between positive and negative samples from the calculation of a confusion matrix. Thus, the confusion matrix is an organized way of mapping the predictions to the original classes (PR: positive samples or NR: negative samples) to which the data belong according to the models. Additionally, several binary classification metrics aside from those generated from the R “caret” package (Accuracy, balanced Accuracy, and Cohen´s Kappa) were calculated to evaluate the classification models using the “training set” by comparing them to the “test set” and to decide which model is better, in the case there were models with similar performance based on just one binary classification evaluation metric. To compare when the metrics performance values were better, if under the ROC or the Accuracy metrics, we did a t-test (α = 0.05).

Other packages implemented in this study for visualization in 3D are the following: “ggplot2”32, “rgl”33, and “car”34. The efficiency of the binary classification PLS-DA depends on metrics such as Accuracy, Sensitivity, and Specificity calculated from the confusion matrix, using Balanced Accuracy as a metric for the binary classifier, because an imbalanced number of negative SARS-CoV-2 cases (the data have more of them) compared to the positive SARS-CoV-2 patients.

Finally, the model classification’s most important original variables (RT) were obtained using the VIP (Variable Importance Projection) calculation from the “caret” package.

Results

Demographic data analysis

Four hundred twenty-six sweat samples, one per patient, were analyzed by GC-MS. The positive sweat samples were obtained from a total of 137 people from two sources, the Progreso Health Center in Progreso, Yucatán, and the private laboratory BioStudio at Merida, Yucatán. The negative samples (n = 289) were obtained from people of Merida City. The whole sample (positives and negatives) comprised 196 (46%) women and 230 (54%) men. The group of positive people comprised 81 (59%) women and 56 (41%) men, and the group of negative people was 115 (40%) women and 174 (60%) men. There were no significant differences between SARS-CoV-2 positive and negative patients in the proportion of women and men (Fisher’s exact test, difference between proportions = 1.65318, p > 0.05). The age range of the whole sample was between 4 and 80 years, with 37 ± 14 years average for women and 38 ± 14 years for men. There were no differences in the mean age between the SARS-CoV-2 positive and negative patients (Student’s t0.05 = 0.47, p > 0.05).

Optimal model

According to the analyses, the best model was the PLS-DA as determined by the Accuracy metric. The PLS-DA model had the best performance among the models, according to metrics like DOR (Diagnostic odds ratio), where a higher ratio value indicates a better result (Supplementary Information S2 Excel file).

To understand how the final PLS-DA model differentiates between the two classes (positives and negatives from the training data), we plotted the final PLS-DA model. The PLS-DA models clearly show the separation for the PR samples from the NR samples in the plot (Fig. 1). The explained variance from the final PLS-DA model for the first three PLS-DA’s LVs is 14.78% in LV1, 14.06% in LV2, and 8.41% in LV3 (37.25% overall).

Fig. 1
figure 1

PLS-DA optimized model for the training set data on 3D plots, where show the separation for the Positive Result (PR, green dots) samples from the Negative Result (NR, red dots) samples. LVs = Latent Variables.

Compounds profile of Sars-Cov-2 positive patients

Solid Phase Microextraction Gas Chromatography-mass spectrometry (SPME-GC-MS) analysis detected 286 different compounds differentially distributed among the samples. The Fig. 2 shows the mean of Total Ion Current Chromatograms with the standard error of the mean for COVID-19 positive and negative groups. The VIP analysis categorized 20 discriminatory compounds for SARS-CoV-2 infected and non-infected patients. However, only 6 compounds (Table 1) were chemically annotated by spectrum recognition using the NIST library in combination with spectrum interpretation for compounds not found in the database and RI determination (Supplementary Information S3 PDF file). Among the compounds annotated, we found one Saturated Fatty Acids (SFA) (Hexadecanoic acid), one Methyl ester of SFA (Methyl hexacosanoate), two Methyl ester of Monounsaturated Fatty Acids (MUFA) (Methyl (Z)-hexadec-9-enoate and Methyl (Z)-octadec-9-enoate), one triterpenoid (squalene) and an Aliphatic aldehyde (undecanal). According to the analysis, significant differences in profiles were observed between SARS-CoV-2 infected and non-infected patients. Hexadecanoic acid was elevated in SARS-CoV-2-positive patients (p = 0.004), as was Methyl (Z)-octadec-9-enoate (p = 0.03). In contrast, the Methyl (Z)-hexadec-9-enoate (p = 0.0004), Methyl hexacosanoate (p = 0.009), squalene (p = 0.029), and undecanal (p = 0.007) were lower in SARS-CoV-2-positive patients than in negative ones (Fig. 3).

Fig. 2
figure 2

Mean of Total Ion Current Chromatograms with the standard error of the mean for COVID-19 positive and negative groups. PR stands for positive patients andNR for negative patients. The spectra show the variation within each group, along with the standard errors of the TIC mean (SEM). When the dispersion is low, only a red line appears instead of three lines (mean – SEM, mean in black, and mean + SEM).

Table 1 Compounds annotated based on the VIPs from the PLS-DA models.
Fig. 3
figure 3

Compounds profiles between infected (positive) and non-infected (negative) patients. (A) Undecanal, (B) Methyl palmitoleate, (C) Palmitic acid, (D) Methyl oleate, (E) Squalene and (F) Methyl cerotate.

Discussion

The hypothesis that the DI-SPME patterns in sweat samples of COVID-positive and negative patients would be different, was partially correct since the DI-SPME patterns of both SARS-CoV-2 infected and non-infected people were similar in composition. However, we found significant differences in the signal intensity of these compounds in infected and non-infected people, which suggest that these differences could be associated to different physiological responses to the virus infection. The best classification model (PLS-DA) correctly discriminated between positive and negative cases, with percentages of specificity and sensitivity above 80%, using the MCC as an evaluation metric. Recently, the PLS-DA model was used by Mitra et al.35 for VOC patterns of the hands infected and non-infected people with SARS-CoV-2, with very good discriminating results. The PLS-DA model has been also used by Brixner et al.36 in the discrimination of plasma samples from patients with colorectal cancer (RCC) and healthy individuals, based on Fourier transform infrared spectroscopy (FTIR) data with excellent results.

An important characteristic of our analysis was that the PLS-DA model only selected six compounds with a high level of similarity. The small number of compounds annotated was related to the fact that we concentrated our efforts on those compounds with the highest levels of similarity with respect to the peaks found in the NIST database and RI determination. The remaining compounds had a lower level of similarity with respect to the peak values found in the NIST database, and so we considered them as non-reliable annotations. In any case, since we found significant differences in the signal intensity of six of the compounds, these differences could be considered as indicators of physiological differences between SARS-CoV-2 positive and negative people.

Among the annotated compounds, three were identified as methyl esters derived from long-chain fatty acids: methyl palmitoleate, methyl oleate, and methyl cerotate. The presence of methyl esters in sweat is not uncommon but could be attributed to several mechanisms. One possibility is endogenous esterification or microbial transesterification occurring on the skin surface37. However, the use of methanol: acetonitrile (1:1 v/v) solvent mixture during compound extraction may have facilitated the formation of these esters, potentially acting as analytical artifacts38,39. Nevertheless, despite their esterified form, lipidomic studies have demonstrated that circulating fatty acids methyl esters concentrations markedly correlate with their corresponding fatty acid levels across biological matrices like plasma and serum40. Therefore, the compounds observed reflect the presence of their corresponding fatty acids (palmitoleic acid, oleic acid, and cerotic acid), offering indirect biochemical indicators associated. According to this, significant differences observed in lipid-related compounds (palmitic acid, palmitoleic acid, oleic acid, and cerotic acid) between infected and non-infected patients suggest probably alterations in lipid metabolism associated with the presence of the SARS-CoV-2 virus. However, it is important to note that the cartridge mix used in this study is likely not optimal for recovering the full spectrum of lipids present in sweat. Therefore, the lipid findings presented here should be considered preliminary and focused on the subset of compounds that could be extracted under these conditions. Even so, the patterns found for palmitic, oleic and palmitoleic acids are exactly the same found by Stromberg et al.17 in plasma fatty acids profiles between individuals who experienced moderate or severe COVID-19 disease compared to those with mild infection or no history of infection. Similarly, Spick et al.41, found that lipid levels were depressed in COVID-positive participants, indicative of dyslipidemia. In fact, various investigations have suggested that these changes in the lipid pattern can be attributed to inflammatory processes, liver dysfunction, increased vascular permeability42, and the interplay between cholesterol and viral replication43. Moreover, these changes seem to be more evident with the progression of the disease. For example, Wu et al.44 and Palma et al.45 found that metabolite and lipid alterations exhibit an apparent correlation with the disease course, reflecting that its development affected the whole-body metabolism. This indicates that plasma concentrations of both pro-inflammatory and pro-resolving lipid mediators were reduced in critically ill patients compared to those with severe disease. However, the mechanisms and consequences of SARS-CoV-2 lipid metabolic reprogramming are largely unexplored46.

In the fatty acid profiles, levels of palmitic acid were significantly higher in COVID-positive patients relative to the uninfected patients. Previous studies of fatty acid metabolism in patient with COVID-19 disease have revealed that palmitic acid is likely to play a role in viral entry to host cells, since this acid is known to attach covalently to the cysteine residues found on the SARS-CoV-2 spike and envelope proteins17,46. Thus, individuals with high levels of this fatty acid may be more susceptible to viral invasion and subsequently develop a more severe disease course. It is also possible that this could be one of the reasons why the immune response of individuals with morbid obesity are adversely affected by COVID-19 disease and in turn present increased disease severity. Additionally, SARS-CoV-2 has also been hypothesized to promote activation of palmitic acid synthesis via upregulation of the genes responsible for signaling the transcription of fatty acid synthase, acetyl-CoA carboxylase, and stearoyl-CoA desaturase 146. In doing so, the virus could increase the lipid stock, further promoting its replication and increasing viral load within the body46.

The oleic acid, also appears as significantly higher in COVID-positive patients compared with uninfected patients. Elevation of oleic acid in the presence of the virus was previously described by Barberis et al.47, they showed that levels of oleic acid directly correlated with disease severity. Interestingly, unsaturated fatty acids, such as oleic, arachidonic, or linoleic acid, have been shown to mediate antiviral activity by disintegrating the envelope of certain animal viruses, including herpes and influenza48. This pattern suggests a possible role for the oleic acid attacking the SARS-CoV-2 envelope given its direct negative correlation with disease severity49.

In this study, methyl palmitoleate (≈ palmitoleic acid) was significantly reduced in the sweat of COVID-19 positive patients. This observation is consistent with recent evidence from Cartin-Ceba et al.50 and Stromberg et al.17, who reported that palmitoleic acid was one of the compounds shown to be significantly reduced in the plasma of COVID-19 patients. This reduction being even more pronounced in patients with acute respiratory distress syndrome, those requiring life support, or those who died during hospitalization. These authors claim that this decrease is linked to an increased use of unsaturated fatty acids for energy in response to inflammation. This reflects a shift in lipid homeostasis and suggests altered de novo lipogenesis, as palmitoleic acid is a direct product of palmitic acid desaturation17. Palmitoleic acid has been described as a lipokine (a lipid molecule with hormonal function), associated with IL-6 and TNF-α and other inflammatory mediators, supporting its anti-inflammatory effects and role in immunometabolic processes17,50.

Among the compounds that differed between COVID-19-positive and -negative individuals, methylcerotate (= cerotic acid) was significantly decreased in the infected group. Cerotic acid belongs to the family of very long-chain saturated fatty acids (VLCFAs), which are primarily metabolized by peroxisomal β-oxidation. These fatty acids play an essential role in maintaining cellular lipid homeostasis, membrane structure, and immune regulation51. Mika et al.52 identified significantly reduced levels of cerotic acid in patients with colorectal cancer, suggesting that decreased VLCFAs may reflect altered peroxisomal function or increased metabolic demand associated with the disease. Therefore, although cerotic acid has not yet been directly linked to viral infections, the reduction observed in our study could indicate a peroxisomal alteration or a change in lipid metabolism in response to infection.

In this study, squalene occurs recurrently and abundantly in patients not infected with the virus; while it decreases significantly in COVID-positive patients. These changes could be attributed to the percentage contribution of the compounds that make up human sebum. Squalene represents 12% of the products of sebaceous secretion, which is a linear intermediate that precedes cholesterol in its biosynthesis53. Curiously, in the sebaceous glands, the squalene produced is not converted into lanosterol, which stops its synthesis to cholesterol, favoring the accumulation of squalene. The possible explanation for the accumulation of squalene in the sebaceous gland may be related to overexpression or an increase in the activity of squalene synthase in the cells, or it may be related to decreased level or activity of enzymes involved in the conversion to cholesterol54. Therefore, in an infectious process such as the entry of a virus like SARS-CoV-2, in which irregularities occur in lipid metabolism, the extra production of palmitic acid or oleic acid, possibly displaces squalene and/or modifies its proportion within the sebum components relative abundance.

Many authors consider undecanal to be an exogenous compound8,55,56, primarily associated with dietary sources such as citrus fruits and cucumbers (https://hmdb.ca/metabolites/HMDB0030941). However, recent findings suggest that undecanal may also originate endogenously in humans. Zhao et al.57, investigating the chemical basis of host discrimination by Aedes aegypti mosquitoes, reported that human odor is characterized by a high relative abundance of long-chain aldehydes such as decanal and undecanal. These compounds were proposed to be oxidation products of squalene and sapienic acid, lipid components unique to human sebum58, which are thought to contribute to skin protection57,58. Furthermore, Omolo et al.59 identified undecanal as a key component in foot odor contributing to Anopheles gambiae mosquito attraction. These findings suggest that undecanal may have both exogenous and endogenous origins. Given the current ambiguity, undecanal should be interpreted with caution in the context of potential biomarker.

In conclusion, the results of this study demonstrate significant differences in the signal intensity of six compounds in the volatile and semi-volatile compound profile between SARS-CoV-2-infected and uninfected individuals, reflecting clear alterations in lipid metabolism associated with viral infection. The effective discrimination achieved by the PLS-DA model (with a sensitivity and specificity greater than 80%) confirms the diagnostic potential of this approach. In particular, variations in the levels of saturated and unsaturated fatty acids, such as increased palmitic and oleic acids, and decreased squalene, palmitoleic acid, cerotic acid, and even undecanal, suggest a virus-induced metabolic reprogramming, possibly related to viral entry mechanisms, immune dysfunction, and interference with lipid homeostasis. These findings not only provide evidence of the role of lipids in the pathophysiology of COVID-19, but also position SPME-GC-MS analysis of body fluids as a promising tool for the early detection and monitoring of this and other infectious diseases.