Abstract
Metabolomics is gaining consideration as a viable approach to diseases detection and even shows promising results in COVID-19 diagnosis. This work extends the study of the relationship between extractable sweat compounds and COVID-19 positive patients. Sweat samples were collected from 426 patients (126 positives and 300 negatives) recruited at Merida and Progreso (Yucatán, México) health centers. The composition of sweat was analyzed with a solid phase microextraction gas chromatography-mass spectrometry (SPME-GC-MS) method. The PLS-DA, with specificity and sensitivity percentages greater than 80%, revealed significant differences between the DI-SPME profiles of infected and non-infected patients. Six key peaks used in the classification model were annotated as undecanal, methyl palmitoleate, palmitic acid, methyl oleate, squalene, and methyl cerotate, based on NIST17 spectral matches (> 90%) and interpretation aided by retention index (RI) data. Palmitic acid and Methyl oleate signals were elevated in COVID-positive patients,while undecanal, methyl palmitoleate, squalene and methyl cerotate were detected in lower amounts in COVID-positive patients. The identified set of compounds may serve as potential biomarkers for COVID-19 disease, not only in serologic assessments but also in the development of non-invasive diagnostic approaches, such as sweat-based or volatilome-based testing, as well as for monitoring disease progression and patient recovery.
Introduction
The rapid spread of the SARS-CoV-2 virus, beginning at the end of 2019, took the world by surprise and exposed numerous deficiencies in global health systems. While laboratory diagnostic tests and vaccines were rapidly developed1,2, the need for more reliable and rapidly deployable diagnostic tools to address future pandemics remains critical. Molecular methods like quantitative polymerase chain reaction (qPCR) offer high reliability for SARS-CoV-2 diagnosis, but their widespread application in countries with limited economic resources has been hindered by their high cost3,4. Although antigen tests present a cheaper alternative, their implementation in developing countries is often challenged by the lack of proper supervision5. The limitations of current diagnostic methods underscore the importance of exploring alternative approaches, such as leveraging the information contained within the complex mixture of volatile and semi-volatile compounds produced by humans. Indeed, the human body produces a diverse array of volatile and semi-volatile compounds, varying with individual factors such as age, diet, gender, genetics, and physiological status. These compounds can be considered unique individual attributes6. Pathological processes can alter this biochemical profile, influencing body odor by either inducing the production of specific compounds or modifying the overall profile to create a distinctive “olfactory fingerprint”6,7. These “olfactory fingerprints” hold potential as diagnostic olfactory biomarkers4. Gas chromatography-mass spectrometry (GC-MS) is a powerful analytical technique that plays a key role in identifying and quantifying these volatile and semi-volatile compounds. GC-MS allows for the precise separation and detection of various compounds in biological samples, such as breath, blood, or urine. This technique’s high sensitivity and specificity make it valuable for detecting subtle changes in the “olfactory fingerprint” associated with disease8. By analyzing these changes, GC-MS can help identify potential biomarkers for various illnesses, offering a complementary approach to other diagnostic methods8,9.
Different compounds have been correlated with various non-infectious diseases10,11, syndromes12, and infectious diseases13,14, using Solid-Phase Microextraction (SPME) or similar methodologies. Recently, the VOCs profile of patients diagnosed with SARS-CoV-2 was identified by breath analysis using SPME-in-mask coupled with a direct analysis in real-time mass spectrometry15 and from blood serum by headspace solid phase microextraction (HS-SPME)16,17. Moreover, it has been shown that the human skin of a healthy individuals offers long-lasting emanation of VOCs through sweat evaporation with approximately 532 different VOCs18, including ammonia, carboxylic acid, alcohols, hydrocarbons, ketones, aldehydes, esters, heterocyclic compounds, and volatile sulfur compounds from the skin19. Thus, sweat has great potential to be used as a reliable serologic biomarker.
As part of a project supported by the Mexican Government (CONACyT now SECIHTI) for training dogs in the bio-detection of SARS-CoV-2, we conducted an exploratory analysis of volatile and semi-volatile compounds present in human sweat, hypothesizing that the sweat compound profiles differ between SARS-CoV-2 infected and uninfected individuals. This study aimed to identify potential chemical biomarkers of infection and to evaluate whether sweat analysis could reliably distinguish COVID-19 positive from COVID-19 negative patients. Accordingly, two objectives were established: (1) to determine whether the analysis of sweat compounds enables differentiation between COVID-19 positive and negative individuals, and (2) to define the compound profile characteristic of SARS-CoV-2 infection.
Materials and methods
Samples
A total of 426 sweat samples were collected from humans between September 7, 2021 and February 16, 2022. All the positive samples (n = 137) were obtained from two sources, the Progreso Health Center in Progreso, Yucatán, and the private laboratory BioStudio in Merida, Yucatán. The negative samples (n = 289) were obtained from people of Merida City (offices and workers at CINVESTAV-IPN Mérida). Before sampling, each patient was asked for her/his willingness to participate in the research, and in case of a positive answer, the patient was provided an informed consent form to read and sign. All the informed consent forms are available under request. The participant’s name was removed to maintain confidentiality. Posteriorly, a questionnaire was applied to obtain the patient’s personal and epidemiological data, symptoms, and medical history. The questionnaire collected information on the full name; age; sex; diagnosed chronic diseases; headache; diarrhea; fever; loss of taste; loss of smell; cough; runny nose; sore throat; body ache; chest pain; nausea; days with symptoms; treatment, days on medication (if provided) and contact with confirmed COVID-positive people. All sweat samples were handled following the safety measures recommended by the Mexican health authorities20. In all cases, the patients took their samples under technical supervision; therefore, each patient was given a resealable Ziploc® bag containing two 7 cm high, 4 cm diameter new amber glass flasks with bakelite caps sterilized in an autoclave, and six pieces of new, sterile Jaloma® odorless gauze (10 cm × 10 cm). The patient was asked to rub her/his neck, face, and forearms for 1 min with two gauzes and to insert them in one of the flasks. Subsequently, the patient was asked to place two gauzes under each armpit for 10 min. After this time, the patient was instructed to insert the sweat-absorbed gauzes into the glass flask, close it, and place it back in the resealable bag. This process was approved by the Bioethics and Safety Committee of the University of Sonora. To reduce inter-individual variability, all sweat samples were collected in climate-controlled facilities. Participants were instructed to avoid vigorous physical activity, bathing, or applying personal care products prior to sampling. The sweat samples were transported in separate coolers at 4 °C from clinical facilities to the laboratory at CINVESTAV-IPN Mérida Unit, where they arrived within 1–2 h after collection. Upon arrival, all samples were immediately stored at − 80 °C until they were processed.
Analysis of sweat components by SPME-GC-MS
For extraction of compounds, the sweat absorbed in the gauzes was extracted with 10 mL of a solvent mixture (acetonitrile-methanol 50:50). For the capture of the sweat components, SPME was performed by introducing the SPME fiber directly into 500 µl of the solvent mixture at room temperature (approximately 22–24 °C) for 1 min. The 1-min extraction time was selected based on preliminary experiments demonstrating adequate sensitivity while minimizing carryover and fiber saturation. The Polydimethylsiloxane/Divinylbenzene/Carboxen fiber (PDMS/DVB/Carboxen) 80/23 mm (Part Number: 5191–5874 Agilent, USA) was selected because its mixed-phase coating allows the adsorption efficiencies of a wide variety of compounds, making it particularly suitable for comprehensive metabolomic profiling21. Subsequently, the SPME fiber contained in the holder was manually inserted into the injector for 1 min at 270 °C to desorb the captured compounds. The fiber was used in direct immersion with a solvent mixture (methanol: acetonitrile) due to prior experience with biological matrices. However, this combination has not been formally validated for comprehensive lipid extraction, and exposure to polar solvents may affect fiber performance over time. To mitigate this risk, the extraction time was limited to 1 min.
All the samples (426) were processed in a Gas Chromatography-Mass Spectrometry (GC-MS) equipment, Agilent Technologies model 7860 Gas Chromatograph and a model 5977B mass spectrometer. Chromatographic runs were performed under the following conditions: The injector was maintained at 270 °C in splitless mode. The chromatographic separation was performed using an Agilent HP-5MS capillary column (30 m length × 0.25 mm internal diameter × 0.25 μm film thickness; Agilent Technologies, USA). The oven was started at a temperature of 50 °C for one minute. The temperature was then raised with a gradient of 10 °C/min up to 300 °C, the waiting time was 5 min, and the chromatographic run time was 31 min. Helium was used as carrier gas at a constant flow of 1 mL/min. The response of the mass detector in electron ionization mode was generated at 70 eV and monitored in TIC (Total ion current) SCAN format from 50 to 650 m/z. A C7 to C30 alkane calibration standard (Sigma-Aldrich, St. Louis, Missouri, United States of America) was analyzed to calculate the retention index (RI) of each component following a previously reported methodology22. To ensure data integrity and minimize the risk of contamination or analytical artifacts, procedural solvent blanks were included throughout the analytical workflow. These consisted of solvent blanks (acetonitrile: methanol 1:1 v/v) processed under identical conditions as the sweat samples, including fiber exposure, thermal desorption, and GC-MS analysis. The blanks were run periodically across the sample batches to detect potential carry-over and to verify the absence of background interference. The analytical strategy employed in this study was exploratory based on relative quantification, using total ion chromatogram (TIC) signal intensity, as no calibration standards or specific quality control (QC) samples were included.
Annotation of sweat components
The gas chromatograph was used to separate the components of interest, and the mass spectrometer to identify them based on their unique mass fragmentation patterns. The mass spectrum of each chromatogram component was extracted from noise using the software Automated Mass Spectral Deconvolution & Identification System (AMDIS). The components were annotated by comparison of their extracted mass spectra with those in the National Institute of Standards and Technology database, United States of America (NIST17), including the same distribution of the 10 major ions, the parent ion, and the molecular weight ions. The NIST match and RI were also considered for annotation. Additionally, each annotation was classified according to the Metabolomics Standards Initiative (MSI) guidelines23.
Statistical treatment
The statistical analysis was conducted using the total ion chromatogram (TIC) signal intensity values derived from the GC-MS data. These values represent integrated spectral responses without retaining individual m/z features. The full TIC-based data matrix (including retention time and intensity) used for the multivariate modeling is provided as Supplementary Information S1 ZENODO repository: https://doi.org/10.5281/zenodo.16740883. The TIC data were selected because it monitors the whole window of several hundred mass-to-charge (m/z) units, so it gets most of the chemical information of each sweat sample. The TIC files were imported into the RStudio24, an integrated development environment for R software25, using the function files2SpectraObject from the “ChemoSpec” package, an R package for the preprocessing and 2D data exploratory analysis26. The “ChemoSpec” package can process spectroscopic or chromatographic 2D data (where Y is the signal intensity and X is a time or frequency unit).
The data importation assumes that the raw chromatogram files contain two columns, the first for the retention time and the second for the intensity. After data importation, a chromatogram baseline correction is followed using the peakDetection algorithm for simultaneous peak detection and baseline correction. Then, the peak alignment, and data binning up to a resolution of 0.1 min. The last preprocessing step included an intensity calibration (or normalization) with PQN normalization, using the median TIC (total ion current) of all chromatograms as the reference.
The supervised data were analyzed with the “caret” R package (short for Classification and Regression Training), which contains functions that attempt to streamline the process of creating a predictive model27. In this procedure, the data were divided into two randomly chosen groups (“training set” and “test set”). The first group, “training set” (75% of the original dataset), was used to optimize the model parameters. In each model optimization, a resampling method was added to reduce the error in the estimate of mean model performance, in this case, the repeated k-fold cross-validation method (k = ten-fold, three times repeated). Five models: PLS-DA (Partial Least-Squares Discriminant Analysis), RF (Random Forest), and three SVM (Support Vector Machines) models, a linear one, and two non-linear algorithms (polynomial kernel and Radial kernel) were implemented and compared to obtain the best classification model that discriminates correctly between positive and negative samples of SARS-CoV-2 from sweat GC-MS profiling data. These models are some of the most recurrent to assort samples into different groups from metabolomic data28,29. The area under the ROC curve (AUC) and Accuracy were the binary classification metrics to select the optimal model. Additionally, to evaluate the performance of each classification model predictors such as DOR (Diagnostic odds ratio), Accuracy, Balanced Accuracy, F1-score, Cohen’s Kappa, and MCC (Matthews Correlation Coefficient) were calculated. The formulas used to calculate the binary classification evaluation metrics are in the following references:30,31. The second group of samples, called the “test set” (which is the rest of the original dataset) was used to evaluate the models, that is, to evaluate its classification power to discriminate between positive and negative samples from the calculation of a confusion matrix. Thus, the confusion matrix is an organized way of mapping the predictions to the original classes (PR: positive samples or NR: negative samples) to which the data belong according to the models. Additionally, several binary classification metrics aside from those generated from the R “caret” package (Accuracy, balanced Accuracy, and Cohen´s Kappa) were calculated to evaluate the classification models using the “training set” by comparing them to the “test set” and to decide which model is better, in the case there were models with similar performance based on just one binary classification evaluation metric. To compare when the metrics performance values were better, if under the ROC or the Accuracy metrics, we did a t-test (α = 0.05).
Other packages implemented in this study for visualization in 3D are the following: “ggplot2”32, “rgl”33, and “car”34. The efficiency of the binary classification PLS-DA depends on metrics such as Accuracy, Sensitivity, and Specificity calculated from the confusion matrix, using Balanced Accuracy as a metric for the binary classifier, because an imbalanced number of negative SARS-CoV-2 cases (the data have more of them) compared to the positive SARS-CoV-2 patients.
Finally, the model classification’s most important original variables (RT) were obtained using the VIP (Variable Importance Projection) calculation from the “caret” package.
Results
Demographic data analysis
Four hundred twenty-six sweat samples, one per patient, were analyzed by GC-MS. The positive sweat samples were obtained from a total of 137 people from two sources, the Progreso Health Center in Progreso, Yucatán, and the private laboratory BioStudio at Merida, Yucatán. The negative samples (n = 289) were obtained from people of Merida City. The whole sample (positives and negatives) comprised 196 (46%) women and 230 (54%) men. The group of positive people comprised 81 (59%) women and 56 (41%) men, and the group of negative people was 115 (40%) women and 174 (60%) men. There were no significant differences between SARS-CoV-2 positive and negative patients in the proportion of women and men (Fisher’s exact test, difference between proportions = 1.65318, p > 0.05). The age range of the whole sample was between 4 and 80 years, with 37 ± 14 years average for women and 38 ± 14 years for men. There were no differences in the mean age between the SARS-CoV-2 positive and negative patients (Student’s t0.05 = 0.47, p > 0.05).
Optimal model
According to the analyses, the best model was the PLS-DA as determined by the Accuracy metric. The PLS-DA model had the best performance among the models, according to metrics like DOR (Diagnostic odds ratio), where a higher ratio value indicates a better result (Supplementary Information S2 Excel file).
To understand how the final PLS-DA model differentiates between the two classes (positives and negatives from the training data), we plotted the final PLS-DA model. The PLS-DA models clearly show the separation for the PR samples from the NR samples in the plot (Fig. 1). The explained variance from the final PLS-DA model for the first three PLS-DA’s LVs is 14.78% in LV1, 14.06% in LV2, and 8.41% in LV3 (37.25% overall).
Compounds profile of Sars-Cov-2 positive patients
Solid Phase Microextraction Gas Chromatography-mass spectrometry (SPME-GC-MS) analysis detected 286 different compounds differentially distributed among the samples. The Fig. 2 shows the mean of Total Ion Current Chromatograms with the standard error of the mean for COVID-19 positive and negative groups. The VIP analysis categorized 20 discriminatory compounds for SARS-CoV-2 infected and non-infected patients. However, only 6 compounds (Table 1) were chemically annotated by spectrum recognition using the NIST library in combination with spectrum interpretation for compounds not found in the database and RI determination (Supplementary Information S3 PDF file). Among the compounds annotated, we found one Saturated Fatty Acids (SFA) (Hexadecanoic acid), one Methyl ester of SFA (Methyl hexacosanoate), two Methyl ester of Monounsaturated Fatty Acids (MUFA) (Methyl (Z)-hexadec-9-enoate and Methyl (Z)-octadec-9-enoate), one triterpenoid (squalene) and an Aliphatic aldehyde (undecanal). According to the analysis, significant differences in profiles were observed between SARS-CoV-2 infected and non-infected patients. Hexadecanoic acid was elevated in SARS-CoV-2-positive patients (p = 0.004), as was Methyl (Z)-octadec-9-enoate (p = 0.03). In contrast, the Methyl (Z)-hexadec-9-enoate (p = 0.0004), Methyl hexacosanoate (p = 0.009), squalene (p = 0.029), and undecanal (p = 0.007) were lower in SARS-CoV-2-positive patients than in negative ones (Fig. 3).
Mean of Total Ion Current Chromatograms with the standard error of the mean for COVID-19 positive and negative groups. PR stands for positive patients andNR for negative patients. The spectra show the variation within each group, along with the standard errors of the TIC mean (SEM). When the dispersion is low, only a red line appears instead of three lines (mean – SEM, mean in black, and mean + SEM).
Discussion
The hypothesis that the DI-SPME patterns in sweat samples of COVID-positive and negative patients would be different, was partially correct since the DI-SPME patterns of both SARS-CoV-2 infected and non-infected people were similar in composition. However, we found significant differences in the signal intensity of these compounds in infected and non-infected people, which suggest that these differences could be associated to different physiological responses to the virus infection. The best classification model (PLS-DA) correctly discriminated between positive and negative cases, with percentages of specificity and sensitivity above 80%, using the MCC as an evaluation metric. Recently, the PLS-DA model was used by Mitra et al.35 for VOC patterns of the hands infected and non-infected people with SARS-CoV-2, with very good discriminating results. The PLS-DA model has been also used by Brixner et al.36 in the discrimination of plasma samples from patients with colorectal cancer (RCC) and healthy individuals, based on Fourier transform infrared spectroscopy (FTIR) data with excellent results.
An important characteristic of our analysis was that the PLS-DA model only selected six compounds with a high level of similarity. The small number of compounds annotated was related to the fact that we concentrated our efforts on those compounds with the highest levels of similarity with respect to the peaks found in the NIST database and RI determination. The remaining compounds had a lower level of similarity with respect to the peak values found in the NIST database, and so we considered them as non-reliable annotations. In any case, since we found significant differences in the signal intensity of six of the compounds, these differences could be considered as indicators of physiological differences between SARS-CoV-2 positive and negative people.
Among the annotated compounds, three were identified as methyl esters derived from long-chain fatty acids: methyl palmitoleate, methyl oleate, and methyl cerotate. The presence of methyl esters in sweat is not uncommon but could be attributed to several mechanisms. One possibility is endogenous esterification or microbial transesterification occurring on the skin surface37. However, the use of methanol: acetonitrile (1:1 v/v) solvent mixture during compound extraction may have facilitated the formation of these esters, potentially acting as analytical artifacts38,39. Nevertheless, despite their esterified form, lipidomic studies have demonstrated that circulating fatty acids methyl esters concentrations markedly correlate with their corresponding fatty acid levels across biological matrices like plasma and serum40. Therefore, the compounds observed reflect the presence of their corresponding fatty acids (palmitoleic acid, oleic acid, and cerotic acid), offering indirect biochemical indicators associated. According to this, significant differences observed in lipid-related compounds (palmitic acid, palmitoleic acid, oleic acid, and cerotic acid) between infected and non-infected patients suggest probably alterations in lipid metabolism associated with the presence of the SARS-CoV-2 virus. However, it is important to note that the cartridge mix used in this study is likely not optimal for recovering the full spectrum of lipids present in sweat. Therefore, the lipid findings presented here should be considered preliminary and focused on the subset of compounds that could be extracted under these conditions. Even so, the patterns found for palmitic, oleic and palmitoleic acids are exactly the same found by Stromberg et al.17 in plasma fatty acids profiles between individuals who experienced moderate or severe COVID-19 disease compared to those with mild infection or no history of infection. Similarly, Spick et al.41, found that lipid levels were depressed in COVID-positive participants, indicative of dyslipidemia. In fact, various investigations have suggested that these changes in the lipid pattern can be attributed to inflammatory processes, liver dysfunction, increased vascular permeability42, and the interplay between cholesterol and viral replication43. Moreover, these changes seem to be more evident with the progression of the disease. For example, Wu et al.44 and Palma et al.45 found that metabolite and lipid alterations exhibit an apparent correlation with the disease course, reflecting that its development affected the whole-body metabolism. This indicates that plasma concentrations of both pro-inflammatory and pro-resolving lipid mediators were reduced in critically ill patients compared to those with severe disease. However, the mechanisms and consequences of SARS-CoV-2 lipid metabolic reprogramming are largely unexplored46.
In the fatty acid profiles, levels of palmitic acid were significantly higher in COVID-positive patients relative to the uninfected patients. Previous studies of fatty acid metabolism in patient with COVID-19 disease have revealed that palmitic acid is likely to play a role in viral entry to host cells, since this acid is known to attach covalently to the cysteine residues found on the SARS-CoV-2 spike and envelope proteins17,46. Thus, individuals with high levels of this fatty acid may be more susceptible to viral invasion and subsequently develop a more severe disease course. It is also possible that this could be one of the reasons why the immune response of individuals with morbid obesity are adversely affected by COVID-19 disease and in turn present increased disease severity. Additionally, SARS-CoV-2 has also been hypothesized to promote activation of palmitic acid synthesis via upregulation of the genes responsible for signaling the transcription of fatty acid synthase, acetyl-CoA carboxylase, and stearoyl-CoA desaturase 146. In doing so, the virus could increase the lipid stock, further promoting its replication and increasing viral load within the body46.
The oleic acid, also appears as significantly higher in COVID-positive patients compared with uninfected patients. Elevation of oleic acid in the presence of the virus was previously described by Barberis et al.47, they showed that levels of oleic acid directly correlated with disease severity. Interestingly, unsaturated fatty acids, such as oleic, arachidonic, or linoleic acid, have been shown to mediate antiviral activity by disintegrating the envelope of certain animal viruses, including herpes and influenza48. This pattern suggests a possible role for the oleic acid attacking the SARS-CoV-2 envelope given its direct negative correlation with disease severity49.
In this study, methyl palmitoleate (≈ palmitoleic acid) was significantly reduced in the sweat of COVID-19 positive patients. This observation is consistent with recent evidence from Cartin-Ceba et al.50 and Stromberg et al.17, who reported that palmitoleic acid was one of the compounds shown to be significantly reduced in the plasma of COVID-19 patients. This reduction being even more pronounced in patients with acute respiratory distress syndrome, those requiring life support, or those who died during hospitalization. These authors claim that this decrease is linked to an increased use of unsaturated fatty acids for energy in response to inflammation. This reflects a shift in lipid homeostasis and suggests altered de novo lipogenesis, as palmitoleic acid is a direct product of palmitic acid desaturation17. Palmitoleic acid has been described as a lipokine (a lipid molecule with hormonal function), associated with IL-6 and TNF-α and other inflammatory mediators, supporting its anti-inflammatory effects and role in immunometabolic processes17,50.
Among the compounds that differed between COVID-19-positive and -negative individuals, methylcerotate (= cerotic acid) was significantly decreased in the infected group. Cerotic acid belongs to the family of very long-chain saturated fatty acids (VLCFAs), which are primarily metabolized by peroxisomal β-oxidation. These fatty acids play an essential role in maintaining cellular lipid homeostasis, membrane structure, and immune regulation51. Mika et al.52 identified significantly reduced levels of cerotic acid in patients with colorectal cancer, suggesting that decreased VLCFAs may reflect altered peroxisomal function or increased metabolic demand associated with the disease. Therefore, although cerotic acid has not yet been directly linked to viral infections, the reduction observed in our study could indicate a peroxisomal alteration or a change in lipid metabolism in response to infection.
In this study, squalene occurs recurrently and abundantly in patients not infected with the virus; while it decreases significantly in COVID-positive patients. These changes could be attributed to the percentage contribution of the compounds that make up human sebum. Squalene represents 12% of the products of sebaceous secretion, which is a linear intermediate that precedes cholesterol in its biosynthesis53. Curiously, in the sebaceous glands, the squalene produced is not converted into lanosterol, which stops its synthesis to cholesterol, favoring the accumulation of squalene. The possible explanation for the accumulation of squalene in the sebaceous gland may be related to overexpression or an increase in the activity of squalene synthase in the cells, or it may be related to decreased level or activity of enzymes involved in the conversion to cholesterol54. Therefore, in an infectious process such as the entry of a virus like SARS-CoV-2, in which irregularities occur in lipid metabolism, the extra production of palmitic acid or oleic acid, possibly displaces squalene and/or modifies its proportion within the sebum components relative abundance.
Many authors consider undecanal to be an exogenous compound8,55,56, primarily associated with dietary sources such as citrus fruits and cucumbers (https://hmdb.ca/metabolites/HMDB0030941). However, recent findings suggest that undecanal may also originate endogenously in humans. Zhao et al.57, investigating the chemical basis of host discrimination by Aedes aegypti mosquitoes, reported that human odor is characterized by a high relative abundance of long-chain aldehydes such as decanal and undecanal. These compounds were proposed to be oxidation products of squalene and sapienic acid, lipid components unique to human sebum58, which are thought to contribute to skin protection57,58. Furthermore, Omolo et al.59 identified undecanal as a key component in foot odor contributing to Anopheles gambiae mosquito attraction. These findings suggest that undecanal may have both exogenous and endogenous origins. Given the current ambiguity, undecanal should be interpreted with caution in the context of potential biomarker.
In conclusion, the results of this study demonstrate significant differences in the signal intensity of six compounds in the volatile and semi-volatile compound profile between SARS-CoV-2-infected and uninfected individuals, reflecting clear alterations in lipid metabolism associated with viral infection. The effective discrimination achieved by the PLS-DA model (with a sensitivity and specificity greater than 80%) confirms the diagnostic potential of this approach. In particular, variations in the levels of saturated and unsaturated fatty acids, such as increased palmitic and oleic acids, and decreased squalene, palmitoleic acid, cerotic acid, and even undecanal, suggest a virus-induced metabolic reprogramming, possibly related to viral entry mechanisms, immune dysfunction, and interference with lipid homeostasis. These findings not only provide evidence of the role of lipids in the pathophysiology of COVID-19, but also position SPME-GC-MS analysis of body fluids as a promising tool for the early detection and monitoring of this and other infectious diseases.
Data availability
The datasets generated during and/or analysed during the current study are available in the ZENODO repository: DOI 10.5281/zenodo.16740883.
References
Chen, H. et al. COVID-19 screening using breath-borne volatile organic compounds. J. Breath. Res. 15, 047104. https://doi.org/10.1088/1752-7163/ac21b5 (2021).
Mercer, T. R. & Salit, M. Testing at scale during the COVID-19 pandemic. Nat. Rev. Genet. 22, 415–426. https://doi.org/10.1038/s41576-021-00360-w (2021).
Giri, A. K. & Rana, D. R. Charting the challenges behind the testing of COVID-19 in developing countries: Nepal as a case study. Biosaf. Health. 2, 53–56. https://doi.org/10.1016/j.bsheal.2020.05.002 (2020).
Jendrny, P. et al. Canine olfactory detection and its relevance to medical detection. BMC Infect. Dis. 21, 838. https://doi.org/10.1186/s12879-021-06594-2 (2021).
Steppert, C., Steppert, I., Sterlacci, W. & Bollinger, T. Rapid detection of SARS-CoV-2 infection by multicapillary column coupled ion mobility spectrometry (MCC-IMS) of breath. J. Breath. Res. 15, 025001. https://doi.org/10.1088/1752-7163/abe5ca (2021).
Shirasu, M. & Touhara, K. The scent of disease: volatile organic compounds of the human body related to disease and disorder. J. Biochem. 150, 257–266. https://doi.org/10.1093/jb/mvr090 (2011).
Issitt, T. et al. Volatile compounds in human breath: critical review and meta-analysis. J. Breath. Res. 16, 024001. https://doi.org/10.1088/1752-7163/ac5a45 (2022).
Phillips, M. et al. Volatile biomarkers of pulmonary infections with Pseudomonas aeruginosa detected in the breath using real-time selected ion flow tube mass spectrometry. J. Clin. Microbiol. 45, 2058–2063. https://doi.org/10.1128/JCM.00171-07 (2007).
Bernaś, E. & Staniaszek, L. The role of volatile organic compounds in the early diagnosis of lung cancer. Adv. Respir Med. 86, 183–191. https://doi.org/10.5603/ARM.2018.0030 (2018).
Pirrone, F. & Albertini, M. Olfactory detection of cancer by trained sniffer dogs: a systematic review of the literature. J. Vet. Behav. 19, 105–117. https://doi.org/10.1016/j.jveb.2017.03.004 (2017).
Maa, E., Arnold, J., Ninedorf, K. & Olsen, H. Canine detection of volatile organic compounds unique to human epileptic seizure. Epilepsy Behav. 115, 107690. https://doi.org/10.1016/j.yebeh.2020.107690 (2021).
Ahmed, I., Greenwood, R., de Lacy Costello, B., Ratcliffe, N. M. & Probert, C. S. An investigation of fecal volatile organic metabolites in irritable bowel syndrome. PLoS One. 8, e58204. https://doi.org/10.1371/journal.pone.0058204 (2013).
Taylor, M. T. et al. Using dog scent detection as a point-of-care tool to identify toxigenic Clostridium difficile in stool. Open. Forum Infect. Dis. 5, ofy179. https://doi.org/10.1093/ofid/ofy179 (2018).
Guest, C. et al. Trained dogs identify people with malaria parasites by their odour. Lancet Infect. Dis. 19, 578–580. https://doi.org/10.1016/S1473-3099(19)30220-8 (2019).
Yuan, Z. C. et al. Solid-phase microextraction fiber in face mask for in vivo sampling and direct mass spectrometry analysis of exhaled breath aerosol. Anal. Chem. 92, 11543–11547. https://doi.org/10.1021/acs.analchem.0c02251 (2020).
Mougang, Y. K. et al. Sensor array and gas chromatographic detection of the blood serum volatilomic signature of COVID-19. iScience 24, 102851 (2021). https://doi.org/10.1016/j.isci.2021.102851
Stromberg, S. et al. Relationships between plasma fatty acids in adults with mild, moderate, or severe COVID-19 and the development of post-acute sequelae. Front. Nutr. 9, 960409. https://doi.org/10.3389/fnut.2022.960409 (2022).
de Costello, B. Review of the volatiles from the healthy human body. J. Breath. Res. 8, 014001. https://doi.org/10.1088/1752-7155/8/1/014001 (2014).
Jalal, A. H. et al. Prospects and challenges of volatile organic compound sensors in human healthcare. ACS Sens. 3, 1246–1263. https://doi.org/10.1021/acssensors.8b00247 (2018).
Secretaría de Salud. Lineamiento estandarizado para la vigilancia epidemiológica y por laboratorio de la enfermedad respiratoria viral. (2021). Available at: https://coronavirus.gob.mx/wp-content/uploads/2020/06/Lineamiento_VE_Lab_enfermedad_respiratoria_viral-_20052020.pdf (Accessed 26 November 2021).
Rodrigues, F., Caldeira, M. & Camara, J. S. Development of a dynamic headspace solid-phase microextraction procedure coupled to GC-qMSD for evaluation the chemical profile in alcoholic beverages. Anal. Chim. Acta. 609, 82–104. https://doi.org/10.1016/j.aca.2007.12.041 (2008).
Sun, G. & Stremple, P. Retention Index Characterization of Flavor, Fragrance and Many Other Compounds on DB-1 and DB-XLB. J&W Scientific, 91 Blue Ravine Road, Folsom (2003).
Salek, R. M. et al. The role of reporting standards for metabolite annotation and identification in metabolomic studies. GigaSci 2, 13. https://doi.org/10.1186/2047-217X-2-13 (2013).
Posit team. RStudio: Integrated Development Environment for R. Posit Software, PBC. (2022). http://www.posit.co/
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2022).
Hanson, B. A. ChemoSpec: Exploratory Chemometrics for Spectroscopy. R package version 6.1.4. (2022). https://cran.r-project.org/package=ChemoSpec
Kuhn, M. Caret: Classification and Regression Training. R package version 6.0–93. (2022). https://cran.r-project.org/package=caret
Gromski, P. S. et al. A tutorial review: metabolomics and partial least squares-discriminant analysis—a marriage of convenience or a shotgun wedding. Anal. Chim. Acta. 879, 10–23. https://doi.org/10.1016/j.aca.2015.02.012 (2015).
Mendez, K. M., Reinke, S. N. & Broadhurst, D. I. A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification. Metabolomics 15, 1–15. https://doi.org/10.1007/s11306-019-1612-4 (2019).
Vujović, Ž. Classification model evaluation metrics. Int. J. Adv. Comput. Sci. Appl. 12, 599–606. https://doi.org/10.14569/IJACSA.2021.0120670 (2021).
De Diego, M. I., Redondo, A. R., Fernández, R. R., Navarro, J. & Moguerza, J. M. General performance score for classification problems. Appl. Intell. 52, 12049–12063. https://doi.org/10.1007/s10489-021-03041-7 (2022).
Wickham, H. Ggplot2: Elegant Graphics for Data Analysis (Springer-, 2016). https://ggplot2.tidyverse.org
Murdoch, D. & Adler, D. Rgl: 3D Visualization Using OpenGL. R package version 0.110.2. (2022). https://cran.r-project.org/package=rgl
Fox, J. & Weisberg, S. An R Companion to Applied Regression, 3rd edn (Sage, 2019). https://socialsciences.mcmaster.ca/jfox/Books/Companion/
Mitra, A. et al. The human skin volatilome: a systematic review of untargeted mass spectrometry analysis. Metabolites 12, 824. https://doi.org/10.3390/metabo12090824 (2022).
Brixner, R. L., Cassol, O. S., Rieger, A. & Corbellini, V. A. Discrimination of healthy and colorectal cancer patients using FTIR and PLS-DA. Rev. Jovens Pesqui. 9 https://doi.org/10.17058/rjp.v9i2.13372 (2019).
Fournière, M., Latire, T., Souak, D., Feuilloley, M. G. J. & Bedoux, G. Staphylococcus epidermidis and Cutibacterium acnes: two major sentinels of skin microbiota and the influence of cosmetics. Microorganisms 8, 1752. https://doi.org/10.3390/microorganisms8111752 (2020).
Lough, A. K., Felinski, L. & Garton, G. A. The production of Methyl esters of fatty acids as artifacts during the extraction or storage of tissue lipids in the presence of methanol. J. Lipid Res. 3, 351–356. https://doi.org/10.1016/S0022-2275(20)40396-7 (1962).
Fukuda, J. I., Mizukami, E. & Imaichi, K. Production of Methyl esters of fatty acids as artifacts during the concentration of methanolic extracts of serum or plasma lipids. J. Biochem. 61, 657–658 (1967).
Mallimoggala, Y. et al. Assessment of fatty acid concentrations among blood matrices. Metabolites 15, 482. https://doi.org/10.3390/metabo15070482 (2025).
Spick, M. et al. Changes to the Sebum lipidome upon COVID-19 infection observed via rapid sampling from the skin. eClinicalMedicine 33, 100786. https://doi.org/10.1016/j.eclinm.2021.100786 (2021).
Wei, X. et al. Hypolipidemia is associated with the severity of COVID-19. J. Clin. Lipidol. 14, 297–304. https://doi.org/10.1016/j.jacl.2020.04.008 (2020).
Roccaforte, V., Daves, M., Lippi, G., Spreafico, M. & Bonato, C. Altered lipid profile in patients with COVID-19 infection. J. Lab. Precis Med. 6 https://doi.org/10.21037/jlpm-20-98 (2020).
Wu, D. et al. Plasma metabolomic and lipidomic alterations associated with COVID-19. Natl. Sci. Rev. 7, 1157–1168. https://doi.org/10.1093/nsr/nwaa086 (2020).
Palmas, F. et al. Dysregulated plasma lipid mediator profiles in critically ill COVID-19 patients. PLoS One. 16, e0256226. https://doi.org/10.1371/journal.pone.0256226 (2021).
Tanner, J. E. & Alfieri, C. The fatty acid lipid metabolism nexus in COVID-19. Viruses 13, 90. https://doi.org/10.3390/v13010090 (2021).
Barberis, E. et al. Large-scale plasma analysis revealed new mechanisms and molecules associated with the host response to SARS-CoV-2. Int. J. Mol. Sci. 21, 8623. https://doi.org/10.3390/ijms21228623 (2020).
Kohn, A., Gitelman, J. & Inbar, M. Interaction of polyunsaturated fatty acids with animal cells and enveloped viruses. Antimicrob. Agents Chemother. 18, 962–970. https://doi.org/10.1128/AAC.18.5.962 (1980).
Valdés, A. et al. Metabolomics study of COVID-19 patients in four different clinical stages. Sci. Rep. 12, 1650. https://doi.org/10.1038/s41598-022-05667-0 (2022).
Cartin-Ceba, R. et al. Evidence showing lipotoxicity worsens outcomes in COVID-19 patients and insights about the underlying mechanisms. iScience 25, 104322. https://doi.org/10.1016/j.isci.2022.104322 (2022).
Wang, X., Yu, H., Gao, R., Liu, M. & Xie, W. A comprehensive review of the family of very-long-chain fatty acid elongases: structure, function, and implications in physiology and pathology. Eur. J. Med. Res. 28, 532. https://doi.org/10.1186/s40001-023-01523-7 (2023).
Mika, A. et al. Hyper-elongation in colorectal cancer tissue—cerotic acid is a potential novel serum metabolic marker of colorectal malignancies. Cell. Physiol. Biochem. 41, 722–730. https://doi.org/10.1159/000458431 (2017).
Popa, O., Băbeanu, N. E., Popa, I., Niță, S. & Dinu-Pârvu, C. E. Methods for obtaining and determination of squalene from natural sources. Biomed. Res. Int. 2015, 367202. https://doi.org/10.1155/2015/367202 (2015).
Picardo, M., Ottaviani, M., Camera, E. & Mastrofrancesco, A. Sebaceous gland lipids. Derm -Endocrinol. 1, 68–71. https://doi.org/10.4161/derm.1.2.8472 (2009).
Duffy, E., Jacobs, M., Kirby, B. & Morrin, A. Probing skin physiology through the volatile footprint: discriminating volatile emissions before and after acute barrier disruption. Exp. Dermatol. 26, 919–925. https://doi.org/10.1111/exd.13361 (2017).
Doležal, P., Kyjaková, P., Valterová, I. & Urban, Š. Qualitative analyses of less-volatile organic molecules from female skin scents by comprehensive two-dimensional gas chromatography–time of flight mass spectrometry. J. Chromatogr. A. 1505, 77–86. https://doi.org/10.1016/j.chroma.2017.05.027 (2017).
Zhao, Z. et al. Mosquito brains encode unique features of human odour to drive host seeking. Nature 605, 706–712. https://doi.org/10.1038/s41586-022-04675-4 (2022).
Verhulst, N. O., Weldegergis, B. T., Menger, D. & Takken, W. Attractiveness of volatiles from different body parts to the malaria mosquito Anopheles coluzzii is affected by deodorant compounds. Sci. Rep. 6, 27141. https://doi.org/10.1038/srep27141 (2016).
Omolo, M. O., Ndiege, I. O. & Hassanali, A. Semiochemical signatures associated with differential attraction of Anopheles Gambiae to human feet. PLoS One. 16, e0260149. https://doi.org/10.1371/journal.pone.0260149 (2021).
Acknowledgements
The authors were indebted to Karla Valenzuela Lozano (nurse and sample collection) and Georgina Villegas, Enrique Claussen (former Secretary of Health of the state of Sonora). The authors thank Clara Vivas-Rodríguez, Gregory Arjona-Torres, Francisco Puc-Itza, Danilú Couoh-Puga, Nadia Herrera, Mariana Ávila López and Oscar Arturo Centeno-Chalé for support with the field and laboratory work.
Funding
This project supported by the Mexican Government (CONAHCyT now SECIHTI) for training dogs in the bio-detection of SARS-CoV-2 at the OBI-K19 training center. Agreement Number: 000000000317533 Secretaría de Ciencia, Humanidades, Tecnología e Innovación. https://secihti.mx/.
Author information
Authors and Affiliations
Contributions
LCS-J and VMV-M contributed to the study conception and design. Material preparation, data collection and the qPCR analysis were performed by DH-M. The SPME-GC-MS analysis was performed by EH-N and JV-M. The statistical analysis was performed by HAP-Pavía. Funding acquisition by JMM-T. Access to patients and samples was provided by PG. The first draft of the manuscript was written by LCS-J, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics declarations
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Research Ethics Committee of the University of Sonora (CEI-UNISON) (09-21-2020/Office No. CEI-Unison 016/2020) and Ethics Committee on Research, State’s General Hospital “Dr. Eduardo Ramos Bours” (COMBIOETHICS-26-CEI-002-20170517).
Consent to participate
In this research, informed consent was obtained from all study participants; in the case of minor participants, written informed consent was obtained from their parents.
Consent to publish
This manuscript does not contain personal data of any kind.
Declaration of authorship
We declare this manuscript as original, and it has not been published before and is not currently being considered for publication elsewhere. We confirm that the manuscript has been read and approved by all named authors and that no others are satisfying the authorship criteria.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Soler-Jiménez, L.C., Peniche-Pavía, H.A., Vázquez-Martínez, J. et al. Solid-phase microextraction of sweat components of patients positive for Sars-Cov-2 for identification of possible biomarkers. Sci Rep 15, 35680 (2025). https://doi.org/10.1038/s41598-025-19509-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-19509-2