Introduction

People with type 1 diabetes (T1D) cannot produce insulin and therefore require injections, as well as a good diet/exercise routine, to manage blood glucose (BG) levels1,2. Hypoglycemia is associated with diabetes and occurs when BG levels are < 70 mg/dL3,4. Hypoglycemic events are often accompanied by physical symptoms, including but not limited to fatigue, shakiness, and confusion, but persons with T1D may not experience any symptoms. In the most severe cases, hypoglycemia can lead to seizures and even death. Hyperglycemic events also occur when BG levels are elevated out of range and can also negatively impact persons with diabetes since they have the potential to lead to nerve damage, eye problems, kidney damage, and other long-term health complications. Therefore, people with T1D must carefully monitor their glucose levels. Technologies to measure glucose are established and include glucometers and continuous glucose monitors (CGMs). Nonetheless, there are reasons to explore other noninvasive techniques for monitoring hypoglycemia. For example, current technologies for glucose monitoring are invasive, as they either require finger prick testing or the need for hypodermal sensors to be inserted beneath the skin (CGMs). Even though CGMs have the potential to increase glycemic control in youths and adolescents5, there are significant barriers to CGM use based on socioeconomic status, race, age, and ethnicity6,7. Not only do these devices have high cost if not covered by insurance, but they also can produce frequent/false alarms and be uncomfortable for the user. Therefore, there is a drive to identify other cost-effective and noninvasive methods for detecting hypoglycemia. There are alternative noninvasive technologies based on monitoring glucose in tears or saliva, but these methods are hindered by inherently lower concentrations of glucose in these fluids, and even lower concentrations induced by hypoglycemia. Therefore, methods for monitoring hypoglycemia that involve biomarkers other than glucose, including those which are upregulated, have great potential. Breath-based diagnostic approaches are emerging as another alternative since they are well situated to detect biomarkers that are upregulated (or downregulated) by hypoglycemia, where noninvasive tools for glucose monitoring in tears, saliva, or sweat solely rely on the downregulation of the single biomarker. Moreover, it should also be mentioned that breath-based assays are typically developed to survey a panel or biosignature of biomarkers for accurate detection, rather than a single marker such as glucose. Finally, breath measurements can be taken in an on-demand fashion, given the sample matrix is virtually limitless by nature. Alternative sample types for glucose sensing may be difficult to collect in certain circumstances (i.e., tears, saliva, sweat).

Another approach for monitoring hypoglycemia is the use of diabetes alert dogs (DADs) which smell volatile organic compound (VOC) biomarkers emitted by their owner’s breath, sweat, and other noninvasive bodily fluids. Research has shown that canines can be trained to detect a wide range of medical conditions by smelling VOCs in noninvasive sample types including breath8,9, and DADs have been demonstrated to objectively change behavior when their owners’ BG levels fluctuate10. However, DADs are costly and have high false-positive rates for hypoglycemia detection11. Furthermore, it is impractical to train canines to detect hypoglycemia for all persons with T1D worldwide. There is a strong motivation in current literature to identify the specific volatile analytes that canines smell for diagnosing a variety of medical conditions12,13. The biochemical rationale for surveying VOCs as biomarkers is they are end- or byproducts of metabolic pathways that are altered by disease14. Different conditions including hypoglycemia can change the concentration of specific VOCs, leading to a unique volatilomic profile. Solid phase microextraction (SPME) coupled with gas chromatography-mass spectrometry (GC-MS) has been widely utilized to identify VOC biomarkers in breath15, as it has the capability of preconcentrating, separating, identifying and quantifying volatiles. Electronic nose (e-nose) technology has ushered in a new opportunity to detect VOC biomarkers at a point of care16,17. These devices mimic the canine olfactory system by employing multiple unique gas sensors in an array to detect VOC biosignatures through pattern recognition18,19. However, their biggest challenge is that even with the use of machine learning algorithms, they generally are not sufficiently selective for medical diagnostic applications20. Identifying and verifying biomarkers for hypoglycemia in breath by SPME GC-MS can lead to the development of a noninvasive, highly tailored breathalyzer or wearable device that is designed to be more sensitive/selective to the specific biomarkers of hypoglycemia compared to existing e-Noses.

VOC biomarker discovery is a growing field that is leading to preliminary studies for cystic fibrosis21,22, COVID-1923,24, lung cancer25, prostate cancer26, and other conditions27,28,29. Moreover, there has already been preliminary work on VOC biomarkers relating to diabetes and hypoglycemia. For example, Lekha et al. previously reported that the concentration of exhaled acetone in persons with T1D is 40x higher than a healthy subject, and therefore breath analysis may be used to detect diabetes noninvasively15. Trefz et al. also conducted a cross-sectional study which indicated a biosignature of seven VOCs to be indicative of diabetes diagnosis30. Regarding hypoglycemia specifically, a study has shown that isoprene is a potential clinically useful biomarker, although the precise relationship between hypoglycemia and isoprene is not fully understood31,32. Nonetheless, isoprene has been a commonly reported marker for different conditions33,34,35, and therefore is likely not specific to hypoglycemia. Most recently, Nicolier et al. undertook a study in which ten individuals with T1D underwent induced hypoglycemia and donated breath samples analyzed using GC-ion mobility spectrometry. Biostatistical analyses of the data showed individual VOCs were significantly correlated with BG levels and that a biomarker panel of VOCs could classify hypoglycemic states with 93% accuracy36.

Finally, our team has previously demonstrated that hypoglycemia can be distinguished with clinically relevant accuracy using an exhaled VOC biomarker panel analyzed by SPME GC-MS4. Although these previous studies (from groups including our own) have identified putative biomarker panels for hypoglycemia, the results across literature are not blindly tested in an independent sample cohort. This is of upmost importance, as VOC biomarker discovery is prone to false discovery, and most studies do not evaluate biomarkers in newly collected samples, thus hindering their ultimate applicability. To move beyond the discovery phase, VOCs must be blindly tested in hundreds of samples for verification. Only once VOCs have been qualified/verified can biomarkers be clinically validated using even larger numbers of samples that represent the true patient population the test is intended for37,38. Herein, the authors make strides in validating exhaled biomarkers of hypoglycemia by verifying VOCs in hundreds of samples collected from a diabetes youth camp. Exhaled VOCs were analyzed in two sample cohorts collected at different times (one for biomarker discovery and the other, collected a year later, for independent verification) using established analytical methodology4 for breath sample collection, processing, and analysis.

Results

Recruitment and sample collection

To verify VOC biomarkers of hypoglycemia, 37 subjects for Cohort 1 and 51 subjects for Cohort 2 were recruited from a diabetes youth camp (different volunteers in Cohort 1 and Cohort 2) over the course of two different summers. The team aimed to collect samples from these volunteers during naturally occurring hypoglycemic events and when their BG levels were > 70 mg/dL (non-hypoglycemic). BG levels were checked periodically over the course of breath collections, including when campers felt that they might be hypoglycemic, and in all cases glucose meters were used to verify hypoglycemic events. Not all subjects were able to contribute hypoglycemic samples: there were n = 20 hypoglycemic samples from Cohort 1 and n = 30 hypoglycemic samples from Cohort 2. Regarding non-hypoglycemic samples, n = 58 were collected from Cohort 1 and n = 99 were obtained for Cohort 2. Table S1 shows the total number of samples analyzed after removing outliers (based on VOC output). The hypoglycemia BG range across both sample cohorts was from 45 to 69 mg/dL and the non-hypoglycemia range was from 70 to 790 mg/dL.

Univariate statistical analysis

After collecting breath samples, analyzing them by SPME GC-MS and spectrally aligning the chromatograms, a total of 40 VOCs were reliably identified to be conserved between Cohort 1 and Cohort 2 after data screening. These VOCs had overlapping retention times (RTs) and mass-to-charge ratio (m/z) fragmentation patterns in both data sets. Many other VOCs were disqualified from analysis as they were either cohort-specific or removed by data screening procedures. Next, univariate statistics was used to assess biomarkers through analysis of normalized GC-MS signals, and as expected, no individual VOC was able to distinguish hypoglycemia with the desired accuracy. Isopropanol was one of the 40 conserved VOCs with the highest discriminatory power in both cohorts (receiver operator characteristic (ROC) in Fig. 1). This VOC could distinguish hypoglycemia with 65.0% sensitivity/87.9% specificity in Cohort 1, and 53.3% sensitivity/80.8% specificity in Cohort 2. To visualize the expression levels of all qualified features, a hierarchical heatmap of the 40 conserved VOCs was generated, displaying their signal intensities in every sample for both data sets (Fig. 2). Red represents high VOC signals, green indicates low signals, and black is designated for average values. VOCs on the x-axis (rows) are segregated into upregulated (bottom section of heatmap) and downregulated (top section of heatmap) by hypoglycemia through hierarchical clustering. It should be noted that in both data sets, a relatively even number of up- and downregulated features were observed among the qualified VOCs. Interclass variation between hypoglycemic and non-hypoglycemic samples were high in both data sets, and both sample cohorts demonstrated low intraclass variation. Overall, even though the individual VOCs did not have high diagnostic accuracies in both data sets, similar patterns in the data were observed for Cohort 1 and Cohort 2. A single heatmap containing all samples is provided in Figure S1 in a similar fashion to show how these 40 VOCs are consistently dysregulated (up- or downregulated) by hypoglycemia in both sample cohorts.

Fig. 1
figure 1

Receiver operator characteristic curves showing the accuracy of isopropanol for hypoglycemia detection in both Cohort 1 (training data, n = 78) and Cohort 2 (testing data, n = 129).

Fig. 2
figure 2

Hierarchical heatmaps showing the 40 conserved VOCs in hypoglycemic (Hypo) and Non-Hypo samples from Cohort 1 (training data, n = 78) and Cohort 2 (testing data, n = 129).

Unsupervised multivariate statistical analysis

Because univariate statistical analysis did not yield highly accurate results, multivariate analyses were undertaken to assess how multiple VOCs distinguish hypoglycemia. First, unsupervised data mining through principal component analysis (PCA) was performed on the 40 conserved VOCs to visualize global data patterns in both sample cohorts and observe how hypoglycemic breath samples naturally segregate from non-hypoglycemic samples. Figure 3 shows two-dimensional PCA plots for both data sets in an independent fashion, where hypoglycemic samples are shown in red and non-hypoglycemic samples are illustrated in blue. The PCA plot for Cohort 1 showed that the hypoglycemic samples could not be perfectly distinguished from non-hypoglycemic samples. Nonetheless, PCA still had relatively high ability to stratify hypoglycemic events in this cohort, especially for an unsupervised multivariate statistical method. For example, hypoglycemia was distinguished along the first two principal components with sensitivity = 85.0% and specificity = 74.1%. For this analysis, PC1 accounted for 15.1% of the total variation in the data and PC2 accounted for 8.7%. PCA was also implemented for the 40 conserved VOCs within Cohort 2 samples, and PC1 accounted for 17.4% of the variation in the data and PC2 accounted for 9.8%. PCA implemented on this data set (Cohort 2) could differentiate hypoglycemic events with higher accuracy (sensitivity = 96.7% and specificity = 94.9%) compared to Cohort 1. Across both cohorts, PCA could distinguish hypoglycemia with greater accuracy than any single biomarker (i.e., isopropanol).

Fig. 3
figure 3

PCA separating hypoglycemia (Hypo) and Non-Hypo samples for Cohort 1 (training data, n = 78) and Cohort 2 (testing data, n = 129) using the 40 conserved VOCs in both data sets.

Supervised multivariate statistical analysis

Unsupervised statistical analysis did not separate hypoglycemic events in Cohort 1 with the desired accuracy. PCA was also implemented using a fairly large number of VOCs, and because the sample classes are not defined within the algorithm, PCA and other unsupervised methods in general do not have inherent classifiers. For these reasons, a supervised multivariate classification approach was undertaken to identify a lean model of VOCs that could classify hypoglycemia with even higher accuracy. Initially, forward feature selection coupled to linear discriminant analysis (LDA)4 was used to identify a VOC biosignature that could distinguish hypoglycemia with the highest cross validated area under the ROC curve (AUC) in the training data set (Cohort 1). As shown in Fig. 4, a panel of six VOCs was able to distinguish hypoglycemia from non-hypoglycemia with AUC = 0.98 (sensitivity = 94.8% and specificity = 95.0%). This biosignature of six VOCs consisted of four VOCs upregulated by hypoglycemia (an alcohol (isopropanol), saturated hydrocarbon, ester, and aromatic ester) and two others that were downregulated (a saturated hydrocarbon and aromatic VOC). When this LDA model of VOCs was tested independently on Cohort 2, hypoglycemic events were classified with AUC = 0.93 (sensitivity = 90.0% and specificity = 89.9%). The verified result shows that this model of six VOCs is not overfit and can make reliable predictions when tested blindly in newly collected samples. The two-dimensional LDA plot and associated ROC curves for this biosignature of VOCs in both Cohort 1 and Cohort 2 can be observed in Fig. 4. Additional machine learning models were explored using this biomarker panel of six VOCs, and these included logistic regression, linear support vector machine (SVM), and weighted K-nearest neighbors (KNN). Table S2 provides in-depth detail on the biomarker results (in both the training and testing sets) using these alternative algorithms.

Fig. 4
figure 4

LDA using a biomarker panel of 6 VOCs distinguishes hypoglycemic (Hypo) from Non-Hypo samples with 90% accuracy in both sample cohorts (n = 207 samples in total). Two-dimensional LDA plots are shown on the left and ROC curves are plotted on the right.

Other statistical analysis

In addition to developing the LDA model to specifically distinguish hypoglycemia from non-hypoglycemia, these results were also assessed in the context of euglycemia and hyperglycemia, given the importance of detecting hyperglycemic events. Given the high sensitivity/specificity of the model for hypoglycemia detection, the authors next explored if these results could be used to specifically distinguish hyperglycemia (high blood sugar, defined as ≥ 180 mg/dL). In the left-hand side of Fig. 5, LDA multivariate predictor scores are specifically shown for hyperglycemia and euglycemia in both sample cohorts. This shows that these two sample classes of interest could not be distinguished with any level of relevant accuracy. In Cohort 1 and Cohort 2, non-hypoglycemia samples displayed homologous LDA scores regardless of blood sugar level. To further explore the ability of exhaled VOCs to correlate results relating to elevated BG levels, scores from the first three linear discriminant axes were abstracted for all Cohort 1 and Cohort 2 samples and analyzed through multiple linear regression. These analyses resulted in a correlation that had R2 = 0.12, Adjusted R2 = 0.11, and p-value = 1.1 × 10−5. These results are shown on the right-hand of Fig. 5, plotting actual BG levels against VOC-based predicted levels. Although these analyses demonstrated a statistically significant correlation, the standard error was equal to 99.35, indicating relatively large variation in predictive capability of VOCs for quantitative glucose readouts.

Fig. 5
figure 5

LDA scores from the first two dimensions for Non-Hypo samples in both sample cohorts, demonstrating hyperglycemic cannot be distinguished from euglycemic samples (left-hand side). LDA scores in three dimensions were analyzed using multiple linear regression (Hypo and Non-Hypo samples in both cohorts), demonstrating a statistically significant correlation with R2 = 0.12, Adjusted R2 = 0.11, and p-value = 1.1 × 10−5 (right-hand side).

Discussion

The current study verifies VOC biomarkers for hypoglycemia by collecting and analyzing breath samples by SPME GC-MS using a previously established method4. Breath samples were collected from a diabetes youth camp in two different summers (Cohort 1 and Cohort 2), and 40 VOCs conserved in both sample cohorts were identified. Of the 40 molecular features, isopropanol was identified to be significantly upregulated by hypoglycemia in both cohorts (Fig. 1). This is interesting, as multiple independent studies have demonstrated that isopropanol in exhaled breath is a potential biomarker for diabetes30,39. Beyond isopropanol, the other 40 VOCs in general had similar expression patterns across Cohort 1 and Cohort 2 (heatmap in Fig. 2). After assessing individual VOCs, it was observed that no single analyte could distinguish hypoglycemia with the desired accuracy. Therefore, multivariate analysis was undertaken to observe global trends in the data and build a predictive classification model based on a small panel of VOCs. Biomarkers are typically reported in biosignatures or panels as they tend to have higher diagnostic accuracies relative to a single compound40 and can reflect dysregulations in multiple metabolic pathways altered by hypoglycemia. PCA was implemented on the 40 VOCs for both cohorts, and hypoglycemia could be separated with reasonable accuracy in Cohort 1. In Cohort 2 on the other hand, hypoglycemia could be distinguished with > 95% accuracy (Fig. 3). The increase in accuracy regarding Cohort 2 may be attributed to the fact that the VOCs with the most variation in Cohort 1 may not be the defining features of hypoglycemia. Furthermore, a slightly higher degree of variation within the VOC data was captured in Cohort 2.

Next, forward feature selection coupled to LDA was undertaken to build a predictive classification model using a small biosignature of VOCs. Model training was implemented in Cohort 1, and a biosignature of six VOCs was identified to distinguish hypoglycemia with AUC = 0.98 (sensitivity = 94.8%/specificity = 95.0%). Any supervised multivariate model is subject to being overfit, which is defined as using too many independent variables to model a desired outcome41,42, in this case the detection of hypoglycemia. The optimal method to assess model performance and detect overfit models is to blindly test the model on an independent sample cohort. Therefore, samples from Cohort 2 were used for testing, and hypoglycemic samples were classified with AUC = 0.93 (sensitivity = 90.0%/specificity = 89.9%). These results indicate that the LDA model is not overfit and can predict hypoglycemic events with relatively high verified accuracy in newly collected samples (Fig. 4). It was confirmed that these VOCs within the biomarker panel were not significantly present in environmental air samples collected from different locations at the camp. This makes sense because the use of unique cohorts from two different years reduced the likelihood that the identified VOCs would have an environmental source.

The LDA results were also analyzed to determine the ability of the VOC biomarker panel to specifically distinguish euglycemic from hyperglycemic samples. Although these analyses result in the determination that this specific biomarker panel is not well situated for the detection of hyperglycemia, this is to be expected based on biological characteristics. For example, hypoglycemic events activate metabolic pathways to increase glucose availability, and these include increased lipolysis and glycogenolysis. Hyperglycemia on the other hand alters endogenous metabolic pathways to utilize or store glucose, and these include lipogenesis as well as glycolysis. Therefore, the VOC biomarkers for these two conditions are expected to be different. Nonetheless, analyses were expanded to utilize all samples to draw correlations between the VOC-based multivariate predictor scores and blood glucose levels (Fig. 5). Although these analyses reached statistical significance, these results may be biased since the VOC biomarker panel was pre-determined to be dysregulated by relatively low blood sugar levels (hypoglycemic events). Furthermore, a relatively low R2 value (0.11) and high standard error (99.35) indicate that these specific VOCs are best situated for hypoglycemia detection and may not be ideal for distinguishing hyperglycemia. For reasons discussed below, key VOCs that could be used for a more direct correlation of blood glucose levels (including acetone and isoprene) could not be detected in this analysis. Therefore, the specific VOCs for hyperglycemia may not be captured in these results, and future studies are required to explore these specific VOCs. Developing tests for hyperglycemia detection and BG level measurement are just as important as those for hypoglycemia, as together they may be used to track how often patients’ BG levels are in-range (70–180 mg/dL), elevated (≥ 180 mg/dL) or low (< 70 mg/dL).

The supervised model of VOCs demonstrates high diagnostic sensitivity/specificity; thus, given a breath sample is analyzed outside the current group (coming from the exact same demographic/population), the probability of an accurate test result would be approximately 90%. These accuracies are clinically relevant, especially in the context of current CGM device capability. For example, CGMs have been previously studied for their ability to distinguish hypoglycemic events (glucose meters were used as the ground truthing method in a similar fashion as the current study). Previous results have demonstrated that CGMs could distinguish hypoglycemia with sensitivity equal to 86% and specificity equal to 81.4%43. It should be noted that this study only analyzed the capability of Dexcom G5®CGMs and no other devices. A more recent systematic review revealed that the mean sensitivity of CGMs was equal to 69.3% [95% CI: 56.8–79.4] with specificity equal to 93.3% [95% CI: 88.2–96.3]. The overall conclusion of the authors of this previous study was that noninvasive monitoring systems including CGMs are not sufficiently accurate for detecting hypoglycemia44. Comparison of accuracy between exhaled breath VOCs in this study and CGMs in previous studies is only performed to demonstrate clinical relevance of the breath-based results. Despite limitations in blood glucose accuracy, CGMs in general are not only very prevalent today, but have also demonstrated their ability to improve quality of life for diabetics. Beyond CGMs, there are also emerging methodologies that sense glucose in noninvasive biofluids such as tears, sweat, or even saliva. One example of this is that a previous study developed a sweat-based electrochemical impedance biosensor device to measure glucose levels. Through analyzing features of the sensor response through machine learning, blood glucose levels could be correlated in patient sweat samples with R2 = 0.9445. Although the accuracy for hypoglycemia detection for these devices is still up for debate, these preliminary studies show the potential merit for clinical application. Taken as a whole, noninvasive monitoring of breath VOCs for the detection of hypoglycemia has relevant diagnostic accuracy for clinical impact. However, there are some factors that need to be considered before commercialization and/or clinical translation of breath-based assays. First and foremost, there is no current analytical or clinical gold standard for breath sampling/analysis. Additionally, to help gain acceptance of VOC biomarkers in the future, research needs to be done to explore their biological origin in the context of the medical condition being studied.

Regarding biomarker identification, isopropanol was analytically validated using pure standards, and the rest were tentatively elucidated using mass spectral library matching (NIST17) and a previously published GC column-specific nonpolar retention index calibration curve12. Therefore, other VOCs beyond isopropanol included octyl acetate, dimethyl phthalate, 5-phenylundecane, 2,6-dimethylnonane, and an unidentified saturated hydrocarbon (no compound hits in NIST with a reasonable nonpolar retention index value). The fact that isopropanol was implicated as a potential biomarker aligns well with metabolic pathways that are dysregulated by hypoglycemia. For example, isopropanol was detected in this study and has been previously correlated metabolically to acetone. Acetone and other ketone bodies are biologically generated through the degradation of fatty acids, and isopropanol is produced by the reversible reaction between acetone and alcohol dehydrogenase39. Fatty acid and lipid degradation are associated with oxidative stress, which is correlated with hypoglycemic events46. Because isopropanol is correlated to acetone metabolically, one would expect acetone to also be detected as a biomarker for hypoglycemia in breath, but the current study did not find this relationship. A previous study published by our group compared different breath sampling techniques, and even though the current method was generally the most sensitive, it had low ability to detect VOCs with relatively lower molecular weight (primarily acetone and isoprene)47.

With regards to comparing the biomarkers from the current results to our previous study on hypoglycemia4, no coordinated analysis could be conducted. Improvements through previous experience in sample collection and processing, along with the use of slightly different GC columns and MS detectors, made the identification of exact similarities in VOCs difficult or in some cases unachievable. Nonetheless, two of the previously reported VOCs were present in the current biosignature, and the functional groups significantly overlapped between the two studies. In general, the VOCs in the previous biomarker panel were detected within the larger set of 40 VOCs in this study. The current results show a slightly different and more accurate biomarker panel relative to the previous investigation, which may be explained by the use of a more sensitive mass analyzer/detector. In addition to our previous study, research by other groups31,36 have implicated different VOC biomarkers and panels that can differentiate hypoglycemia with high diagnostic accuracy. In comparison to these other studies, the reported biosignature shows similarities and differences. For example, because the breath sampling method is not best situated to detect smaller VOCs, biomarker data could not be collected for previously reported compounds including isoprene, acetone, and methanol. On the other hand, many different studies have implicated isopropanol as a biomarker for diabetes/hypoglycemia and this is verified and reflected in the current study. It should be noted that the divergence in results between studies may be attributed to the differences in analytical methodology for sampling and detection. One instance of this is that GC-ion mobility spectrometry is less sensitive compared to GC-MS and is not optimal for the ionization of saturated hydrocarbons with low molecular weight48,49.

Limitations in this study include the fact that samples from both cohorts were collected from the same location and with similar demographics. It is well known that breath-based profiles are unique to different individuals based on an array of different biological or external factors, and therefore inter-individual variability of VOCs may be problematic when translating the results from this study to larger cohorts. Furthermore, even though a large number of samples were collected in this study, these included up to three samples from each volunteer and thus the results may not be generalizable to a broader range of subjects. Although inter-subject variability did not compromise hypoglycemia detection in this study, this may not hold true when expanding to larger and more complex data sets. Therefore, to move beyond verification towards clinical validation, thousands of samples should be collected from more subjects in different geographic locations and with more diverse demographics in the future. In addition to collecting samples from a more heterogenous cohort to define inter-individual variability, longitudinal studies can be undertaken to benchmark and compare intra-subject variability. On the other hand, breath sampling standardization is important to produce results that can be cross-referenced by others50, and a limitation of this study is that the sampling method did not normalize exhaled breath collection by time, exhalation flow, or capnography. Lastly, it is a potential limitation that VOCs identified in this study may or may not be biomarkers for other conditions. Future research needs to be undertaken to show that this VOC-based biomarker panel is specific for hypoglycemia and will not signal other medical conditions. In summary, a major step toward clinical validation of exhaled VOC biomarkers for hypoglycemia has been taken through verifying candidate biomarkers in hundreds of samples. Further steps required in the future include assay optimization/refinement and validation using thousands of samples from volunteers that mirror the overall population that the test is intended for, that is, youths with T1D at risk for hypoglycemia. After, VOC tests can be refined for commercialization and/or obtaining approvals from regulatory pathways, including those from the U.S. Food and Drug Administration.

One day, VOC results in this study can be leveraged to develop a lightweight and portable integrated array of nanosensors that can detect hypoglycemia with high sensitivity and specificity at a point of care. There has already been an array of different studies that implement portable sensor technologies for the detection of breath-based biomarkers related to diabetes51,52. These include wearable health monitoring devices, which can revolutionize the way hypoglycemia is detected and ultimately be used to increase glycemic control in persons with diabetes, especially children and young adults. One recent example of this is that Kapur et al. developed a breath-based system for detection of diabetes. The platform, termed DiabeticSense, is composed of metal oxide sensors which are already commercially available and facilitate electrochemical detection of VOCs. Through collecting and analyzing numerous samples from diabetic and non-diabetic patients, machine learning algorithms (Gradient Boosting, K-Nearest Neighbours, Elastic Net, LASSO, etc.) were used to distinguish diabetes with approximately 86.6% accuracy53. Even more recently, the same authors published another article that introduces another platform called GlucoBreath, which has achieved even higher diagnostic accuracies in relatively larger data sets54. In the future, studies can focus more on the ability of portable sensing devices to noninvasively track glucose levels, and sensor performance can be improved through tailoring customized sensors toward the specific VOC biomarkers identified in this work or other published studies. Beyond integrated nanosensor arrays, other portable instrumentation including GC-MS systems and ambient ionization MS instruments may also be adaptable for point of care VOC biomarker detection. Clinical applications can be expanded to detect both hypo- and hyperglycemic events, as well as to other medical conditions outside the realm of diabetes.

Materials and methods

Subject recruitment

Children and young adults from the ages of 7–22 years old diagnosed with T1D were recruited at the American Diabetes Association (ADA) Camp John Warvel (North Webster, Indiana) over two summers (Cohort 1 and Cohort 2). All campers/parents provided consent/assent and were informed of the procedures for this study. It should be noted specifically that for all minority participants, informed consent was obtained from a parent or legal guardian. The first breath sample collection occurred at the time of consent under the supervision of the researchers to allow the subjects and families to observe and learn the breath collection process. Institutional Review Board (IRB # MOD00000311) approval was obtained from St. Vincent Health, and all experiments were conducted in accordance with these relevant guidelines and regulations.

Breath sample collection

All campers were supervised by medical personnel the entire time they were at camp, and their BG levels were checked periodically and whenever a camper felt they might be hypoglycemic. When a volunteer became hypoglycemic, a glucose meter was used to verify BG levels immediately prior to breath sample collection. Breath sampling was performed using a previously reported procedure shown to include increased concentrations of exhaled carbon dioxide4 and therefore presumably a reasonable fraction of end-tidal phase breath. The breath samples were collected in Tedlar bags (Restek, Bellefonte, PA, USA), where subjects blew into the inlet until the bag was approximately 80% full. Breath samples were collected when subjects were hypoglycemic (BG levels < 70 mg/dL), euglycemic/hyperglycemic after fasting (collected before breakfast), and euglycemic/hyperglycemic after resting (collected upon arrival to camp after travelling by automobile). Samples were also collected while subjects were exercising (swimming) but these were excluded because exercise can induce hypoglycemia55,56. Exercise in this study was unregulated, BG levels were not measured after exercise and therefore, the sample label (hypoglycemic or non-hypoglycemia) was unknown.

Sample processing

Samples in Tedlar bags were cryothermally transferred over dry ice into 20 mL headspace vials containing deactivated glass wool (Restek, Bellefonte, PA, USA) using a previously published method4. Briefly, two stainless steel needles (Med-Vet International, Mettawa, IL, USA) were used to pierce the septa of a headspace vial cap. One of the needles was connected to the Tedlar bag and the other was interfaced with a mass flow controller (Alicat Scientific, Tucson, AZ, USA) connected to a vacuum line to transport the breath from the bag to the vial at a slow/constant flow rate. VOCs adsorb onto the glass wool and sides of the glass vial during the cryothermal transfer. Vials were labeled with the respective de-identified subject number, sealed with Parafilm (Fisher Scientific, Florence, KY, USA), stored in a Thermo Scientific Forma 900 series − 80 °C freezer (Waltham, MA, USA), and subsequently analyzed by SPME GC-MS.

SPME and GC-MS analysis

Cryothermally transferred breath samples were analyzed using an Agilent (Santa Clara, CA, USA) 7890 A GC coupled to an Agilent 7200 Accurate-Mass Quadrupole Time-of-Flight MS system (GC-MS QTOF) with a front-end PAL autosampling system (CTC Analytics, Raleigh, NC, USA) using a one cm PolyDimethylMethylSiloxane/CARboxen/DiVinylBenzene (PDMS/CAR/DVB) SPME fiber (Supelco, Bellefonte, PA, USA). The GC column utilized was an Agilent HP-5 ms, 5% phenylmethyl siloxane capillary column with a length of 30 m, internal diameter of 250 μm and film thickness of 0.25 μm. Sample VOCs were extracted by a SPME fiber for 45 min at 60 °C (with sample agitation at 250 RPM). After, the VOCs were thermally desorbed into the GC-MS QTOF system (27 min run with alternating oven ramps of 8 °C/min and 15 °C/min). The MS was utilized in full scan mode, scanning m/z values ranging from 26 to 400. Data was collected using Agilent Mass Hunter Data Acquisition software (Version B.07.06.2704; https://www.agilent.com/en/promotions/masshunter-mass-spec) in centroid format. Method reproducibility and instrument performance was qualified daily by analyzing external reference standards.

Data processing and screening

Similar data screening and chemometric analysis procedures have been previously reported by the authors in several VOC biomarker discovery studies21,57. Specifically for this data, GC-MS chromatograms from Cohort 1 and Cohort 2 were spectrally aligned using Agilent Mass Hunter Quantitative Profinder (Version B.08.00 Service Pack 3). Sample cohorts were aligned separately, as Profinder cannot process them when batched together. Spectral alignment parameters included a RT tolerance of ± 0.00% + 0.40 min and a minimum dot product between mass spectral vectors equal to 0.40. A matrix containing m/z base peaks, RTs and integrated signals for each VOC and corresponding sample was generated. The samples from Cohort 1 and Cohort 2 were analyzed independently initially which includes data processing procedures. In both sample cohorts, VOCs that were not present in at least 50% of one sample class (hypoglycemic or non-hypoglycemic) were removed from the data matrix. This led to the identification of 481 VOCs in Cohort 1 and 518 VOCs in Cohort 2. The dimensionality of the data was further reduced by removing duplicated features, known exogenous compounds from the background samples, as well as VOCs originating from or altered by the collection materials (including silane/siloxane artifacts of SPME GC-MS). These data screening procedures led to the identification of 95 VOCs that were identified in both data sets with identical base m/z and similar RT. These VOCs underwent further screening to ensure a relevant data set for hypoglycemia biomarker detection. VOCs that did not show the same regulation (up or down in hypoglycemia) or identical mass spectra in both cohorts were removed from the matrix, thus leading to the final set of 40 VOCs. To normalize the data, the VOC signals were z-scored (autoscaled). Sample outliers (regarding the number of VOCs and total integrated signal) were removed from both data sets. Outliers were defined as data points falling more than 1.5 times the interquartile range below the first or above the third quartile. Any breath sample among both cohorts which had values that exceeded 1.5 times the interquartile range for either of these two metrics (number of VOCs or total integrated signal prior to data screening procedures) were removed prior to any biostatistical analyses. This criterion was used mostly to remove samples with relatively low number of VOCs or total signal in a relatively non-biased fashion.

Statistical analyses

After the removal of sample outliers, individual VOCs conserved in both sample cohorts were identified and surveyed to observe their ability to discriminate hypoglycemic events as a single biomarker. Hierarchical heatmaps were created to visualize VOC signals in each of the cohorts. The hierarchical tree was generated using Euclidean distance metrics and average linkages. PCA was utilized as an unsupervised multivariate statistical analysis method and was implemented on VOCs from both sample cohorts to observe natural patterns within the data. LDA coupled to forward feature selection4 was employed for training purposes in Cohort 1 to identify a panel of VOC biomarkers with high discriminatory power. This process starts by identifying a linear combination of three VOCs that can distinguish hypoglycemia with the highest accuracy. VOCs are added in an iterative fashion to identify a stable model with high diagnostic accuracy. The method continues until the cross validated result starts to deteriorate, which is a sign that the model is overfit and has utilized too many molecular features for classification. Cohort 1 was used as a training set to identify the optimal VOCs, and Cohort 2 was used as a testing data set to blindly verify the VOC biosignature in an independent fashion. In addition to LDA, alternative machine learning models were used to further validate the VOC biomarker panel through functional perturbation. Additional models used for training and testing in Cohort 1 and Cohort 2 included linear SVM, weighted KNN, and logistic regression. All biomarker panel results are reported through denoting AUC, sensitivity, and specificity. The LDA model was also used to visualize euglycemic and hyperglycemic samples (BG ≥ 180). LDA scores from the first three dimensions were also analyzed using multiple linear regression. Here, an Ordinary Least Squares regression analysis was completed on the multivariate scores corresponding to the three linear discriminant axes. The explanatory variables were the LDA scores (independent), and the response variable was BG level (dependent). Analysis of variance (ANOVA) was implemented to determine if there was a significant linear correlation between the independent and the dependent variables. To evaluate the strength of the correlation, the team utilized determination coefficients (R² values), regression coefficients, and standard errors.