Introduction

There exist several biomedical applications for which a contactless sensor for real-time monitoring of metabolites would be invaluable1,2,3,4,5,6,7,8. These applications range from cell cultivation inside bioreactors for therapeutics manufacture1,4,6,7,8 to minimally invasive disease diagnosis2,3. For example, The association of fructose with Non-Alcoholic Fatty Liver Disease (NAFLD) is well established epidemiologically9,10,11,12, yet technologies available to measure fructolysis remain limited. The most common methods to study cellular metabolism are mass spectrometry (MS) and nuclear magnetic resonance (NMR)13,14. These methods can be destructive, bulky, and expensive and do not allow repeated measurements on the same sample. The development of inline optical sensors coupled with machine learning analysis of signals to yield metabolite concentrations within bioreactors is prevalent in the literature. Examples include NIR probes6,8,15,16,17,18,19,20, Raman probes4,16,21,22,23,24,25 and even NMR spectrometers26,27. However, these probes are prone to noise, fouling since they must be submerged within the cell sample, and sensor drift28.

As an alternative, the use of a short wave infrared hyperspectral camera (SWIR - HSI) allows for fully contactless collection of thousands of absorption spectra using a single HSI image. This wealth of signals can then be used to efficiently denoise sample spectra28. It can also train interpretable deep learning models using a small dataset of ground truth metabolite samples28. These properties enable instantaneous troubleshooting of model predictions or automatic filtering of drifting readouts. This HSI sensor setup can also produce sample measurements using less than 500 uL of liquid per sample measurement28. A truly multiplex sensor must be capable of distinguishing and quantifying similar metabolites since several biological metabolites, such as fructose and glucose, share similar chemical structures. Most studies in the literature attempt to quantify distinct metabolite relevant to bioprocessing using probes and traditional chemometric models like partial least square (PLS)4,6,7. These models tightly couple dimensionality reduction with regression analysis, so it is difficult to determine which features are both noise free and relevant to metabolite prediction29. Thus, an alternative example of interpretable linear models is explored called L-SLR. The performance of this model is compared with both PLS and well known unmixing model Non-negative Matrix Factorization (NMF).

This paper addresses the following objectives (Fig. 1): (1) enable contactless, label free rapid quantification of fructose and other cell growth parameters using a short-wave infrared (SWIR) hyperspectral camera; (2) develop and demonstrate models that easily adapts to varying cell media environments, and (3) develop model that allows practitioner to troubleshoot model performance based on sparse weight matrix maps. This leads to the development of small but interpretable models that can be extended to quantify any number of metabolites important to human health and cell growth alike.

Fig. 1
figure 1

An overview of the contactless, label free rapid quantification system using a short-wave infrared (SWIR) hyperspectral camera. (A) Collection of dataset of hyperspectral images and corresponding labels. (B) Collection of hyperspectral images in SWIR range for (C) training an encoder and sparse linear regression model for metabolite prediction.

Materials and methods

Materials

Spent Cell media samples were collected and imaged as described by Hevaganinge et al.28. In order to collect the fructose datasets, various weights of Fructose powder (Sigma F0127) were dissolved in solutions of Glucose (Sigma 150023-021) and deionized water to produce calibration datasets for fructose. Fructose was also dissolved in Fetal Bovine Serum (FBS) (Gibco 2309286) spiked with Glucose to generate further calibration curves. Sugar samples were weighed using a precise scale (Mettler-Toledo, ME204). Calibration curves were collected for fructose at varying glucose background levels, as shown in Table 1. The ID indicates the dataset number, while G denotes background glucose concentration in mg/dL,

\(\:{F}_{min\:},\:{F}_{max\:}\) denote fructose concentration range in mg/dL,\(\:\:{\varDelta\:}_{F}\) denotes fructose concentration increment in mg/dL, and N is the number of samples collected in each dataset. T represents test data, and V represents validation data in the Fold column of Table 1. Lines demarcate combinations of data used to evaluate each model. Because fructose dataset was collected with varying background levels of glucose in both water and FBS (Table 1), it was possible to isolate a single glucose level from water and FBS environments respectively for k-fold cross validation and optimization of model components, while the remaining data was divided into k folds and treated as test data.

Fructose model training and analysis

Because fructose dataset was collected with varying background levels of glucose in both water and FBS (Table 1), it was possible to isolate a single glucose level from water and FBS environments respectively for k-fold cross validation and optimization of model components, while the remaining data were treated as test data. Both validation and test data from each respective glucose background were split into k-folds consisting of 2 samples to obtain final metrics. 4 separate model sets were trained by partitioning the data as shown in Table 1, with model validation datasets bolded and all remaining datasets used as test data. Datasets used as validation data are bolded.

Fructose dataset collection

To create the water solutions, glucose and fructose were weighed using a balance (Mettler-Toledo, ME204) and placed in 15mL conical tubes. Next, the appropriate volume of water was added to reach the desired concentration. The tubes were then thoroughly mixed. Serum solutions were created with heat-inactivated Fetal Bovine Serum (FBS) (Gibco, Lot 2309286) using the same procedure as for water solutions. Since glucose is present in horse serum at unknown concentrations, the resulting solutions contained concentrations of glucose augmented by both fructose and additional glucose. Sample order was randomized before image acquisition. 500 uL of each sample was loaded into a quartz cuvette (Hellma 110-1-40) and imaged using Hyperspectral sensor (Headwall Photonics, Bolton, Massachusetts). The spatial resolution of this system was approximately 0.5 mm, which allowed collection of hundreds of transmittance readings from each cuvette photo. Each image pixel contains a transmittance spectrum which is collected in the range from 900 to 2500 nm in 8 nm increments. The quartz cuvette was washed with a single wash of water and two washes of solution of interest prior to image collection.

Table 1 Summary of experimental conditions used to collect fructose datasets.

Cell data model training and analysis

As detailed previously28, cell permeate dataset was collected and quantified from eight different cell culture flasks, such that approximately 30 samples were collected from each flask prior to data cleaning and filtering. Further, ground truth labels for cell media samples were kept unfiltered. Only data with cell viability greater than 80% and lactate levels less than 300 mg/dL were used for model training from spent cell media dataset, resulting in approximately 20 samples per flask. The first flask was chosen as the validation flask and used to determine minimum number of flasks that could be used for model training while maintaining reasonable metabolite quantification results, as well as optimal number of components for all models. Of the remaining flasks, 4 were randomly chosen as test flasks, and remaining flasks were used as training data. 20 folds of flask combinations were generated for evaluation. To minimize model extrapolation, only data with cell viability greater than 80% was used as test data for models trained using spent cell media dataset.

Data pre-processing

Hyperspectral images were processed as described by Hevaganinge et al.28 for all datasets. First, min-max normalization of each transmission signal was conducted against respective dark and white background reference hyperspectral images that are taken at the start of each experiment. Next, transmittance spectra are converted to polynomial basis30 based on Eq. (1), where \(\:\phi \:\: \in \:\mathbb{R}^{{NxP}}\), with N samples and P terms, represents set of linear and 2nd order cross terms and \(\:{x}_{i},\:{x}_{j}\) represent transmittance values from ith and jth wavelength respectively. Also, B is the total number of wavelengths. All terms undergo min-max normalization based on extreme features of respective training dataset.

$$\:\phi\:={x}_{1}+{x}_{2}+\:{x}_{i}+\dots\:+\:{x}_{B}+\:{x}_{1}{x}_{2}+\dots\:+{x}_{i}{x}_{j}+\dots\:+{x}_{B-1}{x}_{B}$$
(1)

Data analysis and model training

Three models were explored for contactless quantification of various cell parameters. The respective validation dataset was used for model hyperparameter tuning, and validation data was omitted from test data evaluation as needed. Scikit-Learn implementation of Partial Least Squares with 10 components29,30 and polynomial features (\(\:\phi\:)\) was used as a baseline model to compare against the performance of two source separation models. The first alternative model used a blind source separation model, specifically Scikit-Learn implementation of Non-negative Matrix Factorization (NMF)30,31 as the encoder which takes \(\:\phi\:\) as input and returns compressed signal with 10 components for subsequent linear regression. Nonnegative Double Singular Value Decomposition (NNDSVD) was used as the initialization method, and the solver was Multiplicative Update Solver. The architecture of the proposed L-SLR model is shown in Fig. 2. First, the transmittance signal, \(\:T\: \in\:{\mathbb{R}}^{NxB}\:\), with N samples and B wavelengths, was converted to a polynomial basis (2 A). Concurrently, the matrix of metabolite concentrations, \(\:Y\: \in\:{\mathbb{R}}^{NxM}\:\) from the training dataset with M metabolites was converted into a single integer index via Hilbert transformation32 (2B). Specifically, each metabolite label was divided by maximum concentration in training dataset, multiplied by 100 and rounded to the nearest integer. Each resulting number was treated as an index in M degree space which can be spanned using a Hilbert curve using the Hilbert curve package in python33. These integer labels were used to pretrain an Scikit-Learn implementation of Orthogonal Matching Pursuit sparse regression model (OMP)30,34 (2 C) which output a sparse feature mask that was used to mask out noisy terms in original feature space that were not relevant to metabolite prediction, as shown in Eq. (2), where \(\:{H}_{i}^{}\) is the sparse cross terms feature vector of sample i from \(\:\phi\:\) after masking, and s is the sparse feature mask.

$$\:{H}_{i}=\:{\phi\:}_{i}\left[s\right]$$
(2)

Next, these integer labels and corresponding masked features were used to pre-train a Linear Discriminant Analysis (LDA) encoder35 (2E). The LDA model forms a compressed latent space which helps to separate very redundant metabolite spectra from a variety of metabolite environments. Test signals were masked using the OMP method (2 C) then dimensionality reduction was done by pretrained LDA according to Eq. (3), where Z is the compressed signal, \(\:{H}_{i}\: \in\:{\mathbb{R}}^{Nx{P}^{+}},\:\)with N samples and \(\:{P}^{+}\:\)terms, is the masked cross terms signal, and W is the basis matrix of n = 10 highest eigenvectors obtained via optimization of LDA cost function (2E).

$$\text{Z}\,=\,\text{HW}$$
(3)

After, simple linear regression30 was used to quantify metabolite levels using compressed signal from LDA model, according to Eq. (4) (2 F) where \(\:\widehat{y}\) is the predicted metabolite reading, \(\:\varTheta\:\) is the sparse weights, and \(\:z\) is the LDA latent space signal. All ground truth labels are normalized by corresponding maximum training label prior to fitting linear regression model.

$$\:\widehat{y}=\:\varTheta\:z$$
(4)

Finally, the sparse weight visualizations were formed by taking the weighted average of LDA weights (W) based on linear regression weights (\(\:\varTheta\:\)) before replacing nonzero elements of sparse feature mask \(\:{s}_{i}\) with the corresponding weight value. All terms used to construct weight map are highlighted in red (Fig. 2). These steps are summarized in Eq. (5), where \(\:{s}_{p}\) is the pth term in the sparse feature mask, \(\:{w}_{p}\) is the pth column of LDA eigenvector matrix, and \(\:\varTheta\:\) is the weight matrix used for metabolite quantification.

$$\:{s}_{p}=\:\frac{1}{n}\:{w}_{p}\varTheta\:$$
(5)
Fig. 2
figure 2

Architecture of L-SLR. First, absorbance signal is converted to a polynomial basis (A), then masked based on weights from pretrained sparse linear regression model (C). Next, masked signal is passed into pretrained LDA model (E). LDA forms compressed latent space which is used to generate metabolite predictions via linear regression (F). Coefficients from LDA and SLR are then used to form weight matrix visuals (G). Models used in (C) and (E) are pretrained using a labeled dataset consisting of integer labels computed via Hilbert curve (B) that represent unique metabolite combinations.

L-SLR weight matrix map generation

Because L-SLR is a fully linear model, the weight matrix may be visualized and analyzed similar to an attention map. The weight matrix not only reveals which features are excluded from metabolite prediction36, but also reveals which features are combined to generate final metabolite predictions. Therefore, the cumulative sum of all weight matrices generated across folds for each dataset was used as an approximation of the stable weight matrix.

Model evaluation and statistics

Predicted cell parameter concentrations were compared to ground truth readings via r2, root mean squared error (RMSE), mean absolute error (MAE), and relative mean absolute percentage error (MAPE) metrics. All metrics were calculated using Scikit-Learn implementations30. MAE, and RMSE are reported in units of mg/dL, while the relative MAPE decimal is reported in all tables. Number of latent space components used for each model is listed as n, with 10 components used for all models.

Results and discussion

Three linear models were evaluated using a series of datasets collected in house using labeled HSI images of not only fructose solutions but also spent cell media from eight CHO cell culture flasks. Tables 2, 3 and 4 summarize validation and test performance when fructose samples in various glucose backgrounds are partitioned into k-folds each containing 2 unique fructose concentrations as test samples. In dataset number 8, 5 unique fructose concentrations are sampled for each test fold since it contains a higher number of unique fructose concentrations compared to the other fructose datasets collected. The dataset number corresponds to data which was collected under conditions listed in Table 1.

Table 2 Quantification of large fructose concentrations in horse serum and glucose solutions.
Table 3 Quantification of large fructose concentrations in glucose solutions.
Table 4 Quantification of small fructose concentrations in glucose solutions and horse serum.

Figures 3, 4 visualize PLS-R and L-SLR correlation to fructose ground truth for test datasets in all datasets of interest for a variety of fructose solution in glucose solutions (Fig. 3) and Horse Serum solutions (Fig. 4). r2 and MAE are listed in the top left corner of each graph. All measurements are reported in units of mg/dL.

Fig. 3
figure 3

Correlation Visualization between ground truth fructose and fructose predicted by PLS-R (A, C) and L-SLR (B, D) for test data collected in dataset conditions 2 and 4 (A, B) and conditions 5 and 7 (C, D). r2 and MAE are listed in the top left corner of each graph.

Fig. 4
figure 4

Correlation Visualization between ground truth fructose and fructose predicted by PLS-R (A, C) and L-SLR (B, D) for test data collected in dataset conditions 9 and 10 (A, B) and condition 8 (C, D). r2 and MAE are listed in the top left corner of each graph.

Across all folds, PLS-R exhibits better test performance when compared to L-SLR, with better scoring performance across all metrics assessed. Nevertheless, the accuracies of both L-SLR and PLS fall within the range of usability for biological studies of fructose2,12 and bioprocess monitoring16,23. The performance gap between PLS-R and L-SLR widens as the fructose concentration decreases, suggesting that LDA has trouble defining discrete clusters for small fructose samples, especially in the presence of relatively high glucose backgrounds. It is also clear that folds sampled at the extreme concentrations tend to exhibit higher levels of drift compared to concentrations in the center of the fructose concentration data distribution. Perhaps more concentrations beyond the extremes of interest need to be sampled and placed as fixed training data points across all folds. In Fig. 4C and D, there are a few outlier points that have high test error for both PLS-R and L-SLR, while most of the other datapoints are quantified with relatively smaller error. The misclassification of these points may stem from the fact that background glucose concentration is more than twice the fructose concentrations imaged, so signal differences caused by fructose concentration changes are small, subtle, and difficult to differentiate properly. Further, the performance of unmixing regression model based on nonlinear matrix factorization (NMF) is shown in Tables 2, 3 and 4, since this is a well established model in the literature for analysis of spectroscopic signals37,38. However, the performance of the NMF regression is worse when compared to PLS and L-SLR, most likely since NMF is an iterative model whose performance is dependent on initialization31. Next, to evaluate model performance on cell culture flask dataset quantification, all models were trained and evaluated on 20 folds of 4 randomly sampled flasks for training data, while using remaining cell culture flasks as test data. The validation flask was omitted from test data evaluation as needed. Tables 5, 6 and 7 summarize average viable cell density (VCD), glucose and lactate quantification performance across 20 cross validation folds when four cell culture flasks (50% of dataset ) are randomly reserved for model training. This number was determined as the minimum number of flasks required to achieve comparable validation performance as using 75% (6 randomly sampled flasks) as training data and flask 1 as validation data. The use of 50% of the dataset rather than 75% of the datasets highlights the powerful generalizability and training efficiency of linear models.

Table 5 Quantification of viable cell density (VCD) in spent cell media.
Table 6 Quantification of glucose in spent cell media.
Table 7 Quantification of lactate in spent cell media.

Figure 5, 6, 7 visualize PLS-R and L-SLR correlation to VCD, glucose and lactate ground truth labels respectively for randomly selected test dataset folds. r2 and MAE are listed in the top left corner of each graph. All measurements are reported in units of mg/dL.

Fig. 5
figure 5

Correlation visualization between ground truth VCD and VCD predicted by PLS-R (A) and L-SLR (B) respectively in spent cell media are shown for randomly selected test fold. r2 and MAE are listed in the top left corner of each graph.

Fig. 6
figure 6

Correlation visualization between ground truth glucose and glucose predicted by PLS-R (A) and L-SLR (B) respectively in spent cell media are shown for randomly selected test fold. r2 and MAE are listed in the top left corner of each graph.

Fig. 7
figure 7

Correlation visualization between ground truth lactate and lactate predicted by PLS-R (A) and L-SLR (B) respectively in spent cell media are shown for randomly selected test fold. r2 and MAE are listed in the top left corner of each graph.

Across all folds, L-SLR exhibits similar performance to PLS-R, although PLS-R tends to display higher performance across all metrics evaluated on the test datasets. Both models achieve best test performance with lactate, test r2 values above 0.9, and MAE less than 20 mg/dL. This suggests that signals correlated to lactate are robust and bi-linear. Performance metrics are lowered slightly when evaluating glucose quantification, but r2 values are still close to 0.9. After analyzing Fig. 6, it is clear that a few cell permeate images exhibit high glucose quantification error, while most of the points have relatively low error. It is important to note that SWIR spectra mainly reflect the absorbance of water, and the subtle shifts of the water peak are used as a “molecular mirror” of substrates dissolved in the hydrogen bonding matrix of water39. Thus, signal shifts caused by glucose are not always linear, or even bi-linear40. This may explain the spurious misclassification of certain glucose images by both models. Further, PLS-R exhibits significant overfitting when compared to L-SLR performance for glucose and lactate. This is highlighted by the relatively large reduction in all metrics between training and testing performance for PLS-R when compared to L-SLR. Finally, both models achieve r2 values less than 0.9 for VCD. Since cell free permeate is imaged, perhaps hyperspectral images lack concrete signals that correlate to VCD. Most likely, the model is forced to correlate signals belonging to other cell parameters with VCD instead. It was hoped that linear unmixing methods like NMF with the help of 2nd degree cross term features could help uncover interpretable signals that directly correlate to VCD. In practice, NMF regression models display the worst performance of the models evaluated, most likely since NMF is an iterative model whose performance is dependent on initialization. Of course, L-SLR might be thought of as a unmixing model as well, since it does factor the input matrix X into eigenvectors and eigen values that best explain the variance correlated to ground truth label separation35. In the future, perhaps the Fischer’s Discriminant ratio can be modified so that chosen components group similar environment samples close together.

L-SLR interpretability

Unlike Mid-Infrared (MIR) absorption, SWIR absorption reflects weak signals caused by overtones and combination vibrations39,41,42,43. These signals are difficult to simulate and interpret, but linear weight matrices help the practitioner correlate high weights with known functional group harmonics in the literature41. Linear models afford weight matrices which directly show both which features are omitted and which features are used as well as how these features contribute to the final metabolite predictions. In contrast, attention maps formed to summarize complex non linear models only afford a glimpse of which features are omitted for model prediction36. Summary visualizations of all weight matrices produced across all folds evaluated are shown in Figs. 8 and 9.

Fig. 8
figure 8

Cumulative sum weight matrices produced by L-SLR (AD) and PLS-R (EH) for bands and cross terms across all models evaluated for fructose quantification via datasets 9–11 (A, E), dataset 8 (B, F), datasets 2–4 (C, G) and datasets 5–7 (D, H). All weights are normalized by the respective maximum of each cumulative matrix to facilitate comparison of weight matrices across models and datasets alike.

It is well known that hydrogen bonding within the solution matrix impacts SWIR absorption spectra41. This phenomenon is reflected in the distinct differences between weight matrices for models trained to identify fructose within glucose solutions vs. horse serum solutions spiked with glucose (Fig. 8A–D). For example, the model trained using glucose solutions pays more “attention” to absorption signals below 1400 nm, whereas the model trained using horse serum pays more “attention” to absorption signals between 1600 and 1800 nm. The former region is dominated by overtones related to O-H stretch, whereas the latter region is dominated by C-H stretch vibrations41. This implies that the correlation between fructose concentration and SWIR absorption can be related to both changes in the hydrogen bonding matrix of water and interactions of fructose with organic compounds dissolved in the horse serum. Further only cross terms in region between bands less than 1400 and bands from 1600 to 1900 or region between 1600 and 1800 are heavily used for all models other than those used to quantify small quantities of fructose in FBS (Fig. 8B). According to Fig. 4, the corresponding model displays the largest extremes in model prediction error compared to the other models. Perhaps this distribution of weights suggests that the model was not able to clearly distinguish linear or bi-linear cross term signals that correlate with fructose concentrations, which leads to overfitting to noise in the training dataset instead. The misclassification of these points may stem from the fact that background glucose concentration is very high compared to the incremental fructose concentration changes in this dataset, so feature changes are small, subtle, and difficult to differentiate properly. A similar analysis is less clear in the PLS-R weight matrices, since non-zero weights are much less sparse compared to equivalent L-SLR model. However, points of distinction are highlighted in Fig. 8E–H.

Fig. 9
figure 9

Cumulative sum weight matrices produced by L-SLR (AC) and PLS-R (DF) for bands and cross terms across all models evaluated for quantification of VCD (A, D), glucose (B, E), and lactate (C, F) in spent cell media. All weights are normalized by the respective maximum of each cumulative matrix to facilitate comparison of weight matrices across models and datasets alike.

Although three distinct cell culture parameters were predicted using the cell culture flask dataset, the weight profile of all three models is quite similar (Fig. 9A–C). In fact, the weight matrices formed by PLS-R (Fig. 9D–F) are virtually indistinguishable, so weight matrices formed by L-SLR will be analyzed for the remainder of the discussion. Notably, both the VCD model and the lactate model share similar regions of high weight, implying that changes in lactate concentration correlate with VCD changes. While both VCD and lactate models share a high weight band around 2200 nm, the glucose model does not weigh this band region quite as heavily. This region corresponds to overtone stretches caused by carboxylic acid41, which is the main functional group on lactic acid. Still, all three models pay “attention” to absorption signals below 1400, implying that changes to the hydrogen bonding matrix of water correlate with changes in all three cell parameters investigated. Still, SWIR spectra exhibit broad, overlapping peaks with non-linear relationships to dissolved molecules. Similar absorption spectra may correspond to very different solution matrix profiles, so the model confuses distinct cell parameter predictions. Because LDA is already designed to maximize separation of data points based on class35, it was assumed that LDA would be able to treat different sets of metabolites as distinct classes and find an appropriate lower dimensional latent space that facilitates metabolite quantification. In the future, the cost function of LDA should be modified to distance points from different environments based on environment similarity. This smooth latent space would also allow for identification and removal of ambiguous absorption spectra which overlap multiple metabolic profiles. In contrast, it is difficult to imagine how PLS-R latent space may be modified to ensure continuous latent space, since dimensionality reduction and regression are highly coupled29. Across all datasets tested, the NMF model performed the worst, most likely due to dependence on weight initialization. Unlike PLS-R and L-SLR, NMF is a blind source separation model, so it cannot take advantage of the supervised labeled dataset available. It is reasonable and arguably essential in the bioprocessing and therapeutics realm for labeled datasets to be collected prior to model training for metabolite quantification. Based on remarks in the literature, an interpretable chemometrics model in the bioprocessing and therapeutics realm should (1) produce reduced dimensionality latent space that separates spectra based on corresponding metabolite environment, (2) easily adapts to varying cell media environments, and (3) allows practitioner to troubleshoot model performance based on sparse weight matrix maps and the prediction of confidence intervals28,36. So far, L-SLR has produced a reduced dimensionality latent space coupled with sparse weight matrices that highlight which bands and cross terms are used for final prediction. Both L-SLR and PLS demonstrate reasonable metabolite quantification accuracy, and the ability to generalize to test datasets using 50% of the dataset available. This implies that these models can generalize training data environments to test environments with the same cell type and media using a 1:1 ratio of train to test data. In the future, this sensor framework may be used to quantify other valuable cell parameters such as pH, amino acids, and even antibody glycosylation5,6,20,44.

Conclusion

Herein, we presented the development of a contactless label free sensor coupled with an interpretable model for the quantification of fructose and other key cell parameters. This SWIR sensor system has succeeded in (1) contactless, label free rapid quantification of fructose and other cell growth parameters, and (2) easily adapts to varying cell media environments using (3) interpretable models that shed light on how the bands of the nonlinear absorption spectra contribute to metabolite quantification. Further, the accuracies of both L-SLR and PLS fall within the range of usability for biological studies of fructose2,12 and bioprocess monitoring16,22.Because the HSI camera is capable of recording thousands of absorbance spectra within a single image, this sensor system can efficiently train and adapt to new metabolites using relatively small calibration datasets. Currently, existing invasive time consuming quantification methods like high performance liquid chromatography (HPLC) and mass spectrometry (MS) remain the gold standard measurement devices in biomedical arena. There are several biomedical applications for which real-time monitoring of cell parameter levels coupled with intermittent ground truth readings will enable both rapid diagnosis of diseases like NAFLD, and precise control of bioprocesses.