Abstract
Raman spectroscopy provides comprehensive biochemical information on a sample’s composition, yet it is often used to analyze aggregated spectra rather than specific shifts. We introduce Fluorescence Guided Raman Spectroscopy (FGRS) as a methodology enabling the isolation of proteins’ spectral signatures and the training of classifiers that generalize across cell lines. We demonstrate the utility of this approach using connexin 43, a marker protein of glioblastoma tumour microtubes. By screening eGFP, sodium fluorescein, and mTagBFP2 for their compatibility with a Raman system operating at 532 nm, we selected mTagBFP2 as the most Raman-compatible fluorophore, whereas the other fluorophores emitting near 532 nm caused spectral interference. mTagBFP2 was cloned into a connexin 43 expression vector, allowing fluorescent tracking and Raman interrogation with subsequent peak identification and correlation to an I-TASSER protein prediction model. We then trained two support vector machines (SVMs) for the classification of cells based on their connexin 43 content and highlighted the impact of different spectral ranges (full spectrum vs. most significant Raman shifts) on specificity and sensitivity in glioblastoma target cell lines. Connexin 43 expression led to a loss of the peaks at 600, 1253, and 1401 cm⁻¹, consistent with an increased α-helical content as predicted by I-TASSER. SVMs achieved up to 79% accuracy on unseen glioblastoma lines, with full-spectrum models reaching 98.7% sensitivity. Thus, FGRS enables the spectral isolation of tumour marker proteins and the development of robust classifiers across cell lines. By focusing on key Raman shifts, this method holds the potential to improve diagnostic accuracy and sensitivity, offering a customizable tool for tumour detection.
Similar content being viewed by others
Introduction
Genetic information can significantly influence a patient’s therapy regimen based on individual biomolecular traits and, therefore, the prognosis1,2. However, some patients do not show clinical benefit from genetically targeted therapies because of post-translational modifications and intratumour heterogeneity3,4,5. In fact, for some cancer entities, such as glioblastoma multiforme and pancreatic cancer, prognosis has changed little over the past decades and remains low at 15–16 months6,7. In response, personalized medicine is striving to integrate comprehensive genetic, proteomic, and lipidomic data into diagnosis and treatment options5,8. Yet the clinically available methods for obtaining such information, such as immunostaining, sequencing, and mass spectrometry, are often laborious and slow, as they require extensive sample preparation protocols9,10.
Raman spectroscopy (RS) offers a potential diagnostic shortcut. It is a label-free, non-destructive method that can extract comprehensive biomolecular information from cells with little to no sample preparation. Technically, spontaneous RS is based on the inelastic scattering of light, i.e., the change in a photon’s energy and, thus, its wavelength (Stokes shift) as it interacts with a molecule11,12. Stokes shifts are specific for each molecular vibration11,13,14. Therefore, the combination of the measured shifts provides a fingerprint of each of the molecules found in the sample volume. RS so delivers a biochemical fingerprint for each probe, containing unique features for pre- and posttranscriptional nucleic acids, lipids, proteins, and other molecules12,14.
The diagnostic properties of RS can be used to distinguish neoplastic from healthy tissue in oncology15 and surgery16. Many of the respective classifiers require the entirety of a Raman spectrum as input and return their results based on the overall compositional differences between malignant and healthy tissue16,17. To our knowledge, there is a relative paucity of classifiers trained on the unique spectral shifts raised by a specific protein.
Raman sampling of distinct intracellular structures is challenging due to both the complex localization of the structures of interest as well as the sophisticated interpretation of the data. It should be noted that Raman spectra contain a mixture of signals from all of the molecules in the sample volume. This is both a strength and a weakness, as specific molecules can be harder to identify from individual peaks. Therefore, some authors have made their molecules of interest amenable to Raman interrogation by selective isolation from bodily fluids18 or synthetic production19. Also, spectroscopically distinct labels, such as deuterium- or alkyne tags, have been used20,21 to provide a locational clue. However, these methods are limited in their applicability because many molecules cannot be easily isolated from body fluids, synthesized, or linked to alkyne or deuterium tags.
To address this gap, we present Fluorescence Guided Raman Spectroscopy (FGRS) as a method to selectively obtain the spectroscopic features of a protein in its native cellular environment. By fusing a blue-shifted fluorescence protein with a protein of interest, we can locate the protein, collect its spectral fingerprint, and train classifiers for its detection.
We prove the feasibility of the proposed method for Connexin 43 (Cx43) C-terminally fused to mTag blue fluorescent protein 2 (mTagBFP2). The latter is a blue-shifted derivative of the GFP-like protein TagRFP22,23, while Cx43 is a four-pass transmembrane protein with an intracellular N- and C-terminus and a prominent alpha-helical structure24. En route to the membrane, Cx43 travels through the endoplasmic reticulum (ER) and the Golgi-Apparatus, where monomers are phosphorylated and form hexameric connexons, which then connect to another cell’s connexon in the membrane25,26. The C-terminus is available for interactions with other proteins and has many functions, including regulatory ones for channel opening probability, gene expression, proliferation, and migration27. In experimental settings, connexins’ C-termini have been used for tagging with fluorescent proteins with no functional impairment of expression or gap junction formation28,29. We thus hypothesize that the introduction of mTagBFP2 would not impede the expression of Cx43 for subsequent Raman sampling.
Cx43 is of particular interest in neurooncology, where it is a functional protein in glioblastoma (GBM) tumour microtubes, connecting cells to a chemotherapy-resistant cellular network with a shared calcium homeostasis30,31. It is suggested that the intraoperative detection of tumour microtubes could enhance the resection of GBM stem cell networks and improve adjuvant therapy response to chemoradiation31.
We believe Cx43 is ideally suited for FGRS, as it is (a) hard to synthesize due to its oligomeric structure, yet (b) of clinical importance. The objective of this study is to fluorescently tag Cx43 (Cx43-mTagBFP2), making it detectable for Raman interrogation in its native environment. Our first hypothesis is that the emission wavelength of mTagBFP2 does not obscure the Raman spectrum using a 532 nm laser and collected with a detection system at longer wavelengths. Second, we propose that we can identify discrete spectral alterations induced by the presence of Cx43 in cell lines. For this, we present the development of four classification algorithms to discern HEK293 cells with high and low Cx43 content. Finally, we verify our classifiers on four human GBM cell lines with different Cx43 expression levels.
Results
mTagBFP2 is a fluorophore compatible with Raman imaging at 532 nm
We screened three fluorophores for their spectral interference with a Raman system using a 532 nm excitation laser. Of the tested fluorophores, sodium fluorescein (NaFl; 465–490/500–550 nm)32,33 was closest to the Raman laser and the only one to completely obscure the Raman signal (Fig. 1a). The enhanced green fluorescent protein (eGFP) (488/509 nm)34 showed a significant baseline elevation of the Raman signal (Fig. 1b). The shape of the baseline shift of the Raman signal resembled the emission spectrum of eGFP with an initial and terminal steep slope and a mid-range slurring. mTagBFP2 (399/454 nm)23 showed no significant fluorescence leaking into the Raman signal (Fig. 1c). Peaks in the fingerprint region were clearly distinguishable. To exclude the possibility of photobleaching before the Raman image acquisition we checked the fluorescence of mTagBFP2 before and after interrogation but detected no significant loss in intensity (Fig. 1d). In general, the lower the fluorophore’s emission wavelength, the further it was from the Raman spectral range and, therefore, the lower its interference with the Raman signal.
The effect of various fluorescence spectra on Raman sampling. (a) Sodium fluorescein (NaFl): excitation/emission \(\:\approx\:\)498 nm (dark green)/\(\:\approx\:\:\)517 nm (light green) and its Raman spectrum in a HEK293 cell (solid line) with the first standard deviation (shade). (b) eGFP: excitation/emission 488 nm (dark green)/507 nm (ocher) and its Raman spectrum in a HEK293 cell (solid line) with the first standard deviation (shade). (c) mTagBFP2: excitation/emission 405 nm (dark blue)/454 nm (light blue) and the Raman spectrum of mTagBFP2-Cx43 in a HEK293 cell (solid line) with the first standard deviation (shade). (a-c) Raman spectra are not baselined. Spectra were obtained from FPbase35 (d-f) HEK293 cells expressing mTagBFP2-Cx43. White bar \(\:\approx\:\:\)5 μm (d) Fluorescence imaging before (left) and after (right) Raman sampling on a fluorescent microscope shows no significant loss in fluorescence. The sites for Raman sampling can be identified in correlating brightfield imaging on the Raman microscope (middle).
Cx43-mTagBFP2 is reliably expressed and located in HEK293 cells for Raman spectroscopy
The secondary structure of Cx43 and mTagBFP2 is retained in the fusion protein
Next, we explored whether a fusion protein composed of Cx43 and mTagBFP2 would retain the secondary structure of its individual constituents. For this, we employed the I-TASSER-server to model each protein’s structure. I-TASSER is an online server predicting the structure of a protein based on its amino acid sequence by iterative threading assembly simulations36,37. The top threading model for the predicted structure of mTagBFP2 was a mutant of the fluorescent protein mKate S158A, which was previously characterized by synchrotron radiation38. It showed a 92% sequential homology and resembled the typical β-barrel structure of fluorescent proteins22,23 (Fig. 2a; Table 1). Accordingly, the confidence score for the predicted secondary model was high at 7.15 out of 9.
The structure of Cx43 was less well-defined and modelled on sequences from three previously defined proteins. The N-terminal region was predominantly modelled on connexin 32 and a C-terminally truncated version of Cx43, exhibiting an α-helical structure (Fig. 2a; Table 1). These two templates covered 54% and 53% of the complete Cx43 with a respective sequence identity of 48% and 100%. The majority of the protein, including the C-terminus, was identified as a random coil39 (Table 1). The template covered approximately 34% of the human amino acid chain C-terminally with a sequence identity of 96%. The average confidence score for the predicted secondary structure was 6.51 out of 9 and considerably higher in the transmembrane helical structures (confidence score of 9).
The fusion molecule Cx43-mTagBFP2 largely retained the secondary and tertiary structure of its individual components (Fig. 2a). The number of alpha-helices with a minimum length of five amino acids for the Cx43-part remained unchanged at eight. The index of the initial and terminal amino acid for each alpha-helix in the Cx43-part varied by a maximum of two. For the mTagBFP2-part we observed 11 beta sheets in lifeact7-mTagBFP2 and one additional beta sheet of six amino acids in the fusion protein. The indices of the 11 mutual sheets varied by no more than two. For the linking amino acid sequence, I-TASSER predicted predominantly a random coil. The overall confidence score in the secondary structure of the fusion protein was 6.22 out of 9. For the linking amino acid sequence, I-TASSER predicted a random coil. The overall confidence score in the secondary structure for the fusion molecule was 6.22 out of 9.
Expression of Cx43-mTagBFP2 in HEK293 cells. (a) Predicted protein models for mTagBFP2, Cx43, and Cx43-mTagBFP2 (light blue: β-pleated sheets, red: α-helices). (b, c) Confocal fluorescence images of HEK293 expressing Cx43-mTagBFP2 show the colocalization of Cx43-mTagBFP2 and the anti-Cx43 antibody. Square image: view of the XY-plane, right image: orthogonal view of the YZ-plane along the vertical blue or white line, bottom image: orthogonal view of the XZ-plane along the horizontal blue or white line. (d) Widefield epifluorescence image showing the colocalization of Cx43-mTagBFP2 (blue) and anti-Cx43 antibody (red) at the characteristic membranous locations. (e) Cropped western immunoblot with an anti-Cx43 antibody (bottom) and an anti-vinculin antibody (loading control, top) (1: blot ladder with molecular weight indicated on the left, 2: HEK293 cells expressing Cx43-mTagBFP2, 3: HEK293 wildtype). The fusion protein is observed at around 70 kDa in the transfected cells, but not in the wildtype cells. Both cell types also show bands at around 43 kDa. Exposure time: 40 s. Uncropped blot in Supplementary Material S5
Intracellular accumulations of Cx43-mTagBFP2 are accessible for Raman imaging
In FGRS, fluorescence imaging is a prerequisite for subsequent Raman sampling. We transiently transfected the expression vector for Cx43-mTagBFP2 into HEK293 cells. Figure 2b and c show that a polyclonal anti-Cx43 staining overlaps with the fluorescent signal from the fusion protein Cx43-mTagBFP2, indicating a positive transfection (Fig. 2b, c). Cx43 could be observed at the cellular membrane (Fig. 2d). We also noted a strong fluorescence peak inside the cell, which in a Vybrant CM-DiI membrane stain colocalized with other intracellular membranes (Fig. 2b, c). This location likely corresponds to the physiologic collection of Cx43 in the endoplasmic reticulum (ER) or Golgi apparatus, as the protein is oligomerized en route to the membrane40. Since these deposits would have the greatest intracellular concentration of the fusion protein while being structurally similar to membranous Cx43, we selected these locations for the subsequent Raman measurements.
Western blot of Cx43-mTagBFP2
In a western immunoblot, there was a strong band at 70 kDa (Fig. 2e), approximating the sum of mTagBFP2 (26.7 kDa)35 and Cx43 (43 kDa)41. We also observed bands at around 43 kDa in transfected and in wildtype cells.
Raman imaging
Specific Raman shifts identified for Cx43-mTagBFP2 and wildtype HEK293 cells
The second step in FGRS is Raman sampling. We first assessed spectral changes between the Raman spectra of the fusion protein Cx43-mTagBFP2 and the control fluorescent protein mTagBFP2-ER5 (Fig. 3a-c) expressed in HEK293 cells. For each protein, we identified representative Raman shifts Table (2) by their peak location and peak intensity in the fingerprint region (600–1800 cm−1).
Peaks were identified by their prominence in comparison to neighbouring peaks using a prominence-value of 0.01. We detected a small number of peaks in the control fluorescent protein that were not present in the Cx43-mTagBFP2 cells (Fig. 3a, supplementary material S1). At 600 cm−1 and 1401 cm−1, we identified peaks with no clear correlation in the Cx43-mTagBFP2 sample. At 876 cm−1, 892 cm−1, and 1618 cm−1, both samples showed a corresponding peak formation, but the Cx43 samples did not exceed the prominence level. Further, the control fluorescent protein exhibited a prominent peak in the Amide III region at approx. 1253 cm−1.
Spectral intensities were generally significantly different for Raman shifts associated with proteins, but less frequently so for shifts associated with nucleic acids or lipids. Raman shifts associated with proteins showed the most significant spectral intensity differences (Fig. 3d). We computed the effect size as a quantitative measure of the magnitude of the statistical difference42. It was greatest for the 1657 cm−1 Raman shift (d = 1.98), followed by the Raman shifts at 1580 cm−1 (d = 1.30), 752–760 cm−1 (d = 1.29), 1235–1240 cm−1 (d = 1.05), and at 1003 cm−1 (d = 1.05). The smallest effect size was observed at 1336 cm−1 (d = 0.91). In contrast, the Raman shifts for glycogen and all-trans-retinol at 480 cm−1 and 1605 cm−1 did not result in significant intensity differences (Fig. 3d). However, the Raman shifts for lipids and carbohydrates at 877 cm−1 and cytochrome c at 750 cm−1 and 1582 cm−1 exhibited significant intensity changes (d877 = 0.72; d750 = 1.44; d1582 = 1.39). For lipids, the Raman shifts at 720 cm−1, 1090 cm−1, and 1449 cm−1 were not significantly different (Fig. 3d), yet there were significant intensity differences for the Raman shifts associated with lipids at 1297 cm−1 and 1660 cm-1 (d1297 = 0.27; d1660 = 1.78). For nucleic acids, we detected no significant intensity variations between Cx43-mTagBFP2 and mTagBFP2 at 720 cm−1, 811 cm−1, 1081 cm−1, and 1180 cm−1 (Fig. 3d).
Comparison of Raman spectra of Cx43-mTagBFP2 and mTagBFP2-ER5. (a) The average spectra of Cx43-mTagBFP2 and the control mTagBFP2-ER5 show distinct spectral peaks at the significance level of p ≥ 0.01. (b) Overview of the average Raman intensity (solid line) and its first standard deviation (shade) at known Raman shifts for Cx43-mTagBFP2, mTagBFP2-ER5, and their absolute difference. Raman shifts where differences exceeded the double standard deviation (dashed red line) are shaded in light grey. (c) Correlation of the brightfield and fluorescent images allows colocalization of the regions of interest for measurement (red line). autofl. indicates imaging based on green autofluorescence. White bar: \(\:\approx\:\)5 μm (d) Comparison of the peak intensity measured at known Raman shifts identifies significant changes for proteins. (n = 95, Bars extend between 25th and 75th percentile. White line indicates the median. Black dots are individual data points. Wilcoxon rank sum test * p < 0.05, **p\(\:{\le\:10}^{2}\), p\(\:{\le\:10}^{3})\).
Distinct Raman shifts identified in four human glioblastoma wildtype cell lines
We then analysed the Raman spectra of the human GBM cell lines U87MG, T98G, Ln229, and Ln18 (Fig. 4a-d). Using a prominence-value of 0.05, we detected peaks shared by all cell lines at approximately 720 cm−1, 850 cm−1, 938 cm−1, 1004 cm−1, 1449 cm−1, and 1660 cm−1 (supplementary material S2). Of note, Cx43-mTagBFP2 and U87MG could be identified by a mutual loss of the peaks at around 1250 cm−1 and 1580 cm−1, whereas all other cell lines showed a prominent peak. Cx43-mTagBFP2 was also the only sample to show no peak at around 784 cm−1. We then compared the spectra of the GBM wildtype cell lines to the spectra of Cx43-mTagBFP2 by calculating the statistical effect size for the previously discussed Raman shifts for proteins, lipids, nucleic acids, and other substances. The average effect size across all Raman shifts was the smallest for the difference between Cx43-mTagBFP2 and U87MG cells at 0.64 (Fig. 4e). We observed the single smallest effect sizes at 1449 cm−1 and at 877 cm−1. In contrast, we calculated the largest average effect size for the difference between Cx43-mTagBFP2 and Ln18 cells.
U87MG is known to express the highest level of endogenous Cx4343. We confirmed this for our U87MG cells in a western immunoblot with a polyclonal anti-hCx43-antibody (Fig. 4b). We observed prominent bands at approx. 43 kDa for U87MG cells, and only weak bands for T98G, Ln229, and Ln18 cells. We thus classified U87MG as “high Cx43 content” and all other cell lines as “low Cx43 content”.
Classification of cell lines based on their Cx43 content
Training support vector machine classifiers with high accuracies in HEK293 cells
Based on these findings, we concluded that spectral intensity differences exist that can be employed to train classification algorithms to discriminate between cells of high and low Cx43 content. We trained a medium and coarse Gaussian SVM on HEK293 cells transfected with Cx43-mTagBFP2 or the control fluorescent protein mTagBFP2-ER5.
First, we trained the classifiers with on the full spectral range of the acquired data (571 wavenumbers), which resulted in a remarkable training accuracy of 94.7% for the coarse and 98.4% for the medium Gaussian SVM classifier (Table 2). Second, to increase robustness when applied to other cell lines, we trained the classifiers only on the Raman shifts where the spectral intensity difference exceeded the double standard deviation. This included primarily Raman shifts associated with proteins and the shift at around 1580 cm−1. The coarse and medium Gaussian SVM achieved training accuracies of 83.7% and 94.7%, respectively (Table 3, supplementary materials S3/S4).
Comparison and Classification of Raman spectra of human GBM wildtype cell lines. (a) Average spectrum (solid line) and first standard deviation (shade) for the indicated cell lines. Raman shifts where the difference exceeded the double standard deviation are shaded in light grey. (b) Cropped western immunoblot with an anti-Cx43 antibody (bottom) and an anti-vinculin antibody (loading control, top). (1: blot ladder with molecular weight indicated to the left, 2: T98G, 3: U87MG, 4: Ln229, 5: Ln18. Exposure time: 3 min. 6: Lane 3 at lower exposure time (7.5 s)). Uncropped blots in Supplementary Material S6. (c) Epifluorescent and correlating brightfield images of a HEK293 cell expressing Cx43-mTagBFP2 (*, aprox. cell borders denoted by dotted line) and an adjacent wildtype cell (wt). Raman sampling for Cx43-mTagBFP2 was conducted along the blue line; for wildtype, along the red line. (d) Brightfield images of probed wildtype cells. White bar: \(\:\approx\:\)5 μm. (e) Effect sizes of the intensity differences at the indicated Raman shifts between HEK293 cells expressing Cx43-mTagBFP2 and the indicated cell lines. Ln18 (brown), Ln229 (dark ocher), T98G (light ocher), U87MG (turquoise). The horizontal lines represent the significance levels. Dots centered below the line do not reach the significance level. (f,g) ROC curves of the coarse and medium Gaussian SVMs using all Raman shifts and the most significant Raman shifts, i.e., those where the intensity difference exceeded the double standard deviation. TPR: true positive rate FPR: false positive rate.
Testing the SVMs by distinguishing HEK-Cx43-mTagBFP2 from wildtype HEK293-cells
The classifiers were then used to discern positively transfected HEK293 cells (high Cx43) from adjacent HEK293 wildtype cells (low Cx43, Fig. 4a, c). Coarse Gaussian SVMs performed significantly better than their medium Gaussian counterparts (Table 3 Testing 1). The coarse Gaussian SVM’s accuracy trained on all wavenumbers was comparable to the training accuracy (92.9%), while the one trained on only the most important Raman shifts improved to 90.9%, mainly because of the high specificity of 98%. Medium SVMs showed a reduced accuracy of 79.2% when using all wavenumbers and 77.9% when using only the preselected shifts. The superior performance of the coarse Gaussian SVMs can also be inferred from their rectangularized random operator characteristic (ROC) curve (Fig. 4f, g).
Testing the SVMs on four human GBM wildtype cell lines: number of Raman shifts allows optimizing for sensitivity or specificity
The coarse and medium SVMs trained on only the most significant Raman shifts showed higher testing accuracies (79.0% and 74.9%, resp.) and specificity (90.8% and 72.1%, resp.) than their counterparts trained on a complete spectrum (Fig. 4f, g; Table 2 Testing 2). Inversely, the SVMs trained on all wavenumbers were more sensitive than SVMs trained only on the most significant Raman shifts.
The coarse Gaussian SVMs showed the highest overall specificity (90.8% for the most significant Raman shifts) and sensitivity (98.7% for all wavenumbers), but at the expense of the other parameter dropping below 60%. For the medium Gaussian SVM the significance and specificity differ less from each other.
Table 3 summarizes the classification results. Training: high-Cx43 content cells were HEK293 expressing C43-mTagBFP2; low-C43 were HEK293 expressing mTagBFP2-ER5. Testing 1: high-Cx43 same as for training; low-Cx4 were HEK293 wildtype cells. Testing 2: high-Cx43 content cells were U87MG cells; low-content T98G, Ln229, and Ln18. For “All wavenumbers” we computed 571 Raman shifts; for “Raman Shifts Greater Double Standard Deviation” 82 spectra were used. AUC: area under the curve.
Discussion
Molecular pathologic analysis of tumours can identify a plethora of marker proteins and therapeutic targets, but it is an inherently slow and work-intensive process9. In contrast, RS can deliver fast biochemical information on a sample44. However, current Raman classifiers are generally trained independently from proteomic and genomic discoveries because there is no versatile method for the collection of a specific protein’s Raman spectrum in its native environment. Therefore, the objective of study was to address this gap by developing FGRS to obtain the spectra of proteins. We identified mTagBFP2 as a fluorescent protein compatible with standard blue fluorescence filters and a Raman system operating at 532 nm. We obtained the spectrum of Cx43, a gap junction protein involved in resistant glioma networks, and trained four classification algorithms for its identification in glioblastoma cell lines.
We first fused Cx43 to mTagBFP2 and validated its expression by fluorescence microscopy, western blot, and an I-TASSER protein prediction model. We observed Cx43-mTagBFP2 at the physiological expression sites for Cx43, most importantly intracellularly and on the cell membrane41. The intracellular accumulation of Cx43-mTagBFP2 resembled previously described expression patterns of Cx43 as it passes through the ER and Golgi-apparatus41,45. This site was chosen for Raman sampling because (a) it was easily detectable and (b) it ensured a local maximum concentration of Cx43-mTagBFP2, yielding stronger and specific spectroscopic signals. According to the protein prediction model, Cx43 retains its characteristic alpha-helical structure. This was expected since fluorescently tagged connexins have been used for functional studies before28,46. Finally, we confirmed the molecular weight of Cx43-mTagBFP2 with a western immunoblot showing the fusion protein at the anticipated weight and in line with other studies47.
We then demonstrated that the spectral properties of mTagBFP2 (399/454 nm) and RS at 532 nm are sufficiently distinct to allow hybrid sampling with RS and fluorescence imaging. This merits special consideration, as overlapping fluorescence is the nemesis of spontaneous RS, because Raman signals are many orders of magnitude weaker than their fluorescent counterparts and can thus be obscured48,49. Fluorescence originates from photon absorption – an energetically relatively intense process - while Raman signals result from the inelastic scattering of light, an energetically less intense process11,12. Several groups used fluorescent proteins in Raman imaging before, but to our knowledge, this study reports the closest spectral proximity between a blue fluorescent protein’s emission wavelength and the Raman excitation laser. Chiu et al. al (2017) combined the blue and enhanced cyan fluorescent proteins (emission 445 nm and 476 nm) with a 532 nm Raman laser but used an anti-Stokes fluorescence detection scheme that allowed focusing on a blue shifted emission50. A greater spectral distance between fluorescence emission and the Raman laser (785 nm) was chosen by Yuan et al. (2018) for the sampling of enhanced cyan fluorescent protein51, while Huang et al. (2007) probed the green fluorophore Cyanine 3 with a 532 nm laser after a photobleaching step52. Alternatively, in their review of intraoperative Raman and fluorescence imaging, Lauwerends et al. (2022) suggest red-shifting the Raman laser past the fluorescence excitation wavelength into the high wavenumber region beyond 2400 cm.153.
Next, we showed that a discrete set of spectral changes could be identified for Cx43 and used to train classifiers. We observed significant differences in the peak locations and intensities. In the Amide III band, the presence of an intense Raman shift at 1230–1240 cm−1 has been reported as one of the most characteristic for β-sheets54. It was prominent in the cells transfected with the control fluorescent protein, but not in the cells transfected with the fusion protein. Inversely, the absence of a strong intensity at 1235–1240 cm−1 is typical of an α-helix54. At 1240 cm−1, we detected a significantly weaker signal and bend in the fusion protein, indicative of its comparatively higher concentration of helices54. As expected, the most significant intensity differences were detected for the Raman shifts associated with proteins at 752–760 cm.1, 1003 cm.1 and the amide I (1654/1655 cm−1), II (1580 cm−1) and III (1337 cm−1) bands. The spectral intensities at around 750 cm−1 and 1582 cm−1 have been reported for a plethora of molecules, most dominantly for the porphyrin ring in cytochrome c55. For lipids, we detected significant intensity differences only at 1301 cm−1 and 1660 cm−1. Since they fall within the Amide III or I region, respectively, a contribution of protein signals seems likely.
Based on the measurements of Cx43-mTagBFP2, we trained different SVMs to distinguish between cells with high and low Cx43 content. SVMs were chosen due to their versatility and wide application in RS56,57. The SVMs used either a coarse or medium Gaussian kernel and were either trained on a complete Raman fingerprint spectrum (571 wavenumbers) or only on the Raman shifts where the intensity difference exceeded the double standard deviation in the training data set (82 wavenumbers). The training accuracies to discern Cx43-mTagBFP2 (high content) from the control fluorescent protein (low content) (coarse: 94%, medium: 83%) resemble those of other published supervised56 and unsupervised58 classifiers. During testing, we gradually increased the samples’ level of biological diversity to evaluate the classifier’s robustness. In the first testing data set, the low-content cells expressing the fluorescent protein only were replaced by HEK293 wildtype cells, expressing neither the control fluorescent protein nor much Cx43. With a testing accuracy of over 90% the coarse SVM exceeded the medium one. This is no surprise, because the lower kernel scale value in the medium Gaussian reduces flexibility59 when applied to new cell lines. The second testing data set comprised only GBM cell lines not included in the training data set. This approach is unique since most previously published classifiers were used to classify the same cell lines for training and testing60,61. In both SVMs, we found that models trained on only the most significant Raman shifts performed more accurately and specifically than models trained on the complete spectrum. Inversely, training on the complete spectrum resulted in a higher sensitivity. For maximum accuracy, we recommend a combination of the presented classifiers. The coarse Gaussian SVM analysing the complete Raman spectrum could be used a screening tool, because it yields a high sensitivity. The overall testing accuracy of the classifiers was again in the magnitude of previously presented classifiers62,63.
The shortcomings of this study include the limited sample size, which, according to the central limit theorem results in a larger standard deviation. However, our experiments show a standard deviation comparable with other studies61,64,65. Also, our coarse Gaussian classification algorithm could distinguish cells with an overall accuracy of 79.0%, despite the greater degree of standard deviation. The biological downstream effects of introducing mTagBFP2 or Cx43 fused to it are not fully controlled in this study. However, fluorescent proteins have been extensively used for morphologic and functional experiments, and our own validation experiments suggest proper expression28,46. Finally, our classifiers work well for cultured cell lines, but their generalizability might be compromised by a greater degree of heterogeneity and autofluorescence in tissue samples17. We tried to reduce susceptibility by reducing the spectra needed for the classifier. It should be noted that the current spectra are obtained from cellular locations with high Cx43 content, while expression levels might be lower in tissue samples.
Methods
Cell culture
We cultured U87MG, T98G, Ln229, Ln18, HEK293 cells in DMEM supplemented with 10% fetal bovine serum (both gibco, Thermo Fisher, Schwerte, Germany) under standard cell culture conditions. GBM cell lines had been purchased previously from the ATCC.
Plasmid cloning by restriction enzyme digest
pDest/hCx43-EGFP-N1 and mTagBFP2-lifeact-7 were donations of Robin Shaw and Michael Davidson23,28. We expressed plasmids in DH5alpha (New England Biolabs, Frankfurt am Main, Germany) cultivated in Terrific Borth (Roth, Karlsruhe, Germany) supplemented with 50 mg/ml Kanamycine (Roth, Karlsruhe, Germany). For the cloning restriction digest we excised eGFP with AgeI and NotI (New England Biolabs, Frankfurt am Main, Germany) and substituted it with mTagBFP2. For the control digest we used AgeI and HindIII (New England Biolabs, Frankfurt am Main, Germany). As a control, we included mTagBFP2-ER5 (a gift from Michael Davidson)23.
Protein prediction
We translated the plasmid sequences into amino acids with ExPASy (Swiss Institute of Bioinformatics, Lausanne, Switzerland)66 and submitted them to I-TASSER36,37. We visualized the predictions with Discovery Studio 2021 Client (BIOVIA, Dassault Systèmes, San Diego, USA).
Immunocytochemistry and fluroescence imaging
We cultivated samples on round cover slips (Hartenstein, Würzburg, Germany) and fixed them with 4% formaldehyde supplemented with 2 µl Vybrant CM-DiI (Thermo Fischer Scientific, Schwerte, Germany) per 500 ml. For the indirect immunofluorescence we used a polyclonal anti-Cx43 antibody (Sigma, Merck, Darmstadt, Germany) and an Alexa Fluor secondary antibody (Thermo Fisher, Schwerte, Germany). Autofluorescence imaging was conducted using the green fluroescence filter sets ZEISS 38 (Ex: 470/40, Em: 525/50) and ZEISS 44 (Ex: 475/40, Em: 530/50). We imaged cells on an Axioplan microscope (Zeiss, Jena, Germany) equipped with an HBO 100 lamp and a monochrome camera (Axiocam 503 mono, Zeiss, Jena, Germany; 2.8 megapixel). For processing we required ZEN lite (Zeiss, Jena, Germany, 3.8 software) and Image J 1.53q (National Institutes of Health, USA)67. Confocal images were obtained on an Olympus FV1000 microscope (20X/0.8 N.A. air-, 60X/1.42 N.A. oil-immersion objective, Olympus Corporation, Tokyo, Japan).
Western immunoblot
Cells were harvested at 80–90% confluency and lysed with a RIPA buffer. Lysed samples were sonicated and centrifuged at 16.000 g at 4 °C. The supernatant was stored at −20 °C until a Bradford protein assay with dye reagent (Bio-Rad Hercules, CA, USA, cat. no 500006). Samples were fractioned by a 16% SDS-page gel and incubated overnight with a polyclonal anti-Cx43 and anti-Vinculin-antibody (Sigma, Merck, Darmstadt, Germany). Primary antibodies were incubated with a horse-radish-peroxidase conjugate secondary antibody and visualized with an imager for chemiluminescent 190 signals (GE Healthcare Life Science, Munich, Germany; cat. no AI680). Western blots were conducted twice in independent samples.
Sample preparation and Raman spectroscopy
We trypsinzed cells, fixed them in 4% formaldehyde, and washed with them with distilled water prior to plating on a polished metal slide steel slide. We used fluorescence imaging to identify Cx43-expressing regions before performing Raman spectroscopy. For Raman microscopy, we used a WITec alpha300 R (WITec GmbH, Ulm, Germany) equipped with an SHG Nd: YAG laser (532 nm, max. 22.5 mW) and a lens-based spectrometer with a CCD-camera (1024 × 128 pixel, Peltier cooled to −65 °C). The nominal spectral resolution was approx. 3 cm−1 per CCD pixel (600 mm−1 grating). Prior to each measurement session, the system was calibrated using the Silicon peak at 520.4 cm−1. A 50x objective was used for experiments. Using the 50 μm core of a multimode fibre we achieved a focal depth of approx. 1 μm. The integration time was typically set to 30 s per spectrum, and the laser intensity was adjusted manually for each sample to avoid the induction of damage by burning. Experiments were conducted a minimum of three times.
Raman spectrum analysis
We processed spectra in Matlab R2023a version 9.14.0. and R2021a version 9.10.0 (The MathWorks Inc., Natick, Massachusetts) and imported them with the WITio toolbox version 2.0.168. For pre-processing we baselined the fingerprint region (400–1800 cm−1) using an asymmetric least squares algorithm69. Cosmic rays were identified as sharp peaks greater than the Amide I intensity and removed. Each spectrum was vector normalized to the greatest spectral intensity.
Originally, we collected 459 spectra of the Cx43-mTagBFP2 cells. To match the number of spectra obtained from mTagBFP2 an equal number of spectra of the fusion protein (n = 95) were randomly chosen. Peaks were identified using the findpeaks-function of the signal processing toolbox. The difference in intensity was calculated as the absolute value of the spectral difference. Significance tests were performed with the ranksum-function for a Mann-Whitney U-Test for independent samples, choosing a p-value of 0.05 a priori. The effect sized was calculated with the computeCohen_d(x1, x2, varargin)-function in Matlab70. Box plots were computed using the Alternative Box Plot Toolbox version 3.2.1.0.
Machine learning classification
For training a binary SVM data were loaded into the Matlab Classification Learner App. 80% of the imported data were used for training; 20% were held out for a five-fold cross validation. An equal number of spectra was used for each class to avoid class imbalance. In the first round, we used all 571 wavenumbers; in the second round, we only used those Raman shifts, where the intensity exceeded the double standard deviation (n = 82). For the medium Gaussian kernel, the scale was set to the square root of the number of predictors; for the coarse Gaussian kernel to the four-fold of the square root. The overall accuracy was calculated as the percentage of correctly classified spectra. Results were visualized as a confusion matrix and ROC-curve. The confusion matrix yielded the sensitivity (TP/(TP + FN)) and specificity (TN/(TN + FP)), where T and F denote true and false, respectively and P and N denote positive and negative.
Data availability
Matlab code and the minimal, preprocessed data set are available upon reasonable request from the corresponding author Johannes Reifenrath.
Abbreviations
- RS:
-
Raman spectroscopy
- FGRS:
-
Fluorescence Guided Raman Spectroscopy
- Cx43:
-
Connexin 43
- mTagBFP2:
-
mTag blue fluorescent protein 2
- GFP:
-
Green fluorescent protein
- ER:
-
Endoplasmic reticulum
- Cx43-mTagBFP2:
-
Cx43 tagged with mTagBFP2
- GBM:
-
Glioblastoma multiforme
- eGFP:
-
Enhanced green fluorescent protein
- SVM:
-
Support vector machine
- CMV:
-
Cytomegalovirus
- ROC:
-
Random operator characteristic
- TAE:
-
Tris acetate EDTA buffer
- TBE:
-
Tris borate EDTA buffer
References
Min, H. Y. & Lee, H. Y. Molecular targeted therapy for anticancer treatment. Exp. Mol. Med. 54, 1670–1694. https://doi.org/10.1038/s12276-022-00864-3 (2022).
Verdugo, E., Puerto, I. & Medina, M. An update on the molecular biology of glioblastoma, with clinical implications and progress in its treatment. Cancer Commun. (Lond). 42, 1083–1111. https://doi.org/10.1002/cac2.12361 (2022).
Dugger, S. A., Platt, A. & Goldstein, D. B. Drug development in the era of precision medicine. Nat. Rev. Drug Discovery. 17, 183–196. https://doi.org/10.1038/nrd.2017.226 (2018).
Schram, A. M. & Hyman, D. M. Quantifying the benefits of Genome-Driven oncology. Cancer Discov. 7, 552–554. https://doi.org/10.1158/2159-8290.Cd-17-0380 (2017).
Doll, S., Gnad, F. & Mann, M. The case for proteomics and Phospho-Proteomics in personalized Cancer medicine. Proteom. – Clin. Appl. 13, 1800113. https://doi.org/10.1002/prca.201800113 (2019).
Mukasa, A. Genome medicine for brain tumors: current status and future perspectives. Neurol. Med. Chir. (Tokyo). 60, 531–542. https://doi.org/10.2176/nmc.ra.2020-0175 (2020).
Oronsky, B., Reid, T. R., Oronsky, A., Sandhu, N. & Knox, S. J. A review of newly diagnosed glioblastoma. Front. Oncol. 10, 574012. https://doi.org/10.3389/fonc.2020.574012 (2020).
Molla, G. & Bitew, M. Revolutionizing personalized medicine: synergy with Multi-Omics data generation, main hurdles, and future perspectives. Biomedicines 12, 2750 (2024).
Vandereyken, K., Sifrim, A., Thienpont, B. & Voet, T. Methods and applications for single-cell and Spatial multi-omics. Nat. Rev. Genet. 24, 494–515. https://doi.org/10.1038/s41576-023-00580-2 (2023).
Fang, R., Wang, X. T., Xia, Q. Y., Zhou, X. J. & Rao, Q. Precision in diagnostic molecular pathology based on immunohistochemistry. Crit. Rev. Oncog. 22, 451–469. https://doi.org/10.1615/CritRevOncog.2017020548 (2017).
Smith, E., Dent, G. & Introduction Basic Theory and Principles in Modern Raman Spectroscopy (eds Ewen Smith & G Dent) Ch. 1, 1–20Wiley, (2019).
Jones, R. R., Hooper, D. C., Zhang, L., Wolverson, D. & Valev, V. K. Raman techniques: fundamentals and frontiers. Nanoscale Res. Lett. 14, 231. https://doi.org/10.1186/s11671-019-3039-2 (2019).
Pezzotti, G. Raman spectroscopy in cell biology and microbiology. J. Raman Spectrosc. 52, 2348–2443. https://doi.org/10.1002/jrs.6204 (2021).
Cialla-May, D., Schmitt, M. & Popp, J. 1. Theoretical principles of Raman spectroscopy in Micro-Raman Spectroscopy (eds Popp Jürgen & Mayerhöfer Thomas) 1–14De Gruyter, (2020).
Blake, N., Gaifulina, R., Griffin, L. D., Bell, I. M. & Thomas, G. M. H. Machine learning of Raman spectroscopy data for classifying cancers: A review of the recent literature. Diagnostics (Basel). 12. https://doi.org/10.3390/diagnostics12061491 (2022).
Jermyn, M. et al. Intraoperative brain cancer detection with Raman spectroscopy in humans. Sci. Transl Med. 7, 274ra219. https://doi.org/10.1126/scitranslmed.aaa2384 (2015).
Galli, R. et al. Rapid Label-Free analysis of brain tumor biopsies by near infrared Raman and fluorescence Spectroscopy—A study of 209 patients. Front. Oncol. 9 https://doi.org/10.3389/fonc.2019.01165 (2019).
Zhang, S., van der Mee, F. A. M., Erckens, R. J., Webers, C. A. B. & Berendschot, T. T. J. M. Raman spectroscopic detection of interleukin-10 and angiotensin converting enzyme. J. Eur. Opt. Society-Rapid Publications. 17, 7. https://doi.org/10.1186/s41476-021-00152-z (2021).
Kniggendorf, A. K. et al. Temperature-sensitive gating of hCx26: high-resolution Raman spectroscopy sheds light on conformational changes. Biomed. Opt. Express. 5, 2054–2065. https://doi.org/10.1364/boe.5.002054 (2014).
Uematsu, M. & Shimizu, T. Raman microscopy-based quantification of the physical properties of intracellular lipids. Commun. Biology. 4, 1176. https://doi.org/10.1038/s42003-021-02679-w (2021).
Jamieson, L. E. et al. Tracking intracellular uptake and localisation of alkyne tagged fatty acids using Raman spectroscopy. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 197, 30–36. https://doi.org/10.1016/j.saa.2018.01.064 (2018).
Subach, O. M. et al. Structural characterization of Acylimine-Containing blue and red chromophores in mTagBFP and TagRFP fluorescent proteins. Chem. Biol. 17, 333–341. https://doi.org/10.1016/j.chembiol.2010.03.005 (2010).
Subach, O. M., Cranfill, P. J., Davidson, M. W. & Verkhusha, V. V. An enhanced monomeric blue fluorescent protein with the high chemical stability of the chromophore. PLOS ONE. 6, e28674. https://doi.org/10.1371/journal.pone.0028674 (2011).
Lee, H. J. et al. Conformational changes in the human Cx43/GJA1 gap junction channel visualized using cryo-EM. Nat. Commun. 14, 931. https://doi.org/10.1038/s41467-023-36593-y (2023).
Musil, L. S. & Goodenough, D. A. Multisubunit assembly of an integral plasma membrane channel protein, gap junction connexin43, occurs after exit from the ER. Cell 74, 1065–1077. https://doi.org/10.1016/0092-8674(93)90728-9 (1993).
Qi, C. et al. Structure of the connexin-43 gap junction channel in a putative closed state. eLife 12, RP87616, (2023). https://doi.org/10.7554/eLife.87616
Basheer, W. & Shaw, R. The tail of Connexin43: an unexpected journey from alternative translation to trafficking. Biochim. Et Biophys. Acta (BBA) - Mol. Cell. Res. 1863, 1848–1856. https://doi.org/10.1016/j.bbamcr.2015.10.015 (2016).
Smyth, J. W. et al. Actin cytoskeleton rest stops regulate anterograde traffic of connexin 43 vesicles to the plasma membrane. Circul. Res. 110, 978–989. https://doi.org/10.1161/CIRCRESAHA.111.257964 (2012).
Laird, D. W. et al. Comparative analysis and application of fluorescent protein-tagged connexins. Microsc Res. Tech. 52, 263–272. https://doi.org/10.1002/1097-0029(20010201)52:3<263::Aid-jemt1012>3.0.Co;2-q (2001).
Osswald, M. et al. Brain tumour cells interconnect to a functional and resistant network. Nature 528, 93–98. https://doi.org/10.1038/nature16071 (2015).
Weil, S. et al. Tumor microtubes convey resistance to surgical lesions and chemotherapy in gliomas. Neuro Oncol. 19, 1316–1326. https://doi.org/10.1093/neuonc/nox070 (2017).
Zhang, N. et al. Sodium Fluorescein-Guided resection under the YELLOW 560 Nm surgical microscope filter in malignant gliomas: our first 38 cases experience. Biomed. Res. Int. 2017 (7865747). https://doi.org/10.1155/2017/7865747 (2017).
Xu, R. et al. Optical characterization of sodium fluorescein in vitro and ex vivo. Front. Oncol. 11 https://doi.org/10.3389/fonc.2021.654300 (2021).
Ilagan, R. P. et al. A new bright green-emitting fluorescent protein – engineered monomeric and dimeric forms. FEBS J. 277, 1967–1978. https://doi.org/10.1111/j.1742-4658.2010.07618.x (2010).
Lambert, T. J. mTagBFP2 in fpbase: a community-editable fluorescent protein database. Nat. Methods. 16, 277–278. https://doi.org/10.1038/s41592-019-0352-8 (2019). accessed 2019/04/01.
Zhang, Y. I-TASSER server for protein 3D structure prediction. BMC Bioinform. 9, 40. https://doi.org/10.1186/1471-2105-9-40 (2008).
Roy, A., Kucukural, A. & Zhang, Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat. Protoc. 5, 725–738. https://doi.org/10.1038/nprot.2010.5 (2010).
Wang, Q. et al. Molecular mechanism of a Green-Shifted, pH-Dependent red fluorescent protein mKate variant. PLOS ONE. 6, e23513. https://doi.org/10.1371/journal.pone.0023513 (2011).
Sorgen, P. L. et al. Structural changes in the carboxyl terminus of the gap junction protein Connexin43 indicates signaling between binding domains for c-Src and Zonula Occludens-1 *. J. Biol. Chem. 279, 54695–54701. https://doi.org/10.1074/jbc.M409552200 (2004).
Epifantseva, I. & Shaw, R. M. Intracellular trafficking pathways of Cx43 gap junction channels. Biochim. Et Biophys. Acta (BBA) - Biomembr. 1860, 40–47. https://doi.org/10.1016/j.bbamem.2017.05.018 (2018).
Solan, J. L. & Lampe, P. D. Key connexin 43 phosphorylation events regulate the gap junction life cycle. J. Membr. Biol. 217, 35–41. https://doi.org/10.1007/s00232-007-9035-y (2007).
Sullivan, G. M. & Feinn, R. Using effect Size-or why the P value is not enough. J. Grad Med. Educ. 4, 279–282. https://doi.org/10.4300/jgme-d-12-00156.1 (2012).
Murphy, S. F. et al. Connexin 43 Inhibition sensitizes chemoresistant glioblastoma cells to Temozolomide. Cancer Res. 76, 139–149. https://doi.org/10.1158/0008-5472.Can-15-1286 (2016).
Klein, K. et al. Label-free live-cell imaging with confocal Raman microscopy. Biophys. J. 102, 360–368. https://doi.org/10.1016/j.bpj.2011.12.027 (2012).
Majoul, I. V. et al. Limiting transport steps and novel interactions of Connexin-43 along the secretory pathway. Histochem. Cell Biol. 132, 263–280. https://doi.org/10.1007/s00418-009-0617-x (2009).
Bukauskas, F. F. et al. Clustering of connexin 43–enhanced green fluorescent protein gap junction channels and functional coupling in living cells. Proceedings of the National Academy of Sciences 97, 2556–2561, (2000). https://doi.org/10.1073/pnas.050588497
Jordan, K. et al. Trafficking, assembly, and function of a Connexin43-Green fluorescent protein chimera in live mammalian cells. Mol. Biol. Cell. 10, 2033–2050. https://doi.org/10.1091/mbc.10.6.2033 (1999).
Shreve, A. P., Cherepy, N. J. & Mathies, R. A. Effective rejection of fluorescence interference in Raman spectroscopy using a shifted excitation difference technique. Appl. Spectrosc. 46, 707–711. https://doi.org/10.1366/0003702924125122 (1992).
Durrant, B., Trappett, M., Shipp, D. & Notingher, I. Recent developments in spontaneous Raman imaging of living biological cells. Curr. Opin. Chem. Biol. 51, 138–145. https://doi.org/10.1016/j.cbpa.2019.06.004 (2019).
Chiu, L. et al. Protein expression guided chemical profiling of living cells by the simultaneous observation of Raman scattering and anti-Stokes fluorescence emission. Sci. Rep. 7, 43569. https://doi.org/10.1038/srep43569 (2017).
Yuan, Y. et al. Raman spectra of the GFP-like fluorescent proteins. Biophys. Rep. 4, 265–272. https://doi.org/10.1007/s41048-018-0072-0 (2018).
Huang, W. E. et al. Raman-FISH: combining stable-isotope Raman spectroscopy and fluorescence in situ hybridization for the single cell analysis of identity and function. Environ. Microbiol. 9, 1878–1889. https://doi.org/10.1111/j.1462-2920.2007.01352.x (2007).
Lauwerends, L. J. et al. The complementary value of intraoperative fluorescence imaging and Raman spectroscopy for cancer surgery: combining the incompatibles. Eur. J. Nucl. Med. Mol. Imaging. 49, 2364–2376. https://doi.org/10.1007/s00259-022-05705-z (2022).
Rygula, A. et al. Raman spectroscopy of proteins: a review. J. Raman Spectrosc. 44, 1061–1076. https://doi.org/10.1002/jrs.4335 (2013).
Surmacki, J. M. & Abramczyk, H. Confocal Raman imaging reveals the impact of retinoids on human breast cancer via monitoring the redox status of cytochrome c. Sci. Rep. 13, 15049. https://doi.org/10.1038/s41598-023-42301-z (2023).
Qi, Y. et al. Recent progresses in machine learning assisted Raman spectroscopy. Adv. Opt. Mater. 11, 2203104. https://doi.org/10.1002/adom.202203104 (2023).
Ullah, R. et al. Raman spectroscopy combined with a support vector machine for differentiating between feeding male and female infants mother’s milk. Biomed. Opt. Express. 9, 844–851. https://doi.org/10.1364/boe.9.000844 (2018).
Wang, X. et al. Robust spontaneous Raman flow cytometry for Single-Cell metabolic phenome profiling via pDEP-DLD-RFC. Adv. Sci. 10, 2207497. https://doi.org/10.1002/advs.202207497 (2023).
The MathWorks. I. Choose Classifier Options, <https://de.mathworks.com/help/stats/choose-a-classifier.html (.
Akagi, Y., Mori, N., Kawamura, T., Takayama, Y. & Kida, Y. S. Non-invasive cell classification using the paint Raman express spectroscopy system (PRESS). Sci. Rep. 11, 8818. https://doi.org/10.1038/s41598-021-88056-3 (2021).
Brauchle, E., Thude, S., Brucker, S. Y. & Schenke-Layland, K. Cell death stages in single apoptotic and necrotic cells monitored by Raman microspectroscopy. Sci. Rep. 4, 4698. https://doi.org/10.1038/srep04698 (2014).
Riva, M. et al. Glioma biopsies classification using Raman spectroscopy and machine learning models on fresh tissue samples. Cancers 13, 1073 (2021).
Quesnel, A. et al. Glycosylation spectral signatures for glioma grade discrimination using Raman spectroscopy. BMC Cancer. 23, 174. https://doi.org/10.1186/s12885-023-10588-w (2023).
Hsu, C. C. et al. A single-cell Raman-based platform to identify developmental stages of human pluripotent stem cell-derived neurons. Proc. Natl. Acad. Sci. 117, 18412–18423. https://doi.org/10.1073/pnas.2001906117 (2020).
Shaik, T. A. et al. Monitoring changes in biochemical and Biomechanical properties of collagenous tissues using Label-Free and nondestructive optical imaging techniques. Anal. Chem. 93, 3813–3821. https://doi.org/10.1021/acs.analchem.0c04306 (2021).
Gasteiger, E. et al. ExPASy: the proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res. 31, 3784–3788. https://doi.org/10.1093/nar/gkg563 (2003).
Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods. 9, 676–682. https://doi.org/10.1038/nmeth.2019 (2012).
Holmi, J. T., Lipsanen, H. & WITio A MATLAB data evaluation toolbox to script broader insights into big data from WITec microscopes. SoftwareX 18, 101009. https://doi.org/10.1016/j.softx.2022.101009 (2022).
Eilers, P. H. & Boelens, H. F. Baseline correction with asymmetric least squares smoothing. Leiden Univ. Med. Centre Rep. 1, 5 (2005).
Bettinardi, R. G. computeCohen_d(x1, x2, varargin), (2024). https://www.mathworks.com/matlabcentral/fileexchange/62957-computecohen_d-x1-x2-varargin
Ma, C. et al. Single cell Raman spectroscopy to identify different stages of proliferating human hepatocytes for cell therapy. Stem Cell Res. Ther. 12, 555. https://doi.org/10.1186/s13287-021-02619-9 (2021).
Klein, K. Label-free microscopic bioimaging by means of confocal Raman spectroscopy on living glioblastoma cells MD thesis, Technical University of Munich, (2013).
Magni, G. et al. Experimental study on blue light interaction with human Keloid-Derived fibroblasts. Biomedicines 8, 573 (2020).
Gualerzi, A. et al. Raman spectroscopy uncovers biochemical tissue-related features of extracellular vesicles from mesenchymal stromal cells. Sci. Rep. 7, 9820. https://doi.org/10.1038/s41598-017-10448-1 (2017).
Han, X. et al. The combined use of serum Raman spectroscopy and D dimer testing for the early diagnosis of acute aortic dissection. Heliyon 10 https://doi.org/10.1016/j.heliyon.2024.e32474 (2024).
Aksoy, C. & Severcan, F. Role of vibrational spectroscopy in stem cell research. Spectroscopy: Int. J. 27, 513286. https://doi.org/10.1155/2012/513286 (2012).
Sadat, A. & Joye, I. J. Peak fitting applied to fourier transform infrared and Raman spectroscopic analysis of proteins. Appl. Sci. 10, 5918 (2020).
Liao, C. S. et al. Microsecond scale vibrational spectroscopic imaging by multiplex stimulated Raman scattering microscopy. Light: Sci. Appl. 4, e265–e265. https://doi.org/10.1038/lsa.2015.38 (2015).
Acknowledgements
The authors to extend their sincere gratitude to Sandra Baur for assistance on cloning and staining and to Christian Schustetter and Thomal Matt for their technical support. Further we would like to thank Monika Leischner-Brill for providing access to the Olympus confocal fluorescence microscope and her advice on image analysis.
Funding
Open Access funding enabled and organized by Projekt DEAL. This work was supported by a generous grant of the Else Kröner-Fresenius-Stiftung.
Author information
Authors and Affiliations
Contributions
JR conceived the experimental lineup, conducted the experiments and data analysis, and drafted the manuscript. Ben Gardner and Nick Stone introduced JR to the Raman systems in Exeter and supplied MATLAB code for data analysis and provided constant support on data analysis. Alexander Gigler introduced JR to Raman systems in Munich and developed the workflow for collection of the Raman spectra on the WITec system. Suzy Eldershaw contributed significantly to the cell culture and fluorescent imaging. Friederike Liesche-Starnecker helped with fluorescent imaging and analysis. Jürgen Schlegel conceived the experimental objective, provided vital feedback on cloning, imaging, and western immunoblotting, Raman interrogation and mentored JR throughout the project. All authors commented on previous versions of the manuscript, contributed to the study conception and design, and read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Reifenrath, J., Gardner, B., Gigler, A. et al. Fluorescence Guided Raman Spectroscopy enables the training of robust support vector machines for the detection of tumour marker proteins. Sci Rep 15, 23711 (2025). https://doi.org/10.1038/s41598-025-08425-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-08425-0