Abstract
Time-resolved coherent Raman spectroscopy (CRS) is a powerful non-linear optical technique for quantitative, in-situ analysis of chemically reacting flows, offering unparalleled accuracy and exceptional spatiotemporal resolution. Its application to large polyatomic molecules, crucial for understanding reaction dynamics, has thus far been limited by the complexity of their rotational-vibrational Raman spectra. Progress in developing comprehensive spectral codes for these molecules, a longstanding goal, has been hindered by prohibitively long computation times required for their spectral synthesis. Here, we present an algorithm that achieves a million-fold improvement in computation time compared to existing methods. The algorithm demonstrates remarkable accuracy, with an approximation error below 0.1% across all tested probe delays, at both room temperature (296 K) and elevated temperatures (1500 K). This result could greatly expand the application of time-resolved CRS, particularly in plasma research, as well as in broader atmospheric and astrophysical sciences.

Similar content being viewed by others
Introduction
The advancement of laser technology over the last six decades has thrusted the development of optical diagnostics able to measure chemo-physical phenomena on molecular scales, moving from the rotational and vibrational nuclear motions on the pico- (ps) and femtosecond (fs) timescale1, to the motion of electrons on the attosecond one2. The increasingly shorter duration and correspondingly higher peak power of pulsed lasers has furthermore enabled the development of time-resolved nonlinear optical spectroscopy3. Coherent Raman spectroscopy (CRS) in particular, has benefitted from the application of fs laser pulses to impulsively generate coherent rotational and vibrational wave packets, maximizing the quantum coherence and thus the signal yield4.
CRS holds enormous potential to study chemical reactions in gas-phase media, and it is presently regarded as the gold standard for quantitative, in situ measurements in chemically reacting flows5,6. Bohlin and Kliewer, for example, have demonstrated hyperspectral imaging thermometry and wideband chemical detection in combustion environments7,8. Recently, CRS spectroscopy has also been employed to perform state-selective measurements of the collisional dynamics and energy transfer of molecular superrotors9,10, and to detect the angular momentum transfer between roton pairs in superfluid helium11.
Quantitative CRS measurements have nonetheless been limited to a notably small set of well-characterized chemical species—typically diatomic molecules, e.g., N2, H2, and CO, whose rotational and vibrational motions are well described by simple quantum mechanical models12. Numerical CRS codes for the vibrational spectra of a handful of triatomic molecules, such as CO2, N2O, and H2O13,14, are also available. The extension of these codes to simulate the spectra of larger polyatomic molecules is complicated by the substantial number of vibrational modes to consider, as well as by the possibility of mode coupling (e.g., Coriolis coupling15). Theoretical models have been developed to perform vibrational spectroscopy on some highly symmetric polyatomics like CH416, SF617, and H2CO18, while pure-rotational spectroscopy has been applied to C2H619 and C2H420.
The application of CRS to larger polyatomic molecules requires computer codes that can handle their vast vibrational complexity. In order to harness the full diagnostic potential of CRS one would thus need: (i) spectroscopic databases, providing the Raman frequencies and transition polarizabilities of the target molecules, and (ii) a way of rapidly synthesizing the CRS spectra from these databases.
On the one hand, recent advancements in computational chemistry have prompted the creation of large spectral databases for molecular species relevant to atmospheric, planetary, and astrophysical studies21,22. Quantum chemistry methods furthermore allow for the vibrational mode assignment and the calculation of the polarizability of significantly more complex molecules, e.g. organosilicon compounds23.
On the other hand, the synthesis of the CRS spectra from large spectral databases represents an unaddressed bottleneck. A recent study by ref. 24 provides an illustrative example: the authors employed the MeCaSDa spectral database25 (accessible at http://vamdc.icb.cnrs.fr/PHP/methane.php) to simulate the symmetric C–H stretch spectrum of CH4, they had to filter the ~1.7 million (M) spectral line list, reducing the number of considered transitions to ~6000 at elevated temperature. They furthermore employed a high-performance computing cluster to calculate the resulting spectra24. Mazza et al. used the same database to compute the H–C–H bend spectrum of CH4: when employing the whole line list (~11 M lines), the computation of a single spectrum required more than 8 h on a commercial laptop26. This being state-of-the-art, the widespread use of CRS for quantitative spectroscopy on large polyatomics, especially in hot environments where one would possibly need to consider many billions (B) of spectral lines21, is demonstrably hindered by the lack of fast spectral algorithms.
In the present work, we introduce an algorithm for the rapid synthesis of time-resolved CRS spectra, demonstrating a reduction of the computation time by a factor of 1 M. We refer to this algorithm as “ultrafast”, in view of the parallel between this result and the shift from nanosecond to femtosecond lasers, which enabled ultrafast time-resolved spectroscopy1. The algorithm is based on the discrete integral transform developed by van den Bekerom and Pannier27, who demonstrated a 2–3 orders-of-magnitude reduction in the computational cost of simulating the absorption spectrum of CO2, as compared to the legacy spectral synthesis code in RADIS28. Here, the accuracy and computational performance of the ultrafast algorithm are assessed against the numerical code of Mazza et al., which models the spectrum of CH4 in the dyad region of its Raman spectrum26. We furthermore discuss the potential implications of these results for the application of time-resolved CRS to investigate the gas-phase chemistry of large polyatomic molecules.
Results
Integral transform formulation
The time-domain description29,30 of CRS is based on the sudden approximation31, whereby the CRS signal arises from the interaction of a probe pulse with a molecular wave packet, impulsively created by the fs pump pulse32. The creation of such a wave packet need not be described in detail, but rather a phenomenological model for the nonlinear optical response of the medium can be used3. According to this model, the intensity of the CRS polarization field, as a function of its angular frequency \(\omega\), is given as:
with \({\hat{P}}_{{{{\rm{CRS}}}}}\left(\omega \right)={\sum}_{i}{\hat{P}}_{i}\left(\omega \right)\) the total polarization, and the polarization for a single Raman line \({\hat{P}}_{i}\left(\omega \right)\) given by:
where \({\chi }_{{{i}}}^{\left(3\right)}\left(t\right)={W}_{i}\exp \left(j{\omega }_{i}-{\Gamma }_{i}\right)t\) is the third-order nonlinear optical susceptibility for an individual line with \({\omega }_{i}\) the transition’s line position and \({\Gamma }_{i}\) being the corresponding damping coefficient, \({E}_{{{{\rm{pr}}}}}\) is the probe laser field, and \({t}_{1}\) is the coherence time relative to the pump/Stokes laser.
The Fourier transform (FT) can be centered around the center of the probe intensity profile at \(\tau\) by replacing \({t}_{1}\to t+\tau\), with \(t\) the time relative to the center of the probe pulse. Equation 2 then becomes:
Here the complex phase common for all lines, \(\exp \left(-j\omega \tau \right)\), may be omitted from Eq. 3 as it will disappear after squaring in Eq. 1. Furthermore, using the frequency shift property of the FT and applying a forward shift of \({\omega }_{i}+j{\Gamma }_{i}\), the total polarization can be written as:
with \({\widehat{E}}_{{{{\rm{pr}}}}}\left(\omega \right)={{{\mathcal{F}}}} \{{E}_{{{{\rm{pr}}}}}\left(t\right)\}\) the FT of the probe field. Equation 4 shows that the total polarization may be expressed directly as the sum of lineshapes \({\widehat{E}}_{{{{\rm{pr}}}}}\left(\omega \right)\) that are scaled by a (complex) factor \({\chi }_{i}^{\left(3\right)}\left(\tau \right)\) and are “shifted” by \({\omega }_{i}+j{\Gamma }_{i}\), where convergence of the integral transform is guaranteed as \({\Gamma }_{i} > 0\) for all lines. In this expression only the term \({\chi }_{i}^{\left(3\right)}\left(\tau \right)\) is a function of the pulse-probe delay \(\tau\), and as such it is fully responsible for the well-known effect of coherence beating33.
Since the total polarization is thus essentially a sum of lineshapes, \({\widehat{P}}_{{{{\rm{CRS}}}}}\left(\omega \right)\) can be expressed as an integral kernel over the space of lineshape parameters \(\omega\) and \(\Gamma\) by following the approach of van den Bekerom and Pannier27:
Here \({\hat{X}}^{\left(3\right)}\) (and \(\hat{W}\)) is the distribution function of the third-order nonlinear susceptibility (and line strength, respectively), that contains all spectral information:
Equation 5 expresses \({\hat{P}}_{{{{\rm{CRS}}}}}\left(\omega \right)\) as a convolution over \(\omega\), which can equivalently be written as a product in the time domain. A compact expression for the time-domain polarization \({{{{\rm{P}}}}}_{{{{\rm{CRS}}}}}\left(t\right)\) is thus obtained:
Here the condition \(t\ge -\tau\) enforces the physical constraint that no CRS polarization field is generated before the impulsive excitation of the Raman coherence: \({P}_{{{{\rm{CRS}}}}}\left(t\right)\) is zero for \(t < -\tau\). By transforming \({P}_{{{{\rm{CRS}}}}}\left(t\right)\) back to the frequency domain and plugging into Eq. 1, the ultrafast computation of the CRS spectrum is achieved.
Equations 6a and 7 are discretized by replacing the continuous FT by a discrete FT (DFT) and replacing the integral over \(\Gamma\) by a summation, as summarized in Methods and explained in more detail in van den Bekerom and Pannier27. Figure 1 presents a graphical summary of the discretization method. In discretizing Eq. 6a, weights must be assigned to reproduce function values that are not on gridpoints. The choice of weights offers a degree of freedom that allows for the complete elimination of the approximation error at a single point on the \(t\) or \(\omega\) axis. With the polarization signal strongly concentrated around the probe pulse center, elimination of the error at t = 0 s seems appropriate. Under this choice, the weights for position and damping for the ith spectral line, \({a}_{\omega ,i}\) and \({a}_{\Gamma ,i}\) respectively, are given by:
with the dimensionless grid sizes \({\theta }_{\Delta \omega }=j\tau \Delta \omega /2\) and \({\theta }_{\Delta \Gamma }=-\tau \Delta \Gamma /2\). The factors \({\lambda }_{\omega ,i}=\left({\omega }_{i}-\omega [{k}_{i}]\right)/\Delta \omega\) and \({\lambda }_{\Gamma ,i}=\left({\Gamma }_{i}-\Gamma [{l}_{i}]\right)/\Delta \Gamma\), with values between 0 and 1, describe the relative position of the line with respect to the gridpoints. The step-by-step derivation of Eq. 8, as well as strategies for its efficient numerical evaluation, can be found in the Supplementary Information, section “Weight derivation and evaluation”.
(Top-left) Schematic representation of line strength distribution W[k, l]; crosses represent exact lines; red diamonds represent constituent lines with parameters on the grid. (Bottom-left) Lineshapes corresponding to the lines in top graph, demonstrating the constituent lineshapes in red, the approximated lineshape in blue, and the exact lineshape in dashed black. The opacity of the red lineshapes indicates the weight with which they contribute to the approximated lineshape. (Right) Detailed schematic of the last line, showing the weights of the individual position and damping contributions as bar plots. In this figure the grid spacing is here taken coarser than would be typical in order to highlight discrepancies.
The discretization error introduced by the choice of weights in Eq. 8 can be derived analytically under relatively weak assumptions (see Supplementary Information, section “Approximation error—frequency-domain approximation”). The error component due to the interpolation of the spectral position, \({\omega }_{i}\) (or damping coefficient, \({\Gamma }_{i}\)), is evaluated as a function of the following dimensionless variables: complex frequency, \(z=\left(\omega -{\omega }_{i}-j{\Gamma }_{i}\right){\Delta t}_{{{{\rm{pr}}}}}\); probe pulse duration-to-delay ratio, \({\theta }_{\Delta t}={\Delta t}_{{{{\rm{pr}}}}}/\tau\); and the previously defined dimensionless grid size, \({\theta }_{\Delta \omega }\) (or \({\theta }_{\Delta \Gamma }\)). It is possible to show (proof given in Supplementary Information, section “Approximation error—frequency-domain approximation”) that the approximation error has an upper bound of:
which is valid whenever \(\left|{\theta }_{\Delta x}\right|\ll 1\) and \({\theta }_{\Delta t}\left|{\theta }_{\Delta x}\right|\ll 1\) (see Supplementary Information, section “Frequency-domain approximation”). With \({\theta }_{\Delta t}\) and the probe laser pulse shape and duration set by the experiment, the error in Eq. 9 only depends on the grid size chosen to discretize the position and damping space, which can be set arbitrarily small. For example, if the probe delay were increased from 5 to 100 ps, the number of spectral points would need to be increased by a factor of √20 ≈ 4.5 to keep the same accuracy. For a large number of lines (>1 M), this will not affect performance because the computation is bottlenecked by populating the distribution function (Eq.7), as shown by the linear increase in Fig. 2. The algorithm thus allows for arbitrary accuracy in the simulation of CRS spectra, where the only limitation is represented by the tradeoff between accuracy and computational performance. Nonetheless, the presented approach provides a dramatic performance improvement even at only a slight concession in accuracy.
Benchmark of different approaches/implementations for synthesizing time-resolved CRS spectra. ref refers to the reference approach based on Eq. 3, implemented as Python (-py) and C++ code (-cpp); ufa refers to the ultrafast approximation based on Eq. 7, also implemented as both Python (-py) and C++ (-cpp) codes. The fully optimized implementation (ufa-cpp) includes multithreading and full SIMD vectorization, which improves computation time by a factor of 1 M compared to the conventional approach (ref-py) for a large number of spectral lines. The performance in the 107–108 lines range is evaluated by synthetically expanding the number of lines in the MeCaSDa database (see Methods).
Improvement in computational performance
Figure 2 illustrates the reduction in the computational cost of different numerical implementations of the algorithm detailed above. The performance of the conventional time-domain modeling based on Eq. 3 (named: ref) is compared to that of the ultrafast algorithm based on Eq. 7 (named: ufa). For both cases, a Python and C++ implementation (-py and -cpp, respectively) is tested; more details are available in the Methods section. The MeCaSDa spectral database25 is rescaled according to the procedure of Mazza et al.26 to calculate the weights \({W}_{i}\) of the ~107 individual lines in the H–C–H bend (ν2 mode) CRS spectrum of CH4. To the best of the knowledge of the authors, this represents the largest experimentally verified line list available for CRS spectroscopy. The computation time for an increasing number of spectral lines included in the spectrum is measured for the different numerical implementations. As detailed in Methods, a total of nine spectra at different conditions is computed at each benchmarking point in Fig. 2, with the markers in the figure representing the average of the nine computation times.
While implementing the conventional time-domain model in a compiled language (ref-cpp) makes the simulation of the CRS spectrum a couple of times faster than in an interpreted language such as Python (ref-py), this improvement is insignificant compared to the improvement gains of the ultrafast spectral algorithm. Even considering the slower, interpreted implementation of the fast algorithm (ufa-py), the computation time for a single spectrum employing the full database is abated from 4 h to 5 s, an improvement of more than three orders-of-magnitude. The fast algorithm implemented in C++ (ufa-cpp) includes multithreading, making use of all the cores, as well as full code vectorization through the implementation of single instruction multiple data (SIMD) intrinsics in places of code bottlenecks. This implementation reduces the computation time to 14 ms, representing an improvement by a factor of 1 M over the present state-of-the-art. We note that compared to ufa-py, ufa-cpp does not make any new approximations, and their outputs are nominally the same. The demonstrated improvement in computational performance enables the on-the-fly evaluation of experimental data for quantitative CRS (see Supplementary Information, section “On-the-fly fitting of experimental data”), which could lead to widespread commercialization of CRS spectrometers for gas-phase chemical analysis. The application code described there is included in the supplementary files as Supplementary Software 1.
We furthermore explore the possibility of employing even more extensive spectral databases, pushing above the tens of millions of Raman lines, by synthetically increasing the size of the spectral database (see Methods) as shown in Fig. 2. When a total of 100 M spectral lines is considered, the conventional modeling time for a single CRS spectrum is estimated to be more or less an entire day, depending on the language employed. The same computation, on the other hand, takes less than a single second by employing the ultrafast algorithm (97 ms). Our ultrafast spectral algorithm is, therefore, the only approach currently available to fully employ even larger line lists, possibly containing tens of billions of spectral lines (e.g. the YT34to10 line list34 contains 35 B transitions for high-temperature CH4), without the need for line reduction schemes. The estimated computation time required by conventional codes to simulate a single CRS spectrum employing 10 B lines is in the order of many months. In contrast, the ufa-cpp implementation would perform the same task in less than 10 s. Finally, we remark that already, for a small number of lines, the new algorithm beats the reference case. A typical spectrum for a diatomic molecule will contain between 100 and 1000 lines, such that the presented algorithm provides a factor 30-300 improvement: fast enough to perform on-the-fly fitting. The data underlying Fig. 2 is appended to the paper as supplementary data in Supplementary Data 1.
Accuracy of the ultrafast algorithm in the frequency domain
In addition to enabling the rapid computation of the CRS spectra, the integral transform approach is also highly accurate. The error introduced in the simulation of the CH4 ν2 mode spectrum by the ultrafast algorithm is represented in Fig. 3. The error, evaluated for the ufa-cpp implementation benchmarked in the section above, is relative to the ref-py implementation and is normalized to the maximum intensity of the spectrum. Figure 3a, b show a map of the error as a function of the frequency and probe pulse delay, simulating a pump-probe CRS experiment, for T = 296 K and T = 1500 K, respectively.
a, b Map of the approximation error of the frequency-domain implementation of the ultrafast algorithm (ufa-cpp) w.r.t the reference (ref-py), evaluated at room temperature (T = 296 K, a) and elevated temperature (T = 1500 K, b), respectively, for a range of probe delays from 20 to 200 ps. The error is taken relative to the largest peak in the spectrum. Regardless of temperature and probe delay, the error never exceeds 0.1%. For probe delays of 50, 100, and 150 ps, representative spectra are plotted in figure c for T = 296 K and d for T = 1500 K, as indicated by the dashed lines in (a, b). The residuals, imperceptible in the (c, d) figures, are plotted in figures (e, f), respectively.
A clear beating pattern is observed for all spectral lines in both Fig. 3a, b, matching the coherence beating of the unresolved, Coriolis-split ro-vibrational transitions26. This shows that the dominating error components in the simulated CRS spectra arise from the coalescence of the approximation errors of the individual unresolved lines. The frequency-domain error is observed to be weakly dependent on the probe delay, with the error slightly increasing for larger probe delays. Nonetheless, for all delays between 20 and 200 ps, the error remains below 0.1%. In Fig. 3c, three sample spectra at probe delays of τ = 50, 100, and 150 ps and a temperature 296 K are depicted, with their residuals plotted in Fig. 3e. In all cases, a discrepancy between the ufa-cpp (solid, colored) and ref-py (dashed, black) implementations is imperceptible at this scale. The magnitude of the approximation error for τ = 50 and 100 ps is less than 0.05% for all the different spectral branches, while for τ = 150 ps, the error in the Q-branch and the O(6) line at ~1655 cm−1 is increased to about 0.1% in magnitude. To put this result into perspective, such an approximation error is equivalent to the stochastic error in a hypothetical CRS experiment, where the signal-to-noise ratio (SNR) exceeded 30 dB, which is beyond the dynamic range of most detectors used in commercial spectrometers.
Figure 3b, d, f show the relative error for the frequency-domain spectra computed at T = 1500 K, approximately the reference temperature for the calculation of the line intensities in the MeCaSDa database35. At elevated temperatures, the higher vibrational states become significantly populated and contribute to the CRS spectrum. This is underscored by the stronger coherence beating of all the spectral features in Fig. 3b, as well as by the larger number of P-branch lines appearing in Fig. 3d. Nevertheless, it is important to remark that the magnitude of the relative error is the same as in the room temperature simulation. The ultrafast algorithm thus allows for the inclusion of a large number of higher vibrational states in the spectral computation, with no additional detriment to its accuracy.
Accuracy of the ultrafast algorithm in the time domain
The possibility of performing time-resolved measurements, by delaying the probe pulse relative to the excitation pulse in pump-probe experiments, enables the use of CRS to study the decoherence of the molecular wave packet due to the interaction with its environment9,10,36. The synthetic modeling of these “probe delay scans” is even more computationally demanding than the frequency-domain spectral calculations described in the previous section, as Eqs. 1 and 3 need to be evaluated for all the values of τ in the delay series24. The integral transform formulation of the third-order susceptibility can also be employed here to derive an ultrafast algorithm for the time-domain synthesis of the CRS signal (details in Supplementary Information, section “Ultrafast probe delay scan”):
Analogous to Eq. 7, the condition \({\chi }^{(3)}\left(\tau \, < \, 0\right)=0\) introduces the physical consideration that there is no Raman coherence before the interaction of the molecular ensemble with the pump/Stokes laser pulse. The approximation in Eq. 10b is coarser than the one in the frequency domain, because the weights given in Eq. 8 eliminate the error only at a single point in the temporal grid. Moreover, the integral transform is predicated on the fact that neighboring lineshapes in the grid differ only slightly, which is not the case in the time domain where the prototype lineshapes rapidly oscillate (see Supplementary Information, section “Approximation error—time-domain approximation”).
In Fig. 4, we compare the dephasing of the integrated Q-branch signal over a delay of 200 ps, as simulated by the conventional and ultrafast spectral algorithms. In Fig. 4a, b, we compare three different approaches for the synthesis of the probe delay scan, including the 250 most intense lines in the Q-branch spectrum: (i) ref-py-w: integration of \(I(\omega )\) from Eq. 4, resulting in a computation time of 92.4 s; (ii) ufa-cpp-w: integration of \(I(\omega )\) from Eq. 7 (computation time: 1.0 s); and (iii) ufa-cpp-t: direct computation of \(I(\tau )\) using Eq. 10a (computation time: 4.8 ms). The error is shown to be minimal for the ufa-cpp-w case, but significantly larger for the direct approach implemented as ufa-cpp-t, showing a clear parabolic structure as expected (see Supplementary Information, section “Approximation error—time-domain approximation”).
a The fast approximation is compared against the reference for the 250 most intense lines in the Q-branch. Integration in frequency space (-w) yields accurate results in either case, so ufa-cpp-w is used as a fast reference for the other plots. b The error for the direct time-domain (ufa-cpp-t) is somewhat larger, ±7.5%. A parabolic profile is clearly visible in the error. c Error evaluated for the direct approach (ufa-cpp-t) for different numbers of gridpoints \({N}_{\omega }\) and \({N}_{\Gamma }\). d the error in the time domain is significantly reduced when a larger number of points is considered. e the coherence decay is evaluated for Nω = 40 000 and NΓ = 4 at different temperatures. At T = 296 K, δ = 1 and n = 0 were used. At T > 296 K, the values δ = 0.47 and n = 2.67 were taken from ref. 24 (see Methods). f The error remains within 2% for all values of T.
In Fig. 4c, d, we employ the two ultrafast approaches, based on the frequency-domain integration (ufa-cpp-w) and direct time-domain calculation (ufa-cpp-t), to compute the dephasing of the Q-branch signal, including all the corresponding 2.5 M Raman lines. The performance of the time-domain implementation is assessed by employing three different combinations of grid resolution for the frequency and damping axes. Increasing the number of gridpoints has a clear benefit on the time-domain computation, reducing the absolute relative error to less than 1%.
Finally, Fig. 4e, f presents the collisional dephasing of the Q-branch at temperatures T = 296, 800, and 1500 K, synthesized by ufa-cpp-t for Nω = 40,000 and NΓ = 4. For all the tested temperatures, the error with respect to the frequency-domain integration is less than 2% and appears to be insensitive to temperature. The calculation time for the three traces is below 7 ms on average. We furthermore remark that the comparison between the two implementations of the ultrafast algorithm is representative of the accuracy of ufa-cpp-t with respect to the spectrally-integrated conventional time-domain CRS model (ref-py-w). We assessed this by spectrally integrating the error maps in Fig. 3a, b, the resulting approximation error in ufa-cpp-w was found to be <0.03% at all probe delays, for both temperatures.
Discussion
In summary, we have introduced an algorithm for the ultrafast synthesis of time-resolved CRS spectra from large spectral databases, demonstrating a reduction in the computational cost by six orders-of-magnitude, while maintaining the approximation error below 0.1% in the frequency domain. The achieved performance increase allows for on-the-fly evaluation of the CRS signal, capable of live monitoring gas-phase chemical reactions. With spectral computation times of ~15 ms, fitting a spectrum typically takes under a second instead of many hours.
Besides the considerable improvement over existing spectral codes, the present algorithm has significant potential to enable the fruitful use of quantitative CRS spectroscopy in currently unexplored research domains. Here, we suggest a couple of prospect applications in the context of extraterrestrial and plasma chemistry. Gaseous CH4 is one of the key components in the atmosphere of astronomical objects such as exoplanets and brown dwarves34, as well as natural satellites, Titan being a fine example37. As such, CH4 plays a key role in the physical (e.g., light absorption) and chemical (e.g., photochemistry) processes that shape these atmospheres, and large spectral line lists are available to investigate its rotational-vibrational spectrum at temperatures as high as 2000 K34. To the best of our knowledge, the ultrafast algorithm presented here is the only tool available to handle such large spectral databases without the need for additional simplifications, such as quasi-continuum cross-sections, which introduce arbitrary data cutoffs. This makes time-resolved CRS ideal for the laboratory investigation of CH4 chemistry under the extreme conditions found, e.g., in the hypersonic entry in Titan’s atmosphere38 or, possibly, in the prebiotic synthesis of complex organic molecules39.
Another likely application is thermometry in methane plasma discharges, which are rapidly gaining attention as a potential electric alternative for contemporary fossil-powered chemical processes40,41. The challenging conditions in the plasma make CRS an attractive diagnostic technique, but due to the high temperature in the plasma (>3000 K), as well as possible non-equilibrium thermodynamics effects, reducing the number of lines included in the synthetic spectrum comes at the risk of losing of spectral accuracy. With the ultrafast algorithm, no such tradeoff needs to be made anymore.
While CH4 is the simplest hydrocarbon, polycyclic aromatic hydrocarbons (PAHs) are large organic molecules, which are also abundant in the interstellar medium and play a significant role in the hydrogen and carbon chemistry in planetary nebulae42. PAHs are also the precursors of soot in hydrocarbon combustion and are thus the subject of widespread laboratory research. The application of CRS in the fingerprint region26 of the Raman spectrum can combine the high chemical specificity required to identify and measure different PAHs43, with the ability to resolve their specific vibrational dynamics in the ground electronic state, thus providing important complementary information to time-resolved XUV-IR spectroscopy44. Currently, however, comprehensive line lists for such large molecules are unavailable. For example, a line-by-line database for the much smaller ethylene molecule, the ECaSDa database25, (accessible at https://vamdc.icb.cnrs.fr/PHP/ecasda.php), contains only 6791 Raman transitions. It is our hope and expectation that removing the barrier for computation of such complex spectra will provide a large impetus to the development of line lists for these larger molecules.
Methods
Discretization
For use in numerical programs, the expression for \({P}_{{{{\rm{CRS}}}}}\left(t\right)\) must be discretized. This is achieved by replacing the continuous FT by a DFT, and replacing the integral over \(\Gamma\) by the summation:
with \(t\left[n\right]={t}_{0}+n\Delta t\) and \(\omega \left[k\right]={\omega }_{0}+k\Delta \omega\) the discretized time/frequency axes, \(\Gamma \left[l\right]={\Gamma }_{0}+l\Delta \Gamma\) the discretized damping axis, and
the discretized distribution function, with the discrete line strength distribution \(\widehat{W}\left[k,l\right]\) to be determined.
The key to discretizing \(W\left(\omega ,\Gamma \right)\), and indeed to the presented approach in general, is that the exact nonlinear susceptibility of an individual line, \({\widehat{\chi }}_{i}^{\left(3\right)}\left(\omega \right)={{{\mathcal{F}}}}\{{\chi }_{i}^{\left(3\right)}\left(t\right)\}\), can be approximated by the weighted sum of susceptibilities with line positions \(\omega \left[k\right]\) and damping coefficients \(\Gamma \left[l\right]\) on the discrete grid. The contribution to the distribution function by a single line i for the approximated line position and damping coefficients is given, respectively, by:
with \({a}_{\omega ,i}\) and \({a}_{\Gamma ,i}\) the weights for line position and damping coefficient, respectively, with a value between 0 and 1, and \({k}_{i}=\left\lfloor \frac{{\omega }_{i}-{\omega }_{0}}{\Delta \omega }\right\rfloor\) and \({l}_{i}=\left\lfloor \frac{{\Gamma }_{i}-{\Gamma }_{0}}{\Delta \Gamma }\right\rfloor\) the indices of the closest gridpoints before (i.e., “to the left of”) the molecular line.
Equations 13a and 13b can be combined to approximate the exact nonlinear susceptibility as the weighted sum of 2 × 2 susceptibilities, each with line parameters only at discrete gridpoints:
where \(\delta k\) and \(\delta l\) (not to be confused with the Kronecker delta’s \({\delta }_{{ij}}\)) each can take the values 0 or 1, and the weights \({a}_{\delta k,\delta l}^{i}\) are simply the products of the individual weights in Eq. 14, i.e.:
The numerical schemes employed for a computationally efficient evaluation of the position (\({a}_{{{{\rm{\omega }}}},i}\)) and damping (\({a}_{\Gamma ,i}\)) weights are detailed in Supplementary Information, section “Weight derivation and evaluation”.
The discrete distribution of the line strength \(\widehat{W}\left[k,l\right]\) is then constructed by every molecular line contributing its weighted intensity to the corresponding 2 × 2 gridpoints. The intensity \({W}_{i}\) is distributed over the gridpoints by applying the weights in Eq. 8, resulting in the following expression for \(\widehat{W}\left[k,l\right]\):
The entire procedure is summarized schematically in Fig. 1.
Benchmarked implementations
Three implementations are benchmarked on a mid-end consumer laptop with an 8-core AMD Ryzen PRO 5850U processor (1901 MHz) on Windows 10. The reference case is implemented purely in Python (ref-py) and C++ (ref-cpp). The compiled C++ version of the reference, compiled with MinGW, benefitted from a couple of straightforward optimizations, that resulted in about doubling the computational performance.
For the ultrafast approximation, two implementations are benchmarked: a pure Python implementation (ufa-py); and a C++ implementation (ufa-cpp), which implements multithreading through OpenMP pragmas as well as single instruction multiple data (SIMD) vectorization. The FT evaluations in the C++ implementation are performed using the PocketFFT library45, which is also the basis for the FFT implemented in SciPy. In the multi-threaded code, the computation of the line strength distribution is split up into a number of parallel threads. Each thread processes a number of lines that are separated by a “chunk size”, which is 2048 as a default to prevent multiple threads from adding lines in the same wavelength region. In case the threads start to overlap, a larger chunk size or a smaller number of parallel threads should be used. Full vectorization is implemented through AVX2 (Advanced Vector eXtensions 2) SIMD intrinsics, which enables simultaneous processing of 4 spectral lines on top of the multithreading parallelization.
The ref-py and ref-cpp codes were implemented from scratch based on the approach by ref. 26 and demonstrated a frequency shift of 0.01115 cm−1 compared to the original code. After comparison to the nominal Raman frequencies calculated in the MeCaSDa database, the discrepancy was attributed to a numerical artifact due to the FFT algorithm of the original code. Like the ref-py and ref-cpp codes, ufa-py and ufa-cpp reproduce the correct line positions. In all the benchmarked implementations, the Fourier-conjugated time and frequency axes were discretized using Nω = Nt = 40,000 points, resulting in a resolution of Δω/2πc = 0.02 cm−1 and Δt = 40 fs, with the time axis spanning from −814 to 814 ps. Additionally, the “ufa” implementation employed NΓ = 4 points to discretize the linewidth space. The same settings were used for both Figs. 2,3.
The benchmarking is performed by evaluating a total of 3 × 3 spectra, with T = 296, 800, 1500 K, and τ = 20, 50, 100 ps; the evaluation is carried out for an increasing number of spectral lines considered in the computation, from 10 to the ~11 M lines of the whole MeCaSDa database. To test the performance of the ultrafast algorithm up to 100 M spectral lines, the MeCaSDa database is synthetically extended by duplicating every spectral line in the database 10 times. The computation times reported in Results are the average of the nine spectral evaluations performed at each tested point, the standard deviation in the computation times for the nine evaluations is in each case around ~20% for computation times below ~5 ms and drops to ~2% for a larger computation time.
Linewidths model
The width of the Raman lines considered in the MeCaSDa database are here calculated according to ref. 26 The main dephasing mechanism for the Raman coherence is assumed to be the rotational energy transfer in molecular collisions, which is estimated by the modified exponential gap (MEG) scaling law:
With \({\gamma }_{{kj}}\) and \({\gamma }_{{jk}}\), respectively, the upward and downward collisional transition rates; \({T}_{0}\) = 296 K the reference temperature; \(p\), the gas pressure (assumed atmospheric); \({E}_{k}\) and \({E}_{j}\), the ro-vibrational energy in the upper and lower vibrational states; kB, Boltzmann’s constant; and fitting parameters \(\alpha\), \(\beta\), \(\delta\), \(a\) and \(n\). The species-specific parameter \(a\) is set to 2, while \(\alpha\) = 4.45 ∙ 10−2 and \(\beta\) = 1.52 according to ref. 26. The temperature-related coefficients \(n\) and \(\delta\), on the other hand, are not reported in the literature for the ν2 mode spectrum of CH4. To a first approximation, the values reported by ref. 24 for the ν1 mode spectrum are here adopted to simulate the ν2 mode one: \(\delta\) = 0.47 and \(n\) = 2.67.
Code availability
All code generated in the current study is attached as Supplementary Software 1 and is additionally available on the Github repository: https://github.com/dcmvdbekerom/ultrafast-crs. The remaining code files are available from the corresponding author upon request.
References
Zewail, A. H. Laser femtochemistry. Science 242, 1645–1653 (1988).
Corkum, P. B. & Krausz, F. Attosecond science. Nat. Phys. 3, 381–387 (2007).
Lavorel, B. et al. Femtosecond Raman time-resolved molecular spectroscopy. C. R. Phys. 5, 215–229 (2004).
Scully, M. O. et al. FAST CARS: engineering a laser spectroscopic technique for rapid identification of bacterial spores. Proc. Natl. Acad. Sci. USA 99, 10994–11001 (2002).
Roy, S., Gord, J. R. & Patnaik, A. K. Recent advances in coherent anti-Stokes Raman scattering spectroscopy: fundamental developments and applications in reacting flows. Prog. Energy Combust. Sci. 36, 280–306 (2010).
Lempert, W. R. & Adamovich, I. V. Coherent anti-Stokes Raman scattering and spontaneous Raman scattering diagnostics of nonequilibrium plasmas and flows. J. Phys. D 47, 433001 (2014).
Bohlin, A. & Kliewer, C. J. Diagnostic imaging in flames with instantaneous planar coherent Raman spectroscopy. J. Phys. Chem. Lett. 5, 1243–1248 (2014).
Bohlin, A. & Kliewer, C. J. Direct coherent Raman temperature imaging and wideband chemical detection in a hydrocarbon flat flame. J. Phys. Chem. Lett. 6, 643–649 (2015).
Milner, A. A., Korobenko, A., Hepburn, J. W. & Milner, V. Effects of ultrafast molecular rotation on collisional decoherence. Phys. Rev. Lett. 113, 1–5 (2014).
Chen, T. Y., Steinmetz, S. A., Patterson, B. D., Jasper, A. W. & Kliewer, C. J. Direct observation of coherence transfer and rotational-to-vibrational energy exchange in optically centrifuged CO2 super-rotors. Nat. Commun. 14, 3227 (2023).
Milner, A. A. & Milner, V. Controlled excitation of rotons in superfluid helium with an optical centrifuge. Phys. Rev. Lett. 131, 166001 (2023).
Hollas, M. J. High Resolution Spectroscopy (Butterworths, 1982).
Hall, R. J. & Stufflebeam, J. H. Quantitative CARS spectroscopy of CO2 and N2O. Appl. Opt. 23, 4319–4327 (1984).
Greenhalgh, D. A., Hall, R. J., Porter, F. M. & England, W. A. Application of the rotational diffusion model to the CARS spectra of high-temperature, high-pressure water vapour. J. Raman Spectrosc. 15, 71–79 (1984).
Jahn, H. A. A new Coriolis perturbation in the methane spectrum I. Vibrational-rotational Hamiltonian and wave functions. Proc. R. soc. Lond. Ser. A 168, 469–495 (1938).
Jourdanneau, E. et al. CARS methane spectra: experiments and simulations for temperature diagnostic purposes. J. Mol. Spectrosc. 246, 167–179 (2007).
Akhmanov, S. A. et al. CARS thermometry of polyatomic gases: SF6. IEEE J. Quantum Electron. 20, 424–428 (1984).
Walser, A. M. et al. Time-resolved investigation of the ν1 ro-vibrational Raman band of H2CO with fs-CARS. J. Raman Spectrosc. 38, 147–153 (2007).
Hosseinnia, A., Nordström, E., Bood, J. & Bengtsson, P.-E. Ethane thermometry using rotational coherent anti-Stokes Raman scattering (CARS). Proc. Combust. Inst. 36, 4461–4468 (2017).
Hosseinnia, A., Brackmann, C. & Bengtsson, P.-E. Pure rotational coherent anti-Stokes Raman spectroscopy of ethylene, experiments and modelling. J. Quant. Spectrosc. Radiat. Transf. 234, 24–31 (2019).
Tennyson, J. et al. The ExoMol database: molecular line lists for exoplanet and other hot atmospheres. J. Mol. Spectrosc. 327, 73–94 (2016).
Richard, C., Boudon, V. & Rotger, M. Calculated spectroscopic databases for the VAMDC portal: new molecules and improvements. J. Quant. Spectrosc. Radiat. Transf. 251, 107096 (2020).
Carteret, C. & Labrosse, A. Vibrational properties of polysiloxanes: From dimer to oligomers and polymers. 1. structural and vibrational properties of hexamethyldisiloxane (CH3)3SiOSi(CH3)3. J. Raman Spectrosc. 41, 996–1004 (2010).
Chen, T. Y., Kliewer, C. J., Goldberg, B. M., Kolemen, E. & Ju, Y. Time-domain modelling and thermometry of the CH4 ν1 Q-branch using hybrid femtosecond/picosecond coherent anti-Stokes Raman scattering. Combust. Flame 224, 183–195 (2021).
Awa Ba, Y. et al. MeCaSDa and ECaSDa: methane and ethene calculated spectroscopic databases for the virtual atomic and molecular data centre. J. Quant. Spectrosc. Radiat. Transf. 130, 62–68 (2013).
Mazza, F. et al. The ro-vibrational ν2 mode spectrum of methane investigated by ultrabroadband coherent Raman spectroscopy. J. Chem. Phys. 158, 094201 (2023).
van den Bekerom, D. C. M. & Pannier, E. A discrete integral transform for rapid spectral synthesis. J. Quant. Spectrosc. Radiat. Transf. 261, 107476 (2021).
Pannier, E. & Laux, C. O. RADIS: a nonequilibrium line-by-line radiative code for CO2 and HITRAN-like database species. J. Quant. Spectrosc. Radiat. Transf. 222–223, 12–25 (2019).
Prince, B. D., Chakraborty, A., Prince, B. M. & Stauffer, H. U. Development of simultaneous frequency- and time-resolved coherent anti-Stokes Raman scattering for ultrafast detection of molecular Raman spectra. J. Chem. Phys. 125, 044502 (2006).
Pestov, D. et al. Optimizing the laser-pulse configuration for coherent Raman spectroscopy. Science 316, 265–268 (2007).
Mankoč-Borštnik, N., Fonda, L. & Bortnik, B. Coherent rotational states and their creation and time evolution in molecular and nuclear systems. Phys. Rev. A 35, 4132 (1987).
Friedrich, B. & Herschbach, D. Alignment and trapping of molecules in intense laser fields. Phys. Rev. Lett. 74, 4623–4626 (1995).
Faeder, J., Pinkas, I., Knopp, G., Prior, Y. & Tannor, D. J. Vibrational polarization beats in femtosecond coherent anti-Stokes Raman spectroscopy: a signature of dissociative pump-dump-pump wave packet dynamics. J. Chem. Phys. 115, 8440–8454 (2001).
Yurchenko, S. N., Amundsen, D. S., Tennyson, J. & Waldmann, I. P. A hybrid line list for CH4 and hot methane continuum. Astron. Astrophys. 605, A95 (2017).
Butterworth, T. D. et al. Quantifying methane vibrational and rotational temperature with Raman scattering. J. Quant. Spectrosc. Radiat. Transf. 236, 106562 (2019).
Owschimikow, N. et al. Cross sections for rotational decoherence of perturbed nitrogen measured via decay of laser-induced alignment. J. Chem. Phys. 133, 044311 (2010).
Niemann, H. B. et al. The abundances of constituents of Titan’s atmosphere from the GCMS instrument on the Huygens probe. Nature 438, 779–784 (2005).
Schipper, A. M. & Lebreton, J.-P. The Huygens probe-space history in many ways. Acta Astronaut. 59, 319–334 (2006).
Miller, S. L. & Urey, H. C. Organic compound synthesis on the primitive earth. Science 130, 245–251 (1959).
Fincke, J. R. et al. Plasma thermal conversion of methane to acetylene. Plasma Chem. Plasma Process. 22, 105–136 (2002).
Mašláni, A. et al. Pyrolysis of methane via thermal steam plasma for the production of hydrogen and carbon black. Int. J. Hydrog. Energy 46, 1605–1614 (2021).
García-Hernández, D. A. et al. Formation of fullerenes in H-containing planetary nebulae. Astrophys. J. Lett. 724, 39–43 (2010).
Faccinetto, A. et al. Evidence on the formation of dimers of polycyclic aromatic hydrocarbons in a laminar diffusion flame. Commun. Chem. 3, 112 (2020).
Lee, J. W. L. et al. Time-resolved relaxation and fragmentation of polycyclic aromatic hydrocarbons investigated in the ultrafast XUV-IR regime. Nat. Commun. 12, 6107 (2021).
Reinecke, M. Pocketfft. GitHub. https://github.com/mreineck/pocketfft (2025).
Acknowledgements
The authors gratefully acknowledge Martin Reinecke for his support and implementation of the new functionality in PocketFFT required for the C++ implementation of the ultrafast algorithm, as well as Dr. Michele Marrocco for his helpful discussions on the possible applications of the presented algorithm.
Author information
Authors and Affiliations
Contributions
F.M.: Writing—original draft, validation, writing—review and editing. D.v.d.B.: Conceptualization, methodology, software, validation, formal analysis, writing—review and editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Chemistry thanks Pieter H. Neethling and the other, anonymous, reviewers for their contribution to the peer review of this work. Peer review reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Mazza, F., van den Bekerom, D. An ultrafast algorithm for ultrafast time-resolved coherent Raman spectroscopy. Commun Chem 8, 3 (2025). https://doi.org/10.1038/s42004-024-01397-8
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s42004-024-01397-8






