Background & Summary

The triatomic carbon dioxide molecule, CO2, contains two terminal oxygen and a single central carbon atom, and in the electronic ground state it has a linear equilibrium structure. Studying the internal motions and related rovibrational spectra of carbon dioxide at relatively low temperatures, so that none of the excited electronic states need to be taken into account, is relevant to many fields of science and technology.

The variation in the CO2 content of the atmosphere of Earth over time is one of its important characteristics1. Detailed understanding of CO2 spectroscopic features2,3, in particular the line center positions, both with and without collisional effects, helps to establish the total amount and the distribution of this molecule in the atmosphere of Earth. These data also help to understand the effect of CO2 on a number of environmental issues, such as climate change and the radiative balance of our atmosphere. The industrial revolution has had a significant impact on climate change; during the last century, the well-documented increase of CO2’s concentration in the atmosphere of the Earth is predominantly due to human activity4. Approximately 98.45 % of carbon dioxide molecules in the atmosphere of the Earth are in the form of the parent, 16O12C16O isotopologue, making it the most important isotopologue to study.

Carbon dioxide plays a crucial role in many areas of astronomical research as well, from the study of stars5,6 to the exploration of planetary atmospheres7,8,9: in our own solar system, carbon dioxide is the major constituent and thus determines the radiative balance of the atmospheres of the planets Mars and Venus. The detection and characterization of CO2 absorption features in exoplanetary spectra offer clues about the atmospheric composition, pressure, and temperature profiles of these distant worlds. CO2 was one of the first molecules detected in the atmosphere of an exoplanet10 and several recent observations have been made using the James Webb Space Telescope11. Within the dense molecular clouds that pervade the interstellar medium, CO2 serves as an important tracer of the physical and chemical conditions that govern the birth of new stars12,13. Emissions from the bending states of the CO2 isotopologues in the far infrared provide valuable information on the temperature, density, and kinematics of these star-forming regions12.

Remote sensing of the CO2 content of the Earth’s atmosphere is a major activity aimed at monitoring the carbon content of our atmosphere in increasing detail. Missions such as NASA’s OCO-2 and OCO-3 satellites and ESA’s planned CO2M satellite constellation have stringent requirements on laboratory spectroscopy results, required for the interpretation of their observations14,15. Similar accuracy is required for ground-based spectroscopic experiments such as TCCON (Total Carbon Column Observing Network)16.

CO2 spectra are important for medical17 and industrial applications18, as well. The study of CO2 spectra in plasma physics is widespread19, where there is particular emphasis on the use of plasma processes to valorize excess CO2 from the Earth’s atmosphere20.

Due to the importance of accurate high-resolution spectroscopic data related to carbon dioxide, they are available in several line-by-line spectroscopic databases, such as the Carbon Dioxide Spectroscopic Databank (CDSD-296)21, NASA Ames-202122, HITRAN202023, and ExoMol24. The present report on the spectroscopic data of 16O12C16O (626) is part of a long-term, ongoing project devoted to the construction of the most extensive empirical energy level datasets, calculated from measured line positions in high-resolution rovibrational spectra, for all isotopologues of carbon dioxide involving the 12C, 13C, 16O, 17O, and 18O isotopes. Empirical energy levels based on all the measured rovibrational transitions are already available for the carbon dioxide isotopologues 16O12C18O (628)25, 16O13C16O (636)26, 16O13C18O (638)27, 18O12C18O (828)28, 17O12C18O (728)28, and 18O13C18O (838)28 (see Table 1). In these projects, empirical energy levels are calculated using the MARVEL 4.0 (Measured Active Rotational-Vibrational Energy Levels) procedure29,30,31,32, built upon the theory of spectroscopic networks33,34. Statistical measures of these previous studies, mostly with respect to the Ames-202122 and CDSD-29621 datasets of energy levels are also given in Table 1. For the seven isotopologues studied thus far, agreement with the CDSD-296 data is significantly better, by more than an order of magnitude for the average absolute deviation.

Table 1 The number of empirical (MARVEL) rovibrational energy levels determined, and the maximum absolute (MAD) and average absolute (AAD) energy level differences, ΔE, in hc cm−1, between the MARVEL studies and those from Ames-202122 and CDSD-29621 for isotopologues of carbon dioxide studied by our group2528.

The most important results of this study, obtained with the help of the MARVEL code for the 626 isotopologue of carbon dioxide, include the 626M24 dataset of validated experimental transitions and empirical rovibrational energy levels, and a large rovibrational line list, 626M24LL. All of these data will contribute not only to future spectroscopic measurements on carbon dioxide but also to the refinement of theoretical and computational spectroscopic models and the enhancement of spectroscopic line-by-line databases, such as HITRAN23 and ExoMol24,35.

Methods

Source data

References36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180 contain rovibrational transitions data considered during the MARVEL analysis of this study. The wavenumber range covered by these measurements is limited to 42.9 – 14 076 cm−1.

MARVEL

The MARVEL procedure29,30,31,32, used extensively during this study, starts with the careful collection, detailed examination, and subsequent validation of the positions of transitions in high-resolution (laboratory) spectra. The transitions collected are then used to construct a spectroscopic network (SN)33,34, whereby each energy level serves as a node and the nodes are interconnected by the observed transitions. The SN built allows the determination of empirical energy-level values along with educated estimates for their uncertainties32. Unlike the effective Hamiltonians widely used for spectroscopic analysis, the MARVEL approach is model-free. This has a number of advantages and, in particular, for the CO2 molecule with its many resonances, MARVEL does not require any special measures or extra parameters to characterize levels perturbed by “accidental” interactions with nearby states.

Ideally, the experimentally observed transitions allow the creation of a well-connected SN, linking all transitions to the ground state (defined as the state with no rovibrational excitation), called the root of the SN. However, because of the limited coverage offered by the experimental data, this is usually not the case. Therefore, in practice, the SN can become fragmented, resulting in a principal component, where all the nodes are linked to the root, and a number of isolated, so-called floating components with their own roots.

The MARVEL protocol allows for the detection of inconsistencies, that is, lines that are in conflict with the correct measurement data. This feature proves invaluable for identifying issues with experimental data that usually come from several sources, such as user errors made during data collection and analysis, incorrect assignments, or the use of different naming conventions.

Notation and quantum numbers

CO2 has three fundamental vibrational modes, conventionally denoted as ν1, ν2, and ν3, associated with the vibrational quantum numbers vii = 1, 2, and 3, respectively. The two-dimensional (degenerate) bending mode, ν2, is characterized by an angular momentum, described by the quantum number 2. Herzberg’s notation is often used to assign energy levels in triatomics; in this notation, the vibrational states of CO2 are designated as \(({v}_{1}\,{v}_{2}^{{\ell }_{2}}\,{v}_{3})\). For the CO2 molecule with a linear equilibrium structure in its ground electronic state, there is a strong Fermi-resonance interaction between the states (\({v}_{1}\,{({v}_{2}+2)}^{{\ell }_{2}}\,{v}_{3}\)) and (\({v}_{1}+1\,{v}_{2}^{{\ell }_{2}}\,{v}_{3}\)). Therefore, it became customary to employ the so-called AFGL (Air Force Geophysics Laboratory) notation to denote the vibrational states and bands of CO2 isotopologues. In the AFGL notation181,182,183, the vibrational energy levels are designated by the quintuplet (v1v22v3r), where r is the ranking index for states in Fermi resonance (the r index is used to distinguish the levels belonging to the same Fermi polyad). The lowest value of r, 1, is assigned to the energy level with the highest wavenumber (or frequency), and r increases for lower-energy levels. For example, the three vibrational states (2 00 0), (1 20 0), and (0 40 0) are in Fermi resonance with each other and have the AFGL vibrational descriptors (2 0 0 0 3), (2 0 0 0 2), and (2 0 0 0 1), respectively.

It is customary to use polyad numbers P to denote strongly interacting groups of vibrational states, decoupling them from the other vibrations. This is a useful concept, especially when effective Hamiltonians are formed. P is not a quantum number, but it behaves like one. For carbon dioxide, based on the approximate relations of the harmonic frequencies, ω1 ≈ 2ω2 and ω3 ≈ 3ω2, the widely accepted definition of P, also used in this study, is P = 2v1 + v2 + 3v3.

The quantum number J is used to denote the angular momentum associated with rotational and (when 2 > 0) vibrational motion of the CO2 molecule. Transitions with ΔJ = − 1 and ΔJ = + 1 are called the P- and R-branch transitions, respectively, while the Q-branch transitions are associated with ΔJ = 0. P and R transitions occur in both the parallel and perpendicular bands, while the Q branch transitions only occur in the parallel bands, where the direction refers to the change in the dipole moment driving the transition relative to the linear equilibrium structure of the molecule. For the symmetric isotopologue 626, the Pauli principle means that symmetric vibrational states (those with even v3 values) only have even J levels, while anti-symmetric vibrational states (those with odd v3 values) have only odd J levels. Similarly, for states with even values of J + 2 + v3 the rotationless parity is ‘e’, while for states with odd J + 2 + v3 values the rotationless parity is ‘f’. The coupling of rotational and vibrational angular momentum means that J ≥ l2.

The upper and lower states involved in a transition are denoted as ′ and , respectively, and the P, R, and Q transitions are usually specified using the lower-state rotational quantum number (J″). For the purposes of the MARVEL analysis, each state is uniquely characterized using the set of seven descriptors (J v1v22v3re/f). This is the format followed by the data deposited in the Supplementary Material to this paper.

Data Records

The 626M24 dataset is available in an OSF (Open Science Framework) repository184. It contains (a) all experimentally measured transitions collected during this work, (b) all empirical rovibrational energy levels determined, and (c) an extensive line list derived from the levels. All validated transitions have positive wavenumber or frequency values, while transitions that had to be removed have negative values. The same repository contains a table describing the main characteristics of the 143 literature sources that contain the transitions collected and analyzed.

The file “626M24_segments.txt” is the segment input file utilized by the MARVEL code, where the unit of the line positions and their uncertainties are specified for each data source. The file “626M24_transitions.txt” contains the 44 828 input transitions, collected from Refs. 36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180, used during the MARVEL analysis. In this file, each transition is characterized by (a) a line position (in units stored in the segment file), (b) an initial and an adjusted line-position uncertainty, (c) the rovibrational assignments for the upper and lower states (see the previous section for a description), and (d) a line tag, representing a unique identifier (each data source tag is based on the last two digits of the year of publication and the first two characters of the last names of the authors).

Of all the experimentally measured transitions only about half of them, 22 218, are unique. During the MARVEL analysis, 368 transitions had to be removed from our spectroscopic network; note, in particular, that all the measured transitions of 00TaPeTeLe107 had to be deleted. It is also worth mentioning that although the transitions of the sources 94Bailly93 and 97BaCaLa97 are included in the transition file, the transitions reported in them form floating components. Thus, we cannot independently validate them or determine the absolute values of the energy levels associated with them. Finally, it is important to add that although most of the transitions in the transition file “626M24_transitions.txt” are from measurements, the final dataset also contains calculated line positions. These sources are denoted by ‘_C’ in the tag. The reason to include these calculated line positions in the transitions list is that the uncertainty of these lines is several orders of magnitude smaller than that of other transitions measured in the given region and they help the analysis of the spectroscopic network of 626.

The empirical energy values, obtained for 8268 rovibrational states in the 0 − 20 654 cm−1 range, are placed in the file “626M24_energy_levels.txt”. Each energy level of this data file is characterized by (a) a rovibrational label, (b) an empirical (MARVEL) energy in hc cm−1, (c) an energy uncertainty in hc cm−1, and (d) the number of transitions incident to this state.

Using our empirical energies and the CDSD-29621, NASA Ames-202122, and HITRAN202023 line positions and intensities, an extended line list, named 626M24LL, was constructed, given in the file “626M24_line_list.txt”. Line intensities relate the probability of absorption by a given line at a specified temperature; here we adopt the standard HITRAN23 unit of cm molecule−1. This line list contains 285 503 dipole-allowed transitions in the range 147 − 19 909 cm−1, with room-temperature intensities down to 10−31 cm molecule−1. Columns (1) − (21) of the “626M24_line_list.txt” file contain the following information: (1) CDSD-296 line position, (2) AMES-21 line position, (3) HITRAN2020 line position, (4) MARVEL line position (generated from the MARVEL energy levels as Eup,MARVEL − Elow,MARVEL), (5) MARVEL uncertainty, (6) AMES-21 intensity (100% abundance assumed), (7) HITRAN2020 line intensity (scaled by natural abundance), (8-14) descriptors of the upper state, and (15-21) descriptors of the lower state. All line positions and uncertainties are in cm−1, the intensity values correspond to a temperature of 296 K. Beyond column (21) the line may contain a possible comment. Four types of comments are used: (a) ‘ONLY IN MARVEL’ means that this line can only be found in the experimental transitions dataset, but not in CDSD-296 and HITRAN2020. There are 290 such lines in the 626M24 dataset, typically with high v3 values (v3 > 6). (b) The 626M24 dataset contains 2506 lines that can be found ‘ONLY IN HITRAN’. Most of these lines (2134) are not assigned (their vibrational labels are –2 –2 –2 2 0). (c) When the deviation of the HITRAN2020 position from the CDSD-296 and/or MARVEL positions is larger than 0.01 cm−1, the comment ‘Incorrect HITRAN line position’ is used. These 565 HITRAN2020 lines should be reinvestigated and replaced with CDSD-296 or MARVEL positions. (d) When the deviation of the MARVEL position from the CDSD-296 position is larger than 0.005 cm−1, the comment ‘Conflict with MARVEL’ is used. There are 5110 such cases. They are divided into two groups. First, when both the lower and the upper energy levels are determined by at least three transitions, i.e., the MARVEL prediction is considered to be reliable, ‘!’ is used at the end of the comment. For example, three sources76,88,101 measured a line at 3181.915 cm−1, but the CDSD-296 position of this line is 3181.909 cm−1. Most of the cases (4427 occurrences) belong to the second group, where one of the MARVEL energy levels, typically the upper energy level, is defined by only one or two transitions. In this case, ‘?’ is placed at the end of the comment, denoting that it is possible that the experimentally measured line is not reliable. As a point of interest, note that while the initial dataset, “626M24_transitions.txt”, contains 816 transitions with uncertainties of less than 10−6 cm−1, the number of such transitions in the extended line list, “626M24_line_list.txt”, is 2101.

Finally, the file “626M24_MARVEL.exe” is a developer version of the MARVEL code, written in the C++ language. This version of the MARVEL code, distributed with the necessary input files ("626M24_transitions.txt” and “626M24_segments.txt”), was used to generate the numerical data of the 626M24 repository.

Technical Validation

The principal validation of the 626M24 energy levels was performed via the MARVEL procedure (see Sec. 2.2). Basically, it involved an elaborate checking of the consistency of the experimentally measured transitions collected, in relation to their assignments, line positions, and uncertainties. Figure 1 shows the final experimental uncertainties of the validated rovibrational measurements of 16O12C16O as a function of the transition wavenumber. This figure shows that for 16O12C16O, one of the spectroscopically most studied molecules, (a) the uncertainties of the experimentally measured transitions cover almost eight orders of magnitude, from 1 × 10−9 to 10−1 cm−1, (b) the wavenumber range covered by the experiments is rather limited, only going up to 14 000 cm−1, and (c) there are no highly accurate measurements above 7000 cm−1.

Fig. 1
figure 1

Uncertainties of the experimental rovibrational line-center positions available for the 16O12C16O molecule, as a function of the transition wavenumber (note the logarithmic scale of the vertical axis). If multiple measurements are available for the same line, the most accurate transition is chosen.

The global MARVEL analysis resulted in the best rovibrational energy-level dataset, based on the presently available transitions. An important validation of the 626M24 energy values is their comparison with entries in standard databases. Comparison of the predicted line positions of 626M24 with those in the CDSD-29621, Ames-202122, and HITRAN202023 line catalogs is particularly important, as it allows additional validation of the energy dataset derived in this study. Furthermore, this comparison might reveal database entries that require further verification and/or modification.

Figure 2 shows the absolute deviations between the MARVEL data and those of CDSD-296 and Ames-2021. The MARVEL data show significantly better agreement with CDSD-296 (with a root-mean-square, rms, deviation of 0.0032 hc cm−1) than with Ames-2021 (with an rms value of 0.0238 hc cm−1, which is an order of magnitude higher). This is not surprising, as the CDSD-296 data are semi-empirical in nature. A comparison with the HITRAN2020 data can be found in the file “626M24_line_list.txt”.

Fig. 2
figure 2

Comparison between rovibrational energies of the present 16O12C16O dataset and those of CDSD-29621 (blue squares) and Ames-202122 (red circles).

Figure 3 shows the empirical rovibrational energy levels of the 16O12C16O molecule determined in this study as a function of the rotational quantum number J and the total energy; the vibrational structure can also be seen in the quadratic curves formed as a function of J. Figure 3 shows that the list of rotational energy levels for the ground vibrational state extends up to J = 108, but is incomplete, as the J = 94 and 96 states are not present in the MARVEL energy levels.

Fig. 3
figure 3

Pictorial representation of the empirical rovibrational energy levels of the 16O12C16O molecule determined in this study, as a function of the rotational quantum number J and the vibrational states (different colors refer to different vibrational states).

Table 2 lists the vibrational band origins of 16O12C16O determined in this study; as J = 0 levels only exist for vibrational states with 2 = 0, these band origins are only for vibrational states which have 2 = 0. It is perhaps surprising to see that there are only nine energies listed there and only four of them have an accuracy better than 1 × 10−3 hc cm−1.

Table 2 Empirical vibrational band origins of 16O12C16O determined in this study (part of the 626M24 dataset) and their counterparts in the CDSD-29621 and NASA Ames-202122 databases.