Main

Polycyclic aromatic hydrocarbons (PAHs) are the likely carriers of the unidentified infrared (UIR) bands that dominate the spectra of most galactic and extragalactic objects1. These bright features, specifically at 3.3, 6.2, 7.7, 8.6, 11.2 and 12.7 μm, are generally associated with vibrational modes of PAHs that undergo infrared (IR) fluorescence after having been electronically excited by absorbing far-ultraviolet photons2,3. In the astronomical objects where UIR bands are observed, PAHs are highly abundant (~10−7 relative to hydrogen1) and, therefore, critically impact the physics and chemistry of the interstellar medium (ISM). In particular, they play a key role in determining the ionization balance in molecular clouds, thus influencing ion–molecule chemistry4 and contributing to the neutral gas heating due to the photoelectric effect5. Despite their perceived importance, little is known about the nature of individual PAH molecules in the ISM. While the presence and abundance of PAHs in space is strongly supported by IR observations using, for example, the Infrared Space Observatory6, the Spitzer Space Observatory7 and the recently launched James Webb Space Telescope8, the spectra obtained in the mid-IR are a convolution of many different hot PAH molecules that all contain similar functional groups. Due to this spectral congestion, identification of individual PAHs in the ISM has not yet been achieved using their vibrational fingerprints. However, comprehensive efforts to compare spectral variations across multiple astronomical objects have constrained the PAH families present in these astrophysical environments9,10,11, including recently with unprecedented spatial resolution using the James Webb Space Telescope8.

While extraterrestrial PAHs have been found in carbonaceous chondrites such as Murchison and Orgueil12,13 and in samples returned from comet 81P/Wild 2 during the Stardust mission14, their recent discovery in return samples from asteroid Ryugu shines new light on potential formation pathways15,16. Carbon-13 isotopic analysis of the PAHs found in Ryugu showed that the three-ring species such as anthracene and phenanthrene were formed at high temperatures (>1,000 K). Meanwhile, the two- and four-ring PAHs naphthalene, fluoranthene and pyrene (the most abundant PAH in Ryugu) must have formed via a kinetically controlled route at low temperatures (~10 K). Indeed, two- and four-ring PAHs have been unambiguously detected in the cold, dark Taurus molecular cloud (TMC-1) by radio astronomical observations17,18,19,20,21.

In contrast to the hot, broad UIR bands, each molecule possessing a permanent dipole moment has a distinct rotational spectrum with narrow emission lines that can be observed using radio astronomy. Most PAHs considered in the literature are large (more than 30 carbon atoms), highly symmetric and unsubstituted (‘pure’ hydrocarbons) for which models predict a viable chance of survival under the harsh interstellar conditions22. However, due to their high symmetry, these PAHs often possess only a small or null dipole moment. Thus, despite their ubiquity, only five individual PAHs have been detected by radio astronomy so far17,18,19,20,21. The rotational emission from these unambiguously detected PAHs has been observed towards TMC-1 and originates from CN-functionalized PAHs (nitriles), with the exception of the asymmetric, pure PAH indene. It has been proposed that, owing to their large dipole moments, nitrile-substituted PAHs can be used as observational proxies for pure PAHs23,24. Extracting quantitative abundances of unsubstituted PAHs from these proxies, however, relies on knowledge of the kinetics of their dominant formation and destruction pathways25.

Here, we present the interstellar detection of two additional CN-functionalized PAHs, 2-cyanopyrene and 4-cyanopyrene, isomers of the recently discovered 1-cyanopyrene, in TMC-1 using broadband radio astronomical observations and enhance the statistical evidence for their detections with a stacking and matched filtering analysis. The discovery of 2- and 4-cyanopyrene completes the set of all possible singly CN-substituted pyrene isomers, allowing us to explore their potential formation routes by comparing their abundances with each other and to further constrain the abundance of pure pyrene in TMC-1.

Results

Discovery of 2- and 4-cyanopyrene in TMC-1

The GOTHAM (Green Bank Telescope (GBT) Observations of TMC-1: Hunting Aromatic Molecules) project is a high-sensitivity high-spectral-resolution broadband line survey of TMC-1 with near-continuous coverage from approximately 8 to 36 GHz (ref. 26). The data were collected with the 100 m Robert C. Byrd GBT from 2018 to 202220,21,27.

First, the laboratory rotational spectra of pure samples of 2- and 4-cyanopyrene (see Fig. 1 for their structures) were measured between approximately 7 GHz and 18 GHz using a cavity-enhanced Fourier transform microwave (FTMW) spectrometer. To determine the rotational constants, 762 and 318 (individual or partly blended) transitions of 2-cyanopyrene and 4-cyanopyrene, respectively, were fitted to a standard asymmetric top rotational Hamiltonian (Methods). The derived spectroscopic constants, which are reported in Supplementary Table 1, allowed us to calculate the rotational rest transition frequencies up to ~25 GHz with an accuracy of ~ 2 kHz. Searches for the radio emission features of 2- and 4-cyanopyrene towards TMC-1 were performed by simulating their rotational spectra under TMC-1 conditions (~5.8 km s−1, 5–10 K) and comparing them with the GOTHAM data (Supplementary Fig. 1). Comparing the simulated rotational spectra with the root mean square (RMS) noise of our GOTHAM data depicted in Supplementary Fig. 1, we identify only a few spectral windows in which the 4-cyanopyrene features might be >1σ (above the noise, but ≤3σ, where σ is the standard deviation), while the 2-cyanopyrene lines are even weaker. These spectral windows are plotted in Supplementary Figs. 2 and 3 for 2-cyanopyrene and 4-cyanopyrene, respectively.

Fig. 1: Structures of the cyanopyrene isomers.
figure 1

CN functionalization of pyrene (C16H10) forms three possible isomers (C17H9N), namely, 1-cyanopyrene (1-CN–C16H9), 2-cyanopyrene (2-CN–C16H9) and 4-cyanopyrene (4-CN–C16H9). Equivalent sites are shown with coloured circles.

A Markov chain Monte Carlo (MCMC) analysis was used to derive the marginalized posterior parameters for the 2- and 4-cyanopyrene emission. These included the velocity in the local standard of rest, vlsr, and column densities, NT, in all four spatially separated velocity components of TMC-1 (ref. 28), a single excitation temperature, Tex, and linewidth, ΔV. From this analysis, using the 1-cyanopyrene marginalized posterior parameters as priors (Methods, Supplementary Table 2 and ref. 21), we derived a column density of \(0.8{4}_{-0.09}^{+0.09}\times 1{0}^{12}\,{{\rm{cm}}}^{-2}\) at an excitation temperature of \(7.9{0}_{-0.48}^{+0.53}\,{\rm{K}}\) for 2-cyanopyrene and \(1.3{3}_{-0.09}^{+0.10}\times 1{0}^{12}\,{{\rm{cm}}}^{-2}\) at \(8.2{7}_{-0.44}^{+0.46}\,{\rm{K}}\) for 4-cyanopyrene (Supplementary Figs. 4 and 5 and Supplementary Table 3). The partition functions used for the MCMC analysis are listed in Supplementary Table 4.

To further explore the significance of our detections, we performed a velocity-stack and matched filtering analysis on both species17,28. For this purpose, spectral windows centred around the 150 brightest signal-to-noise ratio (SNR) lines of the simulated cyanopyrene spectra and the corresponding frequencies in the GOTHAM data were extracted in frequency space and collapsed into one SNR-weighted line in velocity space. Cross-correlating the latter stack (Fig. 2a,c, black) with the former (Fig. 2a,c, colour) yielded an impulse response for the two detections. The statistical significance of the 2-cyanopyrene and 4-cyanopyrene detections was 8.0σ and 12.9σ, respectively.

Fig. 2: Velocity-stacked spectra and matched filter responses of 2- and 4-cyanopyrene.
figure 2

a,c, The stacked GOTHAM observations (black) are overlaid with the simulated stacked spectrum of 2-cyanopyrene (a) and 4-cyanopyrene (c) (blue and pink, respectively), each consisting of the 150 brightest SNR lines. Marginalized posterior parameters were used in both simulations, as reported in Supplementary Table 3. b,d, The corresponding impulse response for the matched filtering analysis is shown, yielding a significance of 8.0σ and 12.9σ for the 2-cyanopyrene (b) and 4-cyanopyrene detections (d), respectively. The small features in the stacked simulated spectrum (pink) in c result from the densely populated 4-cyanopyrene lines (Supplementary Figs. 1 and 3) that add up in the stack.

Discussion

Abundances of the cyanopyrene isomers

We extracted column densities of \(1.5{2}_{-0.16}^{+0.18}\), \(0.8{4}_{-0.09}^{+0.09}\) and \(1.3{3}_{-0.09}^{+0.10}\times 1{0}^{12}\,{{\rm{cm}}}^{-2}\) for the 1-cyanopyrene, 2-cyanopyrene and 4-cyanopyrene isomers, respectively, that is, a 1.8:1:1.6 abundance ratio (with an uncertainty of approximately ± 0.3) or roughly 2:1:2 (ref. 21). On the assumption that CN addition to a double bond in pyrene is the major formation pathway for the cyanopyrene isomers,

$${\rm{CN}}+{{\rm{C}}}_{16}{{\rm{H}}}_{10}\longrightarrow {{\rm{C}}}_{16}{{\rm{H}}}_{9}{\rm{CN}}+{\rm{H}},$$
(1)

the observed ratio is consistent with two equivalent sites yielding 2-cyanopyrene and four equivalent sites yielding 1- and 4-cyanopyrene (Fig. 1, coloured circles).

The use of the cyanopyrene abundance to estimate the abundance of pyrene relies on knowledge of the formation pathway of cyanopyrene. If cyanopyrene is formed predominantly from CN addition to pyrene under kinetic control, then the relative cyanopyrene/pyrene abundance would be determined by the ratio of the CN addition rate and the total cyanopyrene destruction rate. In this case, it is reasonable to expect that the 1-, 2- and 4-cyanopyrene product-branching ratios would be proportional to the number of equivalent sites available for CN addition. Calculations of the CN addition rate coefficients were therefore carried out to evaluate whether the observed isomer distribution agrees with kinetic control for the addition of CN to pyrene (Methods and Supplementary Table 5).

The EP3-corrected ωB97X-D4/def2-TZVPP surface was incorporated into the energy-grained master equation calculator Master Equation Solver for Multi-Energy Well Reactions (MESMER) 7.0 (Supplementary Section 7). The site-specific bimolecular rate coefficients for the CN addition H elimination reaction with pyrene at 10 K were predicted to be k1 = 2.02 × 10−10 cm3 s−1, k2 = 1.01 × 10−10 cm3 s−1 and k4 = 2.02 × 10−10 cm3 s−1, consistent with the observation of the cyanopyrene isomers in a ~2:1:2 ratio. However, the product-branching ratios are sensitive to the heights of the submerged barriers for H elimination.

The heights of the submerged barriers on the EP3//ωB97X-D4/def2-TZVPP surface are large enough that, within the estimated uncertainty of the calculations, they could potentially impact the product distribution. Further computational work to evaluate the magnitude of the exit barriers will be important in elucidating whether the observed ratio of the cyanopyrene isomers agrees with a kinetically controlled addition of CN to pyrene.

In our previous work21, we used a wide range (0.01–0.1) for the CN/H ratio to estimate the abundance of pyrene from 1-cyanopyrene. Here, we use astrochemical modelling, performed with the nautilus modelling code29 (Methods) to further constrain the abundance of pyrene in TMC-1. While pyrene and cyanopyrene are not included in our models, chemically similar aromatics can be used to explore the CN/H ratio. In these models, we allow the rate coefficient for CN + aromatic (either indene or benzene) to vary, enabling us to determine how the CN/H ratio depends on the magnitude of the rate coefficient as well as the chemical age. We then used the outputs of the model (Fig. 3) and our predicted isomer-specific rate coefficients for CN + pyrene to estimate the expected CN/H ratio for cyanopyrene/pyrene in TMC-1. In addition, we compared the modelled CN/H ratio with the ratio of 2-cyanoindene/indene from radio observations20.

Fig. 3: The CN/H ratio as a function of the CN+aromatic rate coefficient, k.
figure 3

a, The filled circles show ratios obtained from the astrochemical models for C9H7CN/C9H8 at different simulation ages. The isomer-specific CN/H ratio obtained at 0.2 Myr by taking 1/6 of the collision rate coefficient for the reaction of CN with indene (that is, 8.14 × 10−11 cm3 s−1) is shown by the purple diamond. The observed C9H7CN/C9H8 ratio is shown by the horizontal dashed line with uncertainties shown in the horizontal grey bar. The yellow-filled triangles show C6H5CN/C6H6 at 0.2 Myr as a function of the CN + C6H6 rate coefficient. b, Zoomed version of the area framed in grey in a.

Figure 3 shows the linear dependence of the CN/H ratio on the rate coefficient for the reaction of CN with an aromatic hydrocarbon, illustrated using 2-cyanoindene/indene, C9H7CN/C9H8, and benzonitrile/benzene, C6H5CN/C6H6. We explored this dependence for different chemical ages between 0.1 Myr and 2 Myr, since a range of values have been reported for different molecules and model parameters28,30. The linear dependence was observed for all chemical ages, albeit with different slopes. This trend is due to the increased efficiency of simultaneously destroying the pure aromatic and producing the nitrile-functionalized aromatic. Both benzene and indene display similar trends with their CN/H ratios nearly identical at 0.2 Myr. Assuming similar production and destruction mechanisms, these models can be extended to the CN/H ratio for other aromatic molecules and their nitriles, such as naphthalene and the cyanonaphthalene isomers.

It is important to note that, while CN addition to benzene produces only one isomer (benzonitrile), CN addition to a double bond in indene can occur via six distinct sites; therefore, the product-branching ratio must be taken into account when determining the appropriate CN + aromatic rate coefficient. To explore this in more detail, we estimated the CN + indene collision rate coefficient, kcoll (Methods and equation (3)). Using the simple assumption that each product channel is equally likely, kcoll/6 = 8.14 × 10−10 cm3 s−1 is used to approximate the rate coefficient for CN + indene to form 2-cyanoindene in the collision limit. To compare with observations, we chose an age of 0.2 Myr based on the approximate chemical age of TMC-1 that was derived previously from modelling of carbon chains31,32. As shown in Fig. 3, this rate coefficient is consistent with both the modelled and observed CN/H ratio for 2-cyanoindene/indene of 0.023. We therefore used these results to constrain the CN/H ratio for cyanopyrene/pyrene.

Using the MESMER predicted rate coefficients, the abundance of pyrene can be estimated from Fig. 3 as ~20× the abundance of either 1-cyanopyrene or 4-cyanopyrene (CN/H ≈ 0.05), or ~40× the abundance of 2-cyanopyrene (CN/H ≈ 0.025), that is, a column density of ~3 × 1013 cm−2.

Bottom-up versus top-down formation pathways

Measurements of 13C isotopic substitutions in samples from asteroid Ryugu suggest that the two- and four-ring PAHs naphthalene, fluoranthene and pyrene probably formed in low-temperature interstellar environments16. However, the observed doubly 13C substitutions for three-ring PAHs anthracene and phenanthrene, as well as pyrene in the carbonaceous chondrite Murchison, suggest they may have formed in high-temperature circumstellar envelopes of evolved stars. Alternatively, these results could be explained by a scenario where these species formed or were altered on the parent body of asteroid Ryugu and, hence, after solar system formation. If these small PAHs are formed in circumstellar envelopes, they must survive destruction by radiation and shocks in the diffuse ISM33,34. However, if this is the case, it is difficult to understand why only the three-ring PAHs survived passage through the diffuse ISM. Indeed, cursory searches for the three-ring PAHs 9-cyanoanthracene and 9-cyanophenanthrene, whose rotational spectra are known35, in GOTHAM observations of TMC-1 have not yet been successful.

Various bottom-up mechanisms have been proposed to explain the formation pathways of PAHs in high-temperature environments, such as combustion systems36 and circumstellar envelopes37. These mechanisms generally occur in two stages36: first-ring closure followed by growth through subsequent addition of aromatic rings. The most prominent of the latter is the hydrogen abstraction, acetylene (C2H2) addition (HACA) mechanism36. In this mechanism, hydrogen is first abstracted from an aromatic molecule (typically by a hydrogen atom), followed by addition to the radical site. A bottom-up HACA mechanism to pyrene has been proposed by ref. 37 involving the reaction of the 4-phenanthrenyl radical (C14H9) with C2H2. However, the addition requires overcoming a barrier on the order of 10–20 kJ mol−1 (1,200–2,400 K equivalent temperature), prohibiting the HACA mechanism at the low temperatures of TMC-1 (ref. 38).

An alternative low-temperature mechanism for the bottom-up ring growth of PAHs has been proposed, the so-called hydrogen abstraction, vinylacetylene (C4H4) addition (HAVA) mechanism. Phenanthrene has been shown to form by the HAVA mechanism via the naphthyl radical39. Since the HAVA mechanism generally involves only submerged barriers, it can operate at low temperatures like those in dense molecular clouds. Recently, the reaction between the phenylethynyl radical (C6H5CC) and benzene has been proposed as another viable bottom-up mechanism to form phenanthrene at low temperature40. However, vinylacetylene addition to the 4-phenanthrenyl radical is not expected to form pyrene41. While this mechanism may form vinylpyrene, it is one of the few HAVA mechanisms studied that possesses a barrier in the entrance channel, due to steric hindrance at the reaction site. Thus, a low-temperature bottom-up mechanism to pyrene has yet to be unveiled.

Top-down routes to smaller PAHs have also been explored, including the fragmentation of bulk amorphous carbon or graphite by collisions of dust grains or interstellar shocks42,43. The relative importance of top-down versus bottom-up chemistry is difficult to quantify, and most astrochemical models focus on the latter. A comparison between the abundances of pyrene and smaller aromatic molecules detected in TMC-1 thus far can provide clues about its formation pathways. The column density of benzonitrile is \(1.7{3}_{-0.10}^{+0.85}\times 1{0}^{12}\,{{\rm{cm}}}^{-2}\) (ref. 17); therefore, assuming a CN + benzene rate coefficient of ~4 × 10−10 cm−3 s−1 (ref. 24), we predict a benzene abundance of ~1.4 × 1013 cm−2. Since this is approximately half the abundance that we derive above for pyrene, it is difficult to envision a bottom-up route to pyrene from benzene, unless benzene is destroyed much more efficiently than pyrene. Observational constraints on the abundance of larger PAHs would help to constrain the relative importance of top-down versus bottom-up formation pathways. If pyrene does form top-down, further work is required to reconcile this mechanism with the isotopic results from Ryugu.

Conclusions

We report interstellar detections of the two CN-functionalized pyrene isomers, 2- and 4-cyanopyrene, in our GOTHAM observations towards the dark molecular cloud TMC-1. Together with the previously detected 1-cyanopyrene, they form a family of the largest interstellar molecules identified by radio astronomy so far. New theoretical calculations of the CN + pyrene rate coefficients and their column densities of \(1.5{2}_{-0.16}^{+0.18}\), \(0.8{4}_{-0.09}^{+0.09}\) and \(1.3{3}_{-0.09}^{+0.10}\times 1{0}^{12}\,{{\rm{cm}}}^{-2}\) for 1-cyanopyrene, 2-cyanopyrene and 4-cyanopyrene, respectively, forming an approximate abundance ratio of 2:1:2, help better constrain the CN/H ratio of all aromatic species present in TMC-1 and, hence, the abundance of pure pyrene. We estimate a column density for gas-phase pyrene of ~3 × 1013 cm−2 corresponding to an abundance of ~3 × 10−9 with respect to H2. In light of this, PAH formation mechanisms should be revisited to help explain the origin and abundance of pyrene in TMC-1.

Methods

Rotational spectroscopy

Rotational spectra of 2- and 4-cyanopyrene were predicted using the open-source quantum chemical package PSI4 (ref. 44). Their geometries were initially optimized at the B3LYP/6-311++G(d,p)45 level of theory and basis set and subsequently using M06-2X/6-31+G(d)46,47, determining their rotational constants A, B and C as presented in Supplementary Table 1. They agree well with the constants derived by the ‘Lego brick’ approach48. The 14N nuclear electric quadrupole hyperfine coupling constants, χaa and χbb, were estimated by rotating the χ tensor of benzonitrile (cyanobenzene), which has been accurately measured experimentally49, to the principal axis coordinate systems of 2- and 4-cyanopyrene, assuming that the electric field gradients remain identical with respect to the local CN bond axis and molecular plane.

The laboratory rotational transition rest frequencies were measured using a cavity-enhanced FTMW spectrometer50,51. A laser ablation source was employed for solid sample introduction. The molecule of interest was either mixed with anthracene (Sigma-Aldrich, purity ≥97%) as a binder material in a 1:1 ratio to produce a homogeneous mixture (4-cyanopyrene) or used as is (2-cyanopyrene) and pressed with 3 tonnes of press force in a hydraulic press into a 0.25″-diameter cylindrical sample rod. The sample rod was mounted 8 mm downstream of a pulsed solenoid valve that was backed with 2.5 kTorr of neon as carrier gas. Sample ablation was performed using the second harmonic of a Continuum Surelite (SLI-10) Nd:YAG laser at a wavelength of 532 nm with a pulse energy of 50 mJ synchronized to operate during the ≤1 ms opening time of the solenoid valve. The ablated cyanopyrenes were carried into the FTMW spectrometer, and the supersonic expansion cooled them to a rotational temperature of ~2 K. After a brief search near the predicted frequencies of the strongest transitions, we observed several lines that could be successfully assigned to 2- and 4-cyanopyrene and were used to iteratively refine the spectroscopic predictions to search for additional transitions. Ultimately, we observed 762 and 318 individual or partly blended rotational transitions over the 7–16 GHz frequency range for 2-cyanopyrene and 4-cyanopyrene, respectively. The rest frequencies were least-squares fit to a standard rotational Hamiltonian using SPCAT/SPFIT in Pickett’s CALPGM suite of programmes52 (A-reduced, Ir representation) including quartic centrifugal distortion and 14N nuclear electric quadrupole hyperfine coupling constants, which are reported in Supplementary Table 1.

Observations and analysis

Our analyses make use of data from the GOTHAM project17,26 including observations up until May 202220,27. Spectra were collected using the VErsatile GBT Astronomical Spectrometer (VEGAS) on the 100 m Robert C. Byrd GBT. Project codes for the observations used in the data set are AGBT17A_164, AGBT17A_434, AGBT18A_333, AGBT18B_007, AGBT19B_047, AGBT20A_516, AGBT21A_414 and AGBT21B_210. Data were recorded with a uniform frequency resolution of 1.4 kHz, or 0.05–0.01 km s−1 in velocity space.

The data span the X-, Ku- and K- and most of the Ka-receiver bands, with nearly continuous coverage from 7.9 to 11.6 GHz, 12.7 to 15.6 GHz and 18.0 to 36.4 GHz. The total bandwidth covered is 24.9 GHz. To visualize the spectral coverage, the GOTHAM dataset is overlaid in Supplementary Fig. 1 with the simulated rotational spectra of the cyanopyrene isomers at ~8 K. The cyanopyrene lines covered by the GOTHAM dataset are shown in violet (1-cyanopyrene), blue (2-cyanopyrene) and pink (4-cyanopyrene). The shaded grey boxes represent the averaged RMS noise level in each data chunk, typically ~2–20 mK. The RMS noise increase at higher frequencies is due to the shorter total integration times in those frequency ranges.

Pointing was performed on the cyanopolyyne peak in TMC-1 at right ascension \(0{4}^{{\rm{h}}}4{1}^{{\rm{m}}}4{2}_{.}^{{\rm{s}}}50\), declination \(+2{5}^{\circ }4{1}^{{\prime} }2{6}_{.}^{{\prime\prime} }8\) (J2000 equinox). Spectra (on/off source) were collected using position-switching between the source and an emission-free position offset by 1°. Repointing and focusing were generally carried out every 1–2 h, primarily on the calibrator J0530+1331. Flux calibration was performed using an internal noise diode and Karl G. Jansky Very Large Array observations of J0530+1331, that is, the same source used for pointing. The flux uncertainty is estimated to be ~20% (refs. 20,27).

MCMC analysis

Prior observations of TMC-1 have shown that most centimetre-wave emission can be separated into contributions from four different velocity components53,54,55. Thus, we consider four different Doppler components, each with independent source size, velocity (in the local standard of rest, vlsr) and column densities (NT); and a shared uniform excitation temperature (Tex) and linewidth (ΔV), resulting in 14 modelling parameters for each molecule.

To account for covariance between the model parameters, we use an affine-invariant MCMC sampling analysis, which has previously been applied to complex probability distributions in many components, including for previous observations of molecules17,28.

The priors we adopted for the MCMC analysis of the two newly detected cyanopyrene isomers are listed in Supplementary Table 2. A uniform distribution (that is, unconstrained within the minima and maxima) was chosen for the NT and Tex priors, whereas the remaining parameters were set to have more tightly constrained Gaussian distributed priors centred at the values determined by prior observations of chemically similar species benzonitrile, cyanonaphthalenes17 and 1-cyanopyrene21.

Posterior probability distributions for each of the model parameters, along with their covariances, were generated using 100 walkers with 10,000 samples. The resulting source-dependent molecular parameters for the cyanopyrene isomers are reported in Supplementary Table 3. The covariance plots of the 14 parameters resulting from our MCMC analysis for the 2- and 4-cyanopyrene isomers are shown in Supplementary Figs. 4 and 5. The covariance plot for 1-cyanopyrene can be found in the supplementary material for ref. 21.

Theoretical calculations

Initial structures for the adducts and separated reagents were optimized with the RI-BP86 density functional theory (DFT) functional56 using the def2-SVP basis set57 and including D3(BJ) empirical dispersion corrections58 (RI-BP86-D3(BJ)/def2-SVP will be referred to as DFT-Cheap). The presence of barriers to subsequent transfer and elimination from the adducts was evaluated using relaxed surface scans with DFT-Cheap. Relaxed scans of the surface are carried out by allowing all but one coordinate within the molecule to reach their energy minimum while varying the fixed value of the frozen coordinate (typically a bond length). These scans were carried out in the forward and reverse directions to check for consistency. For example, in the case of CN addition to the ring, the coordinate scanned was the bond length between the ring carbon atom and the carbon of the CN radical. Where peaks in the energy profile for the relaxed scans were observed, these were used as the initial structure for transition state optimizations. All the structures calculated previously were then further refined and harmonic vibrations were calculated with the ωB97X functional59 with the triple zeta basis set def2-TZVPP60 and D4 empirical dispersion corrections61 (henceforth DFT-2); intrinsic reaction coordinate scans were carried out to ensure that these structures connected to reactants and products using the ORCA implementation of the method of Morokuma and coworkers62,63. Harmonic vibrational frequencies were scaled by 0.95334 and harmonic zero point energies were scaled by 0.9779 in the manner suggested by ref. 64.

Single-point energy corrections were then carried out using the EP3 approximation of the CCSD(T) (coupled cluster theory with full treatment of single and double excitations and including perturbative treatment of triple excitations) complete basis set (CBS) limit65,66; this approximation is based on three MP2 (Møller–Plesset perturbation theory second-order treatment of electron correlation) calculations with increasing basis set size and a single CCSD(T) calculation with the smallest basis set (equation (2)). Here, the small basis set was Dunning’s correlation consistent basis set cc-pVDZ with the extrapolation of the MP2 energies also using cc-pVTZ and cc-pVQZ67; these basis sets were taken from the correlation consistent basis set repository, ccREPO68. This approach can typically yield accuracy on the order of 5–15 kJ mol−1. However, the accuracy of the relative energies of similar species, such as the three isomers of cyanopyrene formed, will probably be much higher on the order of ±1 kJ mol−1. All calculations were performed using the open-source quantum chemical package ORCA 5.0.4 (ref. 63). As no barriers at the DFT-Cheap level had been observed for the addition of CN to the ring in the 1-, 2- or 4-position, additional relaxed scans with the more reliable hybrid functionals and triple zeta basis sets DFT-2 (refs. 59,60,61) and M06-2X/def2-TZVPP46,60 were carried out to verify this result. The potential energy curves recovered from these relaxed scans for CN approach to the ring did not find any submerged barriers and pre-reaction complexes; as such, the influence of pre-reaction complexes on these routes was not considered.

$${\rm{CCSD}}({\rm{T}})\left/{\rm{CBS}}\approx \,\text{EP3}\,={E}_{{\rm{HFCBS}}}+{E}_{{\rm{MP}}2{\rm{CBS}}}+{E}_{{\rm{CCSD}}({\rm{T}}){\rm{small}}}-{E}_{{\rm{MP}}2{\rm{small}}}\right.$$
(2)

The results show that the approach of CN to pyrene is barrierless for the addition of carbon to the 1-, 2- or 4-positions, leading to the formation of deeply bound adducts 121–182 kJ mol−1 below the entrance energies. The barriers to the subsequent H atom elimination are submerged by 2–17 kJ mol−1.

This EP3//DFT-2 surface was then incorporated into the energy-grained master equation calculator MESMER 7.0 (ref. 69), which allowed the reaction to be simulated over a range of densities (1 × 104 to 1 × 1015 cm−3) and temperatures (10–300 K). There are eight equivalent pathways to addition at the 1- and 4-positions and four equivalent pathways to addition at the 2-position. The temperature-dependent collision rate coefficient for CN + pyrene was estimated using classical capture theory70

$${k}_{{\rm{coll}}}(T\;)={\sigma }_{{\rm{coll}}}\langle v(T\;)\rangle =\left[\uppi {\left(\frac{2{C}_{6}}{{k}_{{\mathrm{B}}}T}\right)}^{1/3}\varGamma \left(\frac{2}{3}\right)\right]\left[{\left(\frac{8{k}_{{\mathrm{B}}}T}{\uppi \mu }\right)}^{1/2}\right],$$
(3)

where kB is the Boltzmann constant, Γ(x) is the gamma function such that Γ(2/3) = 1.353, μ is the reduced mass of the collision and C6 is the sum of coefficients describing the magnitude of the attractive forces between collision partners71.

$${C}_{6}=\frac{2}{3}\left(\frac{{\mu }_{1}^{2}{\mu }_{2}^{2}}{{k}_{{\mathrm{B}}}T{\left(4\uppi {\epsilon }_{0}\right)}^{2}}\right)+\frac{{\mu }_{1}^{2}{\alpha }_{2}+{\mu }_{2}^{2}{\alpha }_{1}}{4\uppi {\epsilon }_{0}}+\frac{3}{2}{\alpha }_{1}{\alpha }_{2}\left(\frac{{I}_{1}{I}_{2}}{{I}_{1}+{I}_{2}}\right),$$
(4)

where ϵ0 represents the permittivity of free space, μ1 and μ2 are the dipole moments of the reactants, and α1/α2 and I1/I2 are their polarizabilities and ionization energies, respectively. Where possible, these values were taken from the online databases National Institute of Standards and Technology Chemistry WebBook and Computational Chemistry Comparison and Benchmark Database72,73; where these were not available, they were calculated using DFT-2. This approach is generally accurate within a factor of 2 for the prediction of rate coefficients for neutral–neutral barrierless reactions71.

The capture rate leading to an individual adduct was set to the fraction of the total number of pathways available that led to the formation of that isomer and the estimated overall capture rate. The redissociation of the adducts was then treated with the inverse Laplace transformation methodology74 in MESMER along with Rice–Ramsperger–Kassel–Marcus treatment75,76,77 for the H atom elimination reactions. Quantum mechanical tunnelling effects were accounted for using an asymmetric Eckart barrier model78. The impact of the size of the energy grains was evaluated by varying the grain sizes between 30 cm−1 and 5 cm−1 until the result had reached a consistent value.

The results of the MESMER simulations (for 10 K and 2 × 104 cm−3) have been summarized in Supplementary Table 5. The primary route to product formation is via well-skipping. An estimate for the uncertainties in the barrier heights was accounted for by concertedly raising and lowering all three barriers by 10 kJ mol−1. The predicted branching ratios are 2:1:2 for 1-, 2- and 4-cyanopyrene formation with an overall rate coefficient of 5.05 × 10−10 cm3 s−1.

Astrochemical models

To gain insights into the CN/H ratio in TMC-1, we adapted the three-phase chemical network model nautilus v1.1 code29. The reaction network is that of ref. 32, with modifications to the ion–neutral reaction rate coefficients as described below. We use the same initial conditions as described in ref. 79, that is, gas and dust temperatures of 10 K, gas densities of 2 × 104 cm−3 and a cosmic-ray ionization rate of 1.3 × 10−17 s−1.

Initial elemental abundances were taken from ref. 80 with the exception of atomic oxygen, where we utilize a slightly carbon-rich C/O ≈ 1.1 and XO(t = 0) ≈ 1.5 × 10−4, as described in refs. 28,79.

Rate coefficients for the destruction of benzonitrile via ion–neutral reactions were updated (Supplementary Table 7), estimated assuming a Su–Chesnavich capture model to account for long-range Coulombic attractions81,82. In addition, the reaction of carbon atoms with benzene was added to the network for consistency with the other aromatics (indene and naphthalene) currently in the network.

$${\rm{C}}+{{\rm{C}}}_{6}{{\rm{H}}}_{6}\longrightarrow {{\rm{C}}}_{7}{{\rm{H}}}_{5}+{\rm{H}}$$
(5)

The rate coefficient was initially set to 5 × 10−10 cm3 s−1, the room temperature value83. Modifying the rate coefficient of C + aromatic was found to strongly influence the absolute abundance of benzene, benzonitrile, indene and cyanoindene; however, it had negligible influence on the observed abundance ratios (≤4% for both benzonitrile/benzene and 2-cyanoindene/indene).

The rate coefficient for CN and indene was calculated using classical capture theory70,71 to be 4.88 × 10−10 cm3 s−1; assuming equal formation of all isomers, this leads to a site-specific rate coefficient of 8.14 × 10−10 cm3 s−1 for 2-cyanoindene.

To evaluate the sensitivity of the model predictions to the CN + aromatic rate coefficient, the model was run for a range of \({k}_{{\rm{CN}}+{\rm{benzene}}}\) between 0.5 × 10−10 cm3 s−1 and 6 × 10−10 cm3 s−1 and for a range of \({k}_{{\rm{CN}}+{\rm{indene}}}\) between 0.81 × 10−10 cm3 s−1 and 10 × 10−10 cm3 s−1. The observed ratios of benzonitrile/benzene and cyanoindene/indene depend linearly on their formation rate coefficient (\({k}_{{\rm{CN}}+{\rm{aromatic}}}\)). As can be seen in Fig. 3, there is excellent agreement between the predicted ratio of cyanoindene to indene and benzonitrile to benzene using our updated reaction network. Furthermore, there is also a strong agreement with the observed isomer-specific ratio for 2-cyanoindene to indene.

Synthesis

Syntheses of 2- and 4-cyanopyrene were achieved using pyrene and hexahydropyrene as starting materials, respectively, as reported in the literature84,85,86,87,88,89. The synthetic routes to form these two cyanopyrene isomers are described in detail in Supplementary Section 9.