SARS-CoV-2 human challenge reveals biomarkers that discriminate early and late phases of respiratory viral infections

Rosenheim, Joshua; Gupta, Rishi K.; Thakker, Clare; Mann, Tiffeney; Bell, Lucy C. K.; Broderick, Claire M.; Madon, Kieran; Papargyris, Loukas; Dayananda, Pete; Kwok, Andrew J.; Greenan-Barrett, James; Wagstaffe, Helen R.; Conibear, Emily; Fenn, Joe; Hakki, Seran; Lindeboom, Rik G. H.; Dratva, Lisa M.; Lemetais, Briac; Weight, Caroline M.; Venturini, Cristina; Kaforou, Myrsini; Levin, Michael; Kalinova, Mariya; Mann, Alex J.; Catchpole, Andrew; Knight, Julian C.; Nikolić, Marko Z.; Teichmann, Sarah A.; Killingley, Ben; Barclay, Wendy; Chain, Benjamin M.; Lalvani, Ajit; Heyderman, Robert S.; Chiu, Christopher; Noursadeghi, Mahdad

doi:10.1038/s41467-024-54764-3

Download PDF

Article
Open access
Published: 30 November 2024

SARS-CoV-2 human challenge reveals biomarkers that discriminate early and late phases of respiratory viral infections

Nature Communications volume 15, Article number: 10434 (2024) Cite this article

10k Accesses
5 Citations
60 Altmetric
Metrics details

Subjects

Abstract

Blood transcriptional biomarkers of acute viral infections typically reflect type 1 interferon (IFN) signalling, but it is not known whether there are biological differences in their regulation that can be leveraged for distinct translational applications. We use high frequency sampling in the SARS-CoV-2 human challenge model to show induction of IFN-stimulated gene (ISG) expression with different temporal and cellular profiles. MX1 gene expression correlates with a rapid and transient wave of ISG expression across all cell types, which may precede PCR detection of replicative infection. Another ISG, IFI27, shows a delayed but sustained response restricted to myeloid cells, attributable to gene and cell-specific epigenetic regulation. These findings are reproducible in experimental and naturally acquired infections with influenza, respiratory syncytial virus and rhinovirus. Blood MX1 expression is superior to IFI27 expression for diagnosis of early infection, as a correlate of viral load and for discrimination of virus culture positivity. Therefore, MX1 expression offers potential to stratify patients for antiviral therapy or infection control interventions. Blood IFI27 expression is superior to MX1 expression for diagnostic accuracy across the time course of symptomatic infection and thereby, offers higher diagnostic yield for respiratory virus infections that incur a delay between transmission and testing.

Myeloid cell interferon responses correlate with clearance of SARS-CoV-2

Article Open access 03 February 2022

Genetically diverse mouse models of SARS-CoV-2 infection reproduce clinical variation in type I interferon and cytokine responses in COVID-19

Article Open access 25 July 2023

Type-I interferon signatures in SARS-CoV-2 infected Huh7 cells

Article Open access 18 May 2021

Introduction

Host response biomarkers of viral infection have multiple potential clinical applications. These include diagnostic triage tests to direct prioritisation of confirmatory laboratory investigations, to guide clinical management decisions with the aim of reducing unnecessary antibacterial prescribing, or trigger infection control measures and antiviral treatment. Attention has mostly focussed on biomarker discovery in whole blood samples that enable easy and technically consistent access. Genome-wide transcriptional profiling has emerged as the most common unbiased data-driven approach due to the maturity of technical and analytical workflows¹.

Numerous blood transcriptional signatures for host responses to viral infections have been identified in this way using case-control studies of natural infection or experimental viral challenge in humans, designed to discover the most parsimonious measurements that discriminate viral infections from healthy controls or other diseases. We previously tested the accuracy of such blood transcriptional signatures of viral infection, identified by systematic review, to detect incident SARS-CoV-2 infection². We showed that the majority were highly correlated, and collectively driven by type 1 interferon (IFN) responses. Many, including single gene transcripts (such as that of IFI27) provided near perfect discrimination of PCR positive individuals compared to uninfected controls. In some, the transcriptional biomarkers identified infections before the first positive viral PCR in nasopharyngeal samples. The sensitivity of IFI27 measurements was further leveraged to provide evidence for abortive infections associated with virus specific T cell responses without detection of the virus by PCR³.

In observational studies of natural infection, it is not possible to synchronise the time course of exposure and replicative infection. This has precluded identification of temporally distinct host response biomarkers that may offer optimal solutions for different translational applications such as diagnostic triage or patient stratification for antiviral therapies. To address this limitation, we leveraged the first controlled human challenge model of SARS-CoV-2 infection, complemented with high frequency sampling to measure viral replication and host responses spanning the full time course of viral replication⁴. We updated our previous systematic review to undertake comprehensive head-to-head evaluation of all reported host transcriptional signatures of viral infection to date. We compared their ability to discriminate between groups of participants with and without evidence of replicative infection using whole blood samples stratified by time since experimental inoculation. For selected biomarkers, representative of differential host-responses over the time course, we evaluated associations with symptoms and viral load. We investigated their cellular source in single cell transcriptomic data, and the potential epigenetic mechanisms that may underpin their differential expression. We compared their measurement in blood and nasal swabs and explored the extent to which our findings were generalizable to other respiratory viruses in both experimental challenge and natural infection studies.

Results

Blood transcriptional signatures of viral infection

We updated our previous systematic review of the literature, to identify 26 blood transcriptional signatures associated with viral infection (Supplementary Fig. 1A, Supplementary Table 1, Supplementary Data 1)^{5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29}. These included six single gene biomarkers. The remaining multigene signatures were made up of 2-47 constituent genes. The composition of these signatures was generally distinct, reflected by low Jaccard indices in a matrix of pairwise comparisons (Supplementary Fig. 1B).

Viral infection outcomes in the SARS-CoV-2 controlled human challenge model

33 SARS-CoV-2 seronegative healthy volunteers subjected to nasal inoculation of a standardized dose of SARS-CoV-2 divided into two groups with (N = 17) and without (N = 16) evidence of sustained replicative infection from 2 days after challenge (Fig. 1). Although the individual viral load profiles were different in nose and throat swabs, both measurements segregated the same participants into two groups with and without replicative infection.

**Fig. 1: SARS-CoV-2 PCR viral load in nose and throat swabs following virus challenge.**

Blood transcriptional biomarker discrimination of participants with and without sustained replicative SARS-CoV-2 infection

Blood transcriptional biomarker scores were calculated for each of the 26 signatures identified by systematic review, from RNA sequencing of whole blood samples at selected time points before and after viral inoculation (Fig. 2A). Across this time course, all the biomarkers showed a transient increase in expression (Supplementary Fig. 3) associated with replicative SARS-CoV-2 infection. We first ranked all biomarkers by their ability to discriminate between participants with and without replicative infection by area under the receiver operating characteristic curve (AUROC) across 14 days. We limited calculations to data from days 3, 7, 10 and 14, in order to achieve equal sampling frequency distribution across the time course of infection (Supplementary Fig. 3). Point estimates of the AUROCs ranged between 0.6–0.99. 22 of the 26 biomarkers with point estimates ranging 0.92–0.99 were statistically comparable with overlapping 95% confidence intervals, suggesting most biomarkers were able to accurately discriminate participants with and without replicative infection.

**Fig. 2: Blood transcriptional discrimination of participants with and without replicative infection by time from SARS-CoV-2 challenge.**

Identification of blood transcriptional biomarkers of early and late phases of SARS-CoV-2 infection

Next, we compared the AUROC of each signature stratified by time point. Most achieved near perfect discrimination of participants with and without replicative infection on days 4–10 (Supplementary Fig. 4A). We found greater variation in performance of each signature before and after this time interval, suggesting differential ability to identify early and late phases of viral infection. To investigate this hypothesis further, we focused on the single gene transcripts with highest AUROC on day 3 (MX1) and on day 14 (IFI27). On day 3, MX1 achieved an AUROC of 0.97 (0.93-1) which reduced to 0.8 (0.64-0.96) by day 14. In contrast, IFI27 achieved an AUROC of 0.73 (0.56-0.91) on day 3, increasing to 1 by day 14 (Fig. 2A). These findings reflected an early but transient increase in MX1 expression and a comparatively delayed but sustained increase in IFI27 expression (Fig. 2B). These distinct temporal profiles were comparable in male and female individuals (Supplementary Fig. 4B). A number of other single gene biomarkers (IFI44L, IFIT3 and RSAD2) were highly correlated to MX1 and distinct from IFI27 (Supplementary Fig. 4C).

Relationship of MX1 and IFI27 expression in blood to symptoms and SARS-CoV-2 viral load

Most biomarker discovery and validation has focused on naturally acquired symptomatic viral infection. We and others have shown that host response biomarkers are able to detect asymptomatic infection^2,12. Consistent with this, we found no correlation between blood transcriptional scores and prospective quantitation of daily symptom scores among individuals who developed replicative infection (Fig. 3A). We used a pre-specified threshold of Z score > 2 to indicate increased biomarker levels compared to baseline with 98% specificity. We found elevated biomarker scores at time points in which participants who experienced replicative infection were completely asymptomatic. This was more evident with MX1 measurements at early time points and with IFI27 measurements at late time points.

**Fig. 3: Relationship between blood transcript levels of MX1 and IFI27 with symptoms and viral load by time from SARS-CoV-2 challenge.**

In addition, we investigated the relationship between blood transcriptional signature scores and viral load stratified by time from inoculation in samples from individuals who developed replicative infection. Examples of elevated MX1 and IFI27 scores (Z > 2) were evident at time points with negative virus PCR in contemporary nose or throat swabs (Fig. 3B). Elevated MX1 scores associated with negative virus PCR tests were more evident at early time points, and elevated IFI27 scores associated with negative virus PCR tests were more evident at late time points. Importantly, MX1 and IFI27 scores in the normal range ( < Z2) were also evident at time points with positive virus PCR tests in contemporary samples. False-negative biomarker results were more evident for IFI27 at early time points, and for MX1 at late time points. To underscore the differential temporal relationship of each biomarker with viral load, we examined longitudinal biomarker measurements per participant who developed replicative infection, indexed by time from first PCR detection of virus ( > 4 Log10 copies/mL) in nasal swabs, which we have recently reported to correlate best with viral emissions³⁰. The rise in MX1 scores was generally co-incident with PCR detection of the virus, and in some individuals evident before detection of virus by PCR. However, the MX1 response generally peaked before the peak in viral load, suggesting that clearance of MX1 transcript enrichment was faster than clearance of the virus. In contrast, IFI27 scores increased after detection of the virus and remained elevated after viral load started to fall (Fig. 3C). Among time points in which at least one of these biomarkers was elevated ( > Z2) to signify replicative infection the ratio of MX1:IFI27 levels best correlated with time from virus challenge. In this analysis, the transition from a predominant MX1 response to a predominant IFI27 response occurred 5 days after virus challenge (Supplementary Fig. 5).

Both blood transcriptional biomarkers showed statistically significant correlation with viral load when including PCR negative time points (Supplementary Fig. 6A) consistent with the fact that they provided good discrimination of groups of participants with and without replicative infection. However, when restricting the analysis to time points with positive virus PCR tests, we found a significant correlation only to MX1, suggesting this biomarker provided better prediction of viral load than IFI27, which remained elevated at later time points (Fig. 3B, Supplementary Fig. 6B). Consistent with this observation, we also found that MX1 provided a better biomarker of infectiousness than IFI27, by predicting positive viral culture in contemporary samples. Among individuals who developed replicative infection, blood MX1 transcript levels discriminated virus culture positivity in nose or throat samples with AUROC 0.85 (0.79-0.92), significantly better than IFI27 which achieved AUROC of 0.66 (0.57-0.75). In this analysis false positive MX1 levels were limited to early time points, consistent with the observation that the rise in MX1 levels can precede PCR detection of the virus (Fig. 4).

**Fig. 4: Discrimination of virus culture positivity by blood transcript levels of MX1 and IFI27.**

Differential regulation of MX1 and IFI27 expression in blood

Both MX1 and IFI27 are widely recognised as interferon stimulated genes (ISG)^31,32. To explore this relationship among participants in the replicative infection group, we compared MX1 and IFI27 levels with the average expression of a multigene signature (“STAT1 regulated module”) that we had previously derived and validated as a measure of type 1 IFN bioactivity³³. Both biomarkers showed a statistically significant correlation with the STAT1 module, but the relationship with MX1 was stronger with near perfect correlation and covariance, suggesting that IFI27 expression was subject to additional levels of transcriptional regulation (Fig. 5A, B). To obtain a deeper insight into the mechanisms of differential regulation of MX1 and IFI27, we investigated their expression in our previously reported single cell RNA sequencing analysis of PBMC from a subset of participants with replicative infection in the present SARS-CoV-2 challenge study³⁴. We found a clear increase of MX1 expression in all major PBMC subsets in pooled day 3 data, and subsequent reduction by day 7. In contrast, increased expression of IFI27 was almost exclusively restricted to myeloid cells (monocytes and conventional dendritic cells). Modest upregulation was evident at day 3, but then increased further at day 7 and day 10 before reducing again by day 14, although expression levels remained higher than baseline through to day 28 (Fig. 5C).

In published ATAC sequencing data³⁵, we tested the hypothesis that differential time and cellular distribution of MX1 and IFI27 expression reflected differential epigenetic regulation (chromatin accessibility) of MX1 and IFI27 loci in circulating immune cells. In datasets from unstimulated monocytes, CD4 T effector cells and B-cells from healthy individuals, we found evidence that the MX1 locus contained areas of open chromatin (enrichment of sequencing peaks) close to the transcription start site and exon-1 (Supplementary Figs. 7A, B), which would enable rapid transcriptional upregulation of this gene across multiple cell types. In contrast, the IFI27 locus contained little evidence of open chromatin (Supplementary Figs. 7A, B) in any of these cell types, and therefore inaccessible for rapid transcriptional upregulation. To evaluate subsequent epigenetic modifications following infection, we leveraged single cell ATACseq data from patients admitted to hospital with COVID-19³⁶. Despite the sparsity in single cell data and relatively low coverage of the IFI27 locus, in samples from patients with acute COVID-19, we found a higher number of IF27 sequencing reads in monocytes compared to all lymphocyte populations. This difference was less evident in data from convalescent patients (Supplementary Fig. 7C), and consistent with transient cell-type specific opening of the IFI27 locus in established infection, providing a mechanistic basis for the temporal delay and cellular restriction of IFI27 responses compared to MX1.

Generalisable differential utility of blood MX1 and IFI27 transcriptional biomarkers in acute respiratory virus infection

In order to investigate whether the differential host responses represented by MX1 and IFI27 were generalisable to other acute respiratory viral infections, we investigated their expression profiles in collated data from previously reported influenza, respiratory syncytial virus, and rhinovirus human challenges among participants with evidence of infection following inoculation as per original study definitions³⁷. In every case MX1 upregulation in whole blood transcriptional profiles preceded that of IFI27 (Fig. 6A). The data from these experiments were limited to ~6 days post-challenge and did not allow us to fully compare the temporal profiles of these biomarker measurements to the present SARS-CoV-2 challenge. Therefore, we undertook transcriptional profiling of blood samples from another recent H3N2 influenza human challenge model that included sampling beyond day 7³⁸. This analysis also reproduced our findings in the SARS-CoV-2 challenge (Fig. 6B).

**Fig. 6: Generalisable differences in temporal profiles of blood MX1 and IFI27 expression in diverse respiratory virus challenges.**

We further sought to extend the generalisability of our findings to natural infections. In a household contact study of index cases with COVID-19 spanning pre-alpha, and alpha-virus (B.1.1.7) pandemic waves in the UK³⁹, blood transcript levels of MX1 and IFI27 achieved equivalently good discrimination of contacts with and without prevalent SARS-CoV-2 infection at recruitment (day 0, AUROC 0.97, 0.92-1). This level of discrimination was maintained for IFI27 in follow up samples 7 days later, but significantly reduced for MX1, consistent with earlier resolution of this biomarker (Fig. 7A). In a further data set from patients with unselected community acquired respiratory virus infections, we evaluated MX1 and IFI27 expression in whole blood transcriptional profiles of individuals with PCR confirmed respiratory virus infections within 48 hours of symptom onset, in four sequential samples on alternate days⁴⁰. Compared to baseline (pre-infection) samples from the same individuals, increased levels of MX1 expression (Z > 2) were largely confined to early time points day 0-2 after presentation within 4 days of symptom onset. Increased levels of IFI27 expression (Z > 2) were evident over a longer time course including day 4-6 after presentation, up to 8 days after symptom onset (Fig. 7B, Supplementary Fig. 8A). Across all time points, IFI27 measurements achieved statistically better AUROC than MX1 measurements for discrimination of infection from baseline uninfected samples (Supplementary Fig. 8B). However, when the analysis was stratified by sample time point, MX1 achieved the highest AUROC for discrimination of infected samples on the day of presentation (Supplementary Fig. 8C). The AUROC for MX1 reduced significantly at each subsequent time point. The time point stratified analysis of IFI27, showed stable AUROC discrimination of infection. A combined biomarker signature, comprising the average expression of MX1 and IFI27 improved the AUROC discrimination at early time points compared to IFI27 alone, and at late time points compared to MX1 alone (Supplementary Fig. 8C).

**Fig. 7: Differences in temporal profiles of blood MX1 and IFI27 expression in naturally acquired respiratory virus infections, and delayed responses in the nose to virus challenge.**

Comparison of host response biomarkers of acute respiratory virus infection in blood and nose samples

The potential to measure host response transcriptional signatures in samples from upper respiratory tract swabs has recently been reported^41,42. We compared MX1 and IFI27 transcript measurements in samples from blood and nose swabs in the present SARS-CoV-2 challenge. Surface nose swabs only yielded adequate RNA for sequencing in 103 of 238 samples (43%), reflecting an inherent technical limitation in this approach. Nonetheless, for nose samples which did yield RNA sequencing data, we found clear evidence of MX1 and IFI27 responses in participants who developed a replicative infection. In comparison to blood measurements of these biomarkers, the signal strength in nose swab samples was weaker than in blood, the response in the nose was delayed in comparison to blood, and the differential time course for each biomarker evident in blood samples was lost in nose swab samples (Fig. 7C). These findings were replicated in blood and nasal mucosal curettage samples from the H3N2 influenza human challenge and indicate that in general, blood biomarker measurements are likely to provide better diagnostic discrimination for prevalent infection as well as better differentiation of early and late phases of infection, compared to nasal swabs (Fig. 7C).

Discussion

We present a comprehensive evaluation of previously reported transcriptional signatures as host response biomarkers of viral infection in high frequency longitudinal blood and nose swab samples from the first SARS-CoV-2 human challenge experiment. We provide compelling evidence showing that single gene transcripts for MX1 and IFI27 in blood, discriminate temporally distinct phases of infection, and we show that these findings are generalisable across a range of clinically important respiratory viruses in both experimental and naturally acquired infections. The earliest phase of replicative SARS-CoV-2 infection was associated with rapid upregulation of MX1 transcripts in blood, which may precede PCR detection of the virus and correlated with PCR positive viral load measurements. In contrast, blood transcriptional upregulation of IFI27 occurred after PCR detection of the virus. IFI27 expression did not correlate with PCR positive viral load measurements and was sustained above baseline levels after viral clearance. Of note, transcriptional upregulation of both biomarkers was independent of symptoms.

Both MX1 and IFI27 are widely recognised as ISGs^31,32. The MX1 response closely reflected generalised type 1 ISG expression across all major cell types. We focused on MX1 because it achieved the highest single gene point estimate AUROC for discriminating groups of individuals with and without replicative viral infection at the first time point at which any biomarker achieved significant discrimination. Alternative interferon inducible single gene biomarkers such as IFI44L, IFIT3 and RSAD2 provided statistically comparable discrimination at this time point, and are highly correlated to MX1. These biomarkers are likely to share the same mechanisms for transcriptional regulation, and offer the same utility as MX1. Delayed transcriptional upregulation of IFI27 compared to other canonical ISGs has also been reported following in vitro stimulation of cells with IFN^43,44. In vivo, IFI27 expression in blood samples was restricted to myeloid PBMC. We found evidence of differential epigenetic silencing of the IFI27 locus compared to the MX1 locus in resting PBMC, and cell type specific epigenetic modulation of this locus in monocytes during established COVID-19 infection. These data provide a mechanistic explanation for the differential temporal and cellular expression of the two biomarkers, namely that IFI27 is epigenetically silenced in resting cells but becomes accessible for transcription in specific myeloid lineages during the evolving immune response to infection.

The differential temporal expression of MX1 and IFI27 in the SARS-CoV-2 challenge model was replicated in challenge experiments with multiple influenza strains, respiratory syncytial virus, or rhinovirus, and in data from household contacts with naturally acquired SARS-CoV-2 infection. Likewise, in unselected community acquired symptomatic respiratory virus infections, in which MX1 measurements achieved high diagnostic accuracy for infections within 4 days of symptom onset, and IFI27 measurements achieved higher diagnostic accuracy at later time points. The combination of both measurements provided highest diagnostic accuracy across all time points.

In SARS-CoV-2 and H3N2 Influenza challenge experiments, we found no evidence that measuring these biomarkers in nose swab samples offered any advantage to blood samples. Since IFI27 expression in blood is largely restricted to myeloid cells, quantitatively lower Z scores for IFI27 in nasal swabs may reflect lower sensitivity for host RNA biomarkers in superficial nasal swabs because these swabs capture fewer immune cells from sub-epithelial layers. MX1 expression appears to be less restricted to specific cell types. Therefore, lower MX1 Z scores suggest that nasal swabs may also be less sensitive because the sample provides lower sequencing read depth or the cells captured by nasal swabs do not upregulate IFN stimulated gene expression to the same extent as circulating immune cells. Unexpectedly, upregulation of these biomarkers in nose samples of individuals with replicative infection was also temporally delayed compared to the blood. This finding is also evident in our analysis of comparative single cell sequencing data from blood and nose swab samples. Whether it reflects faster transmission of IFN signalling to circulating blood cells, later onset of viral replication in the nose compared to the throat, or local suppression of IFN signalling by the virus in the nasal mucosa require future mechanistic investigation.

Comprehensive identification of reported blood transcriptional biomarkers of viral infection by systematic review, and their application in standardised SARS-CoV-2 and influenza human challenges with high frequency sampling are major strengths of this study, thus enabling identification of differential temporal profiles of MX1 and IFI27 responses. Single cell data from the SARS-CoV-2 challenge model, and analyses of publicly available data also allowed investigation of the mechanism for differential temporal profiles of MX1 and IFI27 responses, and to confirm reproducibility of our findings across a range of respiratory virus infections.

Our key conceptual advance is that different blood transcriptional biomarkers discriminate early and late phases of infection as a result of cell-specific epigenetic regulation and have important translational applications for respiratory virus infections. MX1 transcripts may have specific diagnostic utility in early pre-symptomatic infection. Their correlation with viral load and infectiousness, represents the first evidence for a biomarker that can be used to identify patients most likely to benefit from antiviral treatments and infection control measures, such as quarantine/self-isolation. Finally, the delay between transmission and testing in naturally acquired infection means that overall IFI27 transcripts achieve greater diagnostic accuracy than MX1 transcripts as a diagnostic triage test for respiratory virus infection. To take full advantage of the differential temporal profiles, a combined approach including both MX1 and IFI27 (as averaged expression, or where a positive test for either gene triggers further confirmatory testing) may be the optimal approach to diagnostic triage.

Our conclusions are currently limited to data derived from individuals with non-severe infection. Therefore, future validation in hospitalised cohorts for whom the time of exposure can be estimated will be required to assess whether severe disease alters the temporal profiles of these biomarkers. In addition, whether anti-type 1 IFN antibodies which have emerged as a risk factor for severe COVID-19⁴⁵ may reduce the expression of IFN-inducible genes and thereby reduce the sensitivity of MX1 and IFI27 as diagnostic biomarkers will need to be addressed in future studies. Finally, we do not address the specificity of our findings for respiratory virus infections. Therefore, we have limited our discussion of potential translational applications to diagnostic triage tests to trigger confirmatory virological investigations, stratification of patients with confirmed viral infections for antiviral treatment, and pre-symptomatic screening of contacts of index cases of confirmed viral infections. Notably, for each of these applications, the generalisability of blood transcriptional biomarkers across respiratory viruses may be considered a strength. Translation of these viral biomarkers to near-patient platforms is now required to enable further evaluation of clinical utility and impact in prospective observational and interventional studies.

Methods

Research ethics

Regulatory approvals for the human studies presented herein were provided by the UK Health Research Authority under the following reference numbers: 20/UK/2001 and 20/UK/0002 for the SARS-CoV-2 challenge study; 20/NW/0231 for the INSTINCT study; 19/LO/1441 for the H3N2 influenza challenge study. Written informed consent was provided by all participants directly or by legally authorized representatives for participants under the age of 18.

Identification of blood transcriptional signatures of viral infection

We updated our previous systematic review of blood transcriptional biomarkers for viral infection² (Supplementary Data 1). In the current analysis, we amended our previous eligibility criteria to identify concise blood transcriptional signatures discovered or applied with a primary objective of diagnosis of viral infection from human whole-blood or peripheral blood mononuclear cell samples, excluding those exclusively intended to stratify severity of infection. Other eligibility criteria remained the same as our previous review. In our update, we searched MEDLINE for articles published up to 31 December 2022, using comprehensive MeSH and keyword terms for “viral infection”, “transcriptome”, “biomarker”, and “blood”, as previously². Additional studies were identified in reference lists. Title and abstract screening was independently performed by two reviewers (CT and JGB); shortlisted articles were reviewed in full, with input from a third reviewer (RKG) to resolve conflicts. For eligible signatures, constituent genes, modelling approaches and gene weightings were extracted, with verification by a second reviewer. Multi-gene signatures are referred to using a prefix of the first-author’s name from the corresponding publication, and a suffix of the number of component genes. Single-gene signatures are referred to by the gene symbol.

Human challenge and patient cohorts

The SARS-CoV-2 human challenge model has been described previously⁴. Briefly, 36 SARS-CoV-2 unvaccinated seronegative healthy volunteers (age range 18–29 years, 22% female sex, 90% White or Caucasian ancestry) were inoculated intranasally with a standardized dose of D614G-containing pre-alpha wild-type SARS-CoV-2 under quarantine conditions (Fig. 1A). From 24 hours after inoculation, virus was quantified by PCR and culture in samples obtained at 12 hourly intervals from nose (mid-turbinate) and throat swabs for at least 14 days of quarantine, or longer if they remained in quarantine beyond 14 days because they still had detectable virus. A final sample was obtained at 28 days after challenge. Blood samples for RNA sequencing were collected into PAXgene tubes (Qiagen) before virus challenge, 6 hours after challenge, daily thereafter for 14 days and on day 28. Mid-turbinate nose swabs (MW013, MedWire) for RNA sequencing were collected before virus challenge, and on days 1, 3, 5, 7, 10 and 14 after challenge, preserved in RNAprotect (Qiagen). Symptom diaries were collected and viral load was quantified by quantitative RT-PCR using N-gene primers (Forward: GACCCCAAAATCAGCGAAAT, Reverse: TCTGGTTACTGCCAGTTGAATCTG, Probe: ACCCCGCATTACGTTTGGTGGACC), and by culture using focus forming assay (FFA) in Vero cells, as previously described⁴. Three individuals were excluded from the present analysis; one individual who opted out of genetic testing (including RNA sequencing); two individuals who seroconverted in the interval between screening and inoculation were excluded from the present analysis, on the basis that they experienced a recent infection that may affect the biomarker expression that is the focus of this study.

The SARS-CoV-2 household contact study (INSTINCT) has been described previously³⁹. Briefly, 52 household contacts (age range 7–79 years, 48% female sex, 90% White ancestry) of SARS-CoV-2 infected index cases recruited within 5 days of index case symptom onset provided nasopharyngeal swabs and blood RNA samples collected in PAXgene tubes on day of enrolment (day 0), day 7, day 14 and day 28. Nasopharyngeal swabs were used to measure viral copy number using PCR against the E-gene (Forward: ACAGGTACGTTAATAGTTAATAGCGT, Reverse: ATATTGCAGCAGTACGCACACA; Probe: ACACTAGCCATCCTTACTGCGCTTCG-BBQ) as previously described⁴⁶.

The Influenza H3N2 human challenge model has been described previously³⁸. Briefly, 20 healthy volunteers (age range 22–55 years, 50% female sex, 68% White, 32% Black or Asian ancestry) were inoculated intranasally with a standardized dose of Influenza A/Belgium/4217/2015 (H3N2) under quarantine conditions. From 24 hours after inoculation, virus was quantified by PCR in nasal lavage samples obtained at 12 hourly intervals. Participants were ascertained to have replicative viral infection if found to have consecutive positive PCR tests at least 24 hours after challenge. Blood samples for RNA sequencing (available from 19 participants) were collected into PAXgene tubes before virus challenge and days 1, 2, 3, 7, 10, 14, and 28 after challenge. Nasal curettage samples (available from 17 participants) were collected on days -14 (baseline), 1, 2, 3, 7, 10 and 14 and preserved in TRIzol (ThermoFisher Scientific) as previously described⁴⁷.

All RNA samples were stored at −80 °C until processing.

Transcriptional profiling

Total RNA was extracted from SARS-CoV-2 challenge PAXgene tubes using the PAXgene Blood RNA kit (Qiagen), including on-column DNase treatment and depletion of globin mRNA using the GLOBINclear Human Kit (Thermo Fisher Scientific). Total RNA was extracted from the INSTINCT SARS-CoV-2 household contact study and the H3N2 influenza challenge PAXgene tubes using the Qiasymphony PAXgene blood RNA kit (Qiagen), with subsequent DNase I treatment (Zymo) and clean-up using the RNA Clean and Concentrator-96 kit (Zymo), followed by globin mRNA and rRNA depletion using NEBNext® Globin & rRNA Depletion kits (New England BioLabs). Total RNA from SARS-CoV-2 challenge nasopharyngeal swabs and curettage samples was extracted using the RNeasy mini kit (Qiagen), including on-column DNase treatment. RNA concentrations were quantified using Qubit 2.0 Fluorometer (ThermoFisher Scientific). RNA integrity scores were determined using the Bioanalyser (RNA Nano 6000 Chip, Agilent) or 4200 TapeStation (Agilent).

Blood RNA samples from SARS-CoV-2 challenge underwent total RNA sequencing. DNA libraries were constructed using the KAPA RNA HyperPrep Kit with RiboErase (Roche) and sequenced on the Illumina NovaSeq 6000 platform using the NovaSeq 6000 S4 Reagent Kit (200 cycles) (Illumina), giving a median of 69.1 million (range 29.3–152.8) 100 base pair (bp) paired-end reads per sample. Nose swab RNA samples underwent mRNA sequencing. DNA libraries were constructed using the Kappa mRNA HyperPrep kit (Roche) and sequenced on the Illumina NextSeq platform the using the NextSeq 500/550 High Output Kit (75 cycles) (Illumina), giving a median of 32.3 million (range 3.2–176.2) 41 bp paired-end reads per sample. Blood RNA samples from the INSTINCT SARS-CoV-2 household contact study underwent mRNA sequencing. DNA libraries were constructed using the NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina (New England Biolabs) and sequenced on the Illumina HiSeq 4000 using the HiSeq 3000/4000 PE Cluster and SBS kits (Illumina), giving a median of 26.1 million (range 18.34–56.04) 75 bp paired-end reads per sample. Nose curettage RNA samples from the H3N2 human challenge underwent mRNA sequencing. DNA libraries were constructed using the NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina (New England Biolabs) and sequenced on the Illumina HiSeq 4000 using the HiSeq 3000/4000 SBS kit (Illumina), giving a median of 74.9 million (range 44.2–122) 75 bp paired-end reads per sample. Whole blood RNA samples from the H3N2 influenza challenge underwent mRNA sequencing, DNA libraries were constructed with the NEBNext® Ultra II Directional RNA Library Prep Kit for Illumina (New England BioLabs) and sequenced on the Illumina NovaSeq 6000 platform using the NovaSeq 6000 S2 200 cycles Flowcell (Illumina), with a target of 40 million paired-end reads per sample.

SARS-CoV-2 challenge sequencing reads were mapped to the reference transcriptome (Ensembl Human GRCh38 release 108) using Kallisto (version 0.46.1)⁴⁸. 360 blood RNA samples from 33 seronegative individuals gave a median of 28.7 million (range 11.6–73.5) mapped reads per sample. 92 nose swab RNA samples gave a median of 23.7 million (range 2.5–138.6) mapped reads per sample. Transcript-level output Deseq2 normalised counts and transcripts per million values were summed on gene level and annotated with Ensembl gene ID, gene name, and gene biotype using the tximport (version 1.20.0) and biomaRt (version 2.48.0) Bioconductor packages in R^{49,50,51,52,53}.

Sequencing reads from the INSTINCT SARS-CoV-2 household contact study were mapped to the reference transcriptome (NCBI Human GRCh38) using STAR aligner (version 2.7.1a)⁵⁴. 134 blood RNA samples from 52 individuals gave a median of 14.26 million (range 7.30–37.67) mapped reads per sample. Read count matrices were generated using featureCounts from the Rsubread package⁵⁵ and normalised using the variance stabilised transformation from the DESeq2 package.

For the sequencing reads of whole blood RNA from the H3N2 influenza challenge, quality control was performed using with FastQC (v 0.11.7; https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and adapter sequences were removed using Trimmomatic (v 0.36)⁵⁶. The reads were mapped against GRCh38 reference genome using STAR aligner (v 2.7.1a). The featureCounts tool from Subread package (v 1.5.2) was used for transcript quantification. Computed gene counts were used for downstream analyses. Whole blood samples from 19 donors gave a median of 10.1 million (range 8.5–12.7) mapped reads per sample (read length = 100 bp). Transcript-level output Deseq2 normalised counts were annotated with Ensembl gene ID and gene name using biomaRt (version 2.46.3) Bioconductor packages in R.Sequencing reads from the H3N2 challenge were mapped to the reference transcriptome (Ensembl Human GRCh38.p13) using STAR (version 2.7.10a). Nasal RNA samples from 17 donors gave a median of 21.4 million (range 3.32–34.7) mapped reads per sample. Transcript-level output Deseq2 normalised counts were annotated with Ensembl gene ID and gene name using biomaRt (version 2.52.0) Bioconductor packages in R.

For RNAseq datasets that were generated in more than one batch, processing batch effects were excluded by principal component analysis showing that data sets did not cluster separately by sample processing batch (Supplementary Fig. 2). Additional genome-wide transcriptomic microarray data were derived from previously published experimental challenge datasets of other respiratory viruses (GEO accession: GSE73072)³⁷ and from a natural infection study of respiratory viruses (GEO accession: GSE68310)⁴⁰. In each case, we used log-2 transformed and normalised data matrices to quantify biomarker scores, standardised to baseline samples.

Signature scores

Analyses were performed in R (version 4.2.2). Biomarker levels were represented by expression values for single genes. Multi-gene transcriptional signatures were calculated as per the original author’s descriptions using transcripts per million values, as previously. In one example (Steinbrink19), 3 of 19 genes in the original signature were not available in our RNA sequencing data set, but we included this signature on the basis that the signature score excluding the missing genes still achieved good discrimination of study participants with and without replicative infection across the time course (Supplementary Fig. 3-4). Scores were standardised to Z scores by subtracting the mean and dividing by the standard deviation of pre-inoculation samples, and were multiplied by -1 for scores intended to decrease in the presence of viral infection. Discrimination of each signature for the outcome of replicative infection was calculated as the area under the receiver operating characteristic curve (AUROC), with 95% confidence intervals, and stratified by day since inoculation, using the pROC R package⁵⁷. Correlation between signatures and with viral loads was quantified as Spearman rank correlation coefficients using the ggpubr R package. Graphs were plotted using the ggplot2 R package.

Analysis of ATACseq data

Publicly available ATAC (Assay for Transposase Accessible Chromatin) sequencing fastq datasets derived from unstimulated human monocytes, B-cells and CD4 T-effector cells (GEO accession: GSE118189, European Nucleotide Archive accession: PRJNA484801)³⁵ were analysed with the nf-core ATAC-seq analysis pipeline (v2.0) curated in Nextflow^58,59, using default parameters. Adaptors were trimmed using trimgalore (v0.6.7) and reads were aligned to the reference genome (NCBI GRCh38) using BWA (v 0.7.17)⁶⁰. Duplicate reads were identified using picard (v2.27.4)⁶¹. Reads were filtered using SAMtools (v1.16.1)⁶². BEDtools (v.2.30.0)⁶³ was used to remove duplicates, reads mapping to blacklisted regions and mitochondrial DNA, multimappers, unmapped reads or those not marked as primary alignments. Replicate datasets were merged using picard for some downstream analyses. Normalised scaled bigWig files were created using BEDtools and tracks were visualised using Integrative Genomics Viewer (v2.16.0)⁶⁴. Peak calling was performed using MACS2 (v2.2.7.1)⁶⁵ in broadpeak mode. Peaks were annotated to gene features using HOMER (v4.11)⁶⁶ and a consensus peak-set was generated using BEDtools. Matrices of reads falling within consensus peaks were generated using featureCounts from the subread package (v2.0.1)⁵⁵ for quantitation.

Publicly available single-cell ATACseq data from the COMBAT consortium³⁶ (EGAD00001007963; Zenodo: https://doi.org/10.5281/zenodo.6120249) were reanalysed for read counts per cell type in established COVID infection from hospitalised COVID patients. Data were processed as described in the original publication using the ArchR software package (v0.9.3)⁶⁷. The sequencing reads at the IFI27 locus were plotted per cell type with the plotBrowserTrack function.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All source data for the analyses presented in this study are provided in the source data file. Processed RNAseq data is available at ArrayExpress for the SARS-CoV-2 challenge study (accession number: E-MTAB-12993, https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-12993). To comply with data privacy restrictions, raw sequencing data is available under managed access through the European Genome-Phenome Archive (https://ega-archive.org), under the following accession numbers: EGAD50000000942 for SARS-CoV-2 challenge study (https://ega-archive.org/datasets/EGAD50000000942), EGAD50000000956 for Influenza H3N2 challenge study (https://ega-archive.org/datasets/EGAD50000000956), and EGAD50000000684 for the INSTINCT SARS-CoV-2 household contact study https://ega-archive.org/datasets/EGAD50000000684). Data will be shared with investigators whose proposed use is within the scope of participant consent subject to a data access agreement. RNAseq data are also available from Gene Expression Omnibus (GEO) under the following accession numbers: GSE73072 for previous human challenge studies, and GSE68310 for previous community acquired respiratory virus infections. ATACseq data are available from GEO under accession number GSE118189 for unstimulated PBMC, and the European Phenome Genome Archive under accession number EGAD00001007963 (https://ega-archive.org/datasets/EGAD00001007931) for the COMBAT consortium data. Source data are provided with this paper.

Code availability

Custom code for deriving RNA signature Z scores from a gene expression matrix is available on Github: https://github.com/JRosenheim/SARS-CoV-2_challenge/blob/main/Viral_biomarker_scores.R (https://doi.org/10.5281/zenodo.10021757).

References

Gupta, R. K. & Noursadeghi, M. Toward a more generalizable blood RNA signature for bacterial and viral infections. Cell Rep. Med 3, 100866 (2022).
Article CAS PubMed PubMed Central Google Scholar
Gupta, R. K. et al. Blood transcriptional biomarkers of acute viral infection for detection of pre-symptomatic SARS-CoV-2 infection: a nested, case-control diagnostic accuracy study. Lancet Microbe 2, e508–e517 (2021).
Article CAS PubMed PubMed Central Google Scholar
Swadling, L. et al. Pre-existing polymerase-specific T cells expand in abortive seronegative SARS-CoV-2. Nature 601, 110–117 (2022).
Article ADS CAS PubMed Google Scholar
Killingley, B. et al. Safety, tolerability and viral kinetics during SARS-CoV-2 human challenge in young adults. Nat. Med 28, 1031–1041 (2022).
Article CAS PubMed Google Scholar
Andres-Terre, M. et al. Integrated, multi-cohort analysis identifies conserved transcriptional signatures across multiple respiratory viruses. Immunity 43, 1199–1211 (2015).
Article CAS PubMed PubMed Central Google Scholar
Cappuccio, A. et al. Multi-objective optimization identifies a specific and interpretable COVID-19 host response signature. Cell Syst. 13, 989–1001.e8 (2022).
Article CAS PubMed Google Scholar
Gómez-Carballa, A. et al. Identification of a minimal 3-transcript signature to differentiate viral from bacterial infection from best genome-wide host RNA biomarkers: a multi-cohort analysis. Int J. Mol. Sci. 22, 3148 (2021).
Article PubMed PubMed Central Google Scholar
Henrickson, S. E. et al. Genomic circuitry underlying immunological response to pediatric acute respiratory infection. Cell Rep. 22, 411–426 (2018).
Article CAS PubMed PubMed Central Google Scholar
Herberg, J. A. et al. Diagnostic test accuracy of a 2-transcript host RNA signature for discriminating bacterial vs viral infection in febrile children. JAMA 316, 835–845 (2016).
Article PubMed PubMed Central Google Scholar
Tang, B. M. et al. A novel immune biomarker IFI27 discriminates between influenza and bacteria in patients with suspected respiratory infection. Eur. Respiratory J. 49, 1602098 (2017).
Article Google Scholar
Gómez-Carballa, A. et al. A qPCR expression assay of IFI44L gene differentiates viral from bacterial infections in febrile children. Sci. Rep. 9, 11780 (2019).
Article ADS PubMed PubMed Central Google Scholar
McClain, M. T. et al. A blood-based host gene expression assay for early detection of respiratory viral infection: an index-cluster prospective cohort study. Lancet Infect. Dis. 21, 396–404 (2021).
Article CAS PubMed Google Scholar
Li, H. K. et al. Discovery and validation of a three-gene signature to distinguish COVID-19 and other viral infections in emergency infectious disease presentations: a case-control and observational cohort study. Lancet Microbe 2, e594–e603 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lopez, R., Wang, R. & Seelig, G. A molecular multi-gene classifier for disease diagnostics. Nat. Chem. 10, 746–754 (2018).
Article CAS PubMed Google Scholar
Lydon, E. C. et al. Validation of a host response test to distinguish bacterial and viral respiratory infection. EBioMedicine 48, 453–461 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pennisi, I. et al. Translation of a host blood RNA signature distinguishing bacterial from viral infection into a platform suitable for development as a point-of-care test. JAMA Pediatrics 175, 417–419 (2021).
Article PubMed PubMed Central Google Scholar
Roers, A., Hochkeppel, H. K., Horisberger, M. A., Hovanessian, A. & Haller, O. MxA gene expression after live virus vaccination: a sensitive marker for endogenous type I interferon. J. Infect. Dis. 169, 807–813 (1994).
Article CAS PubMed Google Scholar
Rao, A. M. A robust host-response-based signature distinguishes bacterial and viral infections across diverse global populations. Cell Rep. Med. 3, 100842 (2022).
Ravichandran, S. et al. VB10, a new blood biomarker for differential diagnosis and recovery monitoring of acute viral and bacterial infections. eBioMedicine 67, 103352 (2021).
Article CAS PubMed PubMed Central Google Scholar
Samy, A., Maher, M. A., Abdelsalam, N. A. & Badr, E. SARS-CoV-2 potential drugs, drug targets, and biomarkers: a viral-host interaction network-based analysis. Sci. Rep. 12, 11934 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Sampson, D. L. et al. A four-biomarker blood signature discriminates systemic inflammation due to viral infection versus other etiologies. Sci. Rep. 7, 2914 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Sampson, D. et al. Blood transcriptomic discrimination of bacterial and viral infections in the emergency department: a multi-cohort observational validation study. BMC Med 18, 185 (2020).
Article CAS PubMed PubMed Central Google Scholar
Steinbrink, J. M. et al. The host transcriptional response to Candidemia is dominated by neutrophil activation and heme biosynthesis and supports novel diagnostic approaches. Genome Med 13, 108 (2021).
Article CAS PubMed PubMed Central Google Scholar
Sweeney, T. E., Wong, H. R. & Khatri, P. Robust classification of bacterial and viral infections via integrated host gene expression diagnostics. Sci. Transl. Med 8, 346ra91 (2016).
Article PubMed PubMed Central Google Scholar
Trouillet-Assant, S. et al. Type I Interferon in Children with Viral or Bacterial Infections. Clin. Chem. 66, 802–808 (2020).
Article PubMed Google Scholar
Tsalik, E. L. et al. Host gene expression classifiers diagnose acute respiratory illness etiology. Sci. Transl. Med. 8, 322ra11–322ra11 (2016).
Article PubMed PubMed Central Google Scholar
Xu, N. et al. A two-transcript biomarker of host classifier genes for discrimination of bacterial from viral infection in acute febrile illness: a multicentre discovery and validation study. Lancet Digital Health 3, e507–e516 (2021).
Article CAS PubMed Google Scholar
Yu, J. et al. Host gene expression in nose and blood for the diagnosis of viral respiratory infection. J. Infect. Dis. 219, 1151–1161 (2019).
Article CAS PubMed Google Scholar
Zaas, A. K. et al. A host-based RT-PCR gene expression signature to identify acute respiratory viral infection. Sci. Transl. Med. 5, 203ra126-–203ra126ra126- (2013).
Article PubMed PubMed Central Google Scholar
Zhou, J. et al. Viral emissions into the air and environment after SARS-CoV-2 human challenge: a phase 1, open label, first-in-human study. SSRN Scholarly Paper at https://doi.org/10.2139/ssrn.4301808 (2022).
Haller, O. & Kochs, G. Mx genes: host determinants controlling influenza virus infection and trans-species transmission. Hum. Genet 139, 695–705 (2020).
Article CAS PubMed Google Scholar
Parker, N. & Porter, A. C. Identification of a novel gene family that includes the interferon-inducible human genes 6–16 and ISG12. BMC Genomics 5, 8 (2004).
Article PubMed PubMed Central Google Scholar
Chandran, A. et al. Rapid synchronous type 1 IFN and virus-specific T cell responses characterize first wave non-severe SARS-CoV-2 infections. Cell Rep. Med 3, 100557 (2022).
Article CAS PubMed PubMed Central Google Scholar
Lindeboom, R. G. H. et al. Human SARS-CoV-2 challenge uncovers local and systemic response dynamics. Nature 631, 189–198 (2024).
Article CAS PubMed PubMed Central Google Scholar
Calderon, D. et al. Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat. Genet 51, 1494–1505 (2019).
Article CAS PubMed PubMed Central Google Scholar
Ahern, D. J. et al. A blood atlas of COVID-19 defines hallmarks of disease severity and specificity. Cell 185, 916–938.e58 (2022).
Article Google Scholar
Liu, T.-Y. et al. An individualized predictor of health and disease using paired reference and target samples. BMC Bioinforma. 17, 47 (2016).
Article Google Scholar
Temple, D. S. et al. Wearable sensor-based detection of influenza in presymptomatic and asymptomatic individuals. J. Infect. Dis. 227, 864–872 (2023).
Article PubMed Google Scholar
Derqui, N. et al. Risk factors and vectors for SARS-CoV-2 household transmission: a prospective, longitudinal cohort study. Lancet Microbe 0, e397–e408 (2023).
Article CAS Google Scholar
Zhai, Y. et al. Host transcriptional response to influenza and other acute respiratory viral infections–a prospective cohort study. PLoS Pathog. 11, e1004869 (2015).
Article PubMed PubMed Central Google Scholar
Shojaei, M. et al. IFI27 transcription is an early predictor for COVID-19 outcomes, a multi-cohort observational study. Front. Immunol. 13, 1060438 (2023).
Article PubMed PubMed Central Google Scholar
Cheemarla, N. R. et al. Nasal host response-based screening for undiagnosed respiratory viruses: a pathogen surveillance and detection study. Lancet Microbe 4, e38–e46 (2023).
Article CAS PubMed PubMed Central Google Scholar
Leaman, D. W. et al. Novel growth and death related interferon-stimulated genes (ISGs) in Melanoma: Greater Potency of IFN-β Compared with IFN-α2. J. Interferon Cytokine Res. 23, 745–756 (2003).
Article CAS PubMed Google Scholar
Rosebeck, S. & Leaman, D. W. Mitochondrial localization and pro-apoptotic effects of the interferon-inducible protein ISG12a. Apoptosis 13, 562–572 (2008).
Article CAS PubMed Google Scholar
Bastard, P. et al. Autoantibodies against type I IFNs in patients with life-threatening COVID-19. Science 370, eabd4585 (2020).
Article CAS PubMed PubMed Central Google Scholar
Rowan, A. G. et al. Optimized protocol for a quantitative SARS-CoV-2 duplex RT-qPCR assay with internal human sample sufficiency control. J. Virological Methods 294, 114174 (2021).
Article CAS Google Scholar
Dhariwal, J. et al. Mucosal Type 2 Innate Lymphoid Cells Are a Key Component of the Allergic Response to Aeroallergens. Am. J. Respir. Crit. Care Med 195, 1586–1596 (2017).
Article CAS PubMed PubMed Central Google Scholar
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).
Article CAS PubMed Google Scholar
Soneson, C., Love, M. I. & Robinson, M. D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res 4, 1521 (2015).
Article PubMed Google Scholar
Durinck, S. et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21, 3439–3440 (2005).
Article CAS PubMed Google Scholar
Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).
Article CAS PubMed PubMed Central Google Scholar
Durinck, S., Spellman, P. T., Birney, E. & Huber, W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat. Protoc. 4, 1184–1191 (2009).
Article CAS PubMed PubMed Central Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
Liao, Y., Smyth, G. K. & Shi, W. The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res. 47, e47 (2019).
Article CAS PubMed PubMed Central Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article CAS PubMed PubMed Central Google Scholar
Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinforma. 12, 77 (2011).
Article Google Scholar
Di Tommaso, P. et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35, 316–319 (2017).
Article PubMed Google Scholar
Patel, H. et al. nf-core/atacseq: nf-core/atacseq v2.0 - Iron Iguana. Zenodo https://doi.org/10.5281/zenodo.7384115 (2022).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Picard Tools - By Broad Institute. https://broadinstitute.github.io/picard/.
Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).
Article PubMed PubMed Central Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Article PubMed PubMed Central Google Scholar
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Article CAS PubMed PubMed Central Google Scholar
Granja, J. M. et al. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat. Genet 53, 403–411 (2021).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This research was supported by the Wellcome Trust (224530/Z/21/Z). AL acknowledges funding by the NIHR Health Protection Research Units (HPRU) in Respiratory Infections (NIHR200927). BMC acknowledges funding by the Rosetrees Foundation. CMB, MK and ML acknowledge funding by NIHR Biomedical Research to Imperial College London. CMW acknowledges funding from the Medical Research Council (MR/T016329/1). CT acknowledges funding from the Wellcome Trust (102186/B/13/Z). JCK acknowledges funding from NIHR Oxford Biomedical Research Centre. LCKB acknowledges funding from the NIHR (Academic Clinical Fellowship Programme). LMD acknowledges funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 955321. M.Z.N. acknowledges funding from a MRC Clinician Scientist Fellowship (MR/W00111X/1), Action Medical Research (GN2911) and funding from the Rutherford Fund Fellowship allocated by the MRC UK Regenerative Medicine Platform 2 (MR/5005579/1). MN acknowledges funding from the Wellcome Trust (207511/Z/17/Z) and by NIHR Biomedical Research Funding to UCL and UCLH. R.H. is a NIHR Senior Investigator. RKG acknowledges funding from the National Institute for Health Research (NIHR302829).

Author information

Authors and Affiliations

Division of Infection and Immunity, University College London, London, UK
Joshua Rosenheim, Clare Thakker, Tiffeney Mann, Lucy C. K. Bell, Briac Lemetais, Caroline M. Weight, Benjamin M. Chain, Robert S. Heyderman & Mahdad Noursadeghi
Institute of Health Informatics, University College London, London, UK
Rishi K. Gupta
UCL Respiratory, Division of Medicine, University College London, London, UK
Rishi K. Gupta, James Greenan-Barrett & Marko Z. Nikolić
Department of Infectious Disease, Imperial College London, London, UK
Claire M. Broderick, Loukas Papargyris, Pete Dayananda, Helen R. Wagstaffe, Myrsini Kaforou, Michael Levin, Wendy Barclay & Christopher Chiu
NIHR Health Protection Research Unit in Respiratory Infections, National Heart and Lung Institute, Imperial College London, London, UK
Kieran Madon, Emily Conibear, Joe Fenn, Seran Hakki & Ajit Lalvani
Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
Andrew J. Kwok & Julian C. Knight
Department of Medicine and Therapeutics, Faculty of Medicine, The Chinese University of Hong Kong, Shatin, Hong Kong
Andrew J. Kwok
Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
Rik G. H. Lindeboom
Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
Lisa M. Dratva & Sarah A. Teichmann
Department of Medicine, University of Cambridge, Cambridge, UK
Lisa M. Dratva & Sarah A. Teichmann
Infection, Immunity and Inflammation Department, Great Ormond Street Institute of Child Health, University College London, London, UK
Cristina Venturini
hVIVO Services Ltd, London, UK
Mariya Kalinova, Alex J. Mann & Andrew Catchpole
Department of Respiratory Medicine, University College London Hospitals NHS Foundation Trust, London, UK
Marko Z. Nikolić
Department of Infectious Diseases, University College London Hospital NHS Foundation Trust, London, UK
Ben Killingley

Authors

Joshua Rosenheim
View author publications
Search author on:PubMed Google Scholar
Rishi K. Gupta
View author publications
Search author on:PubMed Google Scholar
Clare Thakker
View author publications
Search author on:PubMed Google Scholar
Tiffeney Mann
View author publications
Search author on:PubMed Google Scholar
Lucy C. K. Bell
View author publications
Search author on:PubMed Google Scholar
Claire M. Broderick
View author publications
Search author on:PubMed Google Scholar
Kieran Madon
View author publications
Search author on:PubMed Google Scholar
Loukas Papargyris
View author publications
Search author on:PubMed Google Scholar
Pete Dayananda
View author publications
Search author on:PubMed Google Scholar
Andrew J. Kwok
View author publications
Search author on:PubMed Google Scholar
James Greenan-Barrett
View author publications
Search author on:PubMed Google Scholar
Helen R. Wagstaffe
View author publications
Search author on:PubMed Google Scholar
Emily Conibear
View author publications
Search author on:PubMed Google Scholar
Joe Fenn
View author publications
Search author on:PubMed Google Scholar
Seran Hakki
View author publications
Search author on:PubMed Google Scholar
Rik G. H. Lindeboom
View author publications
Search author on:PubMed Google Scholar
Lisa M. Dratva
View author publications
Search author on:PubMed Google Scholar
Briac Lemetais
View author publications
Search author on:PubMed Google Scholar
Caroline M. Weight
View author publications
Search author on:PubMed Google Scholar
Cristina Venturini
View author publications
Search author on:PubMed Google Scholar
Myrsini Kaforou
View author publications
Search author on:PubMed Google Scholar
Michael Levin
View author publications
Search author on:PubMed Google Scholar
Mariya Kalinova
View author publications
Search author on:PubMed Google Scholar
Alex J. Mann
View author publications
Search author on:PubMed Google Scholar
Andrew Catchpole
View author publications
Search author on:PubMed Google Scholar
Julian C. Knight
View author publications
Search author on:PubMed Google Scholar
Marko Z. Nikolić
View author publications
Search author on:PubMed Google Scholar
Sarah A. Teichmann
View author publications
Search author on:PubMed Google Scholar
Ben Killingley
View author publications
Search author on:PubMed Google Scholar
Wendy Barclay
View author publications
Search author on:PubMed Google Scholar
Benjamin M. Chain
View author publications
Search author on:PubMed Google Scholar
Ajit Lalvani
View author publications
Search author on:PubMed Google Scholar
Robert S. Heyderman
View author publications
Search author on:PubMed Google Scholar
Christopher Chiu
View author publications
Search author on:PubMed Google Scholar
Mahdad Noursadeghi
View author publications
Search author on:PubMed Google Scholar

Contributions

J.R., R.K.G., and M.N. conceived this study. R.H., C.C., and M.N. obtained funding for this study. M.K., A.J.M., A.C., B.K., W.B., and C.C. conceived, obtained funding, and supervised the conduct of the SARS-CoV-2 challenge study. C.T., L.K.B., R.K.G., J.G.B., and M.N. undertook the systematic review of blood transcriptional biomarkers of viral infection. J.R., R.K.G., L.K.B., and M.N. undertook an analysis of publicly available bulk RNAseq and ATACseq data. J.R., C.T., T.M., L.K.B., J.G.B., H.W., B.L., C.W., C.V., B.M.C., and R.H. undertook sample processing and data analysis for the SARS-CoV-2 challenge study. R.L., L.D., M.Z.N., and S.T. contributed single-cell RNAseq data from the SARS-CoV-2 challenge study. C.M.B., L.P., P.D., M.K., M.L., and C.C. contributed bulk data from the H3N2 influenza challenge study. K.M., E.C., J.F., S.H., and A.L. contributed data from the INSTINCT SARS-CoV-2 household contact study. A.J.K. and J.C.K. contributed data single cell ATACseq data from the COMBAT study. J.R., R.K.G., and M.N. wrote the manuscript with input from all the authors.

Corresponding author

Correspondence to Mahdad Noursadeghi.

Ethics declarations

Competing interests

The Authors declare the following competing interests: In the past 3 years, S.A.T. has received remuneration for scientific advisory board membership from Sanofi, GlaxoSmithKline, Foresite Labs and Qiagen. S.A.T. is a co-founder and holds equity in Transition Bio and Ensocell. From 8 January 2024, S.A.T. has been a part-time employee of GlaxoSmithKline. A.J.M., A.C., M.K., M.M. and A.B. are full time employees at hVIVO Services Ltd. No other authors report any competing interests.

Peer review

Peer review information

Nature Communications thanks Maryam Shojaei, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Rosenheim, J., Gupta, R.K., Thakker, C. et al. SARS-CoV-2 human challenge reveals biomarkers that discriminate early and late phases of respiratory viral infections. Nat Commun 15, 10434 (2024). https://doi.org/10.1038/s41467-024-54764-3

Download citation

Received: 21 July 2023
Accepted: 19 November 2024
Published: 30 November 2024
DOI: https://doi.org/10.1038/s41467-024-54764-3

This article is cited by

Longitudinal kinetics of the viral infection biomarker 3′-deoxy-3′,4′-didehydro-cytidine in SARS-CoV-2, influenza A virus and RSV human challenge models
- Ravi Mehta
- Elena Chekmeneva
- Shiranee Sriskandan
npj Viruses (2025)