Fig. 4: Somatic mutations in EIF4A1 and DDX3X, both RNA helicases of the DEAD (Asp-Glu-Ala-Asp) box protein family, are recurrent genetic lesions associated with virus-positive status.

A Combined log10(odds ratio) of mutation in genes associated with virus-positive (top) and virus-negative (bottom) status (q < 0.005) from pooled data of 1971 tumors across 9 virus-associated cancers. Data are presented as log10(odds ratio) values with error bars indicating 95% confidence intervals. The heatmap on the right displays the cancer cohorts included in the pooled data for the calculation of each gene, with colors representing mutation rate trends in each cohort (red: higher in virus-positive; blue: higher in virus-negative) and shades indicating the two-sided Fisher’s exact test p-value. HNSCC, head and neck squamous cell carcinoma; CC, cervical cancer; BL, Burkitt lymphoma; GC, gastric cancer; PBL, plasmablastic lymphoma; cHL, classical Hodgkin lymphoma; PCNSL, primary central nervous system lymphoma; MCC, Merkel cell carcinoma; HCC, hepatocellular carcinoma. B Mutations in DDX3X and EIF4A1 in 2488 tumors. * p < 0.05, two-sided binomial test. C Fraction of patients that are male by DDX3X mutation status. DDDX3X expression by DDX3X mutation status and sex in Burkitt lymphoma (n = 117). * p < 0.05, two-sided MWU test. Data are presented as median values with interquartile range (25th–75th percentile). E Frequencies of mutation of DDX3X and EIF4A1 in virus-positive tumors overall and summary of key biological functions. ANKL, aggressive NK-cell leukemia; NKTCL, Natural killer/T-cell lymphoma; CAEBV, chronic active Epstein-Barr virus disease; ATL, Adult T-cell leukemia/lymphoma. Source data are provided as a Source Data file.