Extended Data Fig. 2: Association of HPV clades with HIV status, gene expression, DNA methylation and survival.

a. HPV types in our cohort separated by HIV status (n = 72 positive samples, n = 45 negative), and clade. The x axis indicates the percentage of samples in that cohort infected by the indicated HPV type, and in brackets is the number of samples. b. Unsupervised clustering of the top 1,000 most variable genes across our cohort (n = 118 samples). q-values were determined using Benjamini-Hochberg (BH) corrected Fisher exact tests. c. Percentage of differentially methylated probes between clades (A7 = 51 samples, A9 = 56 samples) at different genomic features, by HPV clade. d. Log2 fold change and adjusted (BH) p-value of differentially expressed genes between clade A7- (n = 52) and A9-infected (n = 57) samples. e. Volcano plots showing the log2 fold change and adjusted p-value (BH) of differentially expressed genes between clade A7- (n = 52) and A9-infected (n = 57) samples associated with A9 hypermethylated (top), and A7 hypermethylated (bottom) differentially methylated regions (DMRs). f, g. top: Kernel density of E6 (f) and E7 (g) expression in the HTMCP cohort separates samples into high- and low expressing cases. bottom: gene ontologies enriched in differentially expressed genes in samples with low / high E6 (n = 68 / n = 48) (f) and E7 (n = 58 / n = 59) (g). h. Multivariate cox proportional hazards model for HPV clade, HIV status and disease stage for 66 patients. Hazard ratios and p-values reported for each variable were determined using log-rank tests. Where relevant, all statistical tests were two-sided.