Extended Data Fig. 6: Complex SVs in FA SCC and the transcriptional landscapes of FA SCC and sporadic HNSCC.
From: Genomic signature of Fanconi anaemia DNA repair pathway deficiency in cancer

a Number of somatic SV chains detected in 10x linked-read WGS of FA SCCs (n = 4), where a chain is defined as ≥ 2 discrete SVs (≥ 4 unique breakpoints). Median and IQR are indicated. b Number of SVs present in 108 SV chains in 10x linked-read WGS of FA SCCs. Mean (4.6 SVs) and IQR are indicated. c Number of SVs of indicated class present in 108 SV chains from 10x linked read WGS of FA SCCs. Means and IQRs are indicated. d SV breakpoint distribution from 108 SV chains stratified by human chromosome number. e Somatic SV burden of n = 9 PacBio-sequenced FA SCCs. 3 samples (indicated) were sequenced to 10x average coverage, and 6 samples were sequenced to 30x average coverage. f Somatic SV class proportions in n = 9 PacBio-sequenced FA SCCs. Medians and IQRs are indicated. g Illumina & PacBio % SV call overlap for SVs > 1kb and deletions < 1kb for n = 9 FA SCCs sequenced on both platforms. Shown are % of PacBio SV calls > 1kb present in Illumina BRASS output, % of PacBio deletion calls < 1kb present in Illumina indel calls, and % of Illumina SV calls > 1kb present in PacBio BAMs. Median and IQR are indicated. h Comparison of deletion sizes (<1kb) detected by SV calling in n = 9 PacBio FA SCCs and by indel calling in the same 9 FA SCCs sequenced by Illumina WGS. Median and IQR are shown. i Examples of fold-back inversions (FBI) driving sharp copy-number change at key oncogenic loci identified in FA SCCs (PacBio data). j Comparison of the raw number of unbalanced translocation events in FA SCC (n = 20), HPV-negative sporadic HNSCC (n = 23), BRCA2mut (n = 41), and BRCA1mut (n = 24) cohorts. Two-tailed Mann-Whitney U test p-values are indicated, with median and IQR shown. k Comparison of hg19 expected vs. observed percentage of somatic SV breakpoints in 9 PacBio-sequenced FA SCCs that localize to repeat regions. Unpaired two-tailed t-test p-value is indicated (t = 7.371, df = 8), with median and IQR shown. l Breakpoint density graph displaying GC% sequence composition within +/− 100bp from SV breakpoints identified in PacBio sequencing data, calculated relative to hg19 global GC% frequency (40.9%) (notated as “expected”). Median and IQR are displayed. m Comparison of hg19 expected vs. observed percentage of somatic SV breakpoints from FA SCCs (n = 20) and HPV-negative sporadic HNSCC cohorts (n = 23) that localize to repeat regions and to the indicated repeat class (Illumina WGS). Two-tailed Mann-Whitney U test p-values are indicated, with median and IQR shown. n Comparison of the number of retrotransposon element (RTE) insertions in FA SCC (n = 20), HPV-negative sporadic HNSCC (n = 23), BRCA2mut (n = 41), and BRCA1mut (n = 24) cohorts. Two-tailed Mann-Whitney U test p-values are indicated, with median and IQR shown. o Cancer-relevant genes differentially expressed between FA SCC (n = 6) and sporadic HNSCC (n = 520) as assessed by RNAseq, including genes displayed in Fig. 1c. Differential expression is gated at log2(FC) > 1 or log2(FC) <−1 with DESeq2 FDR-adjusted p-value < 0.05. DESeq2 implementation of Wald test with FDR-adjusted p-value is indicated. Genes whose relative expression are impacted by a sCNA are colored orange. Genes whose relative expression is discordant with sCNA frequency are colored blue. Genes not identified in focal sCNA peaks are colored white. GAPDH and PGK1 are indicated in black and added as housekeeping controls. p Quality-control distribution graph showing log2(FC) values of all genome-wide transcripts comparing FA SCC (n = 6) vs sporadic HNSCC (n = 520). Median and IQR is displayed. q DNA repair genes differentially expressed in FA SCC (n = 6) versus sporadic HNSCC (n = 520) by RNAseq. Differential expression is gated at log2(FC) > 1 or log2(FC) <−1 with DESeq2 FDR-adjusted p-value < 0.05. DESeq2 implementation of Wald test with FDR-adjusted p-value is indicated. r Aldehyde dehydrogenase (Aldh) and alcohol dehydrogenase (Adh) genes differentially expressed between FA SCC (n = 6) and sporadic HNSCC (n = 520). Differential expression is gated at log2(FC) > 1 or log2(FC) <−1 with DESeq2 FDR-adjusted p-value < 0.05. DESeq2 implementation of Wald test with FDR-adjusted p-value is indicated. s Gene-set enrichment/depletion (GO) analysis of genes differentially expressed between FA SCC and sporadic HNSCC. Genes entered into analysis were gated at log2(FC) > 1 or log2(FC) <−1 with DESeq2 FDR-adjusted p value < 10−5. Gene sets were gated at > 2-fold enrichment over expected background with GO Fisher’s exact test FDR-adjusted p-value < 0.01 to be reported in the figure. In all cases, n refers to independent biological samples.