Extended Data Fig. 4: SLC34A2 in ovarian cancer is likely driven by PAX8.

a) Using the combined GTEx, TCGA, and CCLE dataset, the differential expression of SLC34A2 in each tissue relative to the average of all tissues is compared. The relevant gynecological tissues (fallopian tube, ovary, and uterus) are highlighted in teal. The false discovery rate (FDR) was calculated using a two-sided Wilcoxon ranked sum test comparing each group to the average expression across all groups and correcting for multiple comparisons using Bonferoni’s method. The Cancer Genome Atlas abbreviations used include: LUAD = Lung adenocarcinoma; THCA = Thyroid carcinoma; KRP = Kidney renal cell papillary carcinoma; LUSC = Lung squamous cell carcinoma; OV = Ovarian serous cystadenocarcinoma; UCEC = Uterine corpus endometrial carcinoma. b) The expression of PAX8 and SLC34A2 mRNA in the indicated tissues is plotted. The pearson correlation within these samples is indicated. c) Expression of PAX8 across the indicated tissues was compared as in Fig. 3a. See methods for exact N values. Boxplots are drawn indicating the first and third quartiles, and whiskers span to the largest value within 1.5x the interquartile range. d) Immunoblot validation of CRISPR-interference mediated suppression of PAX8. N = 1 technical replicate, representative of N = 2 independent experiments. e) Gene expression - relative to un-perturbed, parental cell lines profiled in parallel - of reported PAX8 target genes (see main text) after stable overexpression of WT-PAX8 (‘PAX8 O/E’) and/or induction of PAX8-target (sg4) or control (sg9) sgRNA and dCas9-KRAB. Data represents a single experiment. N = 1 replicate. f) XPR1 expression across all tissues in TCGA and GTEx, with ovarian and uterine tissues highlighted in teal. Boxplots are drawn as in b. g) XPR1 copy number heatmap for a ~2.5 Mb region of chromosome 1 indicating XPR1 amplification in TCGA Uterine Corpus Endometrial Carcinoma20. Each patient sample is represented by a horizontal line. Red indicates copy gain and blue indicates copy loss. Data are a subset of the samples rank-ordered by highest copy gain to indicate both focal and chromosome arm-level gains.