Figure 3

Complementary mathematical analysis of Pkd1- and Pkd2-responsive transcripts identified a core ADPKD data set of 178 consistently regulated genes (CD178). To identify genes involved in ADPKD pathogenesis, two independent mathematical approaches were used: (1) statistical analysis of differential gene expression (DGE); and (2) unbiased data simplification by principal component analysis (PCA). (a) and (b) Applying RNA-seq on Pkd1- and Pkd2-deficient mIMCD3 cells revealed a total of 791 differentially expressed genes in Pkd1−/− and 1250 in Pkd2−/− cells (large, dark-yellow dots: FDR < 0.05, log2 fold change ≥|1|). In Pkd1−/− cells, > 54% (429 genes) and in Pkd2−/− cells, > 63% (790 genes) were downregulated. (c) Filtering DGE of Pkd1−/− and Pkd2−/− for concordant changes provided 254 genes (DGE254), of which 38 were up- and 216 downregulated genes (orange dots). (d) 10% most variant genes were input for PCA (n = 2,506, with reads obtained for 25,062 genes). Principal component 1 (PC1) mainly accounted for non-Pkd-related differences between wildtype, Pkd1 and Pkd2 cells. PC2 discriminated Pkd1−/− and Pkd2−/− from wildtype, accounting for 29% of total variance. (e) Since PC2 correlated with Pkd1- and Pkd2-deficiency, genes were ranked according to their PC2 contribution, and the top 20% (n = 501, marked in blue) were selected for further data analysis (PCA501). (f) Comparison of DGE254 and PCA501 identified 178 candidate genes specifically responding to Pkd1- and Pkd2-deficiency in mIMCD3 cells (core data set; CD178).