Extended Data Fig. 7: Analysis of sequence similarity parameters in models of transcriptional adaptation. | Nature

Extended Data Fig. 7: Analysis of sequence similarity parameters in models of transcriptional adaptation.

From: Genetic compensation triggered by mutant mRNA degradation

Extended Data Fig. 7

a, Numbers of differentially expressed genes in the different knockout cell line models; P ≤ 0.05; these genes are distributed throughout the genome (data not shown). b, Venn diagram of genes upregulated in the three different cell line models with L2F knockout > wild-type and P ≤ 0.05. c, KEGG pathway enrichment analysis for genes commonly upregulated in Fermt2, Actg1 and Actb knockout compared to wild-type cells. The top ten pathways based on P value are displayed. The dashed line marks a P value of 0.05. Circle sizes provide an estimation of scale; outer grey circles represent the total number of genes in the pathway; and centred coloured circles represent the number of genes in the pathway that are commonly upregulated. d, Impact of various values of three different BLASTn alignment-quality parameters (alignment length, bit score and E value) on the significance of the observed correlation between upregulation and sequence similarity, and therefore the identification or prediction of putative adapting genes. The E value describes the probability of the match resulting from chance (a lower value corresponds to a lower probability), and the bit score evaluates the combination of alignment quality and length (a higher value corresponds to a better alignment).The y axis of each diagram shows the negative log10 of the P value and the x axis shows the respective parameter value. A P value of 0.05 is marked with a black horizontal line. The E value thresholds used in our analyses are highlighted with a circle. Lines ending preliminarily indicate a lack of any remaining alignments after that point. The first row of diagrams explores large variations of thresholds, in an attempt to identify the total range, whereas the second row focuses on the most relevant window for the three genes investigated. The optimal thresholds differ considerably depending on the gene analysed. n = 2 biologically independent samples. P value was computed by bootstrapping random subsamples (see the ‘Sequence similarity and subsampling analyses’ section of the Methods). P values were not corrected for multiple testing.

Back to article page