Extended Data Figure 8: Cellular resolution tracking of protein expression in the C. elegans embryo.
From: Regulatory analysis of the C. elegans genome with spatiotemporal resolution

a, Cellular-resolution, protein expression levels for 180 genes (x axis) in terminal embryo cells (N = 671, y axis). For each gene, the normalized expression signal in each cell is shown (see Methods). For each gene, expression signals in cells not measured directly correspond to the expression signal of the last measured ancestor. Focus factors (FF = 13) whose binding was assayed in embryonic stages are labelled red. Factors whose binding was assayed only in larval stages are labelled blue (FL = 23). The broad tissue class of each cell is indicated in the sidebar. b, Embryonic, cellular-resolution expression data quality controls. The number of time-series recorded per gene (x axis) is shown. For genes with multiple time-series (NGR = 145), the Pearson correlation coefficient (R) in the fluorescence signals of cells recorded was calculated between NPR = 762 pairs of time-series (replicates). The distribution of correlation coefficients is shown. The median correlation co-efficient among replicate experiments is shown (R = 0.8310). The number (c) and percentage (d) of embryonic cells with expression measurements across any of the assayed genes (assayed cells, grey), all of the assayed genes (tracked cells), and all of the 13 genes (focus factors) for which both embryonic binding data and cellular-resolution expression data was acquired (focused cells) are plotted as a function of developmental time (Sulston minutes). The specific developmental times with the maximum coverage of the cells in the embryo are indicated for the tracked (TT) and focused cells (TF). e, Previously, Murray et al.5 suggested that a robust heuristic to identify cells in which individual genes are expressed can be obtained by requiring a fluorescence signal ≥2000 and a fluorescence signal that is ≥10% of the maximum signal observed for each reporter (gene). To confirm these recommendations, we calculated the overlap in the expressing cell populations for pairs of genes at 10% (ƒ = 0.1) and 20% (ƒ = 0.2) of the maximal signal for each gene, and computed the correlation between calculated overlaps per gene-pair between the two thresholds (R = 0.94). This analysis was extended to compare a wide range of expression cut-offs (ƒ) in e, where we observed robust correlations for the 10% cut-off (ƒ = 0.1). f, Cellular expression overlap matrix for 180 genes in the early embryo. For each pairwise gene comparison, we calculated the significance of the overlap between the population of cells expressing each gene. The overlap enrichment and depletion P values between gene pairs were determined using directional Fisher’s exact tests and were Benjamini–Hochberg corrected. To generate a final overlap score, we select the most significant of the enrichment and depletion scores, reporting either the -log10(P value of enrichment) or the log10(P value of depletion) to obtain positive and negative values for enrichment and depletion, respectively. g, Overlap between co-association cells and the gene-expressing cells (the expressing population) for non-focus factors (NNF = 168). For each cellular-resolution co-association pattern discovered (Fig. 4c), the set of co-association cells is defined as the population of cells in which the co-association is observed in the SOM. For 39 co-association patterns, co-association cells significantly overlap (hypergeometric test, Bonferroni-corrected, P < 0.01) the gene-expression cells of at least one of 124 non-focus factor target genes. Co-association patterns and target gene pairs with significant overlaps between the co-association cells and gene-expression cells were classified as ‘co-association in promoter’ if the co-association pattern with the significant enrichment was observed at the promoter at the target gene, and as ‘co-association not in promoter’ if this was not the case. The distribution of overlap significance values for the two classes and the respective Wilcoxon test P value for similarity between the two distributions is shown. MEP-1 (+) indicates experiments performed with strain OP102.