Figure 1

Comparing SeqSpider with conventional BN learning algorithms on predicting the hESC regulatory network from NGS data sets. (A) The consensus hESC regulatory network inferred by SeqSpider. The color of an edge indicates the Pearson correlation coefficient (PCC) between the total tag counts within TSS ± 2Kb (or TTS ± 2Kb for H3K36me3) for the two interacting nodes. (B) Stability of the consensus network inferred by SeqSpider (as shown in (A), panel f) compared with alternative implementations using different types of training data with/without profile clustering (other panels). Network stability curves are evaluated on 10-fold incomplete training samples. SeqSpider algorithm works on “combined vectored data”, whereas the conventional BN algorithm11,12 works on “discretized data” and the original kernel-based BN algorithm13 works on “real-valued data”. (C) Significance of literature co-citation rates for networks inferred by different algorithms and on distinct types of data (null/sk/k-means/ap: no clustering/the super k-means/the classic k-means/the affinity propagation algorithm-based profile clustering is performed; vec/real/dis: vectored/real-valued/discrete training data). (D) Joint validations of the hESC regulatory network in (A) and two cellular context-dependent/independent motif networks. P-values indicate the statistical significance of network overlaps. (E) The P-values for the overlap between general motif interaction networks and the hESC epigenome-based regulatory networks. “cons/uncons” indicates whether a motif network is learned using the hESC regulatory network in (A) as a structural constraint, also see (C) for other notations of different algorithms. (F) Prototype of information flow of the network in (A). The input from one modification or enhancer activity that feeds into H3K4me3 at TSS will lead the information to flow out of the hub and engage other modifications or enhancers.