Table 5 The generation of the CNN model in the two cell lines from ENCODE project.

From: Predicting CTCF cell type active binding sites in human genome

New cell line

N

Cell line for model training

Signal type

N0

N1

N2

D721Med

41,938

HSMMtube

DNase I only

1260

27,948

12,730

GM10248

33,957

796

23,098

10,063

GM10266

31,923

889

23,795

7239

GM23338

65,903

7388

52,115

6400

H54

50,984

469

38,944

11,571

LNCaP clone FGC

51,252

3139

37,255

10,858

IMR90

43,507

MCF-7

DNase I + RAD21

927

30,251

12,329

SK-N-SH

34,457

GM12878

DNase I + 2 TFs

49

26,271

4837

  1. N is the number of peaks derived from the new cell lines, N0 is the number of peaks not included in the Complete_peak dataset, N1 is the number of peaks that have been predicted as active CTCF binding sites and N2 is the number of peaks that have been predicted as inactive CTCF binding sites. These numbers satisfy the formula N = N0 + N1 + N2. The “2 TFs” refers to RAD21 and SMC3.