Table 1 Data features extracted by PEAS from ATAC-seq bam files and genomic sequence categorized by feature type.

From: A neural network based model effectively predicts enhancers from clinical ATAC-seq samples

Feature Type

Feature Label (24)

ATAC-Seq Peak Driven (n = 5)

Peak score (MACS)

Peak length (MACS)

Fold change

Summit pileup

Summit center distance

ATAC-seq Insert/Cut Driven (n = 10)

# of all inserts

# of inserts (0,50]

# of inserts (50, 150]

# of inserts (150, 300]

# of inserts (300, 500]

# of inserts (500,)

Insert size (mean)

Long/short insert ratio

# of cuts within peak

# of overrepresented cuts

Sequence Driven (n = 3)

Conservation (mean)

GC% (HOMER)

CpG% (HOMER)

Motif Driven (n = 4)

# of CTCF motifs

% of known motifs present

% of denovo motifs present

Genomic Location Driven (n = 3)

Annotation (HOMER)

Distance to TSS

Gene type (HOMER)

  1. For numeric ranges, inclusive and exclusive values are denoted by square brackets and parentheses respectively.