Table 2 Cross-validation results on the CIRCLE-seq dataset

From: A versatile CRISPR/Cas9 system off-target prediction tool using language model

Model

Bal Acc

F1-score

AUROC

AUPRC

CCLMoff

0.998 ± 0.001

0.409 ± 0.003

0.985 ± 0.001

0.524 ± 0.004

LSTM

0.843 ± 0.002

0.052 ± 0.001

0.926 ± 0.001

0.479 ± 0.003

CRISPR-Net

0.806 ± 0.002

0.083 ± 0.001

0.915 ± 0.003

0.462 ± 0.007

CCTop

0.887 ± 0.004

0.003 ± 0.001

0.711 ± 0.004

0.008 ± 0.001

CCLMoff-Epi

0.998 ± 0.001

0.429 ± 0.005

0.989 ± 0.001

0.513 ± 0.004

CCLMoff-Van

0.836 ± 0.001

0.053 ± 0.001

0.901 ± 0.003

0.422 ± 0.005

CCLMoff v.s.

    

LSTM

6 × 10−12

3 × 10−12

9 × 10−9

4 × 10−11

CRISPR-Net

1 × 10−9

8 × 10−7

1 × 10−5

7 × 10−6

CCTop

1 × 10−8

1 × 10−23

1 × 10−13

1 × 10−18

CCLMoff-Epi

0.82

0.53

0.09

0.12

CCLMoff-Van

1 × 10−4

2 × 10−13

6 × 10−4

5 × 10−6

  1. Bal Acc Balanced Accuracy, used for evaluating imbalanced datasets, CCLMoff-Van CCLMoff-Vanilla.
  2. A t test was conducted to assess the statistical significance of performance differences between CCLMoff and baseline models.