Table 9 Comparison of results on the original dataset and human-edited version.
From: Classifying human vs. AI text with machine learning and explainable transformer models
Dataset type | Accuracy (± CI) | Precision (± CI) | Recall (± CI) | F1 (± CI) | Brier (± CI) | ECE (± CI) |
|---|---|---|---|---|---|---|
5–10% Human Edit | 0.951 ± 0.014 | 0.920 ± 0.026 | 0.988 ± 0.003 | 0.953 ± 0.012 | 0.044 ± 0.012 | 0.491 ± 0.003 |
30–40% Human Edit | 0.9442 ± 0.0142 | 0.9094 ± 0.0235 | 0.9871 ± 0.0024 | 0.9466 ± 0.0128 | 0.0501 ± 0.0118 | 0.4896 ± 0.0032 |
Actual Data | 0.961 ± 0.004 | 0.945 ± 0.007 | 0.979 ± 0.004 | 0.962 ± 0.004 | 0.034 ± 0.003 | 0.492 ± 0.010 |