Table 4 Comprehensive performance comparison between ACCI and two backbone networks on the GVC dataset.
From: Argument centric causal intervention for cross document event coreference resolution
Model | MUC | B\(^3\) | CEAF\(_e\) | LEA | CoNLL | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | P | R | F1 | P | R | F1 | F1 | |
Ahmed2023-LH | |||||||||||||
 Baseline-Roberta | 89.6 | 87.0 | 88.3 | 67.9 | 82.3 | 74.4 | 55.2 | 62.0 | 58.4 | 57.8 | 77.6 | 66.2 | 73.7 |
 Baseline-Long | 91.1 | 84.0 | 87.4 | 76.4 | 79.0 | 77.7 | 52.5 | 69.6 | 59.9 | 63.9 | 74.1 | 68.6 | 75.0 |
 ACCI-Base | 92.8 | 88.3 | 90.5 | 76.5 | 82.2 | 79.3 | 59.7 | 71.7 | 65.2 | 68.2 | 78.3 | 72.9 | 78.3 |
 ACCI-Large | 92.7 | 87.1 | 89.8 | 78.6 | 80.6 | 79.6 | 60.3 | 75.5 | 67.1 | 69.7 | 76.2 | 72.8 | 78.8 |
Ahmed2023-LH_Oracle | |||||||||||||
 Baseline-Roberta | 90.2 | 89.1 | 89.6 | 68.0 | 85.0 | 75.6 | 59.6 | 62.7 | 61.1 | 59.5 | 80.6 | 68.5 | 75.4 |
 Baseline-Long | 91.4 | 84.9 | 88.0 | 77.4 | 80.4 | 78.9 | 54.3 | 70.5 | 61.3 | 65.5 | 75.7 | 70.2 | 76.1 |
 ACCI-Base | 92.4 | 89.7 | 91.0 | 74.6 | 84.7 | 79.4 | 63.0 | 70.8 | 66.6 | 67.4 | 81.2 | 73.7 | 79.0 |
 ACCI-Large | 91.9 | 91.5 | 91.7 | 70.7 | 88.0 | 78.4 | 68.7 | 69.7 | 69.2 | 64.6 | 84.8 | 73.4 | 79.8 |
Held2021 | |||||||||||||
 Baseline | 92.3 | 89.3 | 90.8 | 85.7 | 82.1 | 83.8 | 67.5 | 76.6 | 71.7 | 78.8 | 76.9 | 77.8 | 82.1 |
 ACCI | 92.8 | 92.4 | 92.6 | 85.0 | 88.3 | 86.6 | 75.6 | 77.2 | 76.4 | 84.0 | 79.6 | 81.7 | 85.2 |