Table 3 Result of k-cross validation (k = 5, 10) applied for SpaCy with manual corpus, artificial data, and combined data.
k | Num_entities | Num_predictions | Num_correct | Precision | Recall | f_value |
|---|---|---|---|---|---|---|
With manual corpus | ||||||
k = 5 | ||||||
Mean | 2165.2 | 2089.2 | 1630.2 | 0.781 | 0.753 | 0.766 |
S.D. | 35.1 | 91.0 | 34.1 | 0.020 | 0.016 | 0.006 |
k = 10 | ||||||
Mean | 1082.6 | 1050.2 | 821.1 | 0.782 | 0.759 | 0.770 |
S.D. | 33.6 | 38.2 | 20.6 | 0.016 | 0.019 | 0.015 |
With manual corpus + artificial data (10,000) | ||||||
k = 5 | ||||||
Mean | 2165.2 | 2056.2 | 1688.8 | 0.821 | 0.780 | 0.800 |
S.D. | 35.1 | 33.9 | 22.0 | 0.005 | 0.004 | 0.002 |
k = 10 | ||||||
Mean | 1082.6 | 1031.7 | 843 | 0.817 | 0.779 | 0.798 |
S.D. | 33.5 | 33.4 | 19.8 | 0.013 | 0.022 | 0.016 |
Artificial data only (10,000) | ||||||
k = 5 | ||||||
Mean | 2165.2 | 1365.6 | 1063.4 | 0.779 | 0.491 | 0.602 |
S.D. | 35.1 | 16.3 | 18.6 | 0.008 | 0.012 | 0.011 |
k = 10 | ||||||
Mean | 1082.6 | 682.8 | 531.7 | 0.779 | 0.491 | 0.602 |
S.D. | 33.6 | 15.5 | 15.5 | 0.011 | 0.015 | 0.013 |