Table 2 Comparison of answer accuracy between KnBERT-TD and baseline models.
From: A machine solution for math word problems based on semantic understanding enhancement
Baselines | Accuracy | ||||
|---|---|---|---|---|---|
Math23K | Ape-210k | MathQA | MAWPS* | ||
Classical baselines | DNS | – | 66.2% | – | 59.5% |
Math-EN | 66.7% | – | – | 69.2% | |
RecursiveNN | 68.7% | – | – | – | |
GTS | 75.6% | 73.2% | 71.3% | 82.6% | |
TSN-MD | 77.4% | – | – | – | |
Graph2Tree | 77.4% | – | 72.0% | 83.7% | |
BERTGen | 76.6% | 75.4% | – | 86.9% | |
Baselines based on PLM | RoBERTaGen | 76.9% | 74.2% | – | 88.4% |
REAL | 82.3% | – | – | – | |
BERT-CL | 83.2% | 81.5% | 76.3% | – | |
GPT-4 | 84.3% | 83.0% | 77.3% | – | |
SBI-RAG | 80.5% | – | – | – | |
LCR | 80.1% | – | – | – | |
KnBERT-TD | 85.7% | 83.5% | 77.9% | 89.0% | |