Table 2 Comparison of answer accuracy between KnBERT-TD and baseline models.

From: A machine solution for math word problems based on semantic understanding enhancement

Baselines

Accuracy

Math23K

Ape-210k

MathQA

MAWPS*

Classical baselines

DNS

–

66.2%

–

59.5%

Math-EN

66.7%

–

–

69.2%

RecursiveNN

68.7%

–

–

–

GTS

75.6%

73.2%

71.3%

82.6%

TSN-MD

77.4%

–

–

–

Graph2Tree

77.4%

–

72.0%

83.7%

BERTGen

76.6%

75.4%

–

86.9%

Baselines based on PLM

RoBERTaGen

76.9%

74.2%

–

88.4%

REAL

82.3%

–

–

–

BERT-CL

83.2%

81.5%

76.3%

–

GPT-4

84.3%

83.0%

77.3%

–

SBI-RAG

80.5%

–

–

–

LCR

80.1%

–

–

–

KnBERT-TD

85.7%

83.5%

77.9%

89.0%

  1. * indicates five-fold cross-validation.