Scientific Reports

Table 2 Statistical data from generic domain datasets.

From: Intelligent recognition of counterfeit goods text based on BERT and multimodal feature fusion

Training Data	# Line	Avg.Length	# Errors
SIGHAN 2013	350	49.2	350
SIGHAN 2014	6,526	49.7	10,087
SIGHAN 2015	3,174	30.0	4,237
Wang271K	271,329	44.4	382,704

Test Data	# Line	Avg.Length	# Errors
SIGHAN 2013	1,000	74.1	1,227
SIGHAN 2014	1,062	50.1	782
SIGHAN 2015	1,100	30.5	715

Back to article page

Search

Advanced search

Quick links