Table 4 Criteria measurement report.
From: Applying large language models for automated essay scoring for non-native Japanese
Criteria | Measures | Infit Mnsq | Outfit Mnsq | S.E. |
|---|---|---|---|---|
Lexical richness | Lexical diversity | 1.14 | 1.42 | 0.21 |
Lexical density | 0.99 | 0.85 | 0.24 | |
Lexical sophistication | 1.12 | 1.21 | 0.19 | |
Syntactic diversity | Mean dependency distance | 1.21 | 1.21 | 0.13 |
Mean length of clause | 0.94 | 1.04 | 0.19 | |
Verb phrases per T-unit | 0.89 | 0.93 | 0.20 | |
Clauses per T-unit | 0.95 | 0.99 | 0.15 | |
Dependent clauses per T-unit | 1.01 | 1.11 | 0.14 | |
Complex nominals per T-unit | 1.13 | 1.05 | 0.20 | |
Adverbial clause rate | 0.79 | 0.74 | 0.12 | |
Coordinate phrases rate | 1.09 | 0.96 | 0.14 | |
Cohesion | Synonym overlap/paragraph (topic) | 1.22 | 1.46 | 0.24 |
Synonym overlap / paragraph (keywords) | 1.28 | 1.23 | 0.21 | |
(a). word2vec → (b). cosine similarity between sample and reference | 0.76 | 0.85 | 0.28 | |
Content elaboration | Metadiscourse marker rate | 0.82 | 0.76 | 0.24 |
Grammatical accuracy | Grammatical error rate | 0.86 | 0.93 | 0.26 |