Table 5 Semantic Structural Congruence (SSC) across datasets for different document layout models with input embeddings: OCR and Layout Only.

From: Representation learning approach for understanding structured documents

Dataset / Model

LayoutLMv36

DocLayout-YOLO30

DocSAM28

LayoutLLM36

DocLayLLM37

D-REEL

PRImA Newspaper Dataset34

57.14

60.27

58.63

63.42

64.19

73.05

German-Brazilian Newspapers (GBN)35

58.08

60.72

59.37

62.31

63.57

71.48

S2-VL Dataset10

83.52

86.11

86.43

88.59

89.03

90.47

IIIT AR 13K33

84.68

87.29

87.15

89.21

90.07

89.14

Publaynet32

84.03

86.42

89.17

88.26

89.08

89.31