Table 4 Text preprocessing statistics.

From: Dual stream graph augmented transformer model integrating BERT and GNNs for context aware fake news detection

Preprocessing step

Average tokens per article

Stopwords removed

Unique lemmas

Raw text

500

-

-

After stopword removal

350

30% Removed

10,000

After lemmatization

340

-

9,500