Figure 1 | Scientific Reports

Figure 1

From: Deviation of Zipf's and Heaps' Laws in Human Languages with Limited Dictionary Sizes

Figure 1

The character frequency distribution of The Story of the Stone: (a1) p(k) with log-log scale and (a2) Z(r) with log-linear scale.

The number of distinct characters versus the text length of The Story of the Stonein (a3) log-log scale and (a4) linear-log scale. Similar plots in (b1–b4), (c1–c4) and (d1–d4) are for the books The Battle Wizard, Into the White Night and The History of the Three Kingdoms, respectively. The power-law exponent β is obtained by using the maximum likelihood estimation36,37, while the exponent in the Zipf's plot is obtained by the least square method excluding the head (the majority of characters in the head play the similar role to the auxiliary words, conjunctions or prepositions in English). We fit the data r > 500 for Chinese books and r > 200 for Japanese and Korean books.

Back to article page