Figure 2
From: Combining deep learning with token selection for patient phenotyping from electronic health records

(A) Frequency distribution of rank ordered tokens. The solid green line is from a linear regression for the corresponding range showing that the token frequency follows a power law with an exponent of α = 0.97. (B) Relative cumulative sum of token frequencies for the range of the linear regression fit (10 to 1500) in the left figure.