Table 4 Text properties and entropy of medical concept metadata records.
From: A reference set of curated biomedical data and metadata from clinical case reports
Concept | Average Entropy (bits, +/−standard deviation) | Character Count | Word Count | Segment Count |
|---|---|---|---|---|
Keywords | 2.17 +/− 2.04 | 127,932 | 8,326 | 6,636 |
Geographic Locations | 0.35 +/− 1.01 | 6,085 | 901 | 358 |
Life Style | 0.55 +/− 1.35 | 29,244 | 4,862 | 521 |
Family History | 1.15 +/− 1.83 | 138,162 | 21,342 | 1,717 |
Social History | 0.23 +/− 0.90 | 12,310 | 2,022 | 249 |
Medical/Surgical History | 3.02 +/− 1.84 | 804,975 | 119,816 | 8,783 |
Signs and Symptoms | 3.96 +/− 0.94 | 1,460,450 | 218,276 | 16,467 |
Comorbidities | 0.96 +/− 1.63 | 33,978 | 3,918 | 1,329 |
Diagnostic Techniques and Procedures | 3.98 + /− 0.87 | 1,369,668 | 195,000 | 15,936 |
Diagnosis | 3.85 +/− 0.66 | 206,418 | 24,432 | 4,718 |
Laboratory Values | 2.80 +/− 2.12 | 990,769 | 146,240 | 5,238 |
Pathology | 2.32 +/− 2.11 | 853,084 | 121,009 | 2,865 |
Pharmacological Therapy | 2.74 +/− 1.99 | 422,402 | 60,270 | 3,863 |
Interventional Therapy | 2.60 +/− 1.94 | 399,831 | 57,967 | 4,909 |
Patient Outcome Assessment | 3.07 +/− 1.77 | 440,602 | 66,786 | 4,526 |