Fig. 1: Word clouds illustrating top-weighted conditions for selected topics. | npj Digital Medicine

Fig. 1: Word clouds illustrating top-weighted conditions for selected topics.

From: Finding Long-COVID: temporal topic modeling of electronic health records from the N3C and RECOVER programs

Fig. 1

Conditions are sized according to probability within each topic and colored according to relevance, with positive relevance indicating conditions more probable in the topic than overall. Each condition displays the numeric OMOP concept ID encoding the relevant medical code used for clustering, as well as the first few words of the condition name. Per-topic statistics in panel headers show usage of each of each topic across sites (\(\rm{U}\), rounded to nearest 0.1%), topic uniformity across sites (\(\rm{H}\), 0–1, higher values being more uniform), and relative topic quality as a normalized coherence score (\(\rm{C}\), z-score, higher values being more coherent).

Back to article page