Supplementary Figure 6: Pan-disease control profile sampling.

Sorted boxplots showing results of repeated paper sampling (200 iterations, each randomly selecting 200 abstracts), and calculating the proportion of hits for each cell (left) and cytokine (right). The entire corpus of 521,625 disease-HPC and 438,012 disease-cytokine co-occurrence papers is used, without limiting to any condition, to define pan-disease control immune profile. Cell subset and cytokine family classification appears as coloring of individual members across y-axis. Top 50% of the results are shown, with highest cited entities emphasized (grey area, median>=0.05). Box-plot elements: center line, median; box limits, first to third quartile (Q1 to Q3); whiskers, from Q1–1.5 × IQR to Q3+1.5 × IQR; points, outliers.