Figure 2 | Scientific Reports

Figure 2

From: Robust shifts in S100a9 expression with aging: A novel mechanism for chronic inflammation

Figure 2

In silico strategy for identifying cis-regulatory mechanisms controlling S100a9 expression.

The figure illustrates a general procedure for identifying a cluster of S100a9-coexpressed genes (parts A – C), which can then be evaluated to identify TF binding sites that occur at disproportionately high frequency within associated genomic sequences (part D). The procedure is here illustrated for a single cell type (mouse chondrocytes), but we have applied the methodology across a broader panel of 30 mouse and 32 human cell types. In the first step (A), a foreground set of S100a9-coexpressed transcripts is identified. This is done by calculating the Spearman rank correlation (rs) between each transcript and S100a9 and then ranking all transcripts by the magnitude of . The dashed red line shown in (A) represents the segment with minimal distance between the origin (lower left corner) and the curve shown in the figure. This red line serves to define the foreground set of S100a9-coexpressed genes (dark grey region). In part (B), this S100a9-coexpression cluster is illustrated with respect to the 53 microarray samples used to calculate Spearman rank correlations shown in (A), where each microarray sample was generated by hybridization with cDNA derived from mouse chondrocytes. The foreground set of S100a9-coexpressed genes can thus be viewed as the local sub-network that surrounds S100a9, as illustrated in (C). In the final step (D), a generalized additive logistic model (GAM) is used identify significant associations between S100a9 coexpression and the number of TF binding sites present within the 2 KB region upstream of the transcription start site (or other genomic regions). In GAM models, the probability of S100a9 coexpression is modeled (vertical axis) as a function of two variables x1 and x2, where x1 is the length of unmasked sequence scanned for a given gene and x2 is the number of TF binding sites identified in the upstream region. GAM models were fit for each of 1209 TF binding sites and a significant association between S100a9 coexpression and binding site occurrence was evaluated based upon significance of the coefficient β2.

Back to article page