Table 2 Overview of symbols used in the description of the scregclust algorithm
From: Reconstructing the regulatory programs underlying the phenotypic plasticity of neural cancers
Symbol | Description |
|---|---|
Problem setup: | |
n | Total number of cells in the input matrix |
pt | Number of target genes |
pr | Number of regulators |
Zt, Zr | target gene and regulator expression (n × pt resp. n × pr matrices) |
K | Desired number of clusters |
Π | K × pt cluster membership matrix for target genes with Π(i, j) ∈ {0, 1} and \({\sum }_{i=1}^{K}{{{{\mathbf{\Pi }}}}}^{(i,j)}\le 1\) |
J | pt × pt indicator matrix describing prior knowledge of biological relationships between target genes (e.g. pathway co-occurence) |
For data splits d = 1, 2: | |
nd | Number of cells in the d-th data split |
Zt,d | target gene expression used in the d-th data split (nd × pt matrix) |
Zr,d | regulator expression used in the d-th data split (nd × pr matrix) |
For each cluster i = 1, …, K: | |
Ci | Set of target genes in cluster i |
Ri | Set of regulators associated with cluster i |
Ni | Maximum number of regulators associated with cluster i |
si | Sign vectors containing one sign for each regulator in Ri |
Bi | non-negative regression coefficients (∣Ri∣ × pt matrix) |
\({\sigma }_{i,j}^{2}\) | positive variance parameter for each target gene and cluster |
Optimization-related (i = 1, …, K, j = 1, …, pt): | |
λ | Positive penalty parameter used in coop-Lasso |
wi | Positive weight vector of length pr for each cluster i in coop-Lasso |
Bi,OLS | Ordinary least squares estimates of the regression coefficients in cluster i (pr × ∣Ci∣ matrix) |
Bi,CL | Coop-lasso estimates of the regression coefficients in cluster i (pr × ∣Ci∣ matrix) |
τ | Non-negative threshold for rag-bag clustering |
pi,j | Prior probability of target gene j being in cluster i |
Lj | K × n2 likelihood matrix for target gene j |
vj | Vector of n2 votes for the cluster assignment of target gene j |
μ | Prior strength in [0, 1] for trade-off between likelihood and prior in cluster allocation |
Validation measures (i = 1, …, K, j = 1, …, pt, k = 1, …, pr): | |
\({R}_{i}^{2}\) | Predictive R2 for cluster i |
\({R}_{i,-k}^{2}\) | Predictive R2 for cluster i with regulator k omitted |
Ii,k | Importance of regulator k in cluster i |
\({R}_{i,j}^{2}\) | Predictive R2 for target gene j predicted by regulators in Ri |
Sj | Silhouette score for target gene j |
Output: | |
T | Regulatory table providing a summary of regulator strength in each cluster (pr × K matrix) |