Table 2 Overview of symbols used in the description of the scregclust algorithm

From: Reconstructing the regulatory programs underlying the phenotypic plasticity of neural cancers

Symbol

Description

Problem setup:

n

Total number of cells in the input matrix

pt

Number of target genes

pr

Number of regulators

ZtZr

target gene and regulator expression (n × pt resp. n × pr matrices)

K

Desired number of clusters

Π

K × pt cluster membership matrix for target genes with Π(ij) {0, 1} and \({\sum }_{i=1}^{K}{{{{\mathbf{\Pi }}}}}^{(i,j)}\le 1\)

J

pt × pt indicator matrix describing prior knowledge of biological relationships between target genes (e.g. pathway co-occurence)

For data splits d =  1, 2:

nd

Number of cells in the d-th data split

Zt,d

target gene expression used in the d-th data split (nd × pt matrix)

Zr,d

regulator expression used in the d-th data split (nd × pr matrix)

For each cluster i = 1, …, K:

Ci

Set of target genes in cluster i

Ri

Set of regulators associated with cluster i

Ni

Maximum number of regulators associated with cluster i

si

Sign vectors containing one sign for each regulator in Ri

Bi

non-negative regression coefficients (Ri × pt matrix)

\({\sigma }_{i,j}^{2}\)

positive variance parameter for each target gene and cluster

Optimization-related (i = 1, …, K, j = 1, …, pt):

λ

Positive penalty parameter used in coop-Lasso

wi

Positive weight vector of length pr for each cluster i in coop-Lasso

Bi,OLS

Ordinary least squares estimates of the regression coefficients in cluster i (pr × Ci matrix)

Bi,CL

Coop-lasso estimates of the regression coefficients in cluster i (pr × Ci matrix)

τ

Non-negative threshold for rag-bag clustering

pi,j

Prior probability of target gene j being in cluster i

Lj

K × n2 likelihood matrix for target gene j

vj

Vector of n2 votes for the cluster assignment of target gene j

μ

Prior strength in [0, 1] for trade-off between likelihood and prior in cluster allocation

Validation measures (i = 1, …, K, j  =  1, …, pt, k = 1, …, pr):

\({R}_{i}^{2}\)

Predictive R2 for cluster i

\({R}_{i,-k}^{2}\)

Predictive R2 for cluster i with regulator k omitted

Ii,k

Importance of regulator k in cluster i

\({R}_{i,j}^{2}\)

Predictive R2 for target gene j predicted by regulators in Ri

Sj

Silhouette score for target gene j

Output:

T

Regulatory table providing a summary of regulator strength in each cluster (pr × K matrix)