Table 1 Module detection methods evaluated in this study

From: A comprehensive evaluation of module detection methods for gene expression data

	Clustering: grouping genes based on a global similarity in gene expression profiles
A	FLAME: fuzzy clustering by selecting cluster supporting objects based on the K-nearest neighbor density estimation
B	K-medoids: iteratively refines the centers (which are individual genes) and the average dissimilarity within the cluster
C	K-medoids (see B) but with automatic module number estimation
D	Fuzzy c-means: similar to k-means (see F), but using fuzzy instead of crisp cluster memberships
E	Self-organizing maps: maps each gene on a node embedded in a two-dimensional graph structure
F	K-means: iteratively refines the mean expression with a cluster and the within-cluster sum of squares
G	MCL: simulates random walks within the co-expression graph by alternative steps of expansion and inflation
H	Spectral clustering: applies K-means in the subspace defined by the eigenvectors of the Pearson’s correlation affinity matrix
I	Affinity propagation: clustering by exchange of messages between genes
J	Spectral clustering: applies K-means in the subspace defined by the eigenvectors of the K-nearest-neighbor graph
K	Transitivity clustering: tries to find the transitive co-expression graph in which the total cost of added and removed edges is minimized
L	WGCNA: agglomerative hierarchical clustering (see M), but using the topological overlap measure and a dynamic tree cutting algorithm to implicitly determine the number of modules
M	Agglomerative hierarchical clustering: generates a hierarchical structure by progressively grouping genes and clusters based on their similarity
N	Hybrid hierarchical clustering: combination of agglomerative and divisive hierarchical clustering
O	Divisive hierarchical clustering: generates a hierarchical structure by progressively splitting the genes into clusters
P	Agglomerative hierarchical clustering (see M), but with automatic module number estimation
Q	SOTA: combination of self-organizing maps and divisive hierarchical clustering
R	First finds cluster centers by searching for high-density regions, each gene is then assigned to the cluster of its nearest neighbor of higher density
S	CLICK: uses density estimation to find tight groups of similar genes, after which these are expanded into modules
T	DBSCAN: groups genes within core, non-core and outlier genes based on the number of neighbors
U	Clues: first applies a shrinking procedure which moves each gene towards nearby high-density regions, after which the genes are partitioned into an automatically determined number of clusters using the silhouette width
V	Mean shift: moves each gene towards nearby high density regions until convergence
	Decomposition: extracting the components corresponding to co-expression modules by decomposing the expression matrix in a product of smaller matrices
A	Independent component analysis: decomposes the expression matrix into a set of independent components using the FastICA algorithm, detects potentially overlapping modules within each source signal using false-discovery rate (FDR) estimation
B	Similar to A, but detects two modules per independent component depending on whether genes have positive or negative weights
C	Similar to A, but detects modules within each source signal using z-scores
D	Combination of principal component analysis and independent component analysis, uses FDR estimation to find modules
E	Principal component analysis: decomposes the expression matrix into a set of linearly uncorrelated components, detects potentially overlapping modules within each component using FDR estimation
	Biclustering: simultaneous grouping of genes and samples in biclusters based on similar local behavior in expression
A	Spectral biclustering: detecting checkerboard patterns within the gene expression matrix
B	ISA: iteratively refines a set of genes and samples based on high or low expression in both the gene and sample dimension
C	QUBIC: finds biclusters in which the genes have similar high or low expression levels in a discretized expression matrix
D	Bi-Force: finds biclusters with over- or under-expression by solving the bicluster editing problem
E	FABIA: builds a multiplicative model of the expression matrix layer by layer. Every layer represents a bicluster
F	Plaid: builds an additive model of the expression matrix layer by layer. Every layer represents a bicluster
G	MSBE: finds additive biclusters starting from randomly sampled reference genes and conditions
H	Cheng & church: minimizes the mean squared residue within every bicluster
I	OPSM: searches for biclusters where the expression changes in the same direction between genes and samples
	Iterative network inference: iterative optimization of an inferred network and a set of clusters
A	MERLIN: iteratively refines a direct regulatory network and modules within a probabilistic graphical network framework
B	Genomica: starts from an initial hierarchical clustering and iteratively refines this clustering and an inferred module network using a model based on Bayesian regression trees
	Direct network inference: inference of a regulatory network based on gene expression similarity between regulators and target genes
A	GENIE3: predicts the expression of each target gene based on random forest regression
B	CLR: calculates the likelihood of mutual information estimations based on the network neighborhood
C	Pearson’s correlation between regulator and target gene
D	TIGRESS: network inference using a combination of Lasso sparse regression and stability selection

Within each category, methods are ranked according to their average test score (Fig. 2). We refer the reader to Supplementary Note 2 for details regarding the implementation and parameters

Back to article page

Table 1 Module detection methods evaluated in this study

Search

Quick links