Fig. 1: Characterization of features extracted by deep-metric learning (DML).
From: Heterogeneity-preserving discriminative feature selection for disease-specific subtype discovery

a A schematic of DML approach with triplet loss, which is used to analyze the feature space. In this method, a UMAP is generated, where each point represents a feature from a specific condition. Clustering is then applied to the UMAP space. The resulting clusters and the original data are used as input for the triplet loss in DML. The embeddings from the encoder are used to calculate the Euclidean distance between the same feature of different classes, where \({\overrightarrow{l_{1}^{i}}}\) and \({\overrightarrow{l_{2}^{i}}}\) are the embeddings for the gene i in case and control conditions, respectively, and \({d}_{{\overrightarrow{l_{1}^{i}}},{\overrightarrow{l_{2}^{i}}}}\) is the Euclidean distance between these embeddings. Scatter plots of feature embeddings using the Patel data with color representing the distance between the same feature of different conditions (b), logged p-values from mean differences (using z-test) between conditions of features (c), IQR difference between conditions (d), and location of respective feature types on the plot (e). The p-values in c correspond to the adjusted p-values calculated at a significance level of 0.01 using the Benjamini & Hochberg method. These adjustments were derived from the p-values obtained through a two-sided, two-sample z-test that compares the means of the distributions. f Clustering performance results on the Patel data using the adjusted Rand index (ARI) and V-measure (VM) metrics based on DE, HV, IQR Diff., and HD features. g, h, i, j Heat maps of clustering results on the Patel data based on DE, HV, IQR Diff., and HD features, respectively. UMAP visualizations of 1529 HV features (k) (based on a dispersion threshold of >0.5), 1485 DE features (l) (based on the z-test at significance level of 0.01), 904 ΔIQR features (m) (based on a threshold of >0.4), and 468 HD features (n) that are intersection of both DE and ΔIQR features.