Fig. 5: T-distributed Stochastic Neighbor Embedding (t-SNE) visualization of pairwise distances of 3000 randomly selected diagnostic slides from six different primary sites. | npj Digital Medicine

Fig. 5: T-distributed Stochastic Neighbor Embedding (t-SNE) visualization of pairwise distances of 3000 randomly selected diagnostic slides from six different primary sites.

From: Pan-cancer diagnostic consensus through searching archival histopathology images using artificial intelligence

Fig. 5

These primary sites are selected to contain top, average, worst accuracy from the Table 2—lung, brain (top-2), kidney, liver (middle-2), lymph nodes, and pleura (bottom-2). Six different areas containing majority of the points from the same cancer subtype are assigned with unique alphabets—a, b, c, d, e, f. The random slides from the majority cancer subtype within each of the assigned areas are shown in Samples box (gray background). The outliers (not belonging to majority the cancer subtype or the primary site) are shown in the outliers box (red outline). For example, area a contains majority of scans from brain with glioblastoma multiforme (GBM), whereas its outliers are from lymph nodes with diffuse large B-cell lymphoma (DLBC). Without any explicit training, our technique maintains the semantic categories within the diagnostic slides as shows by the t-SNE plot of the pairwise distances. The kidney, liver, and brain form different isolated groups whereas lung, pleura, and lymph nodes are intermixed with each other.

Back to article page