Table 1 Transcriptional features commonly or potentially used to identify malignant cells from scRNA-seq data

From: Identification of malignant cells in single-cell transcriptomics data

Commonly used features

 

Feature/aberration

Readout

Comments

 

Expression of cell-of-origin-marker genes

Gene signature score

Not sufficient to distinguish between normal and malignant cells of the same type; usually combined with other features

 

Inter-patient tumor heterogeneity

Index of cluster mixing (e.g. LISI score, entropy)

Requires multiple samples; may be confounded by batch effects

 

Copy-number alterations

Copy-number profile/aneuploidy score

Requires a reference of “normal” ploidy; will not detect malignant cells without chromosomal alterations

 

Supporting features

 

Feature/aberration

Readout

Comments

 

Single-nucleotide alterations and mutational burden

Mutations in known sites/total number of mutations

Works best when combined with WES of matched samples; limited by low-coverage of scRNA-seq technologies

 

Formation of fusion transcripts

Expression of fused genes

Specific to individual cancer types; limited by low-coverage of scRNA-seq technologies

 

Sustained proliferation

Signature score for cycling gene sets

Commonly measured as cycling enrichment by cluster

 

Pathway dysregulation

Signature score for altered pathway

Specific to individual cancer types

 

Potentially discriminating features

 

Feature/aberration

Readout

Comments

 

MHC downregulation

Signature score for antigen-presenting machinery

Specific to individual cancer types, TMEs, or individual cancer sub-clones

 

Overexpression of checkpoint molecules

Checkpoint ligand expression

Limited evidence in scRNA-seq

 

Expression of telomerase subunits

Gene or signature score

Limited evidence in scRNA-seq

 

Metabolic signatures

Signature score

Adjacent normal cells may exhibit similar alterations

 

Pro-angiogenic signaling

Gene or signature score

Limited evidence in scRNA-seq

 

Drivers of invasion (EMT)

Signature score

Intermediate EMT states may be difficult to capture

 

Oncofetal reprogramming

Gene or signature score

Specific to individual cancer types

 

Number of unique expressed genes

Gene count

Can be confounded by heterogeneous sequencing depth