Fig. 3: Complementary determining region 3 (CDR3) sequence characteristics and cancer-type-specific patterns. | npj Precision Oncology

Fig. 3: Complementary determining region 3 (CDR3) sequence characteristics and cancer-type-specific patterns.

From: T-cell receptor clonotypic diversity and specialization in digestive system cancers

Fig. 3: Complementary determining region 3 (CDR3) sequence characteristics and cancer-type-specific patterns.

A Density plots showing the distribution of CDR3 sequence lengths for T-cell receptor (TCR) β (TRB, left), TCR δ (TRD, middle), and TCR γ (TRG, right) chains. Colors represent different cancer types. Differences between distributions were assessed using the Kolmogorov–Smirnov (ks) test. B Bubble plot showing the cancer-type-specific differences for CDR3 lengths. Red indicates enrichment, and green indicates depletion, as described in Zhang et al. (see Materials and methods)39. C Grouped bar plot showing the average frequency of amino acids across cancer types. Frequency differences between cancers were assessed using the two-sided t-test. D Bubble plot showing cancer-type-specific amino acid distributions. Red indicates enrichment, and green indicates depletion. E Violin combined with box plot showing the distribution of conservation indices across cancer types (see Materials and methods). P value was calculated using a two-sided t-test. F Multiple sequence alignment of the top k most abundant CDR3 sequences in each cancer type. Gaps are denoted by “—”, and “X” indicates unidentified amino acids. G CDR3 sequence logo showing amino acid frequencies at each position (x-axis: positions 1–16; y-axis: information content in bits). Amino acids were depicted as letters, with their size proportional to frequency. Colors indicate chemical properties: acidic (red), basic (blue), hydrophobic (black), neutral (purple), and polar (green). H Heatmap showing cancer-specific CDR3 motifs in colorectal cancer (CRC) and gastric cancer (GC) patients. Redder shades indicate higher motif abundance. Rows and columns were hierarchically clustered using the “complete” method. I Heatmap showing the starting positions of cancer-specific motifs within CDR3 sequences. Redder shades indicate higher relative frequencies. Tau values represent the conservation index of motifs (see Materials and methods). J Violin combined with box plots depicting the clonal fraction distributions of cancer-specific motifs in CRC and GC. P values were calculated using two-sided Wilcoxon tests. K Bubble plot showing cancer-type-specific differences for conserved motif-associated V-J combinations. Red indicates enrichment, and green indicates depletion. L Venn diagram showing unique and shared antigen genes in CRC (purple) and GC (pink), based on VDJdb28 entries with vdjdb.score > 1. M Bar plot comparing log₂-transformed frequencies of antigen genes in CRC and GC from VDJdb (vdjdb.score >2). Bars are dodged. N, O Heatmaps showing the frequency of antigen genes versus TCR VJ combinations in (N) GC and (O) CRC from VDJdb (vdjdb.score >2). Rows represent antigen genes; columns represent VJ pairs. Frequencies are row-scaled and hierarchically clustered. P Network plot illustrating interactions among shared antigen genes (CRC and GC), major histocompatibility complex (MHC) alleles (mhc.a), and TCR VJ combinations from VDJdb (vdjdb.score >2). Nodes represent genes, MHC alleles, and VJ pairs; edge width reflects interaction frequency, and node size indicates connectivity.

Back to article page