Figure 1
From: Harnessing the tissue and plasma lncRNA-peptidome to discover peptide-based cancer biomarkers

Integrated computational workflow. Integrated computational workflow was established which is divided into multiple segments. First step was to retrieve the nucleotide sequences of 23,898 long non-coding RNA (lncRNA) transcripts from GENCODE V30 (GRCh37.p13) followed by the in-silico three-frame translation to obtain the hypothetical polypeptide sequences encoded by lncRNA transcripts to assemble a FASTA database encompassing the hypothetical polypeptide sequences from lncRNA transcripts. Additionally canonical human reference proteome sequences were also integrated into this custom-built FASTA database. Secondly, LC-MS/MS raw files originating from 14 human tissues, 11 cell lines, 92 Colon cancer (COAD) samples, 30 normal colon samples, prostate cancer tissues and adjacent tumor-free tissues from three patients and plasma samples from prostate cancer patients and healthy individuals were retrieved from PRIDE database. The LC-MS/MS raw files were subjected to MaxQuant (Version 1.6.0.1) for re-processing. Custom built FASTA database was used for peptide spectrum match (PSM) in Andromeda searching. The Identified lncRNA peptides were matched to human proteome and only the unmatched fractions of the peptides were retained. Finally these peptides/polypeptides were used for downstream bioinformatics analysis.