Fig. 1: We performed a rigorous quality control (QC) of whole-exome sequencing (WES) data published for 654 canine cases. | Nature Communications

Fig. 1: We performed a rigorous quality control (QC) of whole-exome sequencing (WES) data published for 654 canine cases.

From: Canine tumor mutational burden is correlated with TP53 mutation across tumor types and breeds

Fig. 1

a Distributions of total read pairs per sample of the tumor and normal sample sets of each study. Each dot represents a sample and the median is indicated by a black line. The dashed line specifies the QC cutoff. Each study is represented by the tumor type and the institute name. MT mammary tumor, GLM glioma, BCL B-cell lymphoma, TCL T-cell lymphoma, OM oral melanoma, OSA osteosarcoma, HSA hemangiosarcoma, UCL unclassified. CUK Catholic University of Korea, SNU Seoul National University, JL Jackson Laboratory, SI Sanger Institute, BI Broad Institute, TGen Translational Genomics Research Institute, UPenn University of Pennsylvania. n = 184, 20, 56, 61, 39, 65 (71 tumors), 66, 12, 47, 21 (23 tumors), and 83 independent cases for matched normal and tumors samples for each independent study listed from left to right. bf Distributions of per sample rate of read pairs that aligned concordantly and uniquely to the canFam3 reference genome (b) (n = 81 for UCL BI; others the same as a), the fraction of reads with mapping quality of ≥30 (c) (n = 50 and 18 for GLM JL and HSA UPenn respectively; others the same as b), CDS-targeting rate (the fraction of read pairs that align concordantly and uniquely to the canFam3 CDS regions) (d) (the same sample size as c), mean read coverage in CDS regions (e) (n = 60, 38 and 80 for BCL BI, TCL BI, and UCL BI, respectively; others the same as d) and root-mean-square error (RMSE) between the actual distribution and theoretical distribution (based on the Poisson distribution) of sequence coverage in CDS regions (f) (n = 183, 49, 58, 43, 8, and 74 for MT CUK, GLM JL, BCL BI, HSA BI, HSA UPenn, and UCL BI, respectively; others the same as e). g Distributions of the total number of callable bases per case, determined by MuTect. n = 183, 20, 49, 58, 38, 71, 66, 12, 42, 8, and 74 independent tumors from left to right. h Tumor-normal pairing accuracy. “Self” (in green) is the fraction of germline variants shared between the normal and tumor samples of a dog. “Best nonself” is the fraction of germline variants shared between a normal or tumor sample of one dog and its best-matched sample from another dog. “Self—Best nonself” (in purple) indicates the difference and a negative difference points to incorrect tumor-normal pairing. The sample size is the same as in (g). Source data are provided as a Source Data file.

Back to article page