Extended Data Fig. 2: Detection of duplicated individuals and confounders of RNA-Seq samples. | Nature Genetics

Extended Data Fig. 2: Detection of duplicated individuals and confounders of RNA-Seq samples.

From: A compendium of genetic regulatory effects across pig tissues

Extended Data Fig. 2: Detection of duplicated individuals and confounders of RNA-Seq samples.

a, Distribution of identity-by-state (IBS) distances among 7,008 RNA-Seq samples, which are calculated using 12,207 LD-independent SNPs (r2 < 0.2). b, Density of IBS distances that were computed using genotypes derived from RNA-Seq only and whole-genome sequence (WGS) or SNP array data in the same individuals (n = 227). c, Heatmap of IBS distance of 25 RNA-Seq samples from 9 individuals. The same color on the top of panel represents samples from the same individuals. True: true individual label; Assigned: assigned individual label using an IBS distance cutoff of 0.9. d, Pearson’s correlation (r) between IBS distance calculated from imputed genotypes and those calculated from WGS or SNP array data across four different populations. L×Y: Landrace and Yorkshire cross breed (n = 25); Duroc×DNXE: Duroc and Diannanxiaoer cross breed (n = 11); Duroc: Duroc pure breed (n = 37); D×L×Y: composite population with 1/4 Duroc, 1/2 Landrace and 1/4 Yorkshire (n = 179). e, Duplicated and remaining individuals in each of the 34 pig tissues used for molecular QTL mapping. Sample pairs with IBS > 0.9 were considered as duplicated individuals. f, Proportion of variance explained (PVE) by genotype principal components (PC) in each of 34 tissues (lines). g, Factor weight variance of probabilistic estimation of expression residual (PEER) factors in each of 34 tissues (lines). h, Proportion of variance (adjusted R2) of known confounders captured by the top 10 inferred PEER factors, calculated using the lm function in R (v4.0.2).

Back to article page