Fig. 1

GTEx_Pro Preprocessing Workflow for Downstream Analysis of GTEx Tissues. This diagram (created in BioRender.com) illustrates the preprocessing steps performed in GTEx_Pro to analyze GTEx tissue datasets. The workflow begins with downloading the raw read count matrix, which undergoes data manipulation and filtering to pre-select a specific set of genes for further processing. Furthermore, quality control is performed by imputing missing, invalid values (Inf/NaN) and removing any potential outliers manually, followed by normalization using the TMM and CPM methods. Batch effects are then addressed using SVA with sex as a covariate. PCA is employed to explore principal component variance and tissue clustering quality. The pre-processed data can be subsequently used for various downstream analyses, such as tissue-specific expression studies.