Fig. 1: Network analysis pipeline (see “Methods” for details). | Nature Communications

Fig. 1: Network analysis pipeline (see “Methods” for details).

From: A network analysis to identify mediators of germline-driven differences in breast cancer prognosis

Fig. 1: Network analysis pipeline (see “Methods” for details).The alt text for this image may have been generated using AI.

a Cox models were used to estimate the association between each genetic variant and breast cancer-specific survival in 84,457 patients of the Breast Cancer Association Consortium (BCAC) dataset (discovery set). b The P values of the survival analyses for the genetic variants (blue diamonds) were used to compute gene scores using the Pascal algorithm. These gene scores were based on the maximum chi-squared signal within a window size of 50-kb around the gene region and accounted for linkage disequilibrium structure (depicted in a gradient blue scale). c The HotNet2 method was used to identify gene modules based on the −log10 P value of the computed gene scores. d The modules found by HotNet2 were filtered to obtain a selection of high-confidence germline-related prognostic modules (GRPMs). We constructed a polygenic hazard score (PHS) summarizing the prognostic effects of a set of selected genetic variants in the module. We then tested the association of this PHS with survival in both the discovery set (gray) and the independent set (orange). e We performed a functional characterization of the high-confidence GRPMs by studying the downstream transcriptional effects. For that, we used genotype and expression data from The Cancer Genome Atlas (TCGA). We computed the correlation between a GRPM’s polygenic hazard score and the expression of all available genes. Based on these correlation values, a gene set enrichment analysis assigned biological processes that were enriched among the genes most correlated with the prognostic variants in the GRPM.

Back to article page