Abstract
DNA double-strand breaks (DSBs) can lead to genomic instability in cancer. Cells rely on an efficient DNA damage response (DDR) to maintain their DNA integrity and prevent oncogenic transformation. However, the early events that connect recurrent DNA damage to oncogenesis are not yet fully understood. Here, using next-generation sequencing we comprehensively surveyed genomes to identify DSBs in primary cells of non-malignant carriers of BRCA1 and BRCA2 mutations (BRCAmut), categorized as high-risk patients, to characterize the effects of homologous recombination (HR) loss on cancer initiation. We demonstrate that the landscape of physiological DSBs in BRCAmut mammary epithelial cells differs from that of healthy controls and resemble more the DSB pattern observed in breast cancer cells. Our results reveal that proto-oncogenes and tumor suppressors contain more breaks in BRCAmut samples, and that genes with a high number of DSBs tend to be more highly expressed. These genes containing a high number of DSBs are also often mutated in breast cancer tumors. Finally, genes with high DSBs in mammary epithelial cells from women with BRCAmut exhibit a strong correlation with homologous recombination repair. Together, our findings underscore the impact of BRCA loss on the early stages of carcinogenesis and highlight future possibilities for early cancer detection.

When BRCA is intact, genes that are highly broken are properly repaired via HR, preserving DNA integrity. When BRCA is mutant, impairing its function, highly broken transcriptional DSB genes emerge, no longer able to be efficiently repaired via HR, and are found at genes related to cancer signaling. Breakome of enriched breaks at high-risk model resembles breast cancer breakome, and breaks can be found in genes known to be frequently mutated in breast cancer.
Similar content being viewed by others
Introduction
Genomic instability plays a major role in tumorigenesis and is considered a hallmark of biologically aggressive cancers [1, 2]. Cells are exposed to a myriad of endogenous (i.e., physiological [3]) and exogenous mechanisms that can result in DNA damage. Factors that can increase DNA damage include 1) replication and transcriptional stress 2) reactive oxygen species (ROS), 3) UV radiation, and 4) other biological and chemical agents [4,5,6]. The most deleterious form of DNA damage is DNA double-strand breaks (DSBs) [7]; DSBs can lead to severe DNA aberrations and are implicated in cancer. Several DNA damage response pathways are responsible for maintaining genome integrity, specifically homologous recombination (HR) and non-homologous end joining (NHEJ) when DSBs are the culprit [8]. Loss of, or mutations in, DNA repair proteins are known to cause genomic instability and tumor initiation [9,10,11].
Genomic instability is tightly associated with breast cancer initiation and progression [12, 13]. Approximately 10% of breast cancer is inherited [14, 15]; women with deleterious germline BRCA1 or BRCA2 mutations have a lifetime 60–70% risk of breast cancer [14]. BRCA1mut carriers most frequently develop highly aggressive triple-negative breast cancer (ER-/PR- HER2-wild type; TNBC) while BRCA2mut carriers most frequently develop luminal (ER+) breast cancers [16, 17].
Recent efforts have focused on early detection of breast cancer. Next-generation sequencing techniques were used to define new cancer genes, mutations, and gene rearrangements [18, 19]. DNA damage has been identified through indirect methods, such as staining for DNA-binding damage markers, like γH2AX [20]. While these methods are effective for quantifying relative damage levels, they do not identify specific break sites or the frequencies of specific breaks in the genome. A partial solution is chromatin immunoprecipitation-sequencing (ChIP-seq); however, ChIP-seq remains an indirect method for identifying DNA breaks and lacks high resolution [21].
Sequencing techniques are being pioneered to directly identify DNA breaks at high resolution [3, 22, 23]. These approaches show that the landscape of DNA breaks, henceforth referred to as the breakome, is 1) not random and 2) can be used to understand a multitude of DNA breakage and repair processes. Processes that can be better understood include 1) replication mechanisms and 2) high transcriptional stress and resolution. In-suspension Breaks Labeling in-situ and Sequencing (sBLISS) is a next-generation sequencing (NGS) method that has been developed to detect DNA DSB frequencies at nucleotide resolution [24,25,26]. Previous work in our lab using this method helped define the oncogenic role of DNA DSBs in genes that regulate oncogenic transcription, by inducing repair in topoisomerase I-regulated transcriptionally active sites, thereby supporting the high burden of oncogenic hypertranscription [27, 28].
Although the connection between mutations in DNA repair factors and oncogenesis has been long established [29], how exactly the loss of BRCA protein function impacts cancer initiation remains unclear. BRCA mutations have been correlated with various known driver events, such as TP53 mutations [30,31,32]; it is speculated that impaired DNA repair may be responsible for initiating this chain of oncogenic events. Here we investigated how specific BRCA mutations impact early changes in DNA break pattern that lead to carcinogenesis. To accomplish this, we collected primary mammary epithelial cells from women with a range of deleterious BRCA1 and BRCA2 mutations [33] (Supplementary Table 1). Healthy control samples were obtained from women who are not carriers of deleterious germline DNA mutations, and defined as average risk. Using these samples, we mapped and characterized the pattern of DNA DSBs or “breakome”, and analyzed how specific BRCA1 or BRCA2 mutations may impact early mammary carcinogenesis, and ultimately could play a role in early cancer detection.
Results
Normal and BRCA-mutated primary cells exhibit different break patterns
Given the role of BRCA in HR repair, we hypothesized that the break pattern of BRCAmut cells would shift to an oncogenic-promoting pattern compared to the normal breast breakome. To understand the effects of BRCAmut, we subjected our primary cell samples to the sBLISS method [24,25,26] and set out to characterize the breakome of high-risk BRCA mutation carriers relative to average-risk controls, without treatment with DNA damage inducers. In total, we analyzed the breakomes of eight heterozygous BRCA1 high-risk mutation carriers, four BRCA2 heterozygous high-risk mutation carriers, three PALB2 high-risk mutation carriers and seven healthy average-risk controls, all prepared in technical duplicates and were previously confirmed for heterozgosity [34]. Tissue samples were collected from several young (≤35 years, 4 normal, 5 BRCAmut), “middle aged” (36–54 years, 2 normal, 5 BRCAmut) and older (≥55 years, 1 normal, 2 BRCAmut) women of diverse heritages. Samples were labeled with a prefix of either N, B1, or B2, referring to samples that are normal, BRCA1 mutated or BRCA2 mutated, respectively (Supplementary Table 1). A global view of breaks across the genome shows patterns are largely similar (Supplementary Fig. 1A). However, BRCA mutated samples tend to be more similar to each other than wildtype, as is evident by Pearson correlation (Supplementary Fig. 1B, C) and principal component analysis (Fig. 1A). Hierarchical clustering of the whole genome (5677 bins of 0.5 MB each) according to Spearman correlation identify two clusters, separating average and high risk almost entirely (Fig. 1B). While it is visible that many genomic bins are more broken in high-risk samples, there are also considerable number of bins that are more broken in the average risk samples. Together, we concluded that the breakomes of average-risk and BRCAmut high-risk breast epithelia are distinct from one another.
A Principal component analysis of the breakome across the genome, binned in 500kbp bins. The analysis demonstrates that BRCA-mutated samples exhibit different patterns than average risk samples. B Heatmap of normalized breaks per 500kbp bin. Rows (bins) and columns (samples) are clustered according to their spearman correlation. Right cluster contains almost only low risk samples, highlighting the unique characteristics of high-risk samples (left cluster). C Distribution of breaks across chromatin states, bars represent break density in each state and sample compared to the representation of each chromatin state in the genome. P-values were calculated using Kruskal-Wallis similarity test between BRCAmut and Normal groups. D Breaks are enriched in genes that are early replicating. Time-of-Replication analysis depicts gene break density across TOR categories from late replicating genes (left) to early replicating genes (right). P value was calculated using Kruskal-Wallis similarity test between categories one and four and between categories one and eight for each group (BRCA1mut, BRCA2mut, BRCAWT) individually. E Breaks are enriched in short genes. Gene-length analysis depicts break density across gene length categories from short genes (left) to long genes (right). P value was calculated using Kruskal-Wallis similarity test between categories one and eight and between categories five and eight for each group (BRCA1mut, BRCA2mut, BRCAWT) individually. F Breakome vs. expression analysis demonstrates higher break density for genes that are highly expressed. Genes in each expression category were plotted based on their break density levels, each dot representing the median break density per group per expression category. Statistics were measured using Wilcoxon test.
The breakome of primary cells with BRCA mutations is more pronounced in the genome’s open and active regions
To further understand the behavior of physiological DSBs, we analyzed the distribution of breaks in our models across chromatin states as characterized in HMECs. The distribution of observed DSBs relative to the expected bar, which represents the relative distribution of each state across the genome, revealed breaks are overrepresented both at promoters and at repetitive states (Fig. 1C). Interestingly the only chromatin state that showed a significant difference between high and average risk samples are active promoters (p < 0.03, Kruskal-Wallis similarity test), suggesting breaks in BRCAmut are more affected by transcription. Subsequently, we examined the distribution of breaks across genes based on their time of replication. Genes were categorized into groups based on their time of replication, and breaks were counted per gene per sample. We found that early-replicating genes exhibited a higher break density compared to those categorized as late-replicating (Fig. 1D). Furthermore, we analyzed the distribution of breaks in genes relative to gene length. Interestingly, shorter genes showed significantly higher break density compared to longer ones (Fig. 1E). These findings further suggested that breaks concentrate in highly transcribed regions.
To confirm our hypothesis, we tested whether the breakome correlates with gene expression, utilizing expression data from RNA-seq performed by Shalabi et al. [33] on luminal epithelial cells (LEPs) from the same donors. Genes were binned by their expression levels, starting from 1 for not expressed and going to 10 for highly expressed genes, and break density was measured for genes in each category. Our analysis revealed that higher gene expression correlated with increased break density (Fig. 1F). This phenomenon was significantly more prominent in high-risk samples compared to normal breast cells (p < 2*10−6, Wilcoxon test), and could also be similarly found when binning genes based on their break density and testing for expression distribution (Supplementary Fig. 2A, p < 1.7*10−8, Wilcoxon test), further highlighting the prominence of transcriptionally mediated breaks in BRCA mutated samples.
DNA DSBs in BRCA-mutated cells align with cancer pathways from RNA-seq
Next, we shifted to a gene-centric analysis to identify genes more broken in BRCAmut samples, potentially due to pathological breaks. We identified 1233 differentially broken genes, comprising 714 more broken and 519 fewer broken genes (FDR < 5%, DESeq2; Fig. 2A; Supplementary Tables 2, 3). Many of the genes more broken in BRCAmut are known proto-oncogenes and tumor suppressor genes (Fig. 2B; p < 1.7*10−6, one-sided Fisher’s exact test; Supplementary Fig. 2B); notably, TP53, an established tumor suppressor, was found to be more broken and is also commonly mutated in BRCA1 and BRCA2 germline mutated breast cancer [35]. While more broken genes demonstrate higher breakage across the whole gene body, much of the differential breakage is concentrated at the promoter (Fig. 2C). To estimate the function of differentially broken genes, we performed gene set enrichment of Molecular Signatures Database (MSigDB). Genes more broken in high-risk samples were enriched for several hallmark gene sets, including MYC targets, UV response, P53 pathway, DNA repair, inflammation, and apoptosis (FDR < 25%; Supplementary Table 4; shown FDR < 1%, Fig. 2D). These pathways not only relate to processes and signaling in cancer, but they also correspond with the pathways enriched in the transcriptome of the same high-risk samples, as previously reported [33], thereby strengthening our earlier observations regarding the correlation between the breakome and gene expression.
A Heatmap of normalized break density at differentially broken genes (FDR < 5%). Rows (bins) and columns (samples) are clustered according to their Spearman correlation. B Volcano plot shows gene break shifts between BRCA-mutated (positive log2FC) and normal cells (negative log2FC). Dashed line represents threshold of Padj <0.05. Genes were marked and labeled if they met the threshold and considered an Oncogene/Tumor Suppressor in OncoKB database. C Profile of breaks across gene bodies in differentially broken genes (left) and randomly selected genes (right). Statistics of the region surrounding TSS region were measured using student’s t test. D Gene set enrichment analysis of MSigDB pathways enriched in the breakome of BRCA samples (FDR < 1%).
Increased DNA DSBs in BRCA-mutated cells occur at genes undergoing methylation loss
Since methylation status can induce open or closed chromatin, regions of the genome that undergo gain or loss of methylation often overlap with sites exposed to DSBs or other forms of instability, a phenomenon related to shifts in transcription that can act as a catalyst for carcinogenic transformation [36]. To further confirm the relationship between BRCA mutated status and transcription, we compared available methylation data overlapping with our gene break data. Indeed, genes which were associated with aging-related methylation loss, and therefore increased expression, according to the data of Senapati et al. [37], demonstrated significantly higher break density compared to genes affiliated with methylation gain, i.e expression downregulation (Supplementary Fig. 2C; p < 1.7*10−6 for BRCAmut, p < 0.0005 for BRCAwt, Wilcoxon test), and especially so for BRCAmut samples compared to BRCAwt (Supplementary Fig. 2C; p < 5.4*10−16 for BRCAmut vs BRCAwt methylation loss, p < 1.5*10−13 for BRCAmut vs BRCAwt methylation gain, Wilcoxon test). This was also consistent when we tested promoters (Supplementary Fig. 2D; p < 0.002 for BRCAmut, p < 0.0007 for BRCAwt, p < 1.2*10−5 for BRCAmut vs BRCAwt methylation loss, p < 1.3*10−5 for BRCAmut vs BRCAwt methylation gain, Wilcoxon test). Based on the same data, we also tested the break density of genes specifically associated with transposonal-element related methylation loss, found in DCIS breast cancer. Once again, these genes demonstrated higher break density in BRCAmut samples compared to BRCAwt (Supplementary Fig. 2E; p < 0.02, Wilcoxon test). These findings suggest that BRCAmut cells accumulate more DSBs in demethylated, transcriptionally active regions, linking epigenetic deregulation to transcription-associated genome instability.
DNA DSBs shift between malignant and non-malignant breast cell lines
Previously, sBLISS was utilized to demonstrate various features of cell lines [27, 28, 38,39,40], but a general characterization of breast cell lines is lacking. We therefore applied BLISS to validate malignant and non-malignant breast cancer cell lines (Supplementary Table 5). In line with the literature [27], the breakome of cell lines was consistent across replicates, while different breast lines exhibited many differences in their breakome (Fig. 3A). Principal component analysis further demonstrated this (Fig. 3B). Both non-malignant cell lines, HMLE and MCF10a, are similar compared to malignant lines, except for DKAT, a triple-negative breast cancer (TNBC) line [41] which is characterized by a lack of aneuploidy. MDA-MB231 and MDA-MB436, two additional TNBC lines, showed similar values in their second principal component, as did the two Luminal A lines, MCF7 and T47D. MDA-MB468 is also a TNBC line, yet its breakome resembles Luminal A samples, possibly due to its basal-like morphology, which differs from the mesenchymal-like morphology of MDA-MB231 and MDA-MB436 [42]. Hierarchical clustering of the cell lines based on the genome-wide breakome further supports these observations, with replicates clustering together first, non-malignant cells clustering together, and similarity between similar cells such as MDA-MB231 and MDA-MB436 (Fig. 3C). The distribution of breaks across chromatin states was consistent with findings in primary cells, demonstrating break enrichment at promoters and repetitive heterochromatic regions, while also being enriched for breaks at strong enhancers (Fig. 3D). Furthermore, we tested the correlation of break sites to the H3K4me3 and H3K27me3 histone modifications in each cell line, by creating profile plots in which break loci were centered, and the density of each histone mark was profiled up to 50 kb from the break site (Fig. 3E). As expected, the enrichment of breaks in promoters is reflected in a strong H3K4me3 peak surrounding break sites. On the other hand, H3K27me3 did not show a consistent pattern around breaks, suggesting polycomb-repressed regions are not generally enriched in breaks. Genomic site analysis revealed a high break density for genes in general, particularly in the promoter and 5’UTR regions compared to other genomic features, with intergenic regions being devoid of breaks (Supplementary Fig. 3A). As previously observed in primary cells, genes displayed significantly higher break density in early-replicating genes (Supplementary Fig. 3B) and in short genes (Supplementary Fig. 3C). Together, the characterization of cell lines confirmed our results in primary cells, and allowed us to draw further conclusions using the breakome of breast cancer cell lines as a point of comparison to pre-malignant cell models.
A IGV screenshot of genome-wide DSB pattern of breast cell lines, demonstrating similarity within each cell line’s breakome while highlighting differences between distinct cells. B Principal component analysis of the breakome across the genome, binned in 500 kbp bins, of three replicates of each line. C Heatmap demonstrates sample clustering (Spearman) of cell lines based on their binned whole genome. Samples are shown with replicates. D Distribution of breaks across chromatin states, bars represent break density in each state and sample compared to the representation of each chromatin state in the genome. E Breaks are enriched at the epigenetic histone mark H3K4Me3, which marks gene expression, and deprived at H3K27Me3, which marks closed chromatin regions. Profile plots encompass the 100 kb region centered by the DSBs, 50 kb upstream and 50 kb downstream, while profiling the distribution of the histone mark in that region, relative to breaks.
High-risk primary cells’ DNA double-strand break shifts relate to the breakome of breast cancer cell lines
To test how the breakome of high-risk cells relates to the breakome of breast cancer, we combined our primary cells and cell line data. Primary cells and cell lines display considerable differences, but interestingly, high-risk primary cells were found more similar to the cancer cells than average-risk primary cells were (Fig. 4A, B). This suggests high-risk samples exhibit an intermediate breakome between average risk and cancer cells.
A Principal component analysis of the breakome of normal primary cells, BRCA-mutated primary cells and Breast cell lines highlights the intermediate state of high-risk samples, clustering between normal breakome and breast cancer breakome binned whole genome. Samples are shown with replicates. B Correlation plot highlights similarities and differences between breakome of samples across binned whole genome. C Heatmap (Spearman hierarchical clustering) of the “differentially broken genes” distinguishing BRCA mutated from normal primary cells with inclusion of breast cancer cell line samples demonstrates similar behavior of gene breaks between high-risk primary cells and breast cancer cell lines in genes obtained from DESeq output.
Next, we focused specifically on the list of differentially broken genes between average and high-risk samples. Here, too, malignant cell lines clustered with high-risk BRCAmut samples and distanced themselves from normal primary cell samples, demonstrating genes more broken in high-risk samples are also broken in breast cancer cell lines (Fig. 4C). Together, these data suggest an intermediary phenotype for high-risk samples, demonstrated by similarities and differences at the level of the breakome, implying certain break patterns emerge very early in carcinogenesis and persist into malignancy.
The shift in DNA DSBs in high-risk primary cells correlates with mutation frequency in breast cancer and homologous recombination repair
To test if highly broken genes are more likely to accumulate mutations, we utilized breast cancer mutation data from the COSMIC database [43]. Since exonic mutations are subject to selective pressure, we tested mutations appearing in coding sequences separately (Supplementary Fig. 4A-C). Consequently, we focused on intronic gene regions, analyzing mutational frequency and focusing on four types of genetic alterations: single-nucleotide variants (SNVs), deletions, insertions and rearrangements. When we analyzed the gene lists for mutational data, overlapping gene break density with gene mutation frequency revealed a higher mutational frequency in genes that were highly broken, for SNVs (Fig. 5A), deletions (Fig. 5B), insertions (Supplementary Fig. 4D) and rearrangements, although not as strongly as the first three (Supplementary Fig. 4F). This phenomenon was more prominent and significant in BRCAmut cells compared to average risk breast cells (Fig. 5A, B and Supplementary Fig. 4D, F; p < 2*10-4 for BRCAmut SNV, p < 2.35*10-6 for BRCAmut deletions, p < 3*10−4 for average risk deletions, p < 0.04 for BRCAmut rearrangements, Wilcoxon test). This enrichment was even more evident when we focused specifically on differentially broken genes. Genes more broken in high-risk cells were enriched in SNVs (p < 2*10-6, Wilcoxon test) and deletions (p < 0.019, Wilcoxon test) (Fig. 5C) compared to genes more broken in average risk cells. These findings suggest that genes experiencing higher levels of structural disruption in BRCA-mutant cells are more prone to accumulating somatic mutations—particularly SNVs and deletions—highlighting a potential mechanistic link between gene fragility and mutational burden in high-risk cancer cells.
A SNV or B Deletion gene mutation frequency is shown in relationship to normal primary cell gene breakome and BRCA-mutated cell gene breakome. Breakome vs. mutation analysis demonstrates a higher mutation frequency for genes that are highly broken. Statistics were measured using Wilcoxon test and Pearson correlation across all of the categories was measured. C SNV or Deletion gene mutation frequency is higher in BRCA-mutated Differential genes’ breakome compared to Normal Differential genes’ breakome. Statistics were measured using the Wilcoxon test. D Homologous recombination marker RAD51 enrichment in MCF7 breast cancer associated with gene breakome in BRCA-mutated samples. Breakome vs. repair analysis demonstrates a higher breast cancer-associated RAD51 binding for genes that are highly broken in BRCA-mutated cells. Statistics were measured using Wilcoxon test.
To ensure that changes in mutational frequency are not merely a by-product of gene expression levels, we divided the genes into two groups: highly broken/highly expressed and lowly broken/highly expressed. The group with the highest mutational frequency was, in fact, composed of highly broken/highly expressed genes for both SNVs and deletions, while there were no significant differences for insertions between either group (Supplementary Fig. 4G). This finding indicates that, although highly broken and highly expressed genes showed the greatest mutagenicity, gene expression alone was not a determinant of breast cancer mutations.
Given the role of BRCA in DNA repair, we were curious to explore how repair via HR, or the lack thereof, might contribute to the observed shifts in the breakome landscape of high-risk BRCA-mutated breast cells. To investigate this, we utilized ChIP-seq data of the HR DNA repair protein RAD51 in MCF7 breast cancer cells [44]. Interestingly, genes highly broken in BRCAmut cells exhibited higher correlation with RAD51 occupancy in MCF7 cells (p < 3.18*10−6, Fig. 5D). This was even more evident when focusing only on differentially broken genes (p < 2*10−6). This suggests these genes may require more HR repair even at a high-risk state, and are therefore more susceptible to mutations, explaining the pattern we observe.
Together, our results demonstrate shifts in the dynamics of DNA repair in cells harboring BRCA mutations at very early stages. This might explain why specific genes, shown to be transcriptionally associated, are more susceptible to instability due to the loss of BRCA and competent HR. Consequently, these genes, many of which are considered proto-oncogenes and tumor suppressors, are likely to experience persistent breaks and a higher mutation risk, leading those cells down pathways of malignancy.
DNA DSBs in high-risk PALB2-mutated cells demonstrate partial resemblance to BRCA-mutated cells
To understand whether the DSB patterns we observed are unique to BRCA1/2 mutations or common to other DNA repair mutations, we performed sBLISS on primary cell samples obtained from three women harboring high-risk PALB2 germline mutations (Supplementary Table 1). Their break data was compared to specific replicates of an average risk sample (N4), which was prepared and sequenced in the same batch. Similar to what we observed for BRCAmut, most PALB2mut samples seem to cluster away from PALB2wt (Supplementary Fig. 5A). Moreover, we searched for differentially broken genes and identified 231 genes more broken in PALB2mut samples and 258 less broken genes, but failed to achieve FDR (p < 0.05, DESeq2; Supplementary Fig. 5B). While genes that were more broken in PALB2mut samples had better overlap with BRCAmut samples compared to their WT counterparts, we were not able to identify a significant change in broken tumor suppressors and proto-oncogenes for this sample size (Supplementary Fig. 5C-D). Furthermore, we were only able to identify the Hallmark G2M checkpoint pathway as significantly enriched for PALB2mut differentially broken genes (GSEA; FDR < 25%). Breakome vs. expression analysis recapitulated the notion that genes that are more highly broken demonstrate higher expression distribution (Supplementary Fig. 5E). We repeated the analysis on methylation data overlapping with our PALB2mut gene break data to confirm whether other HR disrupting mutations would also relate to transcription. Indeed, both for genes and especially promoters, methylation loss genes in PALB2mut samples demonstrated significantly higher break density compared to methylation gain genes (Supplementary Fig. 5F-G; p < 0.0002 for PALB2mut genes, p < 1.2*10-6 for PALB2mut promoters, Wilcoxon test). PALB2mut primary cells and cell lines demonstrated considerable similarities and differences, but not as strongly as we observed for BRCAmut (Supplementary Fig. 5H). The list of differential genes for PALB2mut also corresponded with certain breast cancer cell lines (Supplementary Fig. 5I). PALB2mut break density correlated with RAD51 occupancy in MCF7 cells, as observed previously for BRCAmut (p < 1.4*10-9, Supplementary Fig. 5J). Correspondingly, gene mutation frequency analysis exhibited a higher mutational frequency in more broken PALB2mut genes for SNVs, insertions and deletions (Supplementary Fig. 5K), which was more prominent in PALB2mut than in PALB2wt (data not shown). Our data shows that several features of the DSBs seen in BRCAmut cells also appear in other settings, indicating a broader phenomenon. However, testing these effects in a larger tissue group is necessary to confirm our results.
Discussion
Women carrying germline BRCA mutations exhibit seemingly normal breast morphology [33], while underlying processes are already in progress. The prevailing dogma is that BRCA mutations lead to inefficient DNA repair, resulting in subsequent mutations followed by tumor initiation [45,46,47,48]. Despite the roles of BRCA1 and BRCA2 in HR, and although many have linked deficient repair to mutations, a significant gap in the mechanistic relationship remains unexplored. In this study, we utilized a unique model to capture an understudied time point in carcinogenesis that supports this dogma. For the first time, we demonstrate the distinct biology of non–tumor-bearing, high-risk BRCA mutation carriers and its impact on breast tumorigenesis. While prior studies on related HMEC cohorts have confirmed heterozygosity of BRCA1/233,34, we did not directly assess loss of heterozygosity (LOH) in our samples. Nonetheless, we observed clear shifts in the DNA break pattern compared to non-carrier, average-risk controls. These findings suggest that genomic instability can arise well before LOH, thereby challenging the notion that LOH is the essential initiating event in BRCA-driven tumorigenesis.
Despite that BRCAmut samples and average-risk samples cluster almost exclusively away from each other, we still observe a few outliers. Notably, all of the primary cell samples we tested are non-malignant samples, so naturally, despite their important differences, we assume they still have very much in common. We were not yet able to find a factor that would conclusively explain the outlier behavior (age, ethnicity, mutation site or loss of lineage fidelity markers; data not shown), although a larger cohort may be needed to verify any of these hypotheses.
We showed that the breaks resulting from changes in the functionality of BRCA are transcriptional DSBs [49, 50], genes that constantly endure exceptional stress from the burdens of the transcription machinery, highlighting their need for effective repair to maintain stability. In fact, we discovered these genes to be enriched for RAD51 [45] in the context of the breast cancer cell line MCF7, with intact BRCA.
Many of the genes identified as differential between BRCAmut cells and healthy controls are proto-oncogenes and tumor suppressor genes. It is possible to hypothesize that their transcriptional program would shift [46, 51] in response to genomic instability and loss of HR, as a compensatory mechanism to protect the genome, resulting in higher levels of breaks appearing in those regions. This could explain why DNA breaks in normal breast cells do not lead to tumor initiation. Previous work by Shalabi et al. [33] demonstrated a transcriptional shift in BRCA-mutated cells indicative of an accelerated biological age relative to average-risk women. Exploring the breakome of these older average-risk women might offer further insights into the relationship between transcriptional DSBs and oncogenesis.
Previously, studies aimed to understand cancer mechanisms by applying molecular and OMICS approaches to primary breast cancer samples, cell lines, and manipulated cells [47, 48, 51,52,53]. We were interested in determining whether DNA DSBs could explain malignancy by analyzing breakome data of BRCAmut in a non-malignant state from authentic carriers. In our work, we were able to show that the shift in broken genes observed in BRCAmut cells exhibited a break pattern more similar to breast cancer than to healthy breast cells. This could indicate shifts in transcriptional programs in these cells towards cancer-related pathways [33]. Additionally, detecting breaks that behave similarly to breast cancer at such an early stage led us to hypothesize that DNA breaks might predict the genomic instability landscape typical of breast cancer [13, 53,54,55], and suggest likely alterations before they occur. We also showed that higher gene break density in BRCAmut cells correlated more strongly with gene mutation frequencies linked with breast cancer, further reinforcing our conclusions regarding the predictive power of the breakome. Higher gene mutation frequencies were detected in most of the mutation types we tested, namely SNVs, deletions, and gene rearrangements. Insertion mutations did not show an advantage in the BRCAmut setting when we examined differential genes. This was somewhat surprising, as insertions frequently occur in breast cancer [56, 57]. Since insertions alone typically occur in a small number of genes compared to other mutation types, it is possible that the frequencies in our cohort were insufficient to demonstrate any predictive power.
In agreement with our hypothesis, we found that genes which are more highly broken in BRCAmut samples also correlate with higher levels of RAD51 occupancy in MCF7 breast cancer cells, which marks repair by HR [45]. More specifically, genes that were differentially broken in BRCAmut cells were significantly more enriched for MCF7-RAD51, associating them to repair by HR. Our findings suggest that the absence of BRCA1 or BRCA2, and thus deficient repair, leads to gene mutation and tumor initiation through the emergence of transcriptionally associated DSBs in genes categorized as tumor suppressors and proto-oncogenes. We also showed that methylation loss-driven transcriptional activation generates fragile genomic regions, and BRCA deficiency exacerbates their breakage due to impaired repair. This reveals an initiation of events occurring at the very early stages of malignancy, highlighting the potential for developing predictive and preventative tools for managing breast cancer diagnosis. Our findings support the notion that in non-cancer carriers, normal but genomically fragile cells may persist in a pre-malignant state, requiring additional genetic or environmental hits to drive their malignant transformation.
An important consideration in this study is the relatively small cohort size (22 samples), which naturally limits the breadth of our conclusions and highlights the need for validation in larger cohorts. In addition, because the tissues were available only as finite HMEC strains, we were unable to perform certain follow-up assays, such as repair protein ChIP-seq, directly on the same samples and, therefore, complemented our analyses with alternative models. While these comparisons should be interpreted with some caution, the development of future models that permit consistent longitudinal testing would provide an opportunity to further strengthen and extend our findings.
A few questions remain unexplored. For instance, why do germline mutations in BRCA only give rise to tumors in specific tissues such as the breast and ovaries, despite affecting DNA repair in all tissues [58, 59]. An investigation of the breakome of BRCAmut cells in other tissues in relation to transcription could be an interesting step toward further understanding that selectivity. Additionally, pathogenic BRCA mutations subject carriers to a 60–80% chance of developing tumors [45, 60, 61], raising the question of what is different in the remaining 20–40% of carriers. Many speculate that part of the difference is environmental and hormonal [59, 62], but it is possible that perceptions from the breakome may shed more light on the inherent differences. It would also be interesting and monumental to explore how our findings could translate to the understanding of other cancer types in the future, further expanding prevention and treatment capabilities.
Methods
Cell culture of primary HMECs
Primary HMECs at passage 4 were grown and maintained in M87A medium as previously described [63]. HMECs from reduction mammoplasties were obtained from the HMEC Bank [64]. HMECs from prophylactic mastectomies and tissues contralateral or peripheral to tumors were obtained at City of Hope. Mycoplasma testing was performed on all cell strains before use.
Cell culture of cell lines
MCF7 (HTB-22) and T47D (HTB-133) cells were grown in RPMI supplemented with 10% (vol/vol) fetal bovine serum FBS (GIBCO), glutamine, and penicillin/streptomycin. MDA-MB468 (HTB-132) were grown in Leibovitz’s L-15 Medium supplemented with 10% (vol/vol) fetal bovine serum FBS (GIBCO), glutamine, and penicillin/streptomycin. MDA-MB231 (HTB-26) and MDA-MB436 (HTB-130) were grown in DMEM supplemented with 10% (vol/vol) FBS, glutamine, and penicillin/streptomycin. HMLE cells were grown in PromoCell mammary epithelial cell basal media (C-21010) with added supplements (c-93110). MCF10A cells were grown on DMEM/F12 supplemented with 5% Horse serum, 20 ng/ml EGF, 0.5 mg/ml Hydrocortisone, 100 ng/ml Cholera toxin, 10 mg/ml Insulin, and penicillin/streptomycin. Cells were routinely tested for mycoplasma, and cell aliquots from early passages were used.
In-suspension break labeling in situ and sequencing (sBLISS)
sBLISS was conducted as previously described [25, 26]. In summary, 106 cells were fixed in 2% paraformaldehyde in 10% FCS/PBS for 10 min at room temperature. The fixation was quenched with 125 mM glycine for 5 min at room temperature, followed by another 5 min on ice and two washes in ice-cold PBS. Cells were lysed for 60 min on ice, and their nuclei were permeabilized for 60 minutes at 37 °C. Next, nuclei were rinsed twice with CutSmart Buffer containing 0.1% Triton X-100 (CS/TX100), and double-strand break (DSB) ends were blunted in situ using NEB’s Quick Blunting Kit for 60 minutes at room temperature. The blunted nuclei were then washed twice with 1x CS/TX100 before in situ ligation of the sBLISS adapters to the DSB ends. Adaptor ligation was carried out with T4 DNA Ligase for 20–24 h at 16 °C, with BSA and ATP added. Following ligation, the nuclei underwent two washes with 1x CS/TX100, and genomic DNA was extracted using Proteinase K at 55 °C for 14–18 h. Proteinase K was then heat-inactivated for 10 min at 95 °C, followed by extraction using Phenol:Chloroform:Isoamyl Alcohol, Chloroform, and ethanol precipitation. The purified DNA was sonicated in 100 μL of ultra-pure water using Covaris M220 for 60 s. Sonicated samples were concentrated with AMPure XP beads (Beckman Coulter), and fragment sizes were evaluated using a BioAnalyzer 2100 (Agilent Technologies), targeting a range of 300 bp to 800 bp with a peak around 400–600 bp. The sonicated DNA was then in vitro transcribed using the MEGAscript T7 Kit for 14 h at 37 °C. After RNA purification and ligation of the 3’-Illumina adapters, the RNA underwent reverse transcription. The final library indexing and amplification step was performed with NEBNext® Ultra™ II Q5® Master Mix.
sBLISS fastq files are initially de-multiplexed using sample barcodes. Quality control is performed with trim galore to eliminate residual adapters, trim reads to a base quality of at least 20, and filter out short reads smaller than 20 bp. Initial and final sample qualities are assessed with fastqc. Quality-processed fastq files are aligned to the GRCh38 assembly with hisat2, then sorted and indexed using samtools. The resulting bam files are de-duplicated with umi-tools, utilizing genomic coordinates and Unique Molecular Identifiers (UMI). Custom Python and R scripts are employed to identify read start positions and convert bam files to bigWig format and BED files for downstream analysis. An additional custom R script is used to discard a blacklist of positions, primarily within centromeres.
BED format Breakome data was associated with either genes or 0.5MB bins using Bedtools V2.26.0 [65]. For genes, breaks are counted per gene based on overlapping annotations, GRCh38 gene annotations were downloaded from GENCODE [66]. For bins, the whole genome is tiled in windows of 0.5 MB each and breaks are counted per tile. Data were subsequently normalized for FPKM or TPM and Z-score. In some cases, break data was averaged across all samples of the same group and subsequently FPKM or TPM normalized. Further details can be found in the figure legends.
Principal component data was obtained using prcomp function of the R stats base package and visualized using the R ggplot2 package [67].
Heatmaps were created with the ComplexHeatmap R package [68, 69].
“Differential break” analysis was performed using DESeq2 [70] in R v.4.2. Gene set enrichment for MsigDB hallmark pathways was subsequently performed on the DESeq output using GSEA 4.3.2 for pre-ranked data, by their “stat” column. Genes were analyzed for corresponding pathways using the built-in chip platform MSigDB v.2023. The ranking method was signal-to-noise, and hits were considered significant if FDR < 0.25.
Profile plots were created using BigWig data. Files were imported using the R package Rtracklayer [71], normalized and averaged per group for BRCAmut or normal groups. List of genes to profile was subsetted for either positively-differentially broken or random list and then scaled into 10 bins for each gene with a 25% extension before the TSS to capture the full promoter region. Breaks were then profiled for each group across the gene representing region.
Chromatin states
Chromatin states for HMEC were obtained from the chromatin state segmentation ChromHMM by ENCODE (GSE38163) as described [72, 73]. States used for plotting are as follows: 1_Active promoter, 8_Insulator, 3_Inactive/poised Promoter, 14/15_Repetitive/CNV, 12_Polycomb-repressed, 4/5_Strong Enhancer, 10_Transcriptional Elongation, 9_Transcriptional Transition, 6/7_Weak/poised Enhancer, 2_Weak promoter. Normalized breaks were counted per selected chromatin state for each sample. The percentage representation of each chromatin state in the genome was determined, then Log2 observed/expected was calculated.
Time-of-Replication
Samples were analyzed for TOR data downloaded from ReplicationDomain software [74]. Replication timing used was as previously defined by breast cancer MCF7 cell line. Genes were grouped based on their replication timing category and normalized break density was plotted.
RNA-seq data
RNA-seq data from primary HMECs was performed at City of Hope and is available at GSE182338. Only samples that matched the ones we possessed were used and the analysis was performed on the LEP data. TPM values were averaged across replicates for each group to determine the expression rank of each gene. Further analysis of the RNA-seq data can be found in ref. [33].
For expression vs. breakome analysis, 22130 TPM normalized genes were divided into 10 categories based on their TPM expression per group (BRCA1mut, BRCA2mut, BRCAWT), category 1 being the least expressed, up to category 10, highly expressed. Genes in each category were plotted based on their break density levels, each dot representing the median break density per group per expression category.
For combined breakome and expression vs. mutation analysis, samples’ breakome data were merged for BRCAmut and TPM normalized. Genes were considered highly or lowly broken if their TPM break levels belonged to the top or bottom 10% breakome, respectively. For Expression categories, samples were merged for BRCAmut and TPM normalized. Genes were considered highly expressed if their expression levels were TPM > 20. Genes in each of the two categories had to meet both mentioned conditions. Then, mutation frequency was plotted for the genes in each group.
ChIP-seq data
Modifications H3K4me3 and H3K27me3 were downloaded from ChIP-Atlas [75] (GSE85158 [76]) based on hg38 genome build. Top scored 1000 breaks in each cell line were selected, and histone modification peaks were examined in the ±50 Kbp vicinity of these breaks.
ChIP-seq data for RAD51 in MCF7 cells were downloaded from ChIP-Atlas (GSE105597 [44]), and accumulated signal was associated with genes based on their annotations as mentioned above and FPKM normalized. Samples were merged for each group (BRCAmut or normal) and TPM normalized, then split into 10 categories based on their break density, category 1 being the least broken, up to category 10, the most broken. Genes in each category were plotted based on their RAD51 signal. For differential gene analysis, genes were grouped as either “Normal” or “BRCAmut” based on DESeq output (Normal, log2FC < 0 and Padj<0.05; BRCAmut, log2FC > 0 and Padj<0.05). Then, the RAD51 signal was plotted for each group.
SNV, deletion and insertion mutations data analysis
The file Cosmic_NonCodingVariants_Tsv_v101_GRCh37.tar, containing all noncoding variants, was downloaded from COSMIC. This file was first filtered for breast cancer samples and separated based on the type of mutation into SNVs, deletions, and insertions mutation files. These files were then converted into genomic range objects and filtered to contain only mutations within introns. To calculate mutation density, mutations per gene was counted and normalized by the sum of intron length.
For Breakome vs. mutation analysis, breakome categories were created by merging samples for each group (BRCAmut or normal), TPM normalizing, then splitting genes into 20 or 10 categories based on their break density, category 1 being the least broken, up to category 20/10, the most broken. Per mutation type, mutation frequencies of breast cancer related genes (either coding or intronic regions) were normalized to kb. Genes in each break category were plotted based on their mutation frequencies.
For differential gene analysis, genes were grouped as either “Normal” or “BRCAmut” based on DESeq output (Normal, log2FC < 0 and Padj<0.05; BRCAmut, log2FC > 0 and Padj<0.05). Then, mutation frequency was plotted for each group.
Rearrangement mutations data analysis
Mutation data were obtained from Nik-Zainal et al. [77]. The list of genes and their respective rearrangement mutation densities were taken as is from Table 13 in the publication. Breakome data was prepared the same as for the analysis of SNVs, deletions, and insertions. Genes in each break category were plotted based on their mutation density.
Methylation data
Methylation data of aging primary cells were obtained from Senapati et al. [37]. The list of genes and their respective methylation statuses was taken as is from Table 5 (5A for methylation loss status and 5B for methylation gain status) in the publication. The gene list for DCIS age-dependent methylation loss was taken as is from Table 9 in the publication.
Statistics
Groups were compared for normal distribution and variance, to determine the relevant statistical test. Figure legends list the statistical test used during the experiment. Unless otherwise specified, all tests reported were two-tailed. R v.4.2 software was used for statistical analysis. Significance was achieved when P < 0.05 or Binjamini-Hochberg adjusted P < 0.05 when testing for multiple hypotheses. In the case of GSEA, significance of an enriched pathway was achieved when FDR < 0.25.
Data availability
sBLISS data have been deposited in the Gene Expression Omnibus database under accession no. GSE300163.
References
Castellanos G, Valbuena DS, Pérez E, Villegas VE, Rondón-Lagos M. Chromosomal instability as enabling feature and central hallmark of breast cancer. Breast Cancer Targets Ther. 2023;15:189–211.
Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144:646–74.
Oster S, Aqeilan RI. Programmed DNA damage and physiological DSBs: mapping, biological significance and perturbations in disease states. Cells. 2020;9.
Rowe LA, Degtyareva N, Doetsch PW. DNA damage-induced reactive oxygen species (ROS) stress response in Saccharomyces cerevisiae. Free Radic Biol Med. 2008;45:1167–77.
Chatterjee N, Walker GC. Mechanisms of DNA damage, repair, and mutagenesis. Environ Mol Mutagen. 2017;58:235–63.
Liu N, Du J, Ge J, Liu SB. DNA damage-inducing endogenous and exogenous factors and research progress. Nucleosides Nucleotides Nucleic Acids. 2024;44:969–1001.
Trenner A, Sartori AA. Harnessing DNA double-strand break repair for cancer treatment. Front Oncol. 2019;9:1388.
Wozny AS, Alphonse G, Cassard A, Malésys C, Louati S, Beuve M, et al. Impact of hypoxia on the double-strand break repair after photon and carbon ion irradiation of radioresistant HNSCC cells. Sci Rep. 2020;10:1–18.
Yao Y, Dai W. Genomic instability and cancer. J Carcinog Mutagen. 2014;5:1000165.
Eyfjord JE, Bodvarsdottir SK. Genomic instability and cancer: networks involved in response to DNA damage. Mutation Res/Fundam Mol Mech Mutagenesis. 2005;592:18–28.
Tubbs A, Nussenzweig A. Endogenous DNA damage as a source of genomic instability in cancer. Cell. 2017;168:644–56.
Negrini S, Gorgoulis VG, Halazonetis TD. Genomic instability-an evolving hallmark of cancer. Nat Rev Mol Cell Biol. 2010;11:220–8.
Kwei KA, Kung Y, Salari K, Holcomb IN, Pollack JR. Genomic instability in breast cancer: pathogenesis and clinical implications. Mol Oncol. 2010;4:255.
Arun BK, Peterson SK, Sweeney LE, Bluebond RD, Tidwell RSS, Makhnoon S, et al. Increasing referral of at-risk women for genetic counseling and BRCA testing using a screening tool in a community breast imaging center. Cancer. 2022;128:94–102.
Garutti M, Foffano L, Mazzeo R, Michelotti A, Da Ros L, Viel A, et al. Hereditary cancer syndromes: a comprehensive review with a visual tool. Genes. 2023;14:1025.
Choi E, Mun GI, Lee J, Lee H, Cho J, Lee YS. BRCA1 deficiency in triple-negative breast cancer: protein stability as a basis for therapy. Biomed Pharmacother. 2023;158:114090.
Andrew Octavian S, Ying Pei W. Organoids as reliable breast cancer study models: an update. Int J Oncol Res. 2018;1.
Desmedt C, Voet T, Sotiriou C, Campbell PJ. Next-generation sequencing in breast cancer: first take-home messages. Curr Opin Oncol. 2012;24:597.
Rossing M, Sørensen CS, Ejlertsen B, Nielsen FC. Whole genome sequencing of breast cancer. APMIS. 2019;127:303.
Nikitaki Z, Hellweg CE, Georgakilas AG, Ravanat JL. Stress-induced DNA damage biomarkers: applications and limitations. Front Chem. 2015;3:112229.
Clouaire T, Rocher V, Lashgari A, Arnould C, Aguirrebengoa M, Biernacka A, et al. Comprehensive MAPPING OF HISTONE MODIFICations at DNA double-strand breaks deciphers repair pathway chromatin signatures. Mol Cell. 2018;72:250–262.e6.
Rybin MJ, Ramic M, Ricciardi NR, Kapranov P, Wahlestedt C, Zeier Z. Emerging technologies for genome-wide profiling of DNA breakage. Front Genet. 2021;11:610386.
Amente S, Scala G, Majello B, Azmoun S, Tempest HG, Premi S, et al. Genome-wide mapping of genomic DNA damage: methods and implications. Cellular Mol Life Sci. 2021;78:6745–62.
Yan WX, Mirzazadeh R, Garnerone S, Scott D, Schneider MW, Kallas T, et al. BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks. Nat Commun. 2017;8:1–9.
M Bouwman BA, Agostini F, Garnerone S, Petrosino G, Gothe HJ, Sayols S, et al. Genome-wide detection of DNA double-strand breaks by in-suspension BLISS. Nat Protoc. 2020;15:3894–941.
Hidmi O, Oster S, Shatleh D, Monin J, Aqeilan RI. Protocol for mapping physiological DSBs using in-suspension break labeling in situ and sequencing. STAR Protoc. 2024;5:103059.
Hazan I, Monin J, Bouwman BAM, Crosetto N, Aqeilan RI. Activation of oncogenic super-enhancers is coupled with DNA Repair by RAD51. Cell Rep. 2019;29:560–572.e4.
Hidmi O, Oster S, Monin J, Aqeilan RI. TOP1 and R-loops facilitate transcriptional DSBs at hypertranscribed cancer driver genes. iScience. 2024;27:109082.
Basu AK. DNA damage, mutagenesis and cancer. Int J Mol Sci. 2018;19:970.
Peng L, Xu T, Long T, Zuo H. Association between BRCA status and P53 status in breast cancer: a meta-analysis. Med Sci Monit. 2016;22:1939.
Bruchim I, Fishman A, Friedman E, Goldberg I, Chetrit A, Barshack I, et al. Analyses of p53 expression pattern and BRCA mutations in patients with double primary breast and ovarian cancer. Int J Gynecol Cancer. 2004;14:251–8.
Annunziato S, de Ruiter JR, Henneman L, Brambillasca CS, Lutz C, Vaillant F, et al. Comparative oncogenomics identifies combinations of driver genes and drug targets in BRCA1-mutated breast cancer. Nat Commun. 2019;10:1–12.
Shalabi SF, Miyano M, Sayaman RW, Lopez JC, Jokela TA, Todhunter ME, et al. Evidence for accelerated aging in mammary epithelia of women carrying germline BRCA1 or BRCA2 mutations. Nat Aging. 2021;1:838–49.
Nee K, Ma D, Nguyen QH, Pein M, Pervolarakis N, Insua-Rodríguez J, et al. Preneoplastic stromal cells promote BRCA1-mediated breast tumorigenesis. Nat Genet. 2023;55:595–606.
Holstege H, Joosse SA, Th van Oostrom CM, Nederlof PM, de Vries A, Jonkers J. High incidence of protein-truncating TP53 mutations in BRCA1-related breast cancer. Cancer Res. 2009;69:3625–58.
Szyf M, Pakneshan P, Rabbani SA. DNA methylation and breast cancer. Biochem Pharm. 2004;68:1187–97.
Senapati P, Miyano M, Sayaman RW, Basam M, Leung A, LaBarge MA, et al. Loss of epigenetic suppression of retrotransposons with oncogenic potential in aging mammary luminal epithelial cells. Genome Res. 2023;33:1229–41.
Bakr A, Della Corte G, Veselinov O, Kelekçi S, Chen MM, Lin YY, et al. ARID1A regulates DNA repair through chromatin organization and its deficiency triggers DNA damage-mediated anti-tumor immune response. Nucleic Acids Res. 2024;52:5698–719.
Sberna S, Filipuzzi M, Bianchi N, Croci O, Fardella F, Soriani C, et al. Senataxin prevents replicative stress induced by the Myc oncogene. Cell Death Dis. 2025;16:187.
Mosler T, Conte F, Longo GMC, Mikicic I, Kreim N, Möckel MM, et al. R-loop proximity proteomics identifies a role of DDX41 in transcription-associated genomic instability. Nat Commun. 2021;12:7314.
D’amato NC, Ostrander JH, Bowie ML, Sistrunk C, Borowsky A, Cardiff RD, et al. Evidence for phenotypic plasticity in aggressive triple-negative breast cancer: human biology is recapitulated by a novel model system. PLoS One. 2012;7:e45684.
Lehmann BD, Bauer JA, Chen X, Sanders ME, Chakravarthy AB, Shyr Y, et al. Identification of human triple-negative breast cancer subtypes and preclinical models for selection of targeted therapies. J Clin Investig. 2011;121:2750–67.
Sondka, Dhir Z, Carvalho-Silva NB, Jupe D, Madhumita S, McLaren K, et al. COSMIC: a curated database of somatic variants and clinical data for cancer. Nucleic Acids Res. 2024;52:D1210–7.
Snyder M. ENCSR442VBJ. ENCODE datasets. 2017; Available from: https://www.encodeproject.org/experiments/ENCSR442VBJ/.
Stoppa-Lyonnet D. The biological effects and clinical implications of BRCA mutations: where do we go from here? Eur J Hum Genet. 2016;24:S3–9.
Yoshida K, Miki Y. Role of BRCA1 and BRCA2 as regulators of DNA repair, transcription, and cell cycle in response to DNA damage. Cancer Sci. 2004;95:866–71.
Jasin M. Homologous repair of DNA damage and tumorigenesis: the BRCA connection. Oncogene. 2002;21:8981–93.
Arun B, Couch FJ, Abraham J, Tung N, Fasching PA. BRCA-mutated breast cancer: the unmet need, challenges and therapeutic benefits of genetic testing. Br J Cancer. 2024;131:1400–14.
Haffner MC, De Marzo AM, Meeker AK, Nelson WG, Yegnasubramanian S. Transcription-induced DNA double-strand breaks: both oncogenic force and potential therapeutic target?. Clin Cancer Res. 2011;17:3858–64.
Min S, Ji JH, Heo Y, Cho H. Transcriptional regulation and chromatin dynamics at DNA double-strand breaks. Exp Mol Med. 2022;54:1705–12.
Zhu Q, Pao GM, Huynh AM, Suh H, Tonnu N, Nederlof PM, et al. BRCA1 tumour suppression occurs via heterochromatin-mediated silencing. Nature. 2011;477:179–84.
Liu S, Ginestier C, Charafe-Jauffret E, Foco H, Kleer CG, Merajver SD, et al. BRCA1 regulates human mammary stem/progenitor cell fate. Proc Natl Acad Sci USA. 2008;105:1680–5.
Angus L, Smid M, Wilting SM, van Riet J, Van Hoeck A, Nguyen L, et al. The genomic landscape of metastatic breast cancer highlights changes in mutation and signature frequencies. Nat Genet. 2019;51:1450–8.
Duijf PHG, Nanayakkara D, Nones K, Srihari S, Kalimutho M, Khanna KK. Mechanisms of genomic instability in breast cancer. Trends Mol Med. 2019;25:595–611.
Lee JK, Choi YL, Kwon M, Park PJ. Mechanisms and consequences of cancer genome instability: lessons from genome sequencing studies. Annu Rev Pathol Mech Dis. 2016;11:283–312.
Kas SM, De Ruiter JR, Schipper K, Annunziato S, Schut E, Klarenbeek S, et al. Insertional mutagenesis identifies drivers of a novel oncogenic pathway in invasive lobular breast carcinoma. Nat Genet. 2017;49:1219–30.
Iengar P. An analysis of substitution, deletion and insertion mutations in cancer genes. Nucleic Acids Res. 2012;40:6401–13.
Elledge SJ, Amon A. The BRCA1 suppressor hypothesis: an explanation for the tissue-specific tumor development in BRCA1 patients. Cancer Cell. 2002;1:129–32.
Moser SC, Jonkers J. Thirty years of BRCA1: mechanistic insights and their impact on mutation carriers. Cancer Discov. 2025;15:461–80.
Rahman N, Stratton MR. The genetics of breast cancer susceptibility. Annu Rev Genet. 1998;32:95–121.
Depypere H. Treatment of women with BRCA mutation. Climacteric. 2023;26:235–9.
Ramus SJ, Gayther SA. The contribution of BRCA1 and BRCA2 to ovarian cancer. Mol Oncol. 2009;3:138–50.
Labarge MA, Garbe JC, Stampfer MR. Processing of human reduction mammoplasty and mastectomy tissues for cell culture. J Vis Exp. 2013; 50011.
HMEC Extended Life Culture. [cited 12 Mar 2025]. https://hmec.lbl.gov/mindex.html.
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2.
Frankish A, Diekhans M, Ferreira AM, Johnson R, Jungreis I, Loveland J, et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 2019;47:D766–73.
Wickham H. ggplot2. 2016 [cited 12 Mar 2025]; Available from: https://doi.org/10.1007/978-3-319-24277-4.
Gu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016;32:2847–9.
Gu Z. Complex heatmap visualization. iMeta. 2022;1:e43.
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:1–21.
Lawrence M, Gentleman R, Carey V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics. 2009;25:1841–2.
Ernst J, Kellis M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat Biotechnol. 2010;28:817–25.
Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011;473:43–9.
Weddington N, Stuy A, Hiratani I, Ryba T, Yokochi T, Gilbert DM. ReplicationDomain: a visualization tool and comparative database for genome-wide replication timing data. BMC Bioinforma. 2008;9:530.
Oki S, Ohta T, Shioi G, Hatanaka H, Ogasawara O, Okuda Y, et al. Ch IP -Atlas: a data-mining suite powered by full integration of public Ch IP -seq data. EMBO Rep. 2018;19:e76255.
Franco HL, Nagari A, Malladi VS, Li W, Xi Y, Richardson D, et al. Enhancer transcription reveals subtype-specific gene expression programs controlling breast cancer pathogenesis. Genome Res. 2018;28:159–70.
Nik-Zainal S, Davies H, Staaf J, Ramakrishna M, Glodzik D, Zou X, et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature. 2016;534:47–54.
Acknowledgements
We sincerely thank the members of the Aqeilan lab for their valuable discussions and insights. We are deeply grateful to Prof. Itamar Simon for his thoughtful guidance and support as a member of the PhD committee of SOF. We also appreciate the invaluable assistance of Dr. Abed Nasereddin and Dr. Idit Shiff from the Core Research Facility of the Hebrew University-Hadassah Medical School. This study was supported by grants from the Israel Science Foundation (ISF) [No. 1056/21]. We also acknowledge the support of the Carole and Andrew Harper Diversity Scholarship Program and the VATAT PhD scholarship to OH, and the Science Training Encouraging Peace (STEP) Fellowship, previously supported SOF.
Author information
Authors and Affiliations
Contributions
SOF, YD and RIA conceived and designed the study. SOF performed all sBLISS experiments. SOF, JM and OH performed computational analysis. SOF, OH, JAAO, VES and MAL, YD and RIA wrote the manuscript. JAAO, VES and MAL provided the BRCAmut and BRCAwt primary cell samples. JAAO grew the primary cell samples and fixed them for sBLISS by SOF.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
All methods were performed in accordance with the relevant guidelines and regulations. Human mammary epithelial cell samples were obtained under approval of the City of Hope Institutional Review Board (IRB #17185). The study was conducted in accordance with the Declaration of Helsinki and United States federal regulations governing research involving human participants. The trial is subject to United States federal regulations concerning clinical trials and undergoes yearly review; the trial is currently active. Subjects consented in writing and in person.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Edited by Gerry Melino
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Oster Flayshman, S., Hidmi, O., Alva-Ornelas, J.A. et al. The breakome of BRCA1 and BRCA2 pathway mutation carriers reveals early processes in breast oncogenesis. Cell Death Dis 16, 891 (2025). https://doi.org/10.1038/s41419-025-08235-2
Received:
Revised:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41419-025-08235-2







