Introduction

Extrachromosomal circular DNA (eccDNA) is a type of circular DNA that exists freely outside the chromosome in various organisms. It is commonly found in viruses1, bacteria2, as well as in the mitochondria and chloroplasts of eukaryotic organisms3. In the 1960s, such circular DNA molecules were first noted in wheat embryos and boar sperm using electron microscopy4. Recent studies have shown that eccDNAs exist in cells, and the chromosomes of various eukaryotic organisms can form eccDNA5,6,7,8,9,10,11,12,13,14,15. It has been found that eccDNA is involved in tumorigenesis and aging through various mechanisms, such as increasing the copy number of oncogenes, promoting oncogene amplification through autonomous replication, and enhancing the plasticity and instability of oncogenes11,15,16,17,18. EccDNAs have also been reported to have transcriptional capabilities and can function as enhancers to enhance gene expression8,19. Additionally, circular DNA derived from viruses (vcDNA) can be transcribed into RNA and then cleaved into small viral RNAs by host Dicer-2, thus inhibiting viral proliferation through the RNAi pathway20. With advancements in scientific techniques, particularly the development of next-generation sequencing and bioinformatics, eccDNAs have been discovered in an increasing number of organisms, sparking widespread interest as functional molecules18,21,22,23.

The mechanism behind eccDNA formation remains largely unclear. Several hypotheses have been proposed in this regard, including genome rearrangement, excision and ligation, recombination between tandem repeat sequences, and mRNA reverse transcription24. It has been shown that yeast, animals, and plants can generate telomeric circular DNA (t-circles) and play key roles in the alternative-lengthening of telomeres (ALT) and telomere rapid deletion (TRD)24. Apoptosis can induce the formation of eccDNA in cancer cells. The DNA fragments generated via apoptosis can be ligated into eccDNA in the presence of DNA ligase 325. The role of DNA replication in eccDNA formation remains debated. Some studies have shown that inhibiting ongoing DNA replication with inhibitors leads to an increase in eccDNA levels26, while other research provide evidence for replication-independent eccDNA generation27. EccDNAs can be produced via intramolecular homologous recombination27, double strand break (DSB) excision, and microhomology-mediated end joining (MMEJ) repair28. The formation of eccDNA is dependent on DNA sequence features. In multiple human cell lines, alpha satellite repeat sequences (approximately 170 bp in length) are significantly enriched in eccDNA10. Concurrently, genes derived from tandem repeat sequences are also notably enriched in Drosophila eccDNA29. A variety of repetitive elements, including satellites and tandem repeats, have been systematically identified in the eccDNA of human HeLa cells, human fibroblasts, and multiple mammalian species such as mice, rats, hamsters, and monkeys30. Furthermore, the protein complex CTC1/STN1/TEN1 (a trimeric protein complex that binds to single-stranded DNA with high affinity), which is involved in telomere maintenance, also significantly contributes to eccDNA formation31. In yeast cells, lacking the RecQ family DNA helicase slow growth suppressor 1 (SGS1) can significantly increase eccDNA levels32. These results suggest that eccDNA formation involves multiple pathways and key proteins.

Silkworm (Bombyx mori) is a model species of Lepidoptera. The silk gland, originating from the ectoderm, is a paired tubular organ composed of silk gland cells and lumens, believed to be the most efficient organ for synthesizing proteins33. We performed eccDNA sequencing of the posterior silk glands of silkworms and detected 35,346 eccDNAs. Motif analysis revealed that dual direct repeats are flanking the 5′ and 3′ break points of eccDNA. The sequences exceeding 1 kb length in eccDNAs presented palindromic sequence characteristics flanking the 5′ and 3′ break points of eccDNAs34. These motifs might support possible models for eccDNA generation. Subsequently, we identified a microDNA, designated as eccDNAfib-L, with a size of 542 bp (Chromosome 14: 9,692,083–9,692,624 nt) (https://silkdb.bioinfotoolkits.net/main/species-info/-1). This eccDNA harbors a partial sequence of the fibroin light chain (Fib-L) gene and is flanked by “GAGT” direct repeats at both the 5′ and 3′ break points. Functional analysis revealed that eccDNAfib-L acts as a positive regulator of fib-L gene expression34. However, the mechanism underlying the formation of eccDNAfib-L is unknown and remains to be studied.

Herein, we investigated the possible formation modes of eccDNAfib-L and found that transfection of linear DNA fragments containing the eccDNAfib-L sequence into heterologous cell lines recapitulated the characteristic junction initially observed in its linear substrate. The flanking regions of this DNA fragment consist of the upstream sequence of 5′ break point of the eccDNAfib-L and the downstream sequence of the 3′ break point of eccDNAfib-L. The formation efficiency of eccDNAfib-L was closely related to the length of its flanking sequence. The direct short repeat “GAGT” near the break point, UV irradiation and etoposide treatment promote the formation of eccDNAfib-L. Genetics screening and DNA pull-down experiment showed that the factors associated with DNA repair pathway, such as the genes involved in microhomology-mediated end-joining (MMEJ), were involved in the biogenesis of eccDNAfib-L. Furthermore, using a cell-free reaction system, we revealed that Polθ is the key factor in the formation of eccDNAfib-L, Polθ can mediate the circularization steps during eccDNAfib-L formation through direct short repeats, and the DExH-box helicase domain of Polθ is crucial for this circularization process. Overall, the results of this study not only identified the factors needed for eccDNAfib-L formation and discovered the underlying pathway for its generation, but also complemented previous reports on eccDNA formation.

Results

The formation efficiency of eccDNAfib-L is closely associated with the direct short repeats at the break points and the length of its flanking sequence, the mechanism behind eccDNAfib-L biosynthesis is evolutionarily conserved

The eccDNAfib-L sequence originates from the 9,692,083–9,692,624 nt region of chromosome 14 of the silkworm genome (https://silkdb.bioinfotoolkits.net/main/species-info/-1), carrying part of the fib-L gene sequence. Unlike the vast majority of eccDNAs in the silk gland with TA-enriched at the 5′ and 3′ break points34, there was a “GAGT” sequence at both the 5′ and 3′ break points of eccDNAfib-L, but only one “GAGT” sequence was retained in eccDNAfib-L (Fig. 1A, B). To explore the mechanism behind the formation of eccDNAfib-L, the primer pairs were designed based on the sequences at 200 nt (150, 100, and 50 nt) of the upstream of the 5′ break point and at 200 nt (150, 100, and 50 nt) of the downstream of the 3′ break point (Fig. 1B). The mini-200, mini-150, mini-100 and mini-50 were obtained by PCR amplification using the silkworm genomic DNA as template. A specific band with a molecular weight consistent with the theoretical value can be observed (Supplementary Fig. 1A). The corresponding products mini-200 (mini-150, mini-100, and mini-50) were transfected into Spodoptera frugiperd Sf9 (Fig. 1C), grass carp CIK (Fig. 1E), human ovarian cancer COV362 (Fig. 1G) and BmN cells (Fig. 1I). DNAs were extracted 48 h later and amplified by PCR with junction primers. The results of Sanger sequencing showed that the specific junction observed in silkworm eccDNAfib-L linear substrate can also be detected in cells from fish and mammals (Supplementary Fig. 2A–D), suggesting that the circularization step during the formation of eccDNA is conserved across different cell types. Meanwhile, the copy number of eccDNAfib-L in each microgram of total DNA was determined by qPCR, and it was found that the formation efficiency of eccDNAfib-L showed a regular decreasing trend, specifically manifested as the formation efficiency of mini-200 being comparable to that of mini-150, and both were significantly higher than that of mini-100, while the formation efficiency of mini-100 was higher than that of mini-50 (Fig. 1D, F, H, J).

Fig. 1: The formation of eccDNAfib-L is detected in different cells.
Fig. 1: The formation of eccDNAfib-L is detected in different cells.
Full size image

A The sequence of eccDNAfib-L. “GAGTACTC” represents the inverted repeat sequence at the 5. break site. In addition to the “GAGT” sequences at the 5′ and 3′ breakpoints, there are 3 additional “GAGT” sequences in the internal region. B The characterization of eccDNAfib-L break points flanking sequences and the construction of mini-200, mini-150, mini-100, mini-50, mini-50-mut1, mini-50-mut2, and mini-50-mut3. The gray arrow indicates the transition from linear to circular, and the black arrow indicates the primer used for amplifying genomic DNA. The red arrow indicates the divergent primers eccDNAfib-L-junction, which is used for amplifying the flanking sequences of junction site of the eccDNAfib-L. C 1×105 of Sf9 cells were cultured for 24 h and then transfected with 2 μg of mini-50, mini-100, mini-150, and mini-200, respectively. DNA was extracted after 48 h, and the sequence flanking junction site was amplified by PCR with eccDNAfib-L-junction primers. Control, normal cells without treatment; H2O, 1×105 Sf9 cells were cultured for 24 h and then transfected with 1 μL of H20. D The efficiency of eccDNAfib-L formation. 1 × 105 of Sf9 cells were transfected with 2 μg of mini-50, mini-100, mini-150, and mini-200, respectively. After 48 h, cells were collected, DNA was extracted, and the efficiency of eccDNAfib-L formation was determined by qPCR. E, G, and I Experimental protocol was the same as that mentioned for (C). F, H, and J, Experimental protocol was the same as that mentioned for (D). The cells used were CIK cells (E, F), COV362 cells (G, H), and BmN cells (I, J), respectively. K 1 × 105 of Sf9 cells were cultured for 24 h and then transfected with 2 μg of mini-50, mini-50-mut1, mini-50-mut2, and mini-50-mut3, respectively. DNA was extracted after 48 h, and the sequence flanking junction site was amplified by PCR with eccDNAfib-L-junction primers. Control, normal cells without treatment. L 1 × 105 of Sf9 cells were cultured for 24 h and then transfected with 2 μg of mini-50, mini-50-mut1, mini-50-mut2, and mini-50-mut3, respectively. After 48 h, cells were collected, DNA was extracted, and the efficiency of eccDNAfib-L formation was determined by qPCR. (n = 3).

To confirm whether a closed circular DNA is formed, we transfected mini-50 into Sf9 cells. After treating the DNA samples extracted from the transfected cells with nuclease S1, PCR detection was carried out, and the junction PCR products could still be detected, suggesting a closed circular DNA is formed (Supplementary Fig. 3A). Furthermore, in Sf9 cells transfected with mini-50, specific signals representing the formation of circular DNA could also be observed by in situ cell hybridization (Supplementary Fig. 3B). Additionally, positive signal representing eccDNAfib-L formed by intramolecular cyclization was detected in mini-50-transfected Sf9 cells through Southern blot (Supplementary Fig. 3C). These results confirm that the linear DNA fragments transfected into the cells have formed closed-circular DNA.

To understand the effect of the inverted GAGT repeat (GAGTACTC) of 5′ break point and GAGT of 3′ break point (Fig. 1A) on the circularization of eccDNAfib-L, mini-50-mut1 was generated by deleting the “GAGT” flanking the 5′ break point in mini-50, mini-50-mut2 by deleting “ACTC” flanking the 5′ break point, and mini-50-mut3 by deleting “GAGT” flanking the 3′ break point (confirmed by PCR and Sanger sequencing) (Fig. 1B and Supplementary Fig. 1B, C). Following transfection into Sf9 cells, total DNA was extracted after 48 h, amplified by PCR using junction primers, and the copy number of eccDNAfib-L per microgram of total DNA was quantified by qPCR. The results showed a specific junction in Sf9 cells transfected with mini-50-mut2, indicating that the direct short “GAGT” repeats at the 5′ and 3′ break points are critical for the circularization step during eccDNAfib-L formation (Fig. 1K, L and Supplementary Fig. 2E). These results indicated that the formation of eccDNAfib-L is closely associated with the direct short repeat “GAGT” flanking the 5′ and 3′ break points and the length of its flanking sequence.

UV irradiation and etoposide treatment of cells promote the formation of eccDNAfib-L

A previous study has shown that UV irradiation and etoposide treatment can induce apoptosis, thereby promoting the production of eccDNA25. To determine the impact of UV irradiation and etoposide treatment on the generation of eccDNAfib-L, BmN cells were treated with etoposide or UV light. qPCR was employed to determine the abundance of eccDNAfib-L in treated BmN cells. The results demonstrated a significant increase in the abundance of eccDNAfib-L (Fig. 2A), suggesting that both UV irradiation and etoposide treatment promote the production of eccDNAfib-L. qRT-PCR showed that the expression levels of Apaf-1 and Caspase-3 genes, two pro-apoptotic factors increased, while the expression levels of bcl-2 gene, an inhibitor of apoptosis, decreased in the treated group (Fig. 2B). Moreover, flow cytometry showed that the percentage of apoptotic cells in the treated group was significantly higher than that in the control group (untreated BmN cells) (Fig. 2C, D). Terminal deoxynucleotidyl transferase-mediated dUTP-biotin nick end labeling (TUNEL) assay showed that compared to the control group, cells in the treated group emitted more red positive signals representing apoptotic cells (Fig. 2E). All results showed that apoptosis was induced by etoposide and UV treatments.

Fig. 2: UV irradiation and etoposide treatment can induce apoptosis and promote the formation of eccDNAfib-L.
Fig. 2: UV irradiation and etoposide treatment can induce apoptosis and promote the formation of eccDNAfib-L.
Full size image

A Efficiency of eccDNAfib-L formation. Total DNA extracted from the BmN cells treated with etoposide and UV light, respectively was used to determine the efficiency of eccDNAfib-L formation by qPCR. B Expression levels of apoptosis-related genes. BmN cells were treated with etoposide and UV light. Total RNA was extracted from the treated BmN cells. The transcription levels of apoptosis-related genes relative to TIF-4A were detected by qRT-PCR. The cells without treatment were used as a control. C Proportion of apoptotic cells. D Annexin V-PE/7-AAD double staining assay. BmN cells were treated with etoposide and UV light, followed by Annexin V-PE and 7-AAD double staining, and the proportion of cells in different quadrants was detected by flowcytometry. E TUNEL assay. BmN cells were treated with etoposide and UV light. Then, cells were stained with TUNEL assay kit (red) and observed under a fluorescence microscope (n = 3).

MMEJ-related genes are involved in eccDNAfib-L formation

It was reported that DNA damage repair pathways, including MMEJ, are involved in eccDNA formation35. To investigate the effect of MMEJ on eccDNAfib-L formation, we employed RNAi to individually silence key MMEJ-associated genes in BmN cells, including DNA repair protein X-Ray repair cross complementing protein1 (XRCC1, LOC101737558), flap endonuclease 1 (Fen1, LOC101741812), DNA polymerase theta (Polθ, LOC101746157) and DNA ligase 3 (Lig3, LOC101739679) (Fig. 3A, C, E, G). The expression level of eccDNAfib-L was significantly decreased in cells transfected with XRCC1-siRNA, Fen1-siRNA, Polθ-siRNA, and Lig3-siRNA compared to those transfected with NC-siRNA containing random sequence (Fig. 3B, D, F, H). Similar results were found when these genes were silenced in silkworm (Fig. 3I–P). Therefore, we suggested that MMEJ-related genes are involved in eccDNAfib-L formation.

Fig. 3: Effect of silencing MMEJ repair pathway-related genes on eccDNAfib-L formation.
Fig. 3: Effect of silencing MMEJ repair pathway-related genes on eccDNAfib-L formation.
Full size image

A, C, E, G The interference efficiency of siRNAs in BmN cells. 1 × 105 BmN cells were cultured for 24 h. Then, cells were transfected with siRNA at the final concentration of 5 µmol/mL. The total RNA was extracted at 48 h post transfection and reversely transcribed into cDNA. The relative expression levels of genes related to MMEJ were determined by qPCR. Considering the NC-siRNA as the control and the TIF-4A gene as the internal reference. B, D, F, H The effect of silencing MMEJ repair pathway-related genes on eccDNAfib-L formation in BmN cells. The transfected cells with siRNAs were mentioned above. The cells were collected at 48 h post transfection for DNA extraction. The efficiency of eccDNAfib-L formation was determined by qPCR. I, K, M, O Interference efficiency of siRNA in silkworm. siRNA (5 µmol/mL, 5 µL) was injected into silkworm larvae of the 5th instar. The total RNA of the silk gland was collected at 72 h post-injection, and reversely transcribed into cDNA. The relative expression levels of genes related to MMEJ were determined by qPCR. The NC-siRNA was used as a control and TIF-4A gene was used as an internal reference. J, L, N, P The effect of silencing MMEJ repair pathway-related genes on eccDNAfib-L formation in silkworm. The silk glands of injected silkworms injected with siRNAs were collected at 72 h post-injection to extract DNAs. Using eccDNAfib-L-junction primer, the efficiency of eccDNAfib-L formation was determined by qPCR (n = 3).

EccDNAfib-L formation is associated with DNA repair pathway

To identify the proteins related to eccDNAfib-L formation, eccDNAfib-L and mini-50 was used for DNA pull-down assay to screen the proteins involved in eccDNAfib-L formation (Fig. 4A). Simultaneously, to eliminate non-specific binding proteins, pull-down assays were implemented using two distinct strategies: one omitting DNA probes and the other incorporating random DNA probe. After separating the obtained precipitated complex by SDS-PAGE, a specific protein band with a molecular weight between 70 and 100 kDa in the mini-50 DNA pull-down complex was recovered for identification with mass spectrometry. The proteins related to DNA repair pathways, such as Ku domain-containing protein (Ku, AAEG003684), DNA-(apurinic or apyrimidinic site) endonuclease (APE, LOC101739740), poly (ADP-ribose) glycohydrolase (PARG, LOC101743876), cell division cycle 5-like protein (Cdc5L, LOC101743516) and DNA-directed DNA polymerase (Polθ, LOC 101746157), were associated with the formation of eccDNAfib-L (Fig. 4B, Supplementary Table 1). To confirm this speculation, the eccDNAfib-L formation was assessed in the cultured BmN cells after silencing these genes by siRNAs. The results showed that eccDNAfib-L production significantly reduced after silencing KU, APE, Cdc5L and Polθ genes (Fig. 4C–H and Fig. 3E, F), however, the eccDNAfib-L formation increased after silencing the PARG gene (Fig. 4I, J), indicating that proteins involved in DNA repair pathways participate in the formation of eccDNAfib-L.

Fig. 4: Efficiency of eccDNAfib-L formation after silencing the genes related to DNA repair pathway.
Fig. 4: Efficiency of eccDNAfib-L formation after silencing the genes related to DNA repair pathway.
Full size image

A A flowchart of the DNA pull-down. Left, A DNA pull-down assay was conducted to target mini-50. Right, A DNA pull-down assay was conducted to target eccDNAfib-L. B Separation of DNA pull down complexes based on mini-50 and eccDNAfib-L by SDS-PAGE. The biotin-labeled probe 5′-Bio-mini-50 targeting mini-50 and the biotin-labeled probe 5′-Bio-eccDNAfib-L targeting eccDNAfib-L junction site were incubated with streptavidin magnetic beads, followed by incubation with silk gland protein. Simultaneously, pull-down assays were also conducted using two separate approaches: one without adding DNA probe and the other with the addition of random DNA probe. The precipitated complex was separated by SDS-PAGE, and protein bands were visualized by silver staining. The red arrows indicated a specific band. C, E, G, and I The interference efficiency of siRNA in BmN cells. 1×105 BmN cells were cultured for 24 h, and then siRNA was transfected into cells at a final concentration of 5 µmol/mL. Total RNA was extracted at 48 h post transfection and reversely transcribed into cDNA. The relative expression levels of genes were determined by qPCR. Considing the NC-siRNA as the control and the TIF-4A gene as internal reference. D, F, H, J The effect of silencing DNA repair pathway-related genes on eccDNAfib-L formation in BmN cells. The transfected cells with siRNAs were mentioned above. The cells were collected at 48 h post transfection for DNA extraction. The efficiency of eccDNAfib-L formation was determined by qPCR (n = 3).

Only in the cell-free system does Polθ require exposed direct repeats at both ends to join DNA molecules

Among MMEJ-related genes, we found that Polθ had a more pronounced effect on eccDNAfib-L formation. To confirm the effect of Polθ on eccDNAfib-L production, we constructed a Polθ expression plasmid pIZT-V5-Polθ and found that Polθ was overexpressed in Sf9 cells transfected with pIZT-V5-Polθ (Fig. 5A). To unravel the role of Polθ in the formation of eccDNAfib-L, a DNA fragment GAGT-eccDNAfib-L-GAGT containing the linear sequence of eccDNAfib-L flanked by “GAGT” was constructed by PCR (Supplementary Fig. 4A). The mini-200, mini-150, mini-100, mini-50, and GAGT-eccDNAfib-L-GAGT were added to a cell-free reaction system containing a mixture of recombinant Polθ and its interacting proteins captured by protein A + G beads (Fig. 5B). The specific PCR product representing the junction site formed was only detected in the reaction system after adding GAGT-eccDNAfib-L-GAGT (Fig. 5D). These results indicate that Polθ only produces the junction of eccDNAfib-L when the two microhomology sequences are on the terminal ends of the DNA. Next, a DNA fragment GAGT-eccDNAfib-L was generated by deleting the “GAGT” located at the 3′ end of the DNA fragment GAGT-eccDNAfib-L-GAGT, a DNA fragment eccDNAfib-L-GAGT was generated by deleting the “GAGT” located at the 5′ end of the DNA fragment GAGT-eccDNAfib-L-GAGT. Meanwhile, the linear sequence of eccDNAfib-L (L-eccDNAfib-L) with deletions of “GAGT” at both ends was also prepared by PCR. These DNA fragments were validated by sequencing (Supplementary Fig. 4A). The cellular immunofluorescence assay showed that strong yellow fluorescence, generated by the co-localization of GAGT-eccDNAfib-L-GAGT (Green) and Polθ (Red), was mainly distributed around the nuclear envelope. In contrast, co-localization of Polθ with GAGT-eccDNAfib-L, eccDNAfib-L-GAGT, or L-eccDNAfib-L yielded substantially weaker yellow fluorescent signals (Fig. 5E). In addition, Western blot analysis was performed on the cytoplasmic proteins and nuclear proteins isolated from BmN cells. Histone H3 (a nuclear protein marker) was not detected in the cytoplasmic protein samples, and tubulin (a cytoplasmic protein marker) was not detected in the nuclear protein samples. However, Polθ was detected in both types of protein samples (Supplementary Fig. 5), indicating that Polθ is distributed in both the cytoplasm and the nucleus. The GAGT-eccDNAfib-L-GAGT, GAGT-eccDNAfib-L, eccDNAfib-L-GAGT, and were added to a cell-free reaction system containing a mixture of recombinant Polθ and its interacting proteins captureed by protein A + G beads (Fig. 5B). The specific PCR product representing the formed junction site was only detected in the reaction system after adding GAGT-eccDNAfib-L-GAGT (Fig. 5F). However, after adding ART558 (a Polθ inhibitor) to the cell-free reaction system, the junction site formed from GAGT-eccDNAfib-L-GAGT cannot be detected by PCR (Fig. 5G). These results suggested that Polθ and direct short repeat “GAGT” are critical factors for the biogenesis of eccDNAfib-L.

Fig. 5: Polθ requires exposed direct repeats on both ends to join DNA molecules.
Fig. 5: Polθ requires exposed direct repeats on both ends to join DNA molecules.
Full size image

A The recombinant Polθ expressed in the Sf9 cells was detected by Western blot. 1×105 Sf9 cells were cultured for 24 h, and transfected with 2 μg of pIZT-V5-Polθ and pIZT-V5/His, respectively. Then, cells collected at 48 h post transfection were used to extract protein. After the proteins were separated by SDS-PAGE, mouse anti-V5 antibody (1: 1000) and mouse anti α-tubulin antibody (1:5000) were used to detect the expressed protein by Western blot. HRP-labeled goat anti-mouse IgG (1:5000) was used as the secondary antibody. Protein loaded was 50 µg per lane. B Flowchart of preparation for a mixture of recombinant Polθ and its interacting proteins, and cell-free reaction system containing a mixture of Polθ and its interacting proteins. A subset of elements in this figure was generated by figdraw.com. C Flowchart of recombinant Polθ purification and cell-free reaction system containing puried Polθ. A subset of elements in this figure was generated by figdraw.com. D The mini-200, mini-150, mini-100, mini-50, and GAGT-eccDNAfib-L-GAGT were added to a cell-free reaction system containing a mixture of Polθ and its interacting proteins. PCR and Sanger sequencing were used to detect junction site of eccDNAfib-L. E Co-localization between GAGT-eccDNAfib-L-GAGT and Polθ in the BmN cells. GAGT-eccDNAfib-L-GAGT, the linear sequence of eccDNAfib-L flanked by “GAGT”. GAGT-eccDNAfib-L, the linear molecule of eccDNAfib-L with deletions of “GAGT” at the 3′ end. eccDNAfib-L-GAGT, the linear molecule of eccDNAfib-L with deletions of “GAGT” at the 5′ end. L-eccDNAfib-L, the linear molecule of eccDNAfib-L with deletions of “GAGT” at both ends. F The 0.25 μg of GAGT-eccDNAfib-L-GAGT, GAGT-eccDNAfib-L, eccDNAfib-L-GAGT, and L-eccDNAfib-L were added to a cell-free reaction system containing a mixture of Polθ and its interacting proteins. PCR was used to detect the junction site of eccDNAfib-L. 1 μM of ART558 (Polθ inhibitor) and 0.25 μg of GAGT-eccDNAfib-L-GAGT were added to the cell-free reaction system containing a mixture of Polθ and its interacting proteins (G) or the purified Polθ (H), and incubation at 16 °C overnight, the junction site of eccDNAfib-L was detected by PCR. The cell-free reaction system without ART558 was used as a control. Line, PCR was performed using the linear sequence of eccDNAfib-L primers for detecting the linear sequence of eccDNAfib-L. Circular, PCR was performed using the eccDNAfib-L-junction primers for detecting the junction site of eccDNAfib-L.

However, since additional proteins interacting with Polθ might still exist in the precipitated complex, we could not rule out the potential roles of these co-precipitated proteins in the circularization process. Therefore, the pIZT-V5-Polθ was transfected into Sf9 cells, followed by screening with zeocin at a final concentration of 2 μg/mL for 30 days. We further purified recombinant Polθ from the lysates of ‌pIZT-V5-Polθ-transformed Sf9 cells‌ using ‌Ni-NTA agarose‌. Subsequently, the purified recombinant Polθ was employed to reconstitute a cell-free system, and the circularization of linear DNA molecules was re-examined (Fig. 5C and Supplementary Fig. 6A, B). When GAGT-eccDNAfib-L-GAGT was added to the cell-free reaction system, the PCR product representing the junction site of eccDNAfib-L can be detected. In contrast, when ART558 was added to the cell-free reaction system, the junction site formed from GAGT-eccDNAfib-L-GAGT cannot be detected by PCR (Fig. 5H). These results further indicate that Polθ is the primary mechanism of eccDNAfib-L biogenesis.

To elucidate the role of Polθ in the formation of eccDNAfib-L, we investigated the impact of the Polθ inhibitor ART558 on eccDNAfib-L formation. The CCK-8 assay was used to assess the effect of ART558 on the viability of Sf9 cells. The results showed that a concentration of 1 μM ART558 had no significant effect on cell viability (Supplementary Fig. 7A). Subsequently, Sf9 cells were treated with 1 μM ART558, while untreated cells served as the control. After 24 h, 2 μg of GTAG-eccDNAfib-L-GTAG was transfected into the cells. Total DNA was extracted at 48 h post-transfection, and the copy number of eccDNAfib-L was quantified by qPCR using primers targeting the eccDNAfib-L junction. The results demonstrated a significant reduction in the content of eccDNAfib-L in cells treated with ART558, further confirming that Polθ contributes to the cyclization of eccDNAfib-L (Supplementary Fig. 7B).

Polθ can potentially perform the circularization of eccDNAfib-L via direct short repeats

We propose that the short repeat “GAGT” of GAGT-eccDNAfib-L-GAGT binds to Polθ, facilitating the formation of eccDNAfib-L. To validate this hypothesis, GTAG-eccDNAfib-L-GTAG, AGAG-eccDNAfib-L-AGAG, CATC-eccDNAfib-L-CATC and CGAT-eccDNAfib-L-CGAT DNA fragments were constructed by replacing “GAGT” on the GAGT-eccDNAfib-L-GAGT fragment with “GTAG”, “AGAG”, “CATC”, and “CGAT”, respectively (Supplementary Fig. 4C). PCR and Sanger sequencing showed that the junction site (one “GTAG” was retained at the junction site) formed from GTAG-eccDNAfib-L-GTAG in the cell-free reaction system containing the purified Polθ can be detected. Similar results were found for AGAG-eccDNAfib-L-AGAG, CATC-eccDNAfib-L-CATC, and CGAT-eccDNAfib-L-CGAT (Fig. 6A, B). However, compared to GAGT-eccDNAfib-L-GAGT, they form their junction site with lower efficiency (Fig. 6C). In addition, a DNA fragment GAGT-GFP-GAGT was generated by replacing the linear sequence of eccDNAfib-L located at the GTAG-eccDNAfib-L-GTAG with a green fluorescent protein (GFP) gene sequence. A DNA fragment GAGT-GFP was generated by removing “GAGT” located at 3′ end of the DNA fragment GAGT-GFP-GAGT. A DNA fragment GFP-GAGT was generated by removing “GAGT” located at 5′ end of the DNA fragment GAGT-GFP-GAGT. Meanwhile, the linear molecule of the gfp gene was prepared. These DNA fragments were confirmed by sequencing (Supplementary Fig. 4B). PCR and Sanger sequencing showed that the junction site generated from DNA fragment GAGT-GFP-GAGT in the cell-free reaction system can be detected, only one “GAGT” was retained at the generated junction site. However, GAGT-GFP, GFP-GAGT and the linear molecule of the gfp gene failed to be cyclized (Fig. 6D, E). These results suggested that the direct short repeat “GAGT” flanking 5′ and 3′ break points of eccDNAfib-L is essential for eccDNAfib-L formation through microhomologous recombination mediated by Polθ.

Fig. 6: Polθ mediates the formation of eccDNAs by direct short repeats.
Fig. 6: Polθ mediates the formation of eccDNAs by direct short repeats.
Full size image

A The 0.25 μg of GTAG-eccDNAfib-L-GTAG, AGAG-eccDNAfib-L-AGAG, CATC-eccDNAfib-L-CATC and CGAT-eccDNAfib-L-CGAT were added to a cell-free reaction system containing the purified Polθ. The reaction product was used to detect the junction site by PCR. B The Sanger sequencing of junction site. C The 0.25 μg of GTAG-eccDNAfib-L-GTAG, AGAG-eccDNAfib-L-AGAG, CATC-eccDNAfib-L-CATC and CGAT-eccDNAfib-L-CGAT were added to a cell-free reaction system containing the purified Polθ. qPCR was used to detect the efficiency of junction site formation. D The 0.25 μg of GAGT-GFP-GAGT, GAGT-GFP, GFP-GAGT, and linear molecule of GFP were added to a cell-free reaction system containing the purified Polθ. PCR and Sanger sequencing were used to detect the junction site. E The 0.25 μg of GTAG-eccDNAfib-L-GTAG and GAGT-GFP-GAGT were added to a cell-free reaction system containing the purified Polθ. qPCR was used to detect the efficiency of junction site formation. F, G DNA ligases participate in the formation of eccDNAfib-L. An in vitro cell-free reaction system (20 μL) containing 66 mM Tris-HCl (pH7.6), 6.6 mM MgCl2, 10 mM DTT, 0.1 mM ATP, 0.25 μg of GAGT-eccDNAfib-L-GAGT and 6 μg of puried Polθ were incubated at 16 °C overnight. 0.25 μg of the reaction product was mixed with 50 nM purified Polθ in a buffer containing 66 mM Tris-HCl (pH 7.6), 6.6 mM MgCl2, 10 mM DTT, 0.1 mM ATP, and 1 mM dNTPs in a total volume of 20 µL, and incubated for 120 min at 37 ˚C for DNA synthesis. After treatment with S1 nuclease at 37°C for 30 min, PCR and Sanger sequencing was used to detect the junction site (F). The synthesized product was ligated by T4 DNA Ligase according to the manufacturer′s instructions, and followed by treatment with S1 nuclease at 37 °C for 30 min. The resulted product was subjected to PCR with eccDNAfib-L-junction primers. The PCR product was confirmed by Sanger sequencing (G). Meanwhile, the ligated product without treatment with S1 was used as a control. H The synthesized product was treated with ligase and without ligase, respectively, and then subjected to agarose gel electrophoresis (n = 3).

To understand the formation of eccDNAs in silk glands, we analyzed the conservation of flanking sequences of break points of eccDNAs with different sizes. There was a short repeat sequence at the flanking sequences of the break points of 16 eccDNAs smaller than 1000 bp, accounting for 0.05% of all eccDNAs (Table 1). Among eccDNAs larger than 1000 bp, 17 eccDNAs had short repeat sequences at the flanking sequences, accounting for 0.84% of all eccDNAs (Table 2). These findings suggest that not all the circularization steps during eccDNA formation involve short repeat sequences flanking the break points.

Table. 1 Conservativeness analysis of breaking point flanking sequences of eccDNAs smaller than 1000 bp
Table. 2 Conservativeness analysis of breaking point flanking sequences of eccDNAs larger than 1000bp

DNA ligase participates in the formation of eccDNAfib-L

The formation of eccDNA requires Lig3, which is essential for sealing the DNA junctions formed during MMEJ25,36,37. We guess that there is a nick in the intermediate of generated eccDNAfib-L mediated by Polθ, and Lig3 might seal the nick. To verify this speculation, we utilized the reaction product of GAGT-eccDNAfib-L-GAGT from an in vitro cell-free system as a template for Polθ-mediated DNA synthesis. PCR could detect the junction site formed by GAGT-eccDNAfib-L-GAGT in the synthesized product, but the junction site could not be detected after the synthesized product was treated with nuclease S1 (Fig. 6F). In addition, when the synthesized product mentioned above was ligated by DNA ligase, regardless of whether it was digested with nuclease S1 or not, the junction site formed by GAGT-eccDNAfib-L-GAGT could be detected by PCR (Fig. 6G), indicating that DNA ligase was involved in the formation of eccDNAfib-L in vivo. To further validate the presence of nick in the eccDNAfib-L generated via Polθ-mediated circularization, we performed agarose gel electrophoresis analysis on both ligase-treated and untreated samples. The results revealed that the ligase-treated eccDNAfib-L exhibited a significantly faster migration rate compared to the untreated counterpart (Fig. 6H). This discrepancy is most likely attributable to the ligase-mediated sealing of nick within the eccDNAfib-L molecules, converting the originally relaxed circular DNA containing nick into fully closed circular conformations. The resultant topological transition to a more compact structure reduces frictional resistance during electrophoresis, thereby accounting for the accelerated migration. Collectively, these findings provide direct electrophoretic evidence, from the perspective of conformational dynamics, supporting the existence of nick in the Polθ-generated eccDNAfib-L.

DExH-box helicase domain of Polθ mediated the cyclization of eccDNAfib-L

Bioinformatics analysis results revealed that there is DExH-box helicase (96-299 aa), Ski2 family helicase (298-503 aa), HTH (Helix-turn-helix) (497-586 aa), HLD (polymerase θ helicase-like domain) (613-773 aa), and polymerase (1434-1823 aa) domains in the silkworm Polθ (XP_021206915.3). To understand the role of these domains in the formation of eccDNAfib-L, we constructed recombinant vectors pET28-DExH, pET28-Ski2, pET28-HTH, pET28-HLD, and pET28-polymerase to express DExH-box helicase, Ski2 family helicase, HTH, HLD, and polymerase domains, respectively (Fig. 7A). SDS-PAGE and Western blot revealed that specific recombinant protein bands representing purified DExH-box (22.63 kDa), Ski2 family (22.47 kDa), HTH (10.16 kDa), HLD (18.35 kDa), and polymerase (44.27 kDa) can be detected, respectively (Fig. 7B–F). Fluorescence detection results demonstrated that the purified polymerase domain possesses catalytic activity (Supplementary Fig. 8). Subsequently, the GAGT-eccDNAfib-L-GAGT were added to a cell-free reaction system containing the purified DExH-box, Ski2 family, HTH, HLD, and polymerase. The PCR and Sanger sequencing results demonstrated that the junction site formed by GAGT-eccDNAfib-L-GAGT could be detected in the presence of purified DExH-box helicase domain (Fig. 7G). Additionally, the junction site formed by cyclization was detected regardless of ART558 was added (Fig. 7H). These findings indicate that the DExH-box helicase domain of Polθ mediates the formation of eccDNAfib-L.

Fig. 7: DExH-box helicase mediated the formation of eccDNAfib-L.
Fig. 7: DExH-box helicase mediated the formation of eccDNAfib-L.
Full size image

A A schematic diagram of different domains in Polθ and their prokaryotic expression vectors. SDS-PAGE (left) and Western blot (right) for recombinant DExH-box (B), Ski2 family (C), HTH (D), HLD (E), and polymeras (F), respectively. The primary antibodie was anti-6×His tag mouse polyclonal antibody (diluted at a 1:1000). The HRP-labeled goat anti-mouse IgG was used as the secondary antibody at a 1:5000 dilution. The red arrow represented the target protein. G The GAGT-eccDNAfib-L-GAGT was added to a cell-free reaction system containing the puried DExH-box, Ski2 family, HTH, HLD, and polymeras proteins. PCR and Sanger sequencing were used to detect the junction site of eccDNAfib-L. Control, The GAGT-eccDNAfib-L-GAGT was added to a cell-free reaction system containing the purified protein expressed in the E. coli transformaed with pET28a (+). H 1 μM of ART558 (Polθ inhibitor) and 0.25 μg of GAGT-eccDNAfib-L-GAGT were added to the cell-free reaction system containing the purified DExH-box, and incubation at 16°C overnight, the junction site of eccDNAfib-L was detected by PCR. The cell-free reaction system without ART558 was used as a control.

Discussion

It was suggested that eccDNAs are produced via genomic rearrangements mediated by homologous recombination30. Previous studies have indicated a close association between short repetitive sequences and the formation of eccDNAs. These sequences play a role in their shedding from chromosomes to generate small DNA loops38. Subsequently, several models of eccDNAs formation have been proposed, including the breakage-fusion-bridge (BFB) cycle39, translocation-deletion-amplification model40, episome model41 and the chromothripsis model42. However, these models do not offer a comprehensive understanding of eccDNA formation.

Previous studies have shown that contiguous repeat sequences adjacent to eccDNA break points, either in a forward or reverse orientation, can facilitate their formation. The eccDNAfib-L was derived from the 9,692,083–9,692,624 nt region of chromosome 14 of the silk gland genome. We analyzed the flanking sequence of 200 bp adjacent to eccDNAfib-L break points, but did not find any distinctive secondary feature. To understand the effect of flanking sequences on eccDNAfib-L formation, BmN cells were transfected with eccDNAfib-L linear molecules with different lengths in flanking sequences. The results revealed a notable association between the formation efficiency of eccDNAfib-L and the length of its flanking sequence. Of particular note, a marked reduction in formation efficiency was observed when the flanking sequence was truncated to 50 bp, which suggests that a minimum length of flanking sequence may be required for the formation of eccDNAfib-L.

It was reported that intrachromosomal homologous recombination between 9-base direct repeats produces a circular DNA with one copy of the repeats43. In this study, there were 4 bp-direct short repeats “GAGT” on both sides of the eccDNAfib-L locus in the chromosome, and only one copy of “GAGT” repeats was retained in the formed eccDNAfib-L. Therefore, we suggested that eccDNAfib-L formation is mediated by the 4 bp direct repeat “GAGT”. This hypothesis has been confirmed by the deletion and mutation of direct repeat “GAGT”. However, analysis of the eccDNAfib-L sequence revealed that three additional “GAGT” sequences in the internal region, beyond those at the 5′ and 3′ break points. What determines the preference for specific 3′ and 5′ break points remain a mystery. This may be related to the formation of a special secondary structure between the upstream and downstream sequences of the 3′ and/or 5′ break points. Interestingly, eccDNAfib-L could be generated in cultured insect, fish, and mammalian cells after transfecting with linear eccDNAfib-L molecules with different lengths of the flanking sequences, suggesting that the circularization step during the formation of eccDNA in the different organisms is evolutionarily conserved.

A previous study has shown that UV irradiation and etoposide treatment can induce apoptosis, thereby promoting the production of eccDNA25. In this study, we found that the abundance of eccDNAfib-L was significantly increased in BmN cells treated with UV and etoposide, suggesting apoptosis promotes the formation of eccDNAfib-L. However, in addition to apoptosis, UV and etoposide treatments are known to induce both DNA damage and replication stress44,45,46, pathways that may also mediate eccDNA biogenesis. Notably, during the larval stage of silkworms, apoptosis is only activated at a very late stage of silk gland development (wandering stage)47, and the abundance of eccDNAfib-L increases with silk gland development (data not shown). Therefore, it can be considered that apoptosis is not a key factor in the formation of eccDNAfib-L in silk glands. The abundance of eccDNAs significantly increases following DNA damage28,48, suggesting that the formation of eccDNA is associated with DNA repair49,50,51. To find the candidate proteins involved in eccDNAfib-L formation, DNA pull down assay was used to screen the proteins associated with eccDNAfib-L formation. Five types of DNA repair-related proteins were identified. Wherein, Ku protein was recognized as a key factor in the non-homologous end joining (NHEJ) pathway. During the repair of DSBs, Ku recognizes and binds to the end of DNA strand, forming a Ku-DNA complex that recruits DNA-dependent protein kinases (DNA-PKCs) to the damaged site to activate its kinase activity. In Addition, the Ku-DNA complexes can recruit other proteins to participate in DNA repair processes52,53,54. APE functions as a nuclease that specifically repairs the AP site (apurinic/apyrimidinic site) in the base excision repair (BER) pathway55. Cdc5L, a transcriptional regulator, contributes to DNA damage repair by regulating the expression and splicing efficiency of genes56. Polθ, a low-fidelity polymerase with deconjugating enzyme-like activity, plays a central role in MMEJ57,58. In this study, it was found that the efficiency of eccDNAfib-L formation was significantly reduced after silencing the Ku, APE, Cdc5L, and Polθ genes in BmN cells. PARG is an enzyme responsible for the hydrolysis of ribose-ribose bonds on poly (ADP-ribose) (PAR)-ylated proteins and is involved in DNA repair59. However, silencing the PARG gene promoted eccDNAfib-L formation. These results indicated that the DNA repair-related proteins have an impact on the formation of eccDNAfib-L.

Among various pathways involved in DNA repair, it has been observed that excision following DSB and MMEJ repair can markedly affect the levels of eccDNA28. The nucleotide-resolution eccDNA break point identification revealed the presence of microhomology sequences on both sides of most eccDNA break points, further demonstrating that MMEJ-mediated eccDNA formation is the main mechanism for eccDNA biogenesis60. In this study, eccDNAfib-L formation was downregulated after silencing XRCC1, Fen1, Polθ, and Lig3 genes in cultured BmN cells and silkworms. Based on the effect of MMEJ-related genes on the efficiency of eccDNAfib- L formation, we speculated that Polθ plays a key role in the formation of eccDNAfib-L. In vitro cell-free reaction system was used to assess the role of Polθ in the formation of eccDNAfib-L. We found that in the reaction system containing Polθ, a junction site with one copy of “GAGT”, which was formed from GTAG-eccDNAfib-L-GTAG can be detected. Consistently, adding Polθ inhibitors to the reaction system inhibited eccDNAfib-L formation. This result was consistent with the Polθ-mediated microhomologous recombination mechanism, strongly suggesting that eccDNAfib-L was produced by MMEJ. Moreover, the junction sites with one copy of short repeat sequence formed from GAGT-GFP-GAGT, AGAG-eccDNAfib-L-AGAG, CATC-eccDNAfib-L-CATC, and CGAT-eccDNAfib-L-CGAT DNA fragments were found in their corresponding cell-free reaction system, while GAGT-GFP and linear molecule of the GFP gene failed to form junction site, indicating that the direct short repeats are essential for the circularization steps during eccDNA formation through Polθ-mediated MMEJ. Cellular immunofluorescence results showed that strong yellow fluorescence, which was generated by the co-localization of GAGT-eccDNAfib-L-GAGT (Green) and Polθ (Red), was mainly distributed around the nuclear envelope. This observation suggests that Polθ is capable of binding to the linear form of eccDNAfib-L, which contains the short repeats “GAGT” at both sides. This phenomenon could be attributed to the ability of Polθ to recognize and bind to DNA fragments with microhomologous sequences at both ends, exhibiting a strong selectivity for such fragments61,62. Interestingly, in cell-free reaction systems containing mini-200, mini-150, mini-100, or mini-50, junction sites of eccDNAfib-L were undetectable. For DNA fragments with different sequences (possibly combinations of asymmetric DNA fragments), as long as both ends have overlapping sequences of a certain length, nucleic acid exonucleases can cleave the ends into single strands through enzymatic cleavage, thereby enabling complementary pairing and ligation63. Therefore, we believe that prior to initiating eccDNAfib-L formation, it is necessary to remove the upstream sequence of the 5′ short repeat sequence and the downstream sequence of the 3′ short repeat sequence at the eccDNAfib-L break point using an exonuclease.

Polθ, an enzyme belonging to the A family of DNA polymerases, is composed of two primary functional domains with different enzymatic activities: the N-terminal helicase-like domain (Polθ-HD) and the C-terminal polymerase domain (Polθ-PD). These domains are connected by an unstructured central domain64. The analysis of the domains of Polθ in the silkworm reveals that it contains five different structural domains. Of which, DExH-box helicase domain can bind and hydrolyze any nucleotide triphosphates (NTP) to unwind oligonucleotide substrates. It utilizes the hydrolysis of NTPs to drive cycles of 3′ to 5′ directional movement, resolving and/or unwinding of double-stranded RNA, DNA, RNA/DNA hybrids, R-loops, triplex DNA, and G-quadruplexes65. The Ski2-like family is a restricted family of superfamily SF2 helicases that includes RNA and DNA helicases66. In eukaryotic organisms, some Ski2-like enzymes function as DNA helicases involved in homologous recombination67,68. HTH (Helix-turn-helix domain) enables the binding to DNA, playing a crucial role in the activation or repression of gene transcription69. Based on sequence and structural conservation, the helicase-like domain (HLD) of DNA Polθ is classified as a member of the helicase SF2, specifically belonging to the Ski2-like family. It is proposed that the HLD exerts critical functions in the repair of DNA DSBs via the MMEJ pathway, including the removal of single-stranded DNA (ssDNA)-binding proteins such as replication protein A (RPA) and Rad51, as well as the mediation of microhomology sequence alignment70. The polymerase domain possesses the essential DNA polymerase activity required for gap filling71,72,73. In this study, these recombinant domains expressed by Escherichia coli were added to the cell-free system containing GAGT-eccDNAfib-L-GAGT, respectively. It was observed that the junction site of eccDNAfib-L was generated only in the presence of DExH-box helicase domain. Consequently, we propose that DExH-box helicase domain unwinds the double-stranded DNA fragment GAGT-eccDNAfib-L-GAGT, searches for microhomologous sequences anneals the 5′ and 3′ short repeat sequences, and form a “gap-closed loop” intermediate, leading to the initiation of eccDNAfib-L formation mediated by DNA polymerase.

DNA ligase is required for DNA replication, repair and recombination. Lig3 is specifically required to seal the DNA junction formed during MMEJ36,37, presumably after other enzymes, have further process the end-joining intermediate74, and the formation of eccDNA depends on Lig325. This study used the reaction products generated by the GAGT-eccDNAfib-L-GAGT in the cell-free reaction system to synthesize DNA via Polθ. We found that the junction site formed by GAGT-eccDNAfib-L-GAGT can be detected in the synthesized product by PCR, but after the synthesized product was digested with nuclease S1, the junction site cannot be detected. Interestingly, after the synthesized product was ligated by DNA ligase, the junction site can be detected regardless of whether it was digested with nuclease S1. Therefore, we speculated that the intermediate of eccDNA formed by MMEJ pathway through Polθ contains a nick, which can be sealed by DNA ligase.

In summary, our research confirmed that eccDNAfib-L formation is closely related to the length of the upstream sequence of its 5′ break point and the downstream sequence of its 3′ break point. The short direct repeats “GAGT” adjacent to the break points are necessary for eccDNAfib-L formation. The eccDNAfib-L formation is mediated by Polθ through MMEJ. The mechanism of the circularization steps during eccDNA biogenesis is conserved in the different organisms. Our research findings provide a model for the mechanism of the circularization steps during eccDNA formation. Specifically, the broken double-stranded DNA is unwound by the activity of DExH-box helicase of Polθ, followed by searching for micro-homologous regions for annealing, and removing 3′ protruding single strands. Subsequently, a complementary strand is synthesized by DNA polymerase, and the 5′ protruding single strand is removed to form a circular DNA molecule with a nick. Finally, the nick is ligated by ligase, and eccDNAfib-L is formed through crossover75 (Fig. 8). However, this model is a proposed model derived from the existing experimental results, and some details therein remain unclear, which merits further in-depth investigation.

Fig. 8: A model for the mechanism of the circularization steps during eccDNA formation.
Fig. 8: A model for the mechanism of the circularization steps during eccDNA formation.
Full size image

The question mark indicates that the described is a speculation.

Materials and methods

Cell culture

BmN and Sf9 cells were cultured in TC-100 medium (Thermo Fisher Scientific, USA) supplemented with 10% fetal bovine serum (Biological Industries, Israel) at 26 °C. CIK cells were cultured in M199 medium (BasalMedia, China) with 10% fetal bovine serum at 26 °C. COV362 cells were cultured in DMEM/high-sugar medium (Hongsheng Biotechnology, China) containing 10% fetal bovine serum with 5% CO2 at 37°C.

Preparation of eccDNAfib-L linear molecules with different lengths of flanking sequences by PCR amplification

The preparation strategy is shown in Fig. 1B. The chromosomal coordinates were obtained based on the genome of the silkworm database SilkDB3.0 (https://silkdb.bioinfotoolkits.net/main/species-info/-1). Mini-200, mini-150, mini-100, and mini-50 primers (Supplementary Table 2) were designed based on the sequences at 200, 150, 100, and 50 nt upstream of the 5′ break point of eccDNAfib-L and downstream of the 3′ break point of eccDNAfib-L, respectively. The mini-200, mini-150, mini-100, and mini-50 DNA fragments from the silk gland DNA of B. mori were obtained through PCR amplification. mini-50-mut1 was generated by deleting the “GAGT” flanking the 5′ break point in mini-50, mini-50-mut2 by deleting “ACTC” flanking the 5′ break point, and mini-50-mut3 by deleting “GAGT” flanking the 3′ break point using the primer pairs of mini-50-mut1-F and mini-50-R, mini-50-mut2-F and mini-50-R, and mini-50-F and mini-50-mut3-R (Supplementary Table 2), respectively.

Verifying complete digestion of linear DNA

Silkworm gland DNA was extracted and incubated with Plasmid-Safe ATP-Dependent DNase (Lucigen, USA) at 37 °C for 30 min to remove linear genomic DNA, following the manufacturer′s instructions. The DNA sample not treated with this DNase served as a control. To confirm the complete elimination of linear DNA, PCR amplification was performed for verification, using specific primer pairs targeting the silkworm ovarian tumor gene promoter (Potu, derived from linear genomic DNA) and mitochondrial DNA (mtDNA, a circular molecule) (Supplementary Table 2), respectively. PCR products were analyzed by agarose gel electrophoresis to verify whether linear DNA had been completely removed (Supplementary Fig. 3D).

Detection of eccDNAfib-L

1 × 105 of BmN, Sf9, CIK or COV362 cells were cultured in 6-well plates for 24 h. Subsequently, cells were transfected with 2 μg of mini-200, mini-150, mini-100 or mini-50. After 48 h, DNA was extracted and incubated with Plasmid-Safe ATP-Dependent DNase (Lucigen, USA) at 37 °C for 30 min to remove linear genomic DNA, following the manufacturer′s instructions. The formation of eccDNAfib-L was detected by PCR using the primer eccDNAfib-L-junction (Supplementary Table 2). PCR products were sequenced after being subcloned into vector pMD-19T (Takara, Japan). Additionally, the total DNA was extracted from the treated cells. The copy number of eccDNAfib-L per microgram of total DNA was determined by qPCR to evaluate the formation efficiency of eccDNAfib-L. Each sample was repeated 3 times.

eccDNAfib-L formation efficiency assay

1 × 105 of BmN cells were cultured in 6-well plates for 24 h and were transfected with 2 μg of mini-150, mini-100, mini-50, or mini-50-delete. After 48 h, cells were collected and DNA was extracted. The efficiency of eccDNAfib-L formation was detected by qPCR using the primer eccDNAfib-L-junction. The efficiency of eccDNAfib-L formation was evaluated using the copy number of eccDNAfib-L per microgram of cellular DNA.

Situ hybridization

The biotin-labeled probe (5′-Bio-eccDNAfib-L) (Supplementary Table 2) targeting the junction site sequence of eccDNAfib-L was synthesized by Sangon Biotech (Shanghai, China). Sf9 cells were seeded in 24-well plates at 104/well and transfected with mini-50. The cells were then collected after 48 h. After washing three times with 1×PBS, cells were fixed in 4% paraformaldehyde and permeabilized by 0.5% Triton X-100 for 20 min. After washing twice with PBS, cells were blocked with hybridization buffer for 4 h at 37 °C, followed by overnight incubation with 5′-Bio-eccDNAfib-L. The next day, slides were sequentially washed with descending concentrations of SSC buffer (5×, 2×, 0.5×, 0.2×), prepared by diluting 20×SSC stock. After blocking with 3% BSA, signals were detected using fluorescein isothiocyanate (FITC)-conjugated streptavidin (Beyotime, China). After washing with PBS, the cells were counterstained with DAPI (1:1,000; Beyotime, China) and examined under a Leica SP8 Confocal Microscope (Leica, Germany) with a 40× objective. The cells transfected with H2O served as the control.

Southern blot

1 × 105 Sf9 cells were transfected with 2 μg of the mini-50, while untransfected Sf9 cells served as controls. After 48 h, DNA was extracted and incubated with Plasmid-Safe ATP-Dependent DNase (Lucigen, USA) at 37 °C for 30 min to remove linear genomic DNA. A 10 μg aliquot of DNA was digested with KpnI and XbaI (Takara, Japan) at 37°C overnight, and the digested products were separated by 1% agarose gel electrophoresis. Subsequently, the gel was incubated in denaturation buffer (0.5 M NaOH, 1.5 M NaCl) whith shaking for 30 min, following by transfer to neutralization buffer (0.5 M Tris-HCl, pH 7.5, 1.5 M NaCl) for an additional 30 min shaking incubation to achieve neutralization. After transferring the DNA from the gel to Hybond-N+ nylon membranes (Roche, Switzerland), Southern blot was performed with a biotin-labeled probe (5′-biotion-eccDNAfib-L) targeting the junction site of eccDNAfib-L. Signals were visualized using a Biotin Chromo genic Detection Kit (Thermo Scientific, USA).

Expression levels of apoptosis-related genes and the efficiency of eccDNAfib-L formation detected by qPCR

BmN cells were treated with 0.5 μM etoposide (Beyotime, China) for 24 h or treated by irradiation at 3 mJ using UV light. Treated cells were cultured for 16 h. BmN cells without treatment were set as controls. The total RNA of cells was extracted using the RNeasy R Plus Mini Kit (Qiagen, Germany). After removing the genomic DNA by DNaseI (Beyotime, China), the total RNA was reverse transcribed into cDNA using random primers. Subsequently, the expression levels of apoptosis-related genes, including apoptotic protease activating factor-1 (Apaf-1), cysteinyl aspartate specific proteinase-3 (caspase-3), B-cell lymphoma-2 (bcl-2), were detected by qPCR with the primers listed in Supplementary Table 2. The translation initiation factor eIF-4A (TIF-4A) gene (Supplementary Table 2) of silkworm was used as an internal reference. The relative expression levels of genes were calculated by 2-ΔΔct. In addition, the corresponding cells were collected for DNA extraction, the efficiency of eccDNAfib-L formation was measured by qPCR using the primer eccDNAfib-L-junction (Supplementary Table 2). Each sample was repeated 3 times.

Detection of apoptosis by TUNEL assay

The cover glasses (WHB, China) with BmN cells were washed three times with 1 × PBS. Cells were fixed in 4% paraformaldehyde and permeabilized by 0.5% Triton X-100 for 20 min. Cells were stained using a one-step TUNEL apoptosis assay kit (Beyotim, China) following the method described by previous studies76, and observed under a fluorescence microscope (Leica, Germany).

Detection of apoptosis by double-staining with Annexin V-PE and 7-AAD

Normal and apoptotic BmN cells were collected into 1.5 mL tubes and centrifuged at 1000 r/min for 5 min at 4 °C. The supernatant was discarded and cells were washed twice with 1×PBS. Subsequently, cells were stained with Annexin V-PE/7-AAD apoptosis detection kit (Yeasen, China) and were measured by flow cytometry (BD Biosciences, USA) following the method described by previous study77.

Effect of gene silencing on eccDNAfib-L formation

1 × 105 of BmN cells were cultured for 24 h and transfected with siRNAs (Supplementary Table 2) targeting DNA repair protein XRCC1 (XRCC1, LOC101737558), flap endonuclease 1 (fen1, LOC101741812), DNA polymerase theta (Polθ, LOC101746157), DNA ligase3 (lig3, LOC101739679), Ku domain-containing protein (Ku, AAEG003684), DNA-(apurinic or apyrimidinic site) endonuclease (APE, LOC101739740), Poly (ADP-ribose) glycohydrolase (PARG, LOC101743876), and cell division cycle 5-like protein (cdc5L, LOC101743516) genes at a final concentration of 5 µmol/mL. Total RNA was extracted 48 h after transfection. Cells treated with NC-siRNA with random sequence were used as a control. Moreover, 5 μL of siRNAs at a concentration of 5 μmol/mL was injected into the silkworm larvae on the first day of the fifth instar (strain Qingsong×Haoyue). The total RNAs of the silk gland were extracted at 48 h post injection. After the RNAs were reverse transcribed into cDNA, the gene silencing efficiency was determined by qPCR. The primers were shown in Supplementary Table 2. The silkgland of NC-siRNA-injected silkworm was used as the control. Each sample was repeated 3 times.

In addition, the total DNA was extracted from treated BmN cells and silk glands. The copy number of eccDNAfib-L per microgram of DNA was determined by qPCR to evaluate the efficiency of eccDNAfib-L formation.

DNA pull-down

Total proteins were extracted from the silk gland of the silkworm larvae on the first day of the fifth instar (strain Qingsong×Haoyue), and the concentration was determined by the BCA protein assay kit (Beyotime, China). The biotin-labeled primers (5′-Bio-mini-50-F/5′-Bio-mini-50-R; Supplementary Table 2) designed based on linear sequence of eccDNAfib-L and synthesized by Sangon Biotech (Shanghai, China). The biotin-labeled mini-50 fragment was generated via PCR using aforementioned primers. In addition, the biotin-labeled probe 5′-Bio-eccDNAfib-L (Supplementary Table 2) designed based on the eccDNAfib-L junction site was also synthesized (China). They were each incubated with the silk gland extract, followed by the addition of streptavidin-coated magnetic beads for further incubation (ThermoFisher, USA) (Fig. 4A). The obtained precipitation complexes were separated by SDS-PAGE, the differential protein bands were recovered for mass spectrometry identification (Applied Protein Technology, China). Simultaneously, to exclude non-specific binding proteins, pull-down assays were also conducted using two separate approaches: one without adding DNA probe and the other with the addition of random DNA probe.

Construction of plasmid pIZT-V5-Polθ and transient expression of Polθ in Sf9 cells

The Polθ gene (XP_021206915.3) was synthesized by Genscript (China), and cloned into KpnI and NotI sites of the vector pIZT-V5/His (Sangon Biotech, China) to construct the plasmid pIZT-V5-Polθ. In this plasmid, Polθ gene was fused with V5 and 6×His tag sequences. 1 × 105 of Sf9 cells were cultured for 24 h and were transfected with 2 μg of pIZT-V5-Polθ. The pIZT-V5/His-transfected cells were used as control. Cells collected at 48 h post-transfection were subjected to Western blot for detection of recombinant Polθ. The primary antibody was a mouse anti-V5 (Beyotime, China), the secondary antibody was a HRP-labeled goat anti-mouse IgG (Proteintech, USA). α-tubulin (Proteintech, USA) was used as the internal reference. The protein loaded was 50 μg per lane. Each sample was repeated 3 times.

Preparation of DNA fragments flanked by short direct repeats

A DNA fragment GAGT-eccDNAfib-L-GAGT containing the linear sequence of eccDNAfib-L flanked by “GAGT” was obtained by PCR from the silk gland genomic DNA (primers are shown in Supplementary Table 2). Similar methods were used to construct GTAG-eccDNAfib-L-GTAG, AGAG-eccDNAfib-L-AGAG, CATC-eccDNAfib-L-CATC and CGAT-eccDNAfib-L-CGAT. A DNA fragment GAGT-GFP-GAGT was generated by replacing the linear sequence of eccDNAfib-L located at the GTAG-eccDNAfib-L-GTAG with a green fluorescent protein (GFP) gene sequence. Moreover, a DNA fragment GAGT-eccDNAfib-L was generated by deleting the “GAGT” located at the 3′ end of the DNA fragment GAGT-eccDNAfib-L-GAGT. eccDNAfib-L-GAGT was generated by deleting the “GAGT” located at the 5′ end of the DNA fragment GAGT-eccDNAfib-L-GAGT. GAGT-GFP and GFP-GAGT DNA fragment were generated using a similar approach. The PCR products were confirmed by Sanger sequencing.

Preparation for a mixture of recombinant Polθ and its interacting proteins

The recombinant Polθ with V5-tag was enriched by immunoprecipitation from the total protein of Sf9 cells transfected pIZT-V5-Polθ following the method described by previous study78. In brief, 1 µL of PMSF and 5 μL of anti-V5 antibody (Beyotime, China) were added into the total protein (400 μg) extracted from Sf9 cells transfected with pIZT-V5-Polθ. They were incubated at 4 °C overnight, followed by incubation with Protein A + G (Beyotime, China) at 4 °C for 4 h. The precipitation complex was washed five times with PBS and was used as a mixture of recombinant Polθ and its interacting proteins.

Polθ protein purification

The pIZT-V5-Polθ was transfected into Sf9 cells, followed by screening with zeocin at a final concentration of 2 μg/mL for 30 days. 1 × 106 Cells were collected and washed twice with 1×PBS, and then 5 ml of non-denaturing lysis buffer (Beyotime, China) was added and sonicated (40% intensity and 3.0 s/2.0 s for 30 min) at 0 °C. After centrifuging at 12000 rpm/min for 30 min at 4 °C, collected the supernatant, and added 3 ml of GUNTA-0 (Beyotime, China) to the precipitate, followed by sonication at 0 °C with the condition of 40% intensity and 3.0 s/3.0 s for 30 min, supernatant was obtained by centrifugation. The collected supernatant was adjusted to pH 8.0, after which Ni-NTA agarose (Beyotime, China) was added and slowly mixed at 4 °C for more than 1 h. The mixture was added to the chromatography column and the purified Polθ protein was obtained by stepwise elution with 10, 20, 50, 100, 200, 300, 400 and 500 mmol/L of imidazole. After dialysis to remove imidazole, the purified recombinant protein was detected by SDS-PAGE and Western blot. Western blot was conducted using a mouse anti-6×His tag (Proteintech, USA) antibody as the primary antibody, and an HRP-labeled goat anti-mouse IgG (Proteintech, USA) was used as the secondary antibody.

Detection of the generated junction site in vitro cell-free reaction system

An in vitro cell-free reaction system (20 μL) containing 66 mM Tris-HCl (pH7.6), 6.6 mM MgCl2, 10 mM DTT, 0.1 mM ATP, 0.25 μg of DNA fragments flanked by short direct repeats and 6 μg of mixture of Polθ and its interacting proteins or the purified Polθ or recombinant domain of Polθ were incubated at 16 °C overnight79. The reaction products were further used as templates, and DNA was synthesized via PCR using Taq DNA polymerase with eccDNAfib-L-junction or GFP-junction primers (Supplementary Table 2). The PCR products were confirmed by Sanger sequencing. Meanwhile, the mixture containing 1 μM of ART558 (Polθ inhibitor) was used as the control. Additionally, we added a control in which primers were designed based on the linear sequence of eccDNAfib-L for PCR (Supplementary Table 2).

Moreover, 0.25 μg of the reaction product generated from GAGT-eccDNAfib-L-GAGT in vitro cell-free reaction system was mixed with 50 nM purified Polθ in a buffer containing 66 mM Tris-HCl (pH 7.6), 6.6 mM MgCl2, 10 mM DTT, 0.1 mM ATP, and 1 mM dNTPs in a total volume of 20 µL, and incubated for 120 min at 37 ˚C. The generated product was ligated by T4 DNA Ligase (Takara, Japan) according to the manufacturer′s instructions, and followed by treatment with S1 nuclease (Thermo Scientific, USA) at 37 ˚C for 30 min. The resulted product was subjected to PCR with eccDNAfib-L-junction primers. The PCR product was confirmed by Sanger sequencing. Meanwhile, the ligated product without treatment with nuclease S1 was used as a control.

Co-localization

Four biotinylated DNA fragment (GAGT-eccDNAfib-L-GAGT, GAGT-eccDNAfib-L, eccDNAfib-L-GAGT, and L-eccDNAfib-L) were generated via PCR using 5′-biotinylated primers (Supplementary Table 2). These primers were identical to those used for amplifying the corresponding DNA fragments, with biotin conjugated to their 5′ termini. Concurrently, BmN cells were seeded into 24-well plates at 10⁴ cells/well and cultured for 24 h. Each biotinylated DNA was then co-transfected with the plasmid pIZT-V5-Polθ into the pre-cultured cells. The cells were then collected after 48 h. After washing three times with 1×PBS, cells were fixed in 4% paraformaldehyde and permeabilized by 0.5% Triton X-100 for 20 min. After washing twice with PBS, cells were blocked using a blocking solution (3% BSA). For immunofluorescence staining, cells were treated with the mouse anti-V5 (1:200), followed by incubating with FITC-conjugated anti-biotin antibodies (1:200) and Cy3-conjugated goat anti-mouse IgG (1:200, Servicebio, China). After washing with PBS, the cells were counterstained with DAPI (1:1000; Beyotime, China) and examined under a Leica SP8 Confocal Microscope (Leica, Germany) with a 40× objective.

Subcellular distribution of Polθ in the cytoplasm and nucleus

Cytoplasmic and nuclear proteins from BmN cells were extracted following the protocol provided by the Nuclear and Cytoplasmic Protein Extraction Kit (Beyotime, China). Western blot was performed to verify the separation of cytoplasmic and nuclear fractions using primary antibodies against H3 (Proteintech, USA) and tubulin (Proteintech, USA) at a 1:1000 dilution, with HRP-labeled goat anti-mouse IgG (Proteintech, USA) at a 1:5000 dilution as the secondary antibody. A total of 50 μg of protein was loaded per lane. To investigate the subcellular distribution of Polθ between the nucleus and cytoplasm, an anti-Polθ antibody (Huabio, China) was utilized as the primary antibody, and the HRP-labeled goat anti-rabbit IgG at a 1:5000 dilution (Proteintech, USA) served as the secondary antibody.

Expression of the domains located in Polθ expressed by Escherichia coli

The domain of Polθ is predicted by InterPro (https://www.ebi.ac.uk/interpro/). The DExH-box (96-299 aa), Ski2 family (298-503 aa), HTH (497-586 aa), HLD (613-773 aa), and polymerase (1434-1823 aa) domains encoding DNA sequences were synthesized by Genscript (China), and cloned into EcoRI and HindIII sites of the vector pET-28 a (+) (Invitrogen, USA) to construct the plasmids pET28-DExH, pET28-Ski2, pET28-HTH, pET28-HLD, and pET28-polymerase. The recombinant plasmid was transformed into E. coli strain BL21 (DE3) (GenScript, China). E. coli transformed with pET28a (+) as a negative control. After induction with IPTG at a final concentration of 1 mmol/L (Supplementary Fig. 6C), the total protein from the induced bacterial cells was collected. The recombinant proteins DExH-box, Ski2 family, HTH, HLD, and polymerase were purified using Ni-NTA agarose (Beyotime, China) according to the manufacturer′s instructions, separately. The refolding of the recombinant proteins was carried out by dialysis in TGN buffer (50 mM Tris-base, 0.5 mM EDTA, 50 mM NaCl, 1% arginine, 10% glycerol, 5 mM GSSG, and 2 mM DTT), then subjected to SDS-PAGE with 5% stacking gel and 10% separating gel. Western blot was conducted using a mouse anti-6×His tag (Proteintech, USA) antibody as the primary antibody, and an HRP-labeled goat anti-mouse IgG (Proteintech, USA) was used as the secondary antibody.

Detection of polymerase activity in purified polymerase domain

Using the linear full-length sequence of eccDNAfib-L as a template, the sense single-stranded eccDNAfib-L-ss (+) was amplified by PCR using eccDNAfib-L-junction-R primer (Supplementary Table 2). The reaction system was constructed as follows: 1 μM sense single-stranded eccDNAfib-L-ss (+), 10 nM eccDNAfib-L-junction-R primer, 200 μM each dNTPs, 1× SYBR Green I (Beyotime, China), and 1 μg of the purified polymerase domain were mixed. The reaction mixture was adjusted to a final volume of 50 μL with PCR reaction buffer (Takara, Japan). The reactions were incubated at 37°C for 30 min, during which fluorescence intensity was measured every 5 min at excitation/emission wavelengths of 497/520 nm. Positive controls using Taq DNA polymerase (Takara, Japan) and negative controls using Bovine Serum Albumin (BSA) were included.

Statistics

Data are expressed as means ± standard deviation (SD). Statistical analyses were performed using one-way analysis of variance (ANOVA) and t test to determine statistical significance between the groups using GraphPad Prism 8.3 software. Statistical significance was set at P ≤ 0.05.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.