Main

One of the key mechanisms controlling genome folding in metazoan nuclei involves contacts of the genome with the nuclear lamina (NL)1,2. These contacts are mediated by large genomic regions named lamina-associated domains (LADs)2,3,4,5. In most human and mouse cell types there are about 1,000 LADs, each spanning 0.1–10 Mb and collectively covering about 40% of the genome6,7,8,9. Polymer simulations, supported by experimental evidence, indicate that LAD–NL interactions have a key role in chromosome spatial organization10,11,12,13,14.

Genes within LADs are mostly transcriptionally inactive6,7,15,16,17. Some studies described a causal role for the NL in gene silencing18,19 while others found no evidence for gene repression upon forced NL anchoring of a locus20. Because of the artificial nature of the tethering mechanism used in these studies, it is still unknown how natural NL-tethering mechanisms may contribute to gene repression.

Although several chromatin and NL proteins have been identified that may mediate NL–LAD interactions21,22, a major unresolved question is how these interactions are encoded within LAD DNA. This is particularly intriguing considering the enormous size of LADs. For example, a LAD of 3 Mb consists of ~1 mm of linear DNA or roughly 150 µm of a 10-nm nucleosomal fiber. How is the anchoring of such a giant molecule to the NL instructed by its own DNA sequence?

Single-cell DamID maps indicate that LAD DNA contacts the NL in continuous stretches23. Several models may explain these broad interactions. First, LADs may have affinity for the NL along their entire length, either because of a dense peppering of sequence elements mediating NL interactions or because of the near-uniform coating of LADs by a histone modification or NL-binding chromatin protein. Alternatively, each LAD may contain one or a few tethering elements, with the remainder of the LADs secured to the NL by a zipper-like mechanism or through passive contacts24,25. Hybrid models are also possible; LADs could contain few tethering elements driving NL contacts with relatively high affinity, complemented by a multitude of weakly interacting elements.

Previous reports described small-scale screens to identify NL-tethering elements26,27,28. Multiple LAD fragments, some only a few kilobases in size, were identified that appeared to have the potential to promote NL interactions. However, in most experiments, the random integration of these sequences hampered interpretation of the results because the ability of some fragments to target to the NL was demonstrated to be context dependent27. Moreover, homing effects29,30 could have biased the integrations toward specific chromatin contexts. Critically, the role of the candidate elements was not further tested in their native LAD context. These early studies also relied strictly on fluorescence microscopy of tagged loci, which was unable to reveal the detailed pattern of NL contacts in the tested loci. Systematic deletions and rearrangements of LAD DNA in its natural location, together with high-resolution mapping of changes in NL contact across the entire locus, would provide valuable insights into the mechanisms that drive LAD–NL interactions.

The interplay between neighboring LADs is also poorly understood. Because LADs contact each other in three-dimensional (3D) space31,32, neighboring LADs could interact with the NL in a coordinated manner, as suggested by single-cell maps33, rather than independently. However, there is no perturbation-based evidence that neighboring LADs may influence each others’ NL interaction.

Here, we report the systematic dissection of LADs in mouse cells, addressing several of the unresolved issues highlighted above, such as the principles of LAD–NL contacts, LAD–LAD cooperativity and the link between NL interaction and gene repression. To study LADs in their native context, we developed an efficient technology to create series of deletions and inversions of up to ~2 Mb. This technology combines local ‘hopping’ of the Sleeping Beauty (SB) transposon with Cre-mediated recombination between two loxP sites33,34,35. Using this approach, we truncated and rearranged LADs in multiple ways and mapped the effects on NL interactions, gene expression and heterochromatin formation.

Results

Experimental design

We adapted an SB transposon-based strategy33,34,35 to generate large deletions and inversions in LADs and the flanking inter-LAD (iLAD) regions. This technique consists of three main steps. First, a cassette consisting of two loxP recombination sites, one of which is embedded in an SB transposon, is integrated into a genomic locus of interest. Second, SB hopping is induced by transient expression of SB transposase; the transposon is excised and typically reintegrates semirandomly within 2 Mb of its original site (Fig. 1a, left)33,36,37. Thus, one loxP (the anchor loxP) remains in the original insertion position while the other (the hopped loxP) is relocated to various positions in the locus. Third, recombination between the two loxP sites is induced by expression of Cre recombinase (Fig. 1a, right). If the loxP sites are in the same orientation, the sequences in between them will be deleted; if the loxP sites are in the opposite orientation, the sequences will be inverted. This powerful approach allows the generation of clonal cell lines with a wide diversity of inversions and deletions in a region of interest.

Fig. 1: Strategy to generate large inversions and deletions that disrupt LADs.
figure 1

a, Left: a cassette containing two loxP recombination sequences, one of which is embedded within an SB transposon, is inserted in an iLAD (orange). After SB hopping, clones with SB integrations in the nearby LAD are selected. In these clones, one loxP is now located inside the LAD, while the other loxP is still at the original iLAD position. NE, nuclear envelope. Right: Cre-mediated recombination generates a deletion (top) or an inversion (bottom), depending on the relative orientation of the two loxP sites. b, Top: CAST allele-specific pA-DamID track of NL interactions, across the locus of interest on chromosome 9, in mES cells. Bin size, 20 kb. The loxP-SB cassette is inserted in the LAD1-flanking iLAD (arrow). iLADs, orange; LADs, blue. Blue horizontal lines mark four LADs. Bottom: gene annotation track. Genes are colored according to their expression level in WT cells (not allele-specific). c, Top: micro-C contact map, using data from a previous study44. Red square, maximum number of counts per bin. Bin size, 25 kb. Dashed lines represent the sub-TAD in which the launch site is located. Bottom: distribution of SB integrations after hopping from the launch site, shown as a histogram of the number SB integrations per bin of 100 kb (left y axis) and as a density curve (right y axis). Each tick on the x axis represents a unique SB integration event. SB integrations were mapped in both cell pools and clonal cell lines. Red line, launch site; blue rectangles, LADs; n, the number of unique integrations in the plotted window. Blue arrows mark SB integrations used in subsequent rehopping experiments in d,e. d, Top: micro-C contact map, using data from a previous study44. Bottom: distribution of SB integrations after hopping from LAD1, shown as a histogram of the number SB integrations per bin of 100 kb (left y axis) and as a density curve (right y axis). Each tick on the x axis represents a unique SB integration event. SB integrations were mapped in both cell pools and clonal cell lines. Dark-blue line, launch site; blue rectangle, LAD; n, the number of unique integrations in the plotted window. e, Top: micro-C contact map, using data from a previous study44. Bottom: distribution of SB integrations after hopping from LAD2, shown as a histogram of the number SB integrations per bin of 100 kb (left y axis) and as a density curve (right y axis). Each tick on the x axis represents a unique SB integration event. SB integrations were mapped in both cell pools and clonal cell lines. Light-blue line, launch site; blue rectangle, LAD; n, the number of unique integrations in the plotted window. f, Plot of the size of Cre-mediated recombinations, as a function of Cre recombination efficiency. Colors indicate the type of recombination (that is, deletion or inversion). For the 12 recombinations reported in this study, Cre recombination efficiency appears largely independent of the distance between the two loxP sites.

Source data

Compared to previous work34,38, we implemented several improvements. Firstly, we did not integrate any selectable marker gene together with the loxP-SB cassette, because it is likely to affect the LAD architecture39. Secondly, we used SB transposase SB100x, which is 10–100 times more active than its previous version SB11 (ref. 40). Thirdly, mapping of SB integrations was improved by the use of the sensitive TagMap technique41. Lastly, we selected Cre recombinase-transfected cells, enabling a more efficient screening for recombined clones.

As a model system, we chose mouse embryonic stem cells (mES cells), which have a well-characterized LAD architecture7,9,32,42. We conducted all experiments in F1 hybrid (129/Sv:CAST/EiJ) mES cells, in which the CAST and the 129S1 alleles can be distinguished because of the high density of sequence polymorphisms. We monitored effects of rearrangements in one allele, while the other, unperturbed allele served as a control. Moreover, heterozygous perturbations are less likely to compromise cellular homeostasis or differentiation state than homozygous ones.

Efficient generation of genomic rearrangements involving LADs

We studied a ~10-Mb region on chromosome 9, harboring four LADs (here named LAD1 through LAD4) of varying length and NL interaction frequency (Fig. 1b and Extended Data Fig. 1a). LAD1, LAD2 and LAD4 are facultative LADs, while LAD3 is a constitutive LAD (Extended Data Fig. 1b)9. The locus includes dozens of genes, enabling us to link any effects of the rearrangements on NL interactions to changes in gene expression.

We integrated the loxP-SB cassette in the CAST allele of mES cells, in an intergenic iLAD sequence, ~630 kb away from LAD1 (Fig. 1b, arrow). We then induced SB hopping by transient expression of the SB transposase, isolated clonal cell lines, screened these for SB hopping from the launch site by PCR (Extended Data Fig. 1c) and finally mapped the new integration sites using TagMap41. Of 183 screened clones, this yielded 30 clones with precisely mapped hopping events. Combined with SB integrations mapped in pools of cells, these data showed that most SB integrations remained confined to a region of ~2 Mb around the launch site (Fig. 1c). Of 142 integrations found on chromosome 9, 88 were located less than 1 Mb away from the launch site, in line with earlier studies33,36,37. Surprisingly, we detected only three hopping events within the nearby LADs. To test whether this apparent constraint on SB hopping was specific for the selected locus, we inserted the same loxP-SB cassette either closer to a LAD border (300 kb) (Extended Data Fig. 1d) or in a smaller iLAD surrounded by two large LADs (Extended Data Fig. 1e). Again, most hopped SB integrations remained in iLADs (324/341 and 205/226) and were even strongly skewed away from the nearby LAD in one of the two locations (Extended Data Fig. 1d). These results suggested that LAD chromatin is refractory to SB integration or that hopping from an iLAD to a LAD is less efficient than within iLADs. To test these two hypotheses, we reinduced SB hopping in two clones in which SB had integrated within LAD1 (Fig. 1d) or LAD2 (Fig. 1e). Mapping of SB integration sites in the resulting rehopped clones revealed integrations throughout the respective LADs (114/124 and 61/89), showing that LADs are not refractory to SB transposition. Instead, most integrations occurred within the LAD, with few hopping events to the neighboring iLADs (10/124 and 28/89). Hence, SB hopping is efficient within both LADs and iLADs but is rare between the two compartment types. Because LAD–iLAD compartmentalization partially overlaps with topologically associating domain (TAD) structure (Extended Data Fig. 1a,d,e), it appears that SB hopping is constrained by the genome 3D folding.

From the thus obtained clones with loxP integrations throughout LAD1 or LAD2, we selected 12 for the generation of inversions and deletions. For each of these clones, we transiently expressed Cre, isolated 30–150 subclones and identified recombined subclones by PCR screening as illustrated in Extended Data Fig. 2a,b. This revealed that the recombination efficiency was remarkably high, for both deletions and inversions, reaching up to 22% for loxP sites as much as 1.8 Mb apart (Fig. 1f). Surprisingly, the recombination efficiency did not correlate with the distance between loxP sites (R = 0.22, P = 0.5).

Only large deletions within LAD1 result in release from the NL

We first focused on six deletions consisting of increasingly large LAD1 truncations, combined with loss of a constant ~630-kb part of the flanking iLAD (Fig. 2a). Each of these deletions resulted in a new iLAD–LAD junction connecting an identical iLAD sequence with a different LAD1 sequence.

Fig. 2: Large LAD1 deletions only result in LAD release from the NL.
figure 2

a, CAST allele-specific pA-DamID track of NL interactions (bin size: 20 kb), together with the positions of the six loxP integrations within LAD1 that were used in combination with the anchor loxP to generate deletions d1 through d6. b,c, Changes in NL interactions resulting from deletions. Each of the four panels consists of three subpanels. Bottom: gene annotation track. Genes are colored according to their expression level in WT cells (not allele-specific). Middle: pA-DamID tracks (z score) of NL interactions in nonrecombined (control, blue line) and recombined cells (experimental, red line). n, the number of independent biological replicates that were combined. Noise was suppressed by a running mean filter of indicated window size. Shading between the lines corresponds to the color of the sample with the highest value. Horizontal dashed lines mark the 5th and 95th percentiles of genome-wide pA-DamID values. Vertical dashed lines mark the coordinates of the loxP recombination sites. For the experimental dataset, background noise coming from the deleted region on the CAST allele was filtered out. Top: domainogram39; for every window of indicated size (vertical axis) and centered on a genomic position (horizontal axis), the pixel shade indicates the ranking of the change in pA-DamID score (experimental minus control) in this window compared to the genome-wide changes in pA-DamID scores across all possible windows of the same size. Blue, pA-DamID score highest in control samples; red, pA-DamID score highest in experimental samples; gray, no data, where change in pA-DamID score could not be calculated over the deleted region. Data are shown separately for the CAST (left) and 129S1 (right) alleles. b, Smallest deletion d1. c, Largest deletion d6. Arrowheads mark an iLAD region that gains NL interactions in recombined cells. d, Comparison of CAST-allele-specific domainograms of all six deletions involving LAD1 (bottom six panels). Top: same pA-DamID track as in a for reference. The domainogram color key is displayed in b. e, Two possible models for LAD1 interactions with the NL. Left: NL-tethering elements are evenly distributed throughout the LAD. Right: critical subregions have stronger NL-tethering activity.

For each deletion we selected two independent recombined clones and generated detailed maps of NL interactions by pA-DamID43. The parental clone, (with hopped but nonrecombined loxP) served as a control. Allele-specific pA-DamID read counts confirmed the expected deletions only in the CAST allele of all recombined clones (Extended Data Fig. 2c–h). We visualized the effects of each deletion on NL interactions in two ways: a plot of the pA-DamID tracks of control (blue) and recombined (red) clones (Fig. 2b,c, middle) and a multiscale ‘domainogram’ (ref. 39) that highlights regions with increases (red) or decreases (blue) in NL contacts that are unlikely to be because of random noise (Fig. 2b,c, top). The independent replicate clones yielded similar NL association patterns for all recombinations (Extended Data Fig. 3a); therefore, we combined the data of both clones.

The smallest deletion that we generated was 784 kb long and included ~150 kb of LAD1 (recombination d1). Strikingly, near the newly created junction, this resulted in increased NL interactions of the formerly iLAD sequence over a range of ~190 kb (Fig. 2b, left, arrowhead). We did not observe similarly strong changes for the 129S1 allele in the same cells (Fig. 2b, right), indicating that NL interactions spread in cis across the junction. The deletion did not detectably affect the NL contacts of the intact part of LAD1 (Fig. 2b, left). Therefore, the deleted sequences are not required for LAD1 formation.

Next, we analyzed recombination d2, d3 and d4, in which stretches of about 350 kb, 840 kb and 880 kb of LAD1 were deleted, respectively. We observed a similar phenotype as described for recombination d1 (Fig. 2d and Extended Data Fig. 3b–d): the iLAD region near the new junction gained NL interactions, while the NL interactions of the remaining parts of LAD1 remained unaffected. Thus, ~40% of LAD1 length can be deleted without detectable impact on the association of the remaining ~60% with the NL.

Deletions causing even larger truncations of LAD1 (about 1.1 Mb and 1.3 Mb for recombination d5 and d6, respectively) led to more substantial changes in the NL interaction pattern. While the iLAD region near the new junctions still gained NL interactions, the remaining parts of LAD1 now showed greatly reduced NL interactions (Fig. 2c,d and Extended Data Fig. 3e). Nevertheless, LAD1 NL interactions were never completely lost. Interestingly, in recombination d6, LAD3 NL interactions were also somewhat reduced, suggesting crosstalk between these LADs. No other major changes in NL contacts were detected along the chromosome, suggesting that recombination only locally impacts LAD formation (Extended Data Fig. 3f).

A direct comparison of the domainograms of the six deletions (Fig. 2d) showed that all six deletions caused spreading of NL interactions into the former iLAD region near the junction, over a range of ~190–260 kb. Remarkably, this spreading is confined to a pre-existing TAD44 and extends roughly to a region marked by the active histone marks H3K4me1 and H3K27ac, a peak of accessible chromatin and modest CTCF binding (Extended Data Fig. 4a). Disruption of a natural LAD border may, thus, lead to spreading of NL interactions within the same TAD, up to the next border characterized by open chromatin and active transcription. This is consistent with previous studies showing that LAD borders are frequently demarcated by promoters of active genes6 and that transcriptional activity can prevent NL interactions39,45.

Our data indicate that LAD1 is remarkably robust. Deletion of nearly half of LAD1 does not impair NL association of the remaining LAD part (recombination d1 to d4). However, larger deletions resulted in a notable albeit incomplete release of the remaining LAD sequences from the NL (recombination d5 and d6). These results may be explained by two models. In the first model, NL interactions are mediated by a multitude of elements (which may be DNA sequence elements or a chromatin modification) distributed evenly throughout each LAD. Individually, these elements may interact only weakly with the NL but, collectively, they result in high avidity (Fig. 2e, left). Only when the total mass of the LAD is reduced below a critical threshold may the NL interactions become less robust (as in recombination d5 and d6). In a second model, tethering elements are dispersed unevenly throughout LADs, with critical subregions harboring either a higher density of elements or elements with a stronger NL-tethering activity (Fig. 2e, right). Given that recombination d5 leads to reduced NL interactions while recombination d4 does not, we hypothesized that the region located between deletion breakpoints 4 and 5 may constitute such a critical subregion in LAD1. We hereafter refer to this region as R4-5 (Fig. 2d). We note that the remainder of LAD1 in recombination d5 and d6 would still harbor at least some tethering elements because its contacts with the NL are not entirely lost.

Inversion of LAD1 points to multiple tethering elements

To test whether LAD1 formation is driven by multiple tethering elements distributed throughout the LAD (model 1; Fig. 2e, left), we used our loxP-SB system to create an inversion that splits LAD1 into two halves of roughly equal size (recombination i7). If this first model is correct, the two halves should both maintain interactions with the NL, although possibly with lower avidity. This inversion splits LAD1 into two parts (LAD1.1 and LAD1.2) separated by ~630 kb of iLAD sequences (Fig. 3a,b, top, and Extended Data Fig. 4b). To facilitate the interpretation of the NL interaction pattern after recombination, we plotted the pA-DamID tracks and associated domainograms using chromosomal coordinates of the inverted genome and replotted the nonrecombined control data to this inverted genome. The results show that both LAD1 halves remained associated with the NL after the inversion. However, LAD1.1 exhibited weakened NL interactions, particularly near the new iLAD junction (Fig. 3b, star, and Extended Data Fig. 4c), while LAD1.2 remained associated with the NL (Fig. 3b, triangle). Hence, both LAD1 halves contain tethering elements, which differ in strength (model 2; Fig. 2e, right).

Fig. 3: NL-tethering subregions in LAD1 are autonomous.
figure 3

a, CAST allele-specific pA-DamID track of NL interactions in WT cells, together with the position of one loxP integration within LAD1 that was used in combination with the anchor loxP to generate recombination i7. b, Effect of splitting LAD1 into two halves. Top: cartoon of a pA-DamID track of NL interactions of the CAST allele to illustrate the i7 inversion. The domainogram (middle) and pA-DamID tracks (bottom) of CAST allele NL interactions are shown, comparing nonrecombined and recombined cells from recombination i7. Data are plotted using the chromosomal coordinates of the inverted genome. The domainogram color key is displayed in d. Star, LAD1.1, which shows reduced NL contacts upon recombination; triangle, LAD1.2, whose NL interactions are largely unaffected by the recombination; arrowheads, iLAD regions that gain NL interactions upon recombination. c,d, Effects of isolating R4-5 (left) and R2-3 (right) in an iLAD. Rearrangements were generated by sequential inversion and deletion (Extended Data Fig. 5a,b). c, Cartoon of pA-DamID tracks of NL interactions of the CAST allele to illustrate the rearrangements. d, Domainogram (top) and pA-DamID tracks (middle) of CAST allele of cells with hopped but not recombined loxP (control, blue) and rearranged cells (experimental, red) in which R4-5 and R2-3 regions were isolated within an iLAD. pA-DamID tracks and associated domainograms are plotted using chromosomal coordinates of the rearranged genome. Arrows, NL interactions of R4-5 and R2-3 upon their isolation; arrowheads, R4-5 neighboring sequences gaining NL interactions. Bottom: schematics of NL interaction changes upon R4-5 and R2-3 isolation in an iLAD. A legend is provided in Fig. 2e. e,f, Same as c,d but after insertion of R4-5 and R2-3 close to LAD4.

At the two new LAD–iLAD junctions created by this inversion, we observed increased NL interactions of the former iLAD sequences (Fig. 3b, arrowheads). This effect was most pronounced for the iLAD region wedged between LAD1.1 and LAD1.2. Thus, as with the deletions described above (Fig. 2), some spreading of NL interactions occurs across the new LAD–iLAD junctions.

LAD1 subregion R4-5 can autonomously drive NL interactions

Next, to test whether R4-5 could autonomously interact with the NL, we devised a strategy to isolate R4-5 into an iLAD environment (Extended Data Fig. 5a). We used our loxP-SB system to generate an inversion within LAD1 that moved R4-5 directly next to iLAD sequences. Next, we used Cas9 editing46,47,48 to delete a 3.7-Mb segment that included all LAD1 sequences except R4-5, resulting in the positioning of R4-5 between two iLAD regions of about 1.2 Mb and 1.6 Mb (Fig. 3c, left). For comparison, we chose another LAD1 subregion, located between breakpoints 2 and 3 (named R2-3), of the same length as R4-5 and inserted it in the same location by a similar approach (Fig. 3c, right, and Extended Data Fig. 5b). pA-DamID mapping of selected clones (Extended Data Fig. 5c,d) showed that R4-5 has a modest but significant ability to autonomously interact with the NL compared to the flanking iLAD regions (Fig. 3d, left, and Extended Data Fig. 5e,f; P < 2.22 × 10−16, one-sided Wilcoxon text). Interestingly, R2-3 also interacts with the NL (P = 5.3 × 104) but does so less frequently than R4-5 (P = 3.2 × 107) (Fig. 3d, arrows, and Extended Data Fig. 5f). The flanking iLAD sequences gained NL interactions when close to R4-5 only, albeit modestly (Fig. 3d, arrowheads).

Although these results point to some degree of autonomous NL tethering by R4-5, its NL interactions in the iLAD context were weaker than those in the original LAD1 context (Fig. 3d). Possibly, R4-5 requires cooperative interactions with other LAD elements for more efficient NL tethering. To test this, we isolated R4-5 and R2-3 regions in close vicinity of LAD4 (Fig. 3e and Extended Data Fig. 5b,g). In this context, R4-5 exhibited nearly as many NL interactions as in its wild-type (WT) LAD1 context (Fig. 3f, left, arrow, and Extended Data Fig. 5h). Moreover, R4-5 enhanced the NL association of flanking LAD4 sequences over about 400 kb (Fig. 3f, left, arrowhead). The iLAD sequences near the other junction also gained NL interactions, comparable to what we previously observed in the deletion series (recombinations d2, d5 and d6) (Fig. 2d, arrowhead). In contrast, moving R2-3 close to LAD4 only slightly strengthened its NL interactions compared to the iLAD context (Fig. 3f, right, arrow) and did not detectably strengthen NL contacts of its flanking sequences.

R4-5 is the minimal region required for autonomous NL interactions

To identify the minimal sequences required for R4-5 autonomous NL interactions, we further truncated R4-5 in the isolated iLAD environment (Fig. 3c,d). Truncations comprised one quarter, one half or three quarters of R4-5, from the 5′ or the 3′ end (Fig. 4a). We used Cas9 editing to generate two independent clones for each truncation (Extended Data Fig. 6a). Compared to the full R4-5, all truncations resulted in a loss of NL interaction of the remaining R4-5 and flanking sequences (Fig. 4b and Extended Data Fig. 6b). Thus, even though we did not test smaller deletions (<50 kb), we propose that R4-5 is the minimal region required for autonomous NL association.

Fig. 4: R4-5 is the minimal sequence required for autonomous NL interactions.
figure 4

a, Schematics of R4-5 truncations. Either one quarter (t1 and t2), one half (t2 and t3) or three quarters (t5 and t6) were truncated. b, R4-5 truncation results in a loss of NL interactions of remaining R4-5 and flanking iLAD sequences. Top: schematic of the Cas9-mediated truncation of R4-5. Truncations were performed after R4-5 isolation in iLAD sequences (Fig. 3c,d). Bottom: tracks showing the change in NL interactions in the R4-5 truncated clones compared to the nontruncated control (R4-5 isolated in iLAD sequences) (truncated minus control). Data were smoothed. Loosely dashed vertical lines mark the position of R4-5; densely dashed vertical lines mark the site of Cas9 editing within R4-5. Arrowheads indicate iLAD sequences losing NL interactions upon R4-5 truncation. c, No detectable effect of deletion of R4-5 from LAD1. Top: schematic of the Cas9-mediated deletion of R4-5. The domainogram (middle) and pA-DamID tracks (bottom) of CAST allele NL interactions are shown, comparing WT (control, blue) and Cas9-edited cells in which R4-5 was deleted (red).

Source data

R4-5 is not essential for NL interactions of entire LAD1

Next, we tested R4-5 contribution to LAD1 NL interactions. For this, we precisely excised R4-5 from LAD1 by means of Cas9 editing with CAST allele-specific guide RNAs (gRNAs) (Extended Data Fig. 6c). Surprisingly, R4-5 deletion had no impact on LAD1 association with the NL (Fig. 4c and Extended Data Fig. 6d). Thus, in the context of the entire LAD1, R4-5 is not essential, underscoring that NL interactions of LAD1 are redundantly controlled by multiple elements.

Multiple tethering elements in LAD2

We next investigated potential tethering mechanisms in LAD2. We generated an inversion with a breakpoint within LAD2, inverting one third of its sequence (recombination i10) (Fig. 5a and Extended Data Fig. 6e). As a result, LAD2 is split in two, with the inverted part being moved close to LAD1, about 1.2 Mb away from its original location. We refer to the noninverted part as LAD2.1 and to the inverted part as LAD2.2 (Fig. 5b, top). Upon recombination, LAD2.2 showed slightly weakened NL interactions; however, at the same time, it promoted NL association of its new neighboring iLAD sequences (Fig. 5b, arrowhead, and Extended Data Fig. 6f). LAD2.1 lost NL interactions more substantially (Fig. 5b, star) but still contacted the NL to some extent. Thus, both subregions must harbor NL-tethering elements and LAD2.2 is able to boost NL interactions of nearby sequences. Because of the high repetitiveness of LAD2 sequences, we could not perform any Cas9-mediated deletion for further detailed dissection.

Fig. 5: Multiple tethering elements within LAD2.
figure 5

a, CAST allele-specific pA-DamID track of NL interactions in WT cells, together with the position of one loxP integration within LAD2 that was used in combination with the anchor loxP to generate rearrangements i10. b, Top: cartoon of a pA-DamID track of NL interactions of the CAST allele, to illustrate the i10 inversion, which divides LAD2 into two segments that are separated by a large iLAD region. The domainogram (middle) and pA-DamID tracks (bottom) of CAST allele NL interactions are shown, comparing nonrecombined cells (control, blue) to cells with inversion i10 (red). Data are plotted using the chromosomal coordinates of the inverted genome. Star, LAD2.1, which loses NL contacts upon recombination; arrowhead, LAD2.2-flanking iLAD sequences, which gain NL contacts upon recombination.

No obvious characteristics that may explain NL affinity of R4-5 and LAD2.2

We examined R4-5 and LAD2.2 for sequence features that might explain their NL affinity. Earlier studies in mouse embryonic fibroblasts suggested that cKrox and YY1 transcription factors participate in the NL tethering of ectopic LAD fragments26,28. However, neither the motif originally assigned to cKrox (erroneously inferred from the Drosophila homolog Trl)26, nor the up-to-date cKrox binding motif49, nor the YY1 motif49 are enriched in R4-5 and LAD2.2 (Extended Data Fig. 7a). Genome-wide, the cKrox and YY1 motifs are even depleted in LADs compared to iLADs and their density in R4-5 and LAD2.2 is similar to the median value across all LADs (Extended Data Fig. 7b). It is, thus, unlikely that these motifs mediate the NL attachment of R4-5 and LAD2.2, at least in mES cells. Extensive sequence analyses failed to identify any 10-mer motif50, repeat elements6,51,52, high A+T content9 or non-B DNA structures53 that could explain the NL affinity of R4-5 and LAD2.2 (Extended Data Fig. 7c–h).

Because LADs often coincide with large domains of heterochromatic histone marks such as H3K9me2 and H3K9me3 (refs. 16,23,27,54,55), we analyzed available H3K9me2 (ref. 16) and H3K9me3 (ref. 56) ChIP-seq data. H3K9me3 levels were significantly lower in R4-5 compared to the remaining LAD1 sequences but we did not detect differences in H3K9me3 deposition between LAD2.2 and its flanking LAD sequences (Extended Data Fig. 7i). In contrast, H3K9me2 was significantly enriched in R4-5 and LAD2.2 but the enrichment was very modest, making it unlikely that this histone mark alone endows these regions with the ability to autonomously interact with the NL.

Synergy between LAD1 and LAD2 requires linear proximity

LADs interact with each other within the B compartment as defined by Hi-C mapping57,58,59 and can be engaged in physical ‘cliques’ (ref. 32). Furthermore, neighboring LADs tend to interact with the NL in a coordinated fashion23. We, thus, wondered whether neighboring LADs could promote each other’s interaction with the NL.

We first tested whether LAD2 participates in LAD1 association with the NL. We deleted the entire LAD2 from the CAST allele of mES cells (recombination d11) (Fig. 6a and Extended Data Fig. 8a). This revealed that the interactions of LAD1 with the NL are not affected by LAD2 deletion (Fig. 6b and Extended Data Fig. 8b).

Fig. 6: Synergy between LAD1 and LAD2 when in close proximity.
figure 6

a, CAST allele-specific pA-DamID track of NL interactions in WT cells, with the positions of five loxP integrations that were used to generate genomic deletions (d6 and d11) or inversions (i8, i9 and i12). b, Deletion of LAD2 does not affect LAD1. A domainogram (top) and pA-DamD tracks (bottom) are shown for the CAST allele of nonrecombined (control, blue) and recombined (d11, red) cells from recombination d11. Data are plotted using WT chromosomal coordinates. The domainogram color key is displayed in c. c, Deletion of a large part of LAD1 does not affect LAD2 and LAD4. This is the same plot as in Fig. 2c, left, but for a wider window. Star, LAD2; triangle, LAD4. df, Top: cartoon of a pA-DamID tracks of NL interactions to illustrate inversions i12 (d), i8 (e) and i9 (f). Domainograms (middle) and pA-DamID tracks (bottom) are shown, comparing the CAST allele of nonrecombined (control, blue) and respective recombinations (i12 (d), i8 (e) and i9 (f); red). Data are plotted using the chromosomal coordinates of the inverted genome. Star, LAD2, which gains NL interactions in d,e but does not in f. The arrowhead in d indicates iLAD sequences gaining NL interactions. The domainogram color key is displayed in c. g, Left: Hi-C contact matrix generated in control parental cells of recombination i8. Right: Hi-C contact matrix generated in two recombined clones of recombination i8. Data were merged and plotted using the chromosomal coordinates of the inverted genome. Dark-blue and light-blue rectangles below the matrices mark LAD coordinates in the noninverted and inverted genome, respectively. Within the matrices, purple rectangles mark 3D contacts between LAD1.1 and LAD1.2. Dashed lines represent recombination breakpoints. Bin size, 20 kb. h, Top: differential heat map of Hi-C data between the recombined clones i8 and the corresponding parental cell line. Blue rectangles below the matrix mark LAD sequences. Within the map: orange rectangle, gain in contact frequency between LAD1.1 and flanking, noninverted, iLAD sequences; mauve rectangle, gain in contact frequency between LAD1.2 and flanking, inverted, iLAD sequences; purple rectangle, loss of contact frequency between LAD1.1 and LAD1.2; blue rectangle, mild gain in contact frequency between LAD2 and LAD1.1. Middle: tracks showing the change in NL interactions in the recombined clones compared to the control (ΔLmnB1, recombined minus control). Data were smoothed. Bottom: CAST allele-specific pA-DamID track of NL interactions in recombined cells. Data are plotted using the chromosomal coordinates of the inverted genome. Dashed lines represent recombination breakpoints. Bin size, 20 kb.

Because LAD1 is longer and stronger than LAD2, it is more likely that it participates in NL contacts of LAD2. To test this, we revisited recombination d6, the largest deletion we generated in LAD1 (Fig. 2c). Although the remainder of LAD1 is much smaller and weaker, LAD2 remained mostly unaffected by this LAD1 deletion (Fig. 6c, star, and Extended Data Fig. 8c). We could not detect any change in LAD4 either (Fig. 6c, triangle). Thus, these experiments suggest that LAD1 and LAD2 independently interact with the NL.

We wondered whether the 2-Mb-long, transcriptionally highly active iLAD region between LAD1 and LAD2 could prevent crosstalk between the two LADs. To shorten the LAD1–LAD2 linear distance, we created an inversion that brought LAD2 within ~700 kb of LAD1 (Fig. 6a, recombination i12, Fig. 6d, top, and Extended Data Fig. 8d). Strikingly, both LAD2 itself and the iLAD sequences wedged between LAD1 and LAD2 gained NL interactions (Fig. 6d, star and arrowhead, and Extended Data Fig. 8e). However, LAD1 did not gain NL interactions. These results suggest that, when LAD1 and LAD2 are close enough, LAD2 benefits from LAD1’s proximity and interacts more strongly with the NL.

To determine which LAD1 component was necessary for enhancing LAD2 NL interactions, we took advantage of recombination i8 and i9 in which LAD1 is split in two (Fig. 6a and Extended Data Fig. 5a–c). In recombination i8, R4-5 is the closest to LAD2 in the linear space, whereas this is the case for R2-3 in recombination i9. We refer to the inverted and noninverted halves of LAD1 as LAD1.1 and LAD1.2 respectively. When R4-5 was positioned close to LAD2 in recombination i8, LAD2 gained NL interactions (Fig. 6e, star, and Extended Data Fig. 8f). However, LAD2 did not gain NL interactions in recombination i9 when R2-3 was repositioned in its proximity (Fig. 6f, star, and Extended Data Fig. 8g). Together, these data suggest that R4-5 has the ability to promote crosstalk between LADs. We cannot rule out that LAD length has a role in the crosstalk because LAD1.1 is longer in recombination i8 than in recombination i9. We note that recombination i9 showed several other gains and losses of NL interactions; more complex interactions or effects of the overall folding of the chromatin fiber may account for this.

Effects of inversion i8 on 3D contacts

We noted that, in recombination i8, LAD1.2 NL interactions remained unaffected even though it is no longer contiguous with LAD1.1 but separated from it by ~0.5 Mb of iLAD sequence (Fig. 6e). This contrasts with recombination d5 (Fig. 2a), in which the remainder of LAD1 (which corresponds to LAD1.2 in recombination i8) showed reduced NL interactions when LAD1.1 was deleted. This suggests that, even when they are separated, LAD1.1 still interacts with LAD1.2 and promotes LAD1.2 NL interactions. To confirm such 3D contacts, we performed Hi-C on the parental and recombined clones from recombination i8. Our data show that LAD1.1 and LAD1.2 still contacted each other upon inversion, albeit less frequently (Fig. 6g,h, purple rectangles). As expected, we observed a gain in 3D contacts around the recombination breakpoints, showing that LAD1.1 and LAD1.2 contacted their flanking iLAD sequences repositioned after inversion (Fig. 6h, orange and mauve rectangles). Surprisingly, gain of LAD2 NL interactions in recombination i8 was not accompanied by major gains in 3D contacts between LAD2 and LAD1.1 (Fig. 6h, blue rectangle). In conclusion, synergy can occur between two neighboring LADs, provided that they are sufficiently close to one another in linear space.

Changes in NL interactions are partially mirrored by changes in H3K9me3

Because LAD1 and LAD2 are enriched in repressive histone marks such as H3K9me3 (Extended Data Fig. 7i), we performed pA-DamID of this histone mark in several recombined and control cells. In WT cells, H3K9me3 indeed correlates substantially, although not perfectly, with NL interactions (Extended Data Fig. 9a, genome-wide Pearson correlation R = 0.61). Likewise, LAD1 is enriched in H3K9me3, although the fine-scale pattern differs at some positions from that of NL interactions (Extended Data Fig. 9b).

We first focused on recombination d3, particularly on the iLAD region that showed increased NL interactions (Fig. 2d). In recombined cells, this region also gained H3K9me3 (Fig. 7a, arrowhead, and Extended Data Fig. 9c). This extended over ~200 kb but was more modest than the NL interaction gain, perhaps because these iLAD sequences already had substantial H3K9me3 levels in nonrecombined cells. In recombination d6, in which the spreading of NL interactions across the new iLAD–LAD junction was less pronounced (Fig. 2c,d), the increase in H3K9me3 was more subtle than in recombination d3 (Fig. 7b, arrowhead, and Extended Data Fig. 9d). Stronger local increases in H3K9me3 were detected at the iLAD–LAD junction in recombination i7 (Fig. 7c, arrowhead, and Extended Data Fig. 9e). Thus, at least in this region, local spreading of NL interactions and H3K9me3 appeared to correlate at the iLAD–LAD junction.

Fig. 7: Spreading of NL contacts is partially mirrored by H3K9me3 and often linked to gene repression.
figure 7

ac, Changes in H3K9me3 caused by recombinations. Domainograms (top) and pA-DamID tracks (bottom) show CAST allele-specific H3K9me3 levels in control (blue) and recombined (d3 (a), d6 (b) and i7 (c); green) cells. Arrowheads, regions that gain both NL interactions and H3K9me3; arrows, regions that gain or lose NL interactions but show no change in H3K9me3. Bottom: tracks showing the change in NL interactions in the recombined clone compared to the control (ΔLmnB1, recombined minus control). Data were smoothed. Dashed vertical lines mark positions of the recombined loxP sites. In c, data are plotted using coordinates of the recombined genome. df, Changes in gene expression in recombinations d3, d6 and i7. Bar graphs show the allele-specific log2 fold change in gene expression between the control and one recombined clone from recombinations d3 (d), d6 (e) and i7 (f), for each gene around the recombined sequences (n = 2). Genes are ordered by their chromosomal coordinates. Vertical lines represent the location of loxP recombination sites. Significant differentially expressed genes are colored in blue (P value corrected for multiple testing < 0.05 and fold change > 2). Horizontal dashed lines mark log2(fold change) values of 1 and −1. Crosses, no detectable mRNA in the control cell line (with hopped but nonrecombined loxP). Heat maps under the bar graphs show, for each gene (from top to bottom), (i) the expression level in WT F1 hybrid mES cells; (ii) the change in NL interaction pA-DamID score after recombination (recombined minus control); and (iii) the change in H3K9me3 pA-DamID score after recombination (recombined minus control). g, Correlation between changes in gene expression and the change in H3K9me3 levels (left) or NL interactions (right). The aggregate plot shows the combined data of the CAST alleles of all genes depicted in df and Extended Data Fig. 10c. Each dot represents a gene in one tested recombination. Dots are colored on the basis of the gene expression level in WT F1 hybrid mES cells. h, Linear model of changes in gene expression as function of changes in both NL interactions and H3K9me3. The bar plot shows weights of the fitted linear model. ***P = 1.11 × 10−6 and **P = 0.00824.

Source data

We also found instances where changes in NL interactions were not mirrored by H3K9me3 changes. In recombination i7, the iLAD sequences wedged between LAD1.1 and LAD1.2 gained NL interactions (Figs. 3b and 7c) but not H3K9me3 levels (Fig. 7c, arrow). In recombination d6, the reduction in NL interactions of the remaining part of LAD1 (Fig. 2c) was not detectably accompanied by a lowering of H3K9me3 levels (Fig. 7b, arrow). Likewise, in i10 and i12 recombined clones, no significant changes in H3K9me3 levels were detected in regions with losses or gains of NL interactions (Extended Data Fig. 9f,g, arrows). We conclude that changes in NL interaction levels caused by chromosomal structural variants are only partially mirrored by changes in H3K9me3.

We sought to further investigate this partial overlap between LADs and H3K9me3 domains. We found that H3K9me3 domains overlapping LADs (H3K9me3–LAD domains) showed a higher H3K9me3 score than nonoverlapping ones (H3K9me3-only domains) (Extended Data Fig. 9h). Moreover, H3K9me3–LAD domains were significantly larger and located further apart from LADs compared to H3K9me3-only ones (Extended Data Figs. 9i,j). However, H3K9me3 domain size did not correlate with distance to the nearest LAD (Extended Data Fig. 9k). This suggests that LAD–LAD interactions also promote H3K9me3–LAD domain interactions, thus creating large H3K9me3 domains.

We then wondered whether other heterochromatin marks were affected by the recombination. We performed pA-DamID mapping of H3K9me2 and H3K27me3 in the same recombined clones as in Fig. 7a–c. In recombinations d3 and d6, we observed a slight but nonsignificant increase in H3K9me2 and H3K27me3 in the iLAD region that gained NL interactions (Extended Data Fig. 10a,b, arrowheads). Local gains and losses of H3K9me2 and H3K27me3 were also observed in recombination i7 but did not correlate with NL interaction changes (Extended Data Fig. 10a,b).

Spreading of NL contacts and H3K9me3 is accompanied by transcriptional repression

Many studies have found that the localization of genes at the NL is correlated with a lack of transcriptional activity6,7,15,16,17,18,19. We, therefore, examined whether the observed changes in NL interactions and H3K9me3 levels were accompanied by changes in gene expression as detected by mRNA sequencing (mRNA-seq). We first focused on recombination d3. We considered every gene within the window plotted in Fig. 7a and defined differentially expressed genes as those showing a statistically significant change in mRNA abundance and a fold change of more than 2.

Outside from the deleted region, we detected three significantly downregulated genes (Prss50, Tmie and Als2cl) on the recombined chromosome (Fig. 7d). Strikingly, these genes are close to the LAD1 junction and showed the strongest gain in both NL association and H3K9me3 levels after recombination (Fig. 7d, bottom). Only one gene (Fam240a) is closer to the junction but this gene was silent in WT cells. No significantly differentially expressed genes were detected on the 129S1 allele, demonstrating that the observed changes on the CAST allele were the direct consequence of the deletion in cis (Fig. 7d, top).

In recombination d6, gains in NL interactions and H3K9me3 levels were milder compared to recombination d3 (Fig. 7a,b,d,e). Nevertheless, two genes were found significantly downregulated (Prss42 and Als2cl) and two others showed substantial albeit not statistically significant downregulation (Prss45 and Tmie) (Fig. 7e). We conclude that a set of genes close to the deletion junction are responsive to the gain in NL interactions and H3K9me3.

In recombination d6, several genes inside the remainder of LAD1 showed reduced NL interactions (Fig. 7b,e). However, we found no significant upregulation of these genes (Fig. 7e). This could be because their association with the NL their H3K9me3 levels are still too strong to be compatible with transcriptional activation or they lack the potential for activation (for example, missing specific transcription factors in mES cells)60.

In recombination i7, the Als2cl, Prss50 and Prss42 genes, which gained both NL interactions and H3K9me3, were significantly downregulated (Fig. 7f). Several nearby genes followed the same trend, albeit nonsignificantly. However, none of the iLAD genes that gained NL interactions only (for example Lrrc2 and Rpt3) were significantly downregulated. This suggested that H3K9me3 level is a more important determinant for gene repression than NL association. To analyze this systematically, we combined mRNA-seq data from abovementioned recombinations and additionally included inversions i10 and i12 (Extended Data Fig. 10c). We then plotted the fold change in gene expression relative to the change in NL interactions or H3K9me3 levels (Fig. 7g). The correlation was modestly stronger when expression fold change was compared to H3K9me3 changes than to NL interactions changes (Pearson’s R = −0.49 and R = −0.4, respectively). A combined linear model indicated that changes in H3K9me3 were a slightly more important predictor of the changes in gene expression than changes in NL interactions (Fig. 7h). However, the linear model explains only about one quarter of the variance in gene expression (R2 = 0.27, P = 2.3 × 10−13), suggesting that other unknown features also have a role. In conclusion, the inversions and deletions of iLAD–LAD sequences alter not only NL interactions but also H3K9me3 and gene activity.

Discussion

The very large sizes of LADs have made it difficult to unravel how they interact with the NL. In this study, we combined SB hopping and Cre-lox recombination to generate a large set of local deletions and inversions of LAD and iLAD sequences. While this work was in revision, another study reported the generation of long-range recombinations between repetitive elements using prime editing to insert loxP sites in the genome61.

Detailed mapping of NL interactions in the recombined clonal cell lines enabled us to uncover several principles of LAD–NL interactions. We found that LADs contact the NL through multiple LAD subregions that differ in NL-tethering strength, in an intricate pattern of cooperativity and redundancies that underscore that LAD–NL interactions are multivalent. Potent subregions can autonomously interact with the NL and enhance the NL association of flanking sequences. Furthermore, we found that neighboring LADs can synergistically interact with the NL, provided that they are close enough on the same chromosome.

Of the two autonomously interacting regions that we identified, one is necessary for the formation of its corresponding LAD (LAD2.2 in LAD2) while the other is not (R4-5 in LAD1). Interestingly, in mES cells LAD1 has a higher NL contact frequency compared to LAD2. Possibly, NL tethering of LAD1 involves several redundantly acting regions, while LAD2 mainly depends on a single region.

At newly created LAD–iLAD junctions, we consistently observed the spread of NL interactions into the iLAD (Figs. 2b–d, 3b,d and 5). This implies that LAD–NL interactions are not a passive process, for example, because of expulsion of transcriptionally silent chromatin from the transcriptionally active nuclear interior toward the NL. Rather, we propose that these interactions involve an active mechanism, keeping neighboring sequences close to the NL.

We considered that the extent of this spreading could reflect the persistence length of the chromatin fiber. However, the persistence length of heterochromatin has been estimated to be only ~100–200 nm (ref. 62) or ~5–10 kb (assuming a compaction of 50 bp per nm)63. This is at least an order of magnitude shorter than the distance over which we observe the spread of NL contacts. Alternatively, NL contacts may spread in a zipper-like manner. Evidence for such a mechanism emerged from single-cell DamID maps, which indicated that NL interactions of LADs occur over stretches of several hundreds of kilobases23. Tethering single loci at the NL resulted in the transcriptional repression of nearby genes over hundreds of kilobases, further consolidating such a model18,19. The molecular basis of such a zipper-like mechanism remains unknown but is consistent with our conclusion that LAD–NL interactions are multivalent.

LAD–iLAD border deletion may not always result in spreading of NL interactions. In CD4+CD8+ mouse thymocytes, deletion of a LAD border in the Tcrb locus induced LAD sequence release from the NL and activation of the displaced genes64. An active enhancer marked by the active histone mark H3K27Ac directly flanked the deleted sequence and sequences that lost NL association concomitantly gained H3K27Ac64. Moreover, activation of a long noncoding RNA gene enabled the detachment of a neighboring gene from the NL65. Thus, transcriptional activity can shape the pattern of NL interactions and, in some instances, may limit the spread of NL interactions.

What could mediate the autonomous NL affinity of R4-5 and LAD2.2? An obvious candidate would be a specific DNA sequence. Our results indicate that previously implicated sequence elements26,28 are unlikely to have a role in mES cells (Extended Data Fig. 7). This observation includes GAGA repeats, which were previously thought to interact with the NL through the DNA-binding factor cKrox26. However, a role for cKrox could not be confirmed in another study27 and this protein (now known as ZBTB17) was later found to bind a very different motif that is not enriched in LADs either. R4-5 has an unusually high A+T content. This fits with earlier findings that LADs are A+T rich9. Many proteins are known or predicted to bind to A+T-rich sequences. Interactions of some of these proteins with A+T-rich regions could mediate NL anchoring but their identity remains to be established. The NL affinity of R4-5 and LAD2.2 could also be sequence independent. Neither of these two regions seem enriched in secondary-structure-forming motifs but the specific concentration of factors such as HP1 or modified histones could also mediate their anchoring at the NL22,66.

Lastly, we show that NL association and H3K9me3 deposition are mediated by partially independent mechanisms because changes in the former are only partially mirrored by changes in the latter. Moreover, gene expression changes correlate somewhat better with H3K9me3 changes. This is reminiscent of observations in Caenorhabditis elegans, where transcriptional repression also appeared to be more dependent on H3K9 methylation than on NL tethering54.

While our loxP-SB system greatly facilitates the generation of series of deletions and inversions in a locus of interest, it is not easily scalable to multiple loci. Further efforts will be needed to expand our results across the genome and to extend our approach to other cell types, in which heterochromatin compaction and dynamics67 and the composition of the NL68,69,70 may differ. The proteins and molecular mechanisms responsible for LAD–NL affinity remain to be elucidated. A full comprehension of the principles of LAD formation will, thus, require extensive further exploration.

Methods

Experimental procedures

Cell culture

The F121/9 (CAST/EiJ × S129/Sv) female F1 hybrid mES cell line (4DN reference: 4DNSRMG5APUM)71 and derived clones were cultured in 2i + LIF (leukemia-inhibitory factor) conditions, according to the 4D Nucleome guidelines (https://data.4dnucleome.org/biosources/4DNSRMG5APUM/). Briefly, cells were cultured in serum-free DMEM/F12 (Gibco) and Neurobasal (Gibco) medium (1:1), supplemented with N-2 (Gibco), B-27 (Gibco), 0.05% BSA (Gibco), 10× 4 U of LIF (Millipore), 1 μM MEK inhibitor PD0325901 (mirdametinib, MedChemExpress), 3 μM GSK-3β inhibitor CHIR99021 (laduviglusib, MedChemExpress), 1.5 × 10−4 M 1-thioglycerol (Sigma-Aldrich) and 1% penicillin–streptomycin (Gibco, 15070063) on 0.1% gelatin-coated plates. Cells were passaged every 2 days. Cells were seeded and incubated overnight before transfection for SB hopping, Cre recombination, Cas9 editing, pA-DamID and RNA-seq experiments. Mycoplasma contamination was ruled out by regular testing (LT07-318, Lonza).

Insertion of the loxP-SB cassette

Transfection of the loxP-HyTK cassette

We cloned the vector pLD036 containing a loxP site and the double selection marker HyTK flanked by both FRT and F3 flipping sites and SB 3′ and 5′ inverted terminal repeats (ITRs). This cassette was amplified by PCR and knocked into the genome using Cas9 editing. We designed gRNAs targeting an intergenic sequence close to LAD1 5′ border using CHOPCHOP72,73 (Supplementary Table 1) and assessed their cutting efficiency by TIDE74. The best gRNA was selected and assembled into a ribonucleoprotein (RNP) together with the AltR CRISPR trans-activating CRISPR RNA (1073189, Integrated DNA Technologies (IDT)) and the Cas9 nuclease (1081058, IDT). We transfected 300,000 cells with the RNP and the PCR products using Lipofectamine CRISPRMAX Cas9 Transfection Reagent (CMAX00001, Thermo Fisher Scientific) according to the manufacturer’s protocol. Then, 3 days after transfection, Hygromycin was added to the medium (200 µg ml−1; 10687010, Invitrogen) to select for HyTK-expressing cells. We picked colonies 6 days later and screened clones for cassette integration by PCR and Sanger sequencing.

loxP swapping

We cloned the vector pLD037 containing a loxP site flanked by FRT and F3 flipping sites. We used recombinase-mediated cassette exchange to flip the HyTK gene for a loxP site in the previously selected clone. We transfected 300,000 cells with 1.5 µg of flippase-encoding plasmid (Addgene, 13787) and 0.5 µg of pLD037 using Lipofectmamine 2000 (Invitrogen, 11668019). Then, 2 days after transfection, ganciclovir was added to the medium (2.5 µg ml−1) to select for HyTK-nonexpressing cells. Colonies were picked 7 days later and screened by PCR and Sanger sequencing.

SB hopping and SB hopped clone selection

For every hopping experiment, 1 million cells were transfected with 4.5 µg of pME07, a bicistronic vector encoding the SB transposase SB100x and the human nerve growth factor receptor (LNGFR). We used Lipofectamine 2000 according to the manufacturer’s protocol. To select for transfected cells expressing the surface marker LNGFR, ~30 h after transfection, cells were submitted to magnetic-activated cell sorting column selection using MS columns (130-042-201, Miltenyi Biotec) and LNGFR MicroBeads (130-091-330, Miltenyi Biotec). Nontransfected cells were taken along as a negative control. Cells were grown for 1 week and single cells were sorted to grow clones. Crude extracts were prepared for each clone. Briefly, cells from a full 96-plate well were lysed in 25 µl of DirectPCR lysis reagent (302-C, Viagen Biotec) supplemented with 100 µg ml−1 proteinase K and incubated at 55 °C for 150 min and at 85 °C for 45 min. In experiments in which SB hopped from the loxP-SB cassette, the SB launch site was amplified by PCR using 2–3 µl of crude extracts. A shorter band was amplified for hopped clones compared to nonhopped ones. In experiments in which SB hopped from either LAD1 or LAD2, the junction between one SB ITR and the donor site was amplified by PCR. A band was amplified in nonhopped clones only. Over all experiments, SB hopped in 34% of the clones on average. The clone used to hop SB within LAD1 (Fig. 1d) had two SB integrations. The extra SB copy was located 28 Mb away from LAD1 on chromosome 9. Data not shown were filtered out.

Mapping SB integration sites by Tagmentation

To map SB integration site after hopping, we adapted the Tagmentation protocol from a previous study75 to amplify the junction between SB ITRs and genomic DNA. The overall procedure was similar to that described previously75 except for the primers used in the PCR reactions. The linear enrichment PCR amplification was performed using primers MEP9 (5′ ITR) or LD027 (3′ ITR). PCR1 was performed with primers MEP11 (5′ ITR) or MEP34 (3′ ITR), whereas we used primer Ib569 for PCR2 (both ITRs). SB integrations were validated by PCR in clones of interest.

Cre recombination

A total of 1 million cells were transfected with 4 µg of vector encoding the Cre recombinase fused to cyan fluorescent protein (CFP) using Lipofectamine 2000. The following day, CFP-positive cells were sorted by fluorescence-activated cell sorting (FACS) and grown for 1 week. After that cells were sorted again, as single cells, to grow clones. To screen for recombined clones, crude extracts were prepared as described above and three PCR reactions were set up for each clone. The procedure is illustrated in Extended Data Fig. 2a. In brief, after SB hopping, the anchor loxP site is flanked by 3′ and 5′ exogenous sequences while the hopping loxP site is flanked by SB 3′ and 5′ ITRs. After recombination, loxP sites are flanked by either the 5′ exogenous sequence and the 5′ SB ITR or by the 3′ exogenous sequence and the 3′ SB ITR. The hopped loxP site flanked by both SB ITRs is amplified in PCR1 with primers o-JOYC222 and o-JOYC186. loxP sites flanked by both exogenous sequences and the recombined junction between the SB 3′ ITR and the 3′ exogenous sequence is amplified in PCR2 with primers o-JOYC187 and o-LD022. Finally, the recombined junction between the 5′ exogenous sequence and the SB 5′ ITR is amplified in PCR3 with primers o-LD087 and o-LD100. For deletion clones, we controlled that the deleted fragment could not be amplified from genomic DNA. All three PCR reactions contained 10 µl of MyTaq HS Red mix (BIO-25048, Bioline), 2–3 µl of the crude extract and 1 µl of each primer (10 µM) for a final volume of 20 µl. Cre recombination efficiency was measured as the number of recombined clones over the total number of screened clones. All primers used in this study are listed in Supplementary Table 2.

R4-5 Cas9 deletion

We deleted R4-5 in a mES cell line expressing the ddCas9 construct from the ROSA28 locus. We used CHOPCHOP72,73 to design CAST allele-specific gRNAs (Supplementary Table 1). Pools of five and four gRNAs were designed to target the R4-5 5′ and 3′ end, respectively. gRNAs were cloned into an mCherry-expressing plasmid and pooled. A total of 1 million cells were transfected with 4 µg of the pool of gRNAs using Lipofectamine 2000. Shield-1 (500 nM; AOB1848, Aeobius) was added to the medium to stabilize the ddCas9 protein. The day after, mCherry-positive cells were sorted by FACS and kept in culture for 1 week in Shield-1-containing medium. Single cells were sorted and clones were expanded and screened by PCR and Sanger sequencing.

R4-5 and R2-3 isolation in a strong iLAD and close to LAD4 and R4-5 truncation

We used CHOPCHOP72,73 to design CAST allele-specific gRNAs (Supplementary Table 1). gRNAs targeting the iLAD and close to LAD4 regions in the isolation experiment or R4-5 3′ and 5′ borders in the truncation experiment were cloned in a CFP-expressing plasmid while those targeting R4-5 and R2-3 in the isolation experiment or first quarter, half and three quarters in the truncation experiment were cloned in an mCherry-expressing plasmid. A total of 1 million cells were transfected with the corresponding pools of gRNAs and with the pX458 plasmid encoding Cas9 and GFP proteins (48138, Addgene), using Lipofectamine 2000. The day after, triple-positive cells (GFP+, mCherry+, CFP+) were sorted by FACS and kept in culture for 1 week. After that, triple-negative cells (GFP, mCherry, CFP) were sorted as single cells; clones were expanded and screened by PCR and Sanger sequencing.

pA-DamID (LaminB1, H3K9me3, H3K9me2 and H3K27me3)

pA-DamID on LaminB1 and H3K9me3 was performed as described previously76. A total of 1 million cells were collected by centrifugation (500 g, 3 min) and washed sequentially in PBS and digitonin wash buffer (DigWash) (20 mM HEPES–KOH pH 7.5, 150 mM NaCl, 0.5 mM spermidine, 0.02% digitonin and cOmplete protease inhibitor cocktail). Cells were rotated for 2 h at 4 °C in 200 μl of DigWash with 1:100 LaminB1 antibody (Abcam, ab16048, rabbit), 1:100 H3K9me3 antibody (Diagenone, C15410193, rabbit), 1:100 H3K9me2 antibody (ActiveMotif, 39239, rabbit) or 1:100 H3K27me3 antibody (Millipore, 07-449, rabbit) followed by a wash step with DigWash. This was repeated with a 1:200 pA-Dam solution for 1 h at 4 °C (~60 New England Biolabs (NEB) Dam units), followed by two wash steps. Dam activity was induced by an incubation for 30 min at 37 °C in 100 μl of DigWash supplemented with 80 μM SAM. For every condition, another 1 million cells were processed in only DigWash and during Dam activation incubated with 4 U of Dam enzyme (NEB, M0222L). This Dam control sample served to account for DNA accessibility and amplification biases.

The library preparation was performed as follows. Genomic DNA was isolated (BIO-52067, Bioline) and ~500 ng was digested with DpnI (10 U; NEB, R0176L) in CutSmart Buffer 1× (8 h at 37 °C and 20 min at 80 °C) in a total volume of 10 µl. Poly(A) tailing was performed by the addition of 5 µl of the poly(A) tailing mix (0.5 µl of CutSmart buffer 10×, 0.25 µl of 50 U per µl Klenow (NEB, M0212M), 0.05 µl of 100 mM dATP and 4.2 µl of H2O) and incubation for 30 min at 37 °C followed by 20 min at 75 °C. Adaptors were ligated by adding 15 µl of the ligation mix (3 µl of T4 ligase buffer 10×, 0.5 µl of T4 ligase (5 U per µl; Roche, 10799009001), 0.25 µl of 50 mM x-Gene stubby adaptor (IDT) and 11.25 µl of H2O) and incubating samples for 16 h at 16 °C and 10 min at 65 °C. Finally, methyl-indexed PCR was performed by mixing 4 µl of ligated DNA with x-Gen dual combinatorial indices (IDT) (125 nM final concentration) and MyTaq RedMix (Bioline, BIO-25048) in a final volume of 40 µl. The following PCR program was used: 1 min at 94 °C, followed by 14 cycles of 30 s at 94 °C, 30 s at 58 °C and 30 s at 72 °C, with a final step of 2 min at 72 °C. The resulting amplified material was processed for high-throughput sequencing and sequenced for single-end 100-bp reads on a NovaSeq 6000 platform. Approximately 30 million reads were sequenced for every condition.

For LaminB1 pA-DamID experiments, we processed two independent recombined or Cas9-edited clones except for the isolation of R2-3 in an iLAD and close to LAD4 where only one edited clone could be selected.

For H3K9me3 pA-DamID experiments, we processed one recombined clone for recombinations d3, d6, i10 and i12 and two recombined clones for recombination i7.

For H3K9me2 and H3K27me3 pA-DamID experiments, we processed one recombined clone for recombinations d3, d6 and i7.

Hi-C

Hi-C was performed as described previously77. Cells were crosslinked using 2% formaldehyde. DNA was digested with the restriction enzyme MboI, followed by the incorporation of biotin-labeled nucleotides at the overhangs and subsequent blunt-end ligation. Streptavidin pulldown was used to enrich the ligated DNA. Sequencing libraries were constructed with the standard end-repair and poly(A) tailing protocols.

RNA-seq

A total of 1 million cells were harvested, washed once in cold PBS, resuspended in 600 µl of RLT buffer (RNeasy mini kit, 74104, Qiagen) and stored at −80 °C until subsequent RNA isolation. RNA was isolated using the RNeasy mini kit (74104, Qiagen). Libraries were prepared using the TruSeq stranded mRNA library prep (20020594, Illumina) and TruSeq RNA single indices set A (Illumina) kits. Libraries were sequenced for single-end 75-bp reads on a NextSeq 550 platform. Approximately 25 million reads were sequenced for every condition.

For recombinations d6, i10 and i12, one recombined clone was processed, whereas two recombined clones were processed for recombinations d3 and i7.

Computational analysis

Tagmentation mapping

We adapted the pipeline published previously78 and created a built-in R package to improve user usage. The pipeline was modified to map integration sites of the SB transposon and determine whether the integration occurred on the CAST or on the 129S1 allele.

pA-DamID

To process pA-DamID data, we adapted a previously published pipeline76 to align reads to the mouse reference genome (release M23, mm10)79 and map them to the CAST and 129S1 alleles. We used the WASP toolkit80 for the latter step, providing single-nucleotide polymorphism (SNP) data from CAST and 129S1 alleles, available on the Mouse Genomes Project (https://www.mousegenomes.org/publications/)81,82.

LAD coordinates were determined using hidden Markov modeling on the average NL interaction profile between biological replicates in WT F1 hybrid mES cells (https://github.com/gui11aume/HMMt).

Differential pA-DamID tracks of NL interactions were smoothed using the ‘runmean’ function from the caTools package, using a moving window of width 9 (k = 9).

Domainograms

To visualize changes in NL interactions, we adapted a previously described pipeline39 to analyze genomic deletions and inversions. For genomic deletions, pA-DamID scores computed for each GATC fragment of the nonrecombined genomes were smoothed. For recombined genomes, GATC fragments overlapping the deleted region were removed before smoothing. pA-DamID plots and domainograms are plotted on the reference genome. For genomic inversions, coordinates of the GATC fragments overlapping the inverted region were inverted and data were then smoothed. Because pA-DamID plots and domainograms are plotted on the inverted genome, the same GATC fragments were inverted after smoothing for the nonrecombined genomes. The pipeline is available from GitHub (https://github.com/vansteensellab/domainograms/blob/dauban_etal/plot_domainograms.R).

For combined inversions and Cas9 deletions, the coordinates of GATC fragments overlapping the inverted region were inverted in silico, the GATC fragment overlapping the deleted region was removed and data were smoothed. The pipeline is available from GitHub (https://github.com/vansteensellab/domainograms/blob/dauban_etal/plot_domainograms_combined_recombination.R).

All pA-DamID scores were converted to z scores to minimize variability across experiments and obtain comparable dynamic ranges. Across all experiments, one z score unit corresponds on average to 0.83 log2 units.

Gene expression was computed from publicly available RNA-seq experiments performed in WT F1 hybrid mES cells83 (4DNFIPFKK5LM). Data are not allele specific.

The CAST (129S1) allele in the recombined clone was always plotted against the CAST allele (129S1) in the nonrecombined clone except in H3K9me2 and H3K27me3 pA-DamID experiments, in which the CAST allele in the recombined clone was plotted against the 129S1 allele of the same clone.

Transcription factor motif scanning

The number of cKrox, YY1 and Trl motifs in R4-5, as well as the remainder of LAD1, LAD2.1 and LAD2.2, was computed using MotifScan (https://motifscan.readthedocs.io/en/latest/index.html). The weight matrices of cKrox and YY1 motifs were downloaded from the latest version of the Hocomoco database (ZBT7B.H12CORE.0.SM.B and H12CORE.0.PSM.A, respectively). The weight matrix of Trl motif was downloaded from the JASPAR database (MA0205.1)84. The number of motifs detected in each region was then normalized to the regions’ respective size.

Hi-C data processing

To generate allele-specific Hi-C contact maps, a custom hybrid 129S1/CAST genome was used. Mouse strain-specific variants were obtained from the Mouse Genomes Project81 and homozygous SNPs were filtered using SnpSift85 version 4.3p. Then, bcftools86 version 1.9 was used to insert single-nucleotide variants into the GRCm38/mm10 mouse genome and the resulting fasta files were concatenated to produce one diploid hybrid genome. Sequencing reads were aligned to the hybrid genome using bwa-mem87 version 0.7.17-r1188 with the ‘-SP’ option. BAM files were created using SAMtools88 version 1.19.2. Then, pairtools89 version 1.1.2 was used to perform parsing of the alignments into contact pairs, duplicate removal and phasing. Next, cooler90 version 0.10.3 was used to create normalized contact matrices from the phased read pairs. The matrices were loaded for subsequent analysis and visualization in R using GENOVA91.

Micro-C data from a previous study44 (4DNFI6HG4GP3) were visualized using Juicebox92.

Allele-specific RNA-seq data processing

We obtained the 129S1 (GCA_001624185.1) and CAST/Eij (GCA_001624445.1) genome fasta files from the UCSC genome browser website (https://hgdownload.soe.ucsc.edu/hubs/mouseStrains/hubIndex.html). Reads were mapped to the two genomes separately (STAR aligner93) and only those with MAPQ > 10 were kept (SAMtools88). Reads coming from each allele were then split using a custom R script. Briefly, the alignment score (AS tag in BAM file) of a given read to the 129S1 genome was compared to its alignment score to the CAST genome. If it was higher for the 129S1 genome (CAST), the read was assigned to the 129S1 genome (CAST). If alignment scores were equal and greater than zero, the read was tagged as ambiguous. If both alignment scores were zero, the read was tagged as unmapped. Finally, split allele-specific and ambiguous and unmapped BAM files were produced as output. To generate allele-specific gene counts, gene annotations from GENCODE (release M23, mm10)79 were first lifted over to 129S1 and CAST genomes using Liftoff (https://github.com/agshumate/Liftoff)94. Then, read coordinates from the allele-specific filtered BAM files were overlapped with the corresponding lifted-over gene coordinates using the ‘summarizeOverlaps’ function from the GenomicAlignments Bioconductor package95. These allele-specific read counts were used for subsequent analyses.

To assess gene expression changes, we first normalized read counts in the parental cell line over the total amount of reads in the nonrecombined one. For each gene of interest, we then compared the number of reads coming from the CAST allele (129S1) in the recombined clone to the number of reads coming from the CAST allele (129S1) in the nonrecombined clone over the total amount of reads detected in each allele (129S1). We performed Fisher tests on these gene-specific matrices and P values were corrected for multiple testing by multiplying them by the number of tests performed. Genes were considered as significantly downregulated or upregulated if the Fisher test P value was lower than 0.05 and the fold change was greater than 2.

In recombination d3 and d6, no CAST allele-specific reads were detected for most of the genes within the deleted region. A small number of CAST-specific reads were still detected on the Tdgf1 gene (with a tenfold decrease in gene expression). These reads were discarded after visual inspection as they resulted from sequencing and mapping issues.

Statistics and reproducibility

No statistical method was used to predetermine sample size. Data distribution was assumed to be normal but this was not formally tested. Sample position (for lysis, sequencing index selection, PCR and high-throughput sequencing) was random. Blinding was not relevant in this study. All the data generated for this manuscript were analyzed in an automated manner. No data were excluded from the analysis. Every experiment was performed in duplicate, following the convention in the field.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.