Extended Data Fig. 2: Computational and in vitro characterization of MSL2 CXC and CTD domains.
From: RNA nucleation by MSL2 induces selective X chromosome compartmentalization

a, MSL2 ChIP–seq analyses (data published by10). Boxplots display the change in input-normalized ChIP enrichment difference in roX1ex6 Df(1)roX252 (abbreviated as roX1 roX2) versus wild-type on MSL2-enriched 1 kb bins, HAS (n = 267/26 on X/autosomes) or random regions. The random regions were generated by shuffling the enriched 1 kb bins. A negative value indicates a loss of enrichment in the mutant, while a gain is indicated by a positive value. The enrichment score was calculated using deepTools multibigwig summary. 1 kb bins with a score >0.7 in either genotype were classified as enriched (n = 1,893 enriched/144,916 total bins, of which n = 1,448 reside on X and n = 445 on autosomes). A Welch Two Sample t-test for the difference in enrichment of MSL2 at HAS in wild-type versus roX1 roX2 revealed a P-value of 4.387 × 10−73. b, as in a, Genome-browser snapshot of MSL2 ChIP–seq. Data normalization is described in Methods. c, Analysis of adult male viability in flies expressing transgenic roX2 (tg1, tg2, tg3) in a roX1SMC17A roX2∆ background. The data was expressed relative to females obtained from the same cross. The barplot represents the mean ± s.e.m. with overlaid datapoints reflecting the results from n = 6 vials/crosses per genotype. Details on genotypes and nature of transgenes are provided in Methods and Supplementary Data 4. d, RT–qPCR analysis of the indicated genes in roX1SMC17A roX2+ (bar 1) or roX1SMC17A roX2∆ ; roX2tg1/tg2/tg3 (bars 2–4, see Methods) or wild-type female (bar 5) L3 larvae. All genotypes additionally contained the msl-2::HA allele. The RNA level of each gene was normalized to RpL32 and expressed relative to roX1SMC17A roX2+ male flies. The barplot represents the mean ± s.e.m. of n = 5 larvae with overlaid datapoints reflecting the result from one individual larva. e, Representative coomassie-stained SDS–PAGE of the purified GST-CXC domains of D. melanogaster (Dmel), D. virilis (Dvir) and H. sapiens (Human). GST was used as negative control (−ctrl). For source data, see Supplementary Fig. 1. f, Biolayer interferometry (BLItz) experiments were conducted to quantify binding of the recombinant GST-tagged CXC proteins of different species as in e to the biotinylated ATGAGCGAGATG dsDNA (referred to as S12 in12). The barplots report the mean fitted K(on1) and K(off1) ± s.e.m. rate constants (left), the boxplots display the equilibrium KDs (right). Parameters were determined from two independent protein purifications. In each independent experiment (Dmel CXC n = 12, Dvir CXC n = 7, Human CXC n = 6), the proteins were measured over a concentration range of at least 5 different dilutions. Only measurements at concentrations above 1 μM were taken into account for fitting using a 2:1 binding model (also see Methods). The P-values were obtained by a two-sided Wilcoxon rank sum test. g, as in f, Representative BLItz experiments with observed signals shown in grey and fitted in red-blue colour. h, Sequence alignment of the CTD of MSL2 in different mammalian and Drosophila species abbreviated as Dmel = D. melanogaster, Dsim = D. simulans, Dsec = D. sechellia, Dyak = D. yakuba, Dere = D. erecta, Dana = D. ananassae, Dpse = D. pseudoobscura, Dwil = D. willistoni, Dmoj = D. mojavenis, Dvir = D. virilis, Dgri = D. grimshawi. The numbering at the top refers to the relative amino-acid position, starting from the CTD. i, Evaluation of the poly-proline bias in the MSL2 CTD of Drosophila species abbreviated as in h compared to mammalian species. Prolines are indicated with a grey bar. The amino acid sequence of the D. melanogaster CTD is shown below, with significantly enriched low complexity regions by dAPE software shown in colour. j, Top, illustration of the published crystal structure of the MSL2 CXC domain in complex with DNA (PDB: 4RKH). Residues with signs of positive selection are colored in red, conserved residues in blue. Bottom, sequence alignment of the MSL2 CXC domains of different species abbreviated as in e. Residues with signs of positive selection (also see Supplementary Data 2) are colored in red, conserved residues in blue. k, Representative coomassie-stained SDS–PAGE of the purified Drosophila dCXC-CTD, as well as the Drosophila dCTD proteins. Purified His-Smt3 was used as negative control in several in vitro assays. For source data, see Supplementary Fig. 1. l, Biolayer interferometry (BLItz) experiments were conducted to quantify binding of Drosophila dCXC-CTD or dCTD proteins to the biotinylated dsDNA CGAATATGAGCGAGATGGATG (CES11D1). The BLItz experiments for the dCXC-CTD (20 and 10 μM) are shown in black/grey, while the dCTD (40 and 20 μM) is shown in blue. The two lines at each concentration indicate two independent measurements. m, Absorbance at 280 nm recorded during anion exchange chromatography conducted after the His-Talon purification of the dCXC-CTD construct. Two peaks were monitored, of which the second one corresponds to co-purifying nucleic acids. n, Absorbance at 280 nm recorded during gel filtration of the Drosophila dCXC-CTD and mammalian mCXC-CTD proteins. Final pooled fractions used for in vitro assays are indicated. o, Representative example BLItz experiments of the recombinant D. melanogaster GST-dCXC and untagged dCXC-CTD protein binding to the CGAATATGAGCGAGATGGATG dsDNA (referred to as CES11D1 in12). The experimental signal is colored in grey and the fitted data overlaid in colour. The concentrations from top to bottom in the panels of the GST-dCXC proteins were (40, 16, 4, 2, 0.66, 0.33 μM) and dCXC-CTD (100, 75, 50, 37.5, 25, 10, 4, 1.5, 0.75 μM). Fitted rate constants from n = 5 (dCXC-CTD) or n = 4 (GST-dCXC) independent protein purifications are reported in Fig. 1i. Significance was evaluated with a two-sided Wilcoxon rank sum test, where the difference in equilibrium constant (KD1) revealed a P-value of 1.26 × 10−7, n = 29 (dCXC-CTD) versus n = 21 (GST-dCXC) measurements.