Introduction

Ewing sarcomas (ES) are rare and highly aggressive malignant bone and soft tissue tumors that mostly affect adolescents and young adults (https://rarediseases.org/rare-diseases/ewing-sarcoma/). ES are characterized by EWSR1/ETS translocations, frequently involving EWSR1 and FLI1 (hereby EWS::FLI1), that appear to be the only detectable oncogenic event in as many as 20% of tumors1,2,3. Since the initial description of a new entity called “diffuse endothelioma of the bone” by Dr. James Ewing in 19214 and the discovery of the fusion oncoprotein in 19925, the fundamental question in ES research has been to elucidate the identity of the progenitor cell that gives rise to a tumor with a transcriptional profile characteristic of ES. The current hypothesis is that this rare cancer arises from a stem-cell precursor6. Indeed, among different primary human cells, only pluripotent stem cells display permissiveness for exogenous EWS::FLI1 expression7,8. In addition, ES tumors display transcriptional features of mesenchymal stem cells (MSCs)9, neural crest stem cells10, and endothelial tissues11. Furthermore, the normal tissues that show Ewing-like transcriptomes belong to fetal developmental transitions such as gastrulation12. These data led to postulate that a primitive neural crest-derived progenitor at the transition to mesenchymal and/or endothelial differentiation could be transformed into ES11.

Human MSCs (hMSCs) are multipotent precursors that can be differentiated in vitro into various cell types, typically osteoblasts, chondrocytes, and adipocytes, but also into endothelial lineage13. EWS::FLI1 expression in adult hMSCs (haMSCs) blocks their differentiation and generates a transcriptome profile reminiscent of ES8. Many of these EWS::FLI1-regulated genes are even more induced when the oncogene is expressed in human pediatric MSCs (hpMSCs), while numerous genes that are among the most prominent ES markers are induced in hpMSCs but not in haMSCs14. Notably, although hpMSCs provide a far more permissive environment, hpMSCs that express EWS::FLI1 (EF-hpMSCs) are, like EF-haMSCs, unable to form tumors in vivo, which has led to the suggestion that a cooperating genetic event or a unique cellular phenotype is additionally required. To address these two issues, Gordon and colleagues developed an approach to model ES that exploits the in vitro potential of human embryonic stem cells (heSCs) to differentiate into cells from the three germ layers when forming embryoid bodies (EBs)15. EWS::FLI1 expression in EBs derived from p53-deficient heSCs leads to in vitro transformation, yet EB-derived cells lack tumor formation capacities in vivo15. It should be noted that although p53 mutation has been described in up to 50% of ES cell lines, p53 is rarely mutated in primary ES tumors.

The difficulties in identifying a suitable cellular environment that simultaneously recapitulates the oncogene’s ability to block cell differentiation and induce tumor transformation have hindered the understanding of the underlying mechanisms that trigger ES and the generation of reliable experimental murine models. Regarding the latter, EWS::FLI1 fails to transform immortalized mouse fibroblasts other than NIH3T316, but mouse MSCs transfected with EWS::FLI1 form tumors when injected subcutaneously into nude mice7. More recently, it was shown that ex vivo expression of EWS::FLI1 efficiently transforms embryonic osteochondrogenic progenitors, indicating that embryonic precursor cell plasticity may be critical for uncontrolled cell growth17. Despite this tumorigenic ability, the transcriptional profile of the tumors generated by EWS::FLI1 expression in mouse cells only partially resembles the human ES transcriptome. Besides, all efforts to model a transgenic mouse that generates ES-like tumors have failed18.

Here we have isolated, cultured, and characterized human embryonic mesenchymal stem cells (heMSCs) from heSC-derived experimental teratomas. To investigate whether EWS::FLI1 expression is tolerated, heMSCs were transduced with a Flag-tagged EWS::FLI1 lentivirus. Oncogene expression induced a transcriptional profile indicative of an aberrant differentiation towards an endothelial-neural hybrid phenotype consistent with the ES signature. Chromatin immunoprecipitation sequencing (ChIP-seq) revealed that, in addition to intergenic regions, EWS::FLI1 binds mainly to intronic microsatellites (>10 CA dinucleotides) in heMSCs. Furthermore, we identified BRCA1, an essential protein in the DNA damage repair (DDR) process, as a direct EWS::FLI1 target. An intrinsic DDR defect characterizes EWS::FLI1-infected heMSCs, providing high sensitivity to DNA-damaging agents, a characteristic feature of human ES. Inoculation of the transduced cells into the gastrocnemius of mice led to the development of tumor masses in the spine or abdominal cavity, as well as lungs, liver, and kidneys. Analysis by spatial transcriptomics of these tumors confirmed a transcriptome compatible with ES and revealed the expression of tumor-specific proteins not described before. In this work, we demonstrate that EWS::FLI1 expression is the only genetic event required to induce ES. We describe an innovative experimental method to study critical aspects of the biology of developmental tumors, from leukemia to sarcomas, in which few (even single) genetic alterations are able to transform fetal stem cells.

Results

Characterization of human embryonic mesenchymal stem cells (heMSCs)

ES may arise from early pluripotent precursor cells that tolerate EWS::FLI1 expression, in which the fusion oncoprotein inhibits cell differentiation. Recently, experimental teratomas have been proposed as a useful multilineage model of human development, as they can produce a wide range of cell types from the major developmental lineages transcriptionally similar to the corresponding human fetal cell types19. We used this model as a source of early mesenchymal progenitors in which to assess the effects of EWS::FLI1. As expected, inoculation of heSCs into mouse testes resulted in the formation of teratomas (Supplementary Fig. 1A), which contained elements of the three germ layers (ectoderm, mesoderm, and endoderm) (Supplementary Fig. 1B). To isolate mesenchymal progenitor cells, teratomas were disaggregated and adherent cells were maintained in culture. Isolated cells exhibited a fibroblastic morphology and human nuclear staining (Fig. 1A). Serial culture passages showed that proliferation of heMSCs decreased progressively, and cells eventually stopped proliferating, and became senescent (Fig. 1B). Next Generation Sequencing (NGS) using the commercial AmpliSeq™ for Illumina Childhood Cancer Panel (Supplementary Fig. 1C) showed no pathogenic variants detected in any of the heMSC lines. heMSCs expressed CD73, CD105, CD90, and HLA-I; lacked CD45, CD34, and CD31; and were able to differentiate to osteogenic, chondrogenic, and adipogenic precursors, similar to hpMSCs (Fig. 1C, D). Therefore, the expression profiling of surface markers and differentiation potential displayed by heMSCs fulfilled the definition of MSCs20.

Fig. 1: Isolation, culture, and characterization of heMSCs derived from experimental teratomas.
figure 1

A Phase contrast microscopy and human nuclei immunohistochemistry of heMSC cultures (n  =  2 independent experiments). B Upper panel, proliferative exhaustion of heMSCs during passaging. Data from one representative experiment (from two independent experiments) are presented by plotting population doubling time (PDT) against passages. Lower panel, bright field pictures of the heMSCs cultured for more than 20 passages and senescence-associated β-Galactosidase (SA-β-Gal) staining of these cultures. C Characterization of surface markers in heMSC by flow cytometry. D heMSC differentiation to osteogenic (alizarin red staining), adipogenic (oil red staining), and chondrogenic (alcian blue staining) lineages upon culture with differentiation media. In both C and D, pediatric hMSCs (hpMSCs) are shown as a positive control (n  =  2 independent experiments). Magnification bars: 50 μm. E Plot shows top 5 significant terms in control heMSC-1 when compared to hpMSCs, based on the RNA-seq data analysis, with the genes contributing to pathway-level enrichment (core enriched genes). Color depicts de −log10pval × sign(FC). A two-sided test was performed with the pre-ranked gene set enrichment analysis (GSEA) based on limma-derived statistics (−log(p value) × signFC). P values were adjusted for multiple testing using the Benjamini–Hochberg false discovery rate (FDR).

The comparison between heMSCs with hpMSCs isolated from bone marrow aspirates showed that heMSC transcriptomes were enriched in genes involved in mitochondrial function and RNA processing, whereas the transcriptome of hpMSCs was enriched in genes of the embryonic skeletal development, which may reflect not so much a marked increase of these genes in the hpMSC transcriptome, but lower expression of these genes in heMSCs (Fig. 1E; Supplementary Fig. 2A; Supplementary Data 1). Mitochondria are central metabolic organelles, and their arrangement and fusion-fission state regulates O2 consumption and bioenergetics21,22. Importantly, mitochondria are involved in stemness and differentiation. We assessed the morphology of heMSC and hpMSC mitochondria using Mito Red, a cell membrane potential-dependent permeable Rhodamine-based dye. Fluorescence microscopy showed that heMSC mitochondria are punctate and distributed mainly perinuclearly, closely resembling those of A673 cells (Fig. 2A) and heSCs21. In contrast, hpMSCs mitochondria show a filamentous network distributed throughout the cell soma (Fig. 2A). These patterns of networked mitochondria were universal in all of the hpMSCS cell cultures (Supplementary Fig. 2B). Analysis of the expression levels of mitochondrial metabolic enzymes showed how similar they are overall (Fig. 2B), indicating a conserved mitochondrial mass ratio for both heMSC and hpMSC cells. Mitochondria in heSCs and human pluripotent stem cells (hPSCs) are punctate or fragmentated, but switch to filamentous mitochondria upon differentiation21. This morphologic change is associated with a metabolic transition by repression of UCP2, a mitochondrial uncoupling protein that dissociates oxidative phosphorylation from ATP synthesis21. Close inspection of the RNA-seq transcriptional profiles showed that UCP2 and UCP3 (the mitochondrial proton leak primarily expressed in skeletal muscle) were differentially expressed in heMSCs (Supplementary Data 1). The elevated expression levels of UCP2 and, to a lesser extent, UCP3 in heMSCs and ES cells were confirmed by RT-qPCR (Fig. 2C). UCP2 redirects glucose into anabolic pathways by increasing pyruvate oxidation and tricarboxylic acid cycle (TCA) flux22. Accordingly, expression levels of GLUT1, a major glucose transporter, and GLUT4, an insulin-regulated facilitative glucose transporter, were significantly higher in heMSCs than in hpMSCs (Fig. 2C), and heMSCs exhibited enhanced levels of TCA intermediate metabolites (Fig. 2D). Altogether, the underdeveloped appearance of the mitochondria, the elevated UCP2 expression levels, and the enhanced production of TCA metabolites reflect the earlier progenitor status of heMSCs compared to hpMSCs.

Fig. 2: Characterization of the mitochondrial phenotype and metabolic features of heMSCs.
figure 2

A Representative merged images of mitochondrial morphology and distribution abundance in heMSCs, hpMSCs, and A673 cells incubated with the mitochondrial potential-dependent dye Mito Red and the membrane-permeable live-cell labeling dye Calcein AM (in green), and shown by fluorescence microscopy at two different original magnifications. The pictures correspond to three lines of heMSCs and four lines of hpMSCs from different teratomas and individuals, respectively. Gray images, Mito Red pictures of the merged images above. Magnification bar: 10 μm. Observe the fragmented mitochondrial network in heMSCs and A673 cells. B Expression of metabolic enzymes in hpMSCs, heMSCs, and A673 cells, as determined by RT-qPCR, and referred to the mRNA of the mitochondrial translocon TIMM23. HK1 and GAPDH, glucolysis; ACLY, synthesis of cytosolic acetyl-CoA; Citrate synthase, Aconitase 2, IDH2, OGDH, and Fumarate hydratase, Krebs cycle; SHMT2, glycine synthesis; PCK2, conversion of oxaloacetate to phosphoenolpyruvate. Data obtained from one experiment performed in triplicate, and expressed as mean ± s.d. C Expression of the decoupling proteins UCP2 and UCP3 and glucose transporters GLUT1 and GLUT4 in hpMSCs, heMSCs, and Ewing sarcoma cells, and determined as in (B). In blue, Ewing sarcoma cell lines with the EWS::ERG (CHLA25 and COG-E-352) and EWS::FEV (TC205) fusions. D Levels of TCA metabolites in heMSCs, hpMSCs, and A673 cells. The statistics for comparisons of the heMSC and hpMSC groups in (BD) were performed using the two-tailed unpaired t-test.

EWS::FLI1 expression in heMSCs induces an ES transcriptional profile

We investigated the possibility that EWS::FLI1 might rewire the transcriptome of heMSCs more efficiently compared to hpMSCs. Expression levels of the Flag-tagged EWS::FLI1 were significantly lower than those of the ES cell line A673 and did not affect the constitutive expression of the ES marker CD99 (Fig. 3A). However, they were sufficient to induce the expression of the tumor suppressors RB and the p53 target gene p21 (Fig. 3B), indicating that p53-p21-RB1 signaling is functionally intact in heMSCs. Consistent with the critical role of p53 in maintaining chromosomal and genomic stability, the DNA content of both control and EWS::FLI1-expressing cells maintained a normal diploid cell cycle profile (Supplementary Fig. 3A, Supplementary Fig. 3B). As previously described for hMSCs, expression of well-established EWS:FLI1 target genes such as EZH2, IGF, NGFR, or PADI28, among others, was induced, while ENC1 and LOX expression was repressed when the oncogene was expressed in three different heMSCs, as shown in Fig. 3C and Supplementary Fig. 3C.

Fig. 3: Transcriptional changes in heMSCs expressing EWS::FLI1 (EF-heMSCs).
figure 3

A EWS::FLI1 and CD99 expression in heMSCs at 48 h after infection and in A673 cells. Actin, loading control. At the bottom, densitometric quantification of western blot signals normalized to Actin intensity. RT-qPCR data obtained from two independent experiments performed in triplicate are expressed as mean ± s.d. Statistics performed by two-tailed unpaired t-test. B Western blot detection of proteins of the p53-p21-RB1 axis in EF-heMSCs (n  =  2 independent experiments). Actin, loading control. C RT-qPCR to detect the expression of some of the induced and repressed oncogene targets in heMSC-1 cells at 48 h after EWS::FLI1 infection (n = 2 independent experiments performed in triplicate). Data express mean ± s.d. Statistics performed by two-tailed unpaired t-test. D Gene-concept networks of the top 5 significantly enriched terms in EF-heMSCs vs. control heMSCs. Data was analyzed as indicated in Fig. 1E. E RT-qPCR to determine the expression of 15 of the 30 top genes most potently induced by EWS::FLI1 in heMSC-1 cells at 48 h. Data from three independent experiments performed in triplicate are expressed as mean ± s.d. A two-tailed unpaired t-test was performed. F Expression values obtained from DepMap of the genes shown in (E) in Ewing sarcoma and other cancer cell lines (log2(TPM + 1)). A two-tailed Mann–Whitney U test was performed. G GSEA of EF-heMSCs transcriptomes in EWS::FLI1 signatures (EWS::FLI1 expression in UET-13 mesenchymal progenitors24, rhabdomyosarcoma RD cells10, and hMSCs8). H GSEA of EF-heMSCs transcriptomes in Ewing sarcoma signatures11,25. I, J Heatmap and PCA representation of the unsupervised clustering analysis of gene expression signatures in heSCs, hMSCs, and Ewing sarcoma samples. K GSEA of EF-heMSCs transcriptomes in the 400 most expressed and under-represented genes identified by the unsupervised clustering analysis of Ewing sarcomas. In (G, H, K), a two-sided test was performed with GSEA based on limma-derived statistics (−log(p value) × signFC). Running score plots show the cumulative enrichment of the gene set across the ranked gene list. The peak of the curve indicates the maximum enrichment score (ES). P values are adjusted for multiple testing using the Benjamini–Hochberg (FDR < 0.05).

RNA sequencing and independent analysis of the changes induced by EWS::FLI1 expression in three different heMSC lines (EF-heMSCs) identified 3836 differential expressed genes (DEGs; 2204 and 1632 up-regulated and down-regulated, respectively; p < 0.05) (Supplementary Data 2; Supplementary Fig. 4A). Normalized enrichment score (NES) analysis of the DEGs revealed that EWS::FLI1-induced transcriptional changes in genes mainly involved in two specific and interconnected functions: signal transduction cascades of receptors coupled to G-proteins (GPCRs) and biological processes involving the circulatory system (Fig. 3D; Supplementary Fig. 4B). In contrast, the repressed functions associated with EWS::FLI1 expression in heMSCs were restricted to autophagocytosis and tRNA methylation (Supplementary Fig. 4C). In addition, and in agreement with the neuroectodermal and endothelial features of ES, EF-heMSCs displayed an enhanced expression of proteins involved in signal transduction cascades in neural cells (i.e. NGFR, ALK, NTRK1, RET, GRM2 and NPY1R) and endothelial cells (i.e. UTS2, ACE, CDH5, DLL4, ECE1 or the Angiopoietin-1 receptor TEK). Importantly, most transcriptionally induced genes are non-described EWS::FLI1 targets. A selection of 15 of the 30 genes most potently induced by the oncogene were validated by RT-qPCR (Fig. 3E; Supplementary Fig. 5A). Expression data extracted from DepMap (https://depmap.org/portal/) show that many of these genes exhibit an expression pattern specific to ES cell lines (Fig. 3F). Accordingly, among the most induced genes was PRKCB, described by Surdez et al. as critical for tumor cell survival and development in ES cells and expressed in primary ES tumors but not in other tumor types23 (Supplementary Fig. 5B). Importantly, oncogene-induced gene expression was dependent on the oncogene expression levels and was reversed when EWS::FLI1 expression was down-regulated with siRNAs targeting the fusion breakpoint or Flag, supporting the specificity of the oncogene-induced transcriptional profile (Supplementary Fig. 5C, Supplementary Fig. 5D).

To estimate further the significance of the transcriptional changes induced by the oncogene in our system, we performed gene set enrichment analysis (GSEA) comparing the transcriptome of EF-heMSCs to other cell systems that had previously recapitulated the transcriptional signature of the ESs to varying degrees. These analyses showed that the EF-heMSC transcriptome is enriched in genes induced by EWS::FLI1 in human bone marrow stromal cells24, and in a rhabdomyosarcoma cell line10 (Fig. 3G; Supplementary Data 2). Furthermore, the EF-heMSC transcriptome is enriched in genes induced by the oncogene in hpMSCs8, which constitutes thus far the most accepted cellular context of ES14 (Fig. 3G; Supplementary Data 2). GSEA using a collection of publicly available soft tissue tumor gene expression profile database25 and genes expressed in ES11 demonstrated that EWS::FLI1 expression in heMSCs induced a transcriptional pattern enriched in genes of the ES signature (Fig. 3H; Supplementary Data 2), but not in other sarcomas like osteosarcoma or rhabdomyosarcoma.

To further characterize the overlap of EF-heMSCs and ES transcriptomes, we performed a hierarchical unsupervised analysis, using heSCs and hMSCs jointly as a reference for ES transcriptional profiles deposited in GEO (Fig. 3I, J; Supplementary Data 3), and selected the 400 most up- and down-regulated genes for comparisons. GSEA revealed the significant enrichment of EF-heMSC transcriptomes in the expression of the highly expressed genes in ES, while genes repressed by the oncogene in EF-heMSCs were among those transcripts under-represented in ES (Fig. 3K; Supplementary Data 3). Altogether, these results validated the use of the heMSCs as a model to experimentally recreate a bona fide ES signature.

EWS::FLI1 binds to repetitive genomic regions in heMSCs

To identify binding sites of EWS::FLI1 in heMSCs, we performed chromatin immunoprecipitation with a Flag antibody to pull down uniquely the Flag-tagged oncogene (avoiding contamination with endogenous FLI1-bound regions), followed by genome-wide sequencing (ChIP-seq). This analysis resulted in 3086 peaks (2836 annotated peaks) (Supplementary Data 4). Genome mapping revealed preferential binding of EWS::FLI1 to intergenic and intragenic regions, particularly within introns. Indeed, 1065 peaks in genes corresponded to intronic positions, while 1186 peaks corresponded to intergenic locations (Fig. 4A). Characterization of the chromatin states of heMSCs by profiling five core histone marks26 and overlaying the identified peaks revealed that 85% of the chromatin bound by EWS::FLI1 showed low levels of all histone marks (quiescent chromatin state). Among the genomic regions decorated with any of the histone marks, EWS::FLI1 binding was enriched in regions marking for Polycomb, heterochromatin, enhancers and active transcription start sites, but not in regions actively transcribed in heMSCs (Fig. 4B; Supplementary Fig. 6A). Visualization of the distribution relative to the transcription start site (TSS) showed that EWS::FLI1 preferentially binds to regions around 10–100 kb of the TSS (Fig. 4C). De novo motif analysis revealed that the primary DNA sequence underlying EWS::FLI1-binding sites upstream and downstream the TSS is divergent: while the fusion oncoprotein binds to GGAA repeats in intergenic regions, the preferential binding sequence in intron and promoter regions is the tandem iteration of 10 or more CA dinucleotides (Fig. 4D).

Fig. 4: Identification and characterization of EWS::FLI1-binding sites in heMSCs.
figure 4

A Left, genomic annotation of oncogene-bound regions in heMSC-1 cells 48 h after infection with a Flag-tagged EWS::FLI1, identified by ChIP-seq performed with a Flag antibody. Right, overlapping of peaks. Peak calling using the input as control was performed with MACS266. B Chromatin states associated with EWS::FLI1-bound peaks in heMSC-1 cells, performed with MACS2 tools using five core histone modification marks26: H3K27me3 (Polycomb repression, ReprPC); H3K9me3 (heterochromatin regions, Het); H3K4me1 (enhancer regions, Enh); H3K4me3 (promoter regions, TssA); and H3K36me3 (transcribed regions, Tx). Statistical significance of the relative frequency of EWS::FLI1 peaks in each chromatin state was assessed using a two-sided Fisher’s exact test. Numbers in the bars indicate P values, odds ratios, and confidence intervals. Quiet (Quiescent/Low) chromatin state was excluded from this graph to better visualize the data. C Percentage of EWS::FLI1-binding sites upstream and downstream from the transcriptional start sites (TSS) of the nearest genes. D Identification of EWS::FLI1-binding motifs by MEME tools68. E Overlap of genes associated with EWS::FLI1-binding peaks in heMSC and A673 cells. Bottom panel, annotations of EWS::FLI1 peaks in heMSCs corresponding to oncogene-bound genes. F Genome browser tracks depicting EWS::FLI1 binding to the PRKCB locus in heMSCs (top) and A673 cells (bottom)28. In blue, heMSC-1 cells infected with EWS::FLI1; in gray, heMSC-1 cells infected with control supernatants. Scale, 0-23. Bottom left panel, validation of EWS::FLI1 binding to intron 7 of PRKCB in heMSCs, detected by ChIP-qPCR in EF-heMSC-1 cells. Values referred to the percentage of input and were normalized with respect to the control condition. ACAT1, negative control. Bottom right panel, PRKCB induction is abolished after EWS::FLI1 knockdown. Data from two independent experiments performed in triplicate are expressed as mean ± s.d. Statistics performed by two-tailed unpaired t-test. G Sankey plots showing annotations of peaks corresponding to genes bound by EWS::FLI1 in both heMSC and A673 cells. In each of the plots, the transitions from distal intergenic (on the left), first intron (in the middle), and other introns (on the right) of the oncogene peaks in heMSCs to the peaks in A673 cells have been highlighted.

Many of the genes associated with the ES gene signature are not direct targets of EWS::FLI1 in A673 cells27 (Supplementary Data 5). Because EWS::FLI1 induces an ES transcriptome in heMSCs (see Fig. 3), we wanted to know how many of these transcriptional changes in heMSCs were due to direct EWS::FLI1-binding activity. Strikingly, only a few EWS::FLI1-bound genes displayed significant transcriptional changes in heMSCs, accounting for about 5% of DEGs (Supplementary Fig. 6B, C). These results suggest that direct DNA binding is not the main mechanism contributing to the oncogene-induced transcriptional profile in heMSCs. We tested whether EWS::FLI1 directly binds to the same genomic regions as in a transformed ES cell by overlapping EWS::FLI1 peaks in heMSCs with those peaks identified by other investigators in A673 cells28. Although the overlap between peaks was virtually nonexistent (Supplementary Data 6), the analysis based on genes demonstrates that 37% of the genes bound by EWS::FLI1 in heMSCs are bound in A673 cells (Fig. 4E, Supplementary Data 6). Importantly, EWS::FLI1 binding in heMSCs occurs in intronic regions in more than 50% of these common genes (Fig. 4E). Among these genes is PRKCB, which exhibits an EWS::FLI1-binding peak at intron 7 in heMSCs, while in A673 cells the fusion oncoprotein binds to the TSS and the second intron, distant chromatin regions, and whose expression is directly regulated by the oncogene (Fig. 4F). To determine the transitions from oncogene peaks in heMSCs to oncogene peaks in tumor cells, we generated Sankey plots that show, for each overlapped gene between samples, which annotations relate to each individual sample (Fig. 4G). The illustration of the peak annotation in each sample for each shared gene shows a portion (22%) of genes in which EWS::FLI1 binds in intergenic and intronic regions in heMSCs enriched with the oncoprotein in the 1 kb upstream promoter region in A673 cells (Fig. 4G). Particularly, this transition occurs between intron peaks in heMSCs and 1 kb promoters in tumor cells (16%). These data suggest that EWS::FLI1 could prime certain genomic regions in heMSC and, through diverse mechanisms including altered RNAPII transcriptional activity, R-loop accumulation, or RNA splicing, eventually result in the binding of the oncogene to other genomic regions, further inducing chromatin reprogramming and full transformation.

EWS::FLI1 induces BRCA1 expression and impairs DNA damage response

ES cells have robust BRCA1 mRNA levels29, which are dependent on EWS::FLI1 expression30. Accordingly, BRCA1 expression detected by immunohistochemistry revealed a strong BRCA1 signal in primary ES samples, but not in other developmental tumors, such as rhabdomyosarcoma or neuroblastoma (Supplementary Fig. 7A).

Although the number of peaks in exons identified by ChIP-seq in EF-heMSCs is low (4% of the peaks), one of the most extensive EWS::FLI1-binding regions was found in large central exon 11 of BRCA1. The oncogene also binds to exon 15 of BRCA1. These enrichments were confirmed by ChIP-qPCR (Fig. 5A). RT-qPCR with different primer pairs to amplify exons 11 and 15 revealed that EF-heMSCs have significantly higher BRCA1 mRNA and protein levels (Fig. 5B). Depletion of oncogene levels by transfection of cells with an siRNA targeting the EF breakpoint impaired the induction of BRCA1 expression (Fig. 5C), indicating that the oncogene directly regulates its expression. BRCA1 appears in discrete nuclear foci during S phase that are disrupted following DNA damage, due to its relocalization to PCNA replication structures31. Immunodetection of the subnuclear localization of BRCA1 in heMSCs revealed the presence of BRCA1 spots, although the number of these foci (particularly the larger ones) was smaller in EF-heMSCs (Supplementary Fig. 7B). This dispersal of BRCA1 foci might be due to cellular response to damage-induced transcription, accumulation of R-loops, and increased replicative stress by EWS::FLI1, as previously described29. However, despite the upregulation of BRCA1 expression and the dispersion of foci, analysis of the DNA damage by the alkaline comet assay reveals that EF-heMSCs harbor significant defects in DDR (Fig. 5D).

Fig. 5: EWS::FLI1 binds to BRCA1.
figure 5

A Top panel, genome browser screenshot illustrating EWS::FLI1 binding to the BRCA1 locus in control and EF-heMSC cells. The scale of the tracks is the same size for the control and the EF. Lower panel, chromatin immunoprecipitation of BRCA1 exons 11 and 15 by EWS::FLI1 in heMSC-1 cells infected with EWS::FLI1. Values were referred to the percentage of input and normalized with respect to the control condition. Data correspond to two independent experiments performed in duplicate and are expressed as mean ± s.d. ACAT1, negative control. A two-tailed unpaired t-test was performed. B BRCA1 expression in heMSC-1 cells infected with EWS::FLI1, detected by RT-qPCR and Western blot. Data were obtained from three independent experiments performed in triplicate and expressed as mean ± s.d. A two-tailed unpaired t-test was performed. C BRCA1 induction is abolished after EWS::FLI1 knockdown. Data obtained from two independent experiments performed in triplicate are expressed as mean ± s.d. A two-tailed unpaired t-test was performed. D Representative images of the alkaline comet assay performed with control and EF-heMSC-1 cells. Magnification bar: 50 μm. Below, Box-Whisker plot representation of the quantification of the product of the tail length and the fraction of total DNA in the tail (Olive tail moment) in control and EF-heMSC cells. Box-Whisker plot represents: center line = median; box = 25th–75th percentiles; the lower whisker corresponds to the minimum and the upper whisker to 1,5(75th percentile). Outliers are plotted as individual points. The difference between groups was analyzed by using a multiple regression model and a log(x + 0.1) transformation. E Western blot analysis to detect the expression and phosphorylation status of BRCA1 and kinases involved in DNA damage repair in control and EF-heMSC cells under basal conditions and after treatment with 5 µM etoposide. At the bottom, densitometric quantification of Western blot signals, normalized to Actin intensity (n  =  2 independent experiments). F Dose-response curves and IC50 values for etoposide in control heMSC-1 and EF-heMSC-1 cells. Representative values of three independent experiments are expressed as mean ± s.d. A two-tailed unpaired t-test was performed.

Since ES cells are highly sensitive to chemotherapeutic agents, we investigated whether defective DDR pathways in EF-heMSCs were related to baseline altered BRCA1 activation or to abnormal response to treatment with DNA-damaging agents like etoposide under experimental conditions in which cell viability was not compromised (Supplementary Fig. 7C). EF-heMSC showed upregulation of BRCA1 expression in all experimental conditions. However, BRCA1 phosphorylation in EF-heMSCs was only slightly increased in basal conditions and after etoposide treatment did not reach the phosphorylation levels of control cells (Fig. 5E). Since BRCA1 undergoes hyperphosphorylation during S phase and is transiently dephosphorylated shortly after M phase32, we tested whether the reduced phosphorylated levels of BRCA1 were due to an increase in the percentage of EF-expressing cells in G1 phase. However, etoposide treatment of EF-heMSCs increased the cell population in G2 (Supplementary Fig. 7D), where phosphorylated levels of BRCA1 should be maximal. BRCA1 is the phosphorylation substrate of different kinases, including ATM and ATR, which in turn are activated by phosphorylation at early stages of DNA damage. Expression of EWS::FLI1 in heMSCs induced the phosphorylation of both kinases in basal conditions. However, phosphorylation of ATM and ATR was defective in EF-heMSCs upon etoposide treatment (Fig. 5E). In contrast, the activation of DNA-PKc, the third kinase, which, along with ATM and ATR, regulates DDR signaling, was unaffected by EWS::FLI1 expression in heMSCs. In support of EWS::FLI1 expression inducing defective DDR, thereby sensitizing tumor cells to the effects of DNA-damaging agents, viability assays showed that the IC50 of etoposide is over half-fold lower in EF-heMSCs compared to control heMSCs (Fig. 5F).

EF-heMSCs form Ewing-like tumors in vivo

To test the potential in vivo tumorigenic effect, heMSCs were collected 2 days after infection with Flag-tagged EWS::FLI1 and injected into the gastrocnemius of 21-day-old NOD/SCID mice. Few months later (mean latency ̴ 5 months), 40% of the mice inoculated with EF-heMSCs showed health problems: they appeared sad and dull, and in some cases with pain and breathing difficulties. Post-mortem examination revealed the varying presence of soft, whitish masses in the spine; hemorrhagic masses in the abdominal cavity affecting the mesothelium, the liver, and other organs; and nodules in the lungs (Supplementary Fig. 8A). Importantly, no such lesions were observed in any of the mice inoculated with control heMSCs. Histologic characterization of an inoculation site revealed the presence of a small nodule with a heterogeneous population of cells among which some aberrant nuclei could be distinguished, while the lumps in the posterior aspect of the spine were composed of tubular structures diffusely infiltrating the fat formed by immature lipoblasts which in the periphery of the tumor masses show a myxoid appearance (Fig. 6A). Of note, the nuclei of the cells in the vessels were round to oval and showed a distinctive “salt and pepper” chromatin. The masses in the abdominal cavity showed characteristic features of ES: densely packed, small, uniform, and poorly differentiated cells with scant cytoplasm and round nuclei with frequent areas of hemorrhage, necrosis, and scattered mitotic figures. In addition, in one of the animals, tumor cells were observed to massively grow in the spine (Fig. 6A).

Fig. 6: EF-heMSCs form Ewing-type sarcomas in vivo.
figure 6

A H&E staining of representative lesions from NOD/SCID mice injected in the gastrocnemius with EF-heMSCs. Images show representative sections at two different magnifications (top bars, 100 µm; bottom bars, 20 µm). Observe the aberrant mitotic figure and the presence of blood lakes shown in the insets of the abdominal tumor masses. B IHC to detect human HLA. Magnification bars: 50 μm. C IHC to detect human HLA and human Ku80 in sequential sections of the same sample. Magnification bars: 50 μm. D IHC of human HLA and Flag in sequential sections of EF-heMSC samples. Magnification bar: 50 μm. E IHC of PRKCB, BRCA1, and hHLA in serial sections of EF-heMSC-derived tumors. Lung and thymus sections correspond to normal tissues. Magnification bars: 50 μm. N = 7 tumors.

Histologic examination of the different organs revealed metastatic dissemination to the lungs, where the cells initially colonized the pleura and spread to the bronchial tree and bronchioles, and to the liver and kidney parenchyma (Fig. 6A). Positive human HLA staining was observed in cells of the vascular-like tubes, and in cells from primary tumors and from pulmonary and renal metastases (Fig. 6B). Subsequent sections subjected to immunohistochemical staining for human HLA and human Ku80 further confirmed the human nature of these cells (Fig. 6C and Supplementary Fig. 8B).

To determine EWS::FLI1 expression in these lesions, sections were immunostained with the Flag antibody. EWS::FLI1 expression could be detected in the vascular formations and in the neoplastic lesions (Supplementary Fig. 8D, E). Immunohistochemical staining of adjacent sections with human HLA or Flag antibodies corroborated that the Flag signal corresponded to human HLA-positive cells (Fig. 6D). In addition, we confirmed the expression of PRKCB and BRCA1, direct targets of EWS::FLI1 in heMSCs, in these lesions (Fig. 6E). Ki67 expression showed that these tumors are composed of highly proliferating cells, and are enriched in cells expressing cyclin A, which is mainly found in S and G2 (Supplementary Fig. 8F). Recapitulating, vascular-like structures, primary and metastatic neoplastic lesions were formed in the animal by human cells expressing the oncogene.

Experimental Ewing tumor transcriptome resembles Ewing sarcoma

To characterize the transcriptional signature of the experimental tumors, we performed single-cell RNA sequencing on formalin-fixed tumors using the 10X Genomics Visium platform, which also allows for the reconstruction of tissue organization. Unsupervised clustering of gene expression led to the identification of tumor cells, as shown in the spatial DimPlots of Fig. 7A. Spatial maps of the expression of some of the most relevant genes by uniform manifold approximation and projection (UMAP) dimensional reduction confirmed the expression of those genes in tumor cells (Fig. 7B; Supplementary Data 7). Interestingly, the expression of these genes in ES cell lines was higher than in any other tumor cell line tested (Fig. 7C), many of which were already described in A673 cells as direct targets of the oncogene28.

Fig. 7: Transcriptional characterization of experimental Ewing sarcoma tumors.
figure 7

A H&E staining of an abdominal mass and a lung metastasis, and identification of cell clusters based on their corresponding UMAPs. Magnification bars: 400 μm. B Expression maps of some of the most relevant genes that determine clusterization in the above UMAPs. C Expression levels in Ewing sarcoma cell lines and in other tumor cell lines, extracted from the DepMap portal (https://depmap.org/), of the most relevant genes identified by computation of cell clusters from UMAP dimensional reduction. Legends in bold highlight genes directly bound by EWS::FLI1 in A673 cells in promoters or enhancers28. A two-tailed Mann–Whitney U test was performed.

With the aim of finding markers specific for ES, we took advantage of the unsupervised analysis using heSCs and hMSCs to distinguish the ES transcriptional profiles (see above, Fig. 3G and Supplementary Data 3). The top 100 DEGs (adjusted p value < 0.05) considering the log2(FC) were filtered, and spatial expression images were generated for those genes that were cross-represented in both data files. As shown in Fig. 8A, both primary tumors and lung metastasis shared the expression of many of those genes. Among them, BCL11B and ITM2A, previously identified as genes specifically related to neural and endothelial features of ES11; expressed in most ES specimens; and undetectable or weakly expressed in other developmental tumors (Supplementary Fig. 9A). Importantly, BCL11B and ITM2A were immunodetected in the experimental tumors, whereas staining of normal tissues such as thymus, spleen or liver resulted in no or low expression (Fig. 8B).

Fig. 8: ES transcriptional signature of EF-heMSC derived experimental tumors.
figure 8

A Expression maps of some of the genes differentially expressed in Ewing sarcomas (Supplementary Data 3) (adjusted p value < 0.05, log2(FC), top 100). B IHC of the ES distinctive markers BCL11B and ITM2A in EF-heMSC-derived tumors. Thymus, spleen, and liver sections correspond to normal tissues. Magnification bar, 50 μm. C Expression levels of differentially expressed genes in Ewing sarcoma cell lines and in other tumor cell lines, extracted from the DepMap portal (https://depmap.org/). Legends in bold highlight genes directly bound by EWS::FLI1 in A673 cells in promoters or enhancers28. A two-tailed Mann–Whitney U test was performed. D Spatial images of the gene set activity scores calculated using the singscore R package70,71, which implements a rank-based single-sample scoring method. Scores were computed using unidirectional gene signatures with known direction (knownDirection = TRUE). The resulting scores are directly interpretable as a normalized mean percentile rank. As a reference gene set, the top 400 (for abdominal mass) or 200 (for pulmonary metastasis) differentially expressed genes in Ewing sarcoma (Supplementary Data 3) were ordered by the log2(FC) (adjusted p value < 0.5).

To assess the specificity of the identified ES markers, we explored their expression in cancer cell lines. DepMap data analysis revealed that they were all significantly overexpressed in ES cell lines compared to other tumor cell lines (Fig. 8C). Furthermore, spatial DimPlots representing the gene signature score for each cell using the most differentially expressed ES genes (Supplementary Data 3; adjusted p value < 0.05; log2(FC)) as reference, confirmed the transcriptional similarity of the experimental tumors to human ES (Fig. 8D).

Finally, we investigated the similarity among the transcriptomes of the abdominal nodule and lung metastasis. Interestingly, both tumor samples share about 70% of their transcriptomes, which are enriched for the expression of genes involved in chromatin organization, mRNA processing and splicing (Supplementary Fig. 9B). Among the transcripts identified by spatial transcriptomics in both tumor samples we found RING1B, which we have previously identified as a trait of the ES cell of origin33, and UCP2, the mitochondrial respiratory chain uncoupling protein whose expression characterizes the stem phenotype of heMSCs (Supplementary Fig. 9C). Althogether, these findings suggest that ES cells retain the undifferentiated state of the progenitor cell of origin.

Discussion

ES tumors are highly undifferentiated developmental cancers characterized by a translocation as the only pathognomonic genetic feature (for a recent review, see ref. 34). The oncoprotein resulting from this translocation is the tumor cell driver of tumorigenesis responsible for blocking cell differentiation and inducing cell transformation. For modeling ES, our hypothesis was that the sole expression of the oncogene should be sufficient to induce an ES-specific transcriptional signature and generate tumors when expressed in the rightly undifferentiated embryonic stem-cell context, such as progenitor cells in the stage of mesenchymal and/or endothelial differentiation during fetal transitions at gastrulation12. Indeed, mesenchymal cells derived from the mesoderm or the neural crest are currently considered the most likely candidates as cells of origin of ES34,35. Previous work by Riggi et al. using hMSCs showed that the degree of differentiation, or in this case, undifferentiation, of the cells is relevant for oncogene-induced transcriptional reprogramming14. Our experimental approach consisted of using human embryonic mesenchymal cells (heMSCs) from experimental teratomas. These progenitor cells harbor sparse mitochondria with limited perinuclear localization, a clear indication that heMSCs represent earlier progenitors than hpMSCs.

Low levels of the Flag-tagged oncogene in heMSCs were sufficient to profoundly alter the transcriptional profile of these cells. In addition to inducing or repressing previously recognized target genes, EWS::FLI1 induced the transcription of numerous vascular differentiation genes and, to a lesser extent, of genes specific to neural lineage. This EWS::FLI1-induced transcriptional signature in heMSCs recreated the transcriptome of ES, confirming that the oncogene, rather than blocking differentiation, imposes an aberrant hybrid differentiation program, with endothelial and neural characteristics. Our results also confirm previous findings demonstrating that the ES-specific expression profile is enriched in endothelial and neural genes11. Therefore, the ES transcriptome is a direct consequence of the activity of the oncogene on cell differentiation, and does not reflect the histogenesis of these tumors. Strikingly, despite inducing a transcriptional signature characteristic of ES, the chromatin binding pattern of EWS::FLI1 in heMSCs is distinct from that in the ES cell line A673. In primary hMSCs, sites activated by EWS::FLI1 have a closed chromatin conformation that switches to an open chromatin conformation upon EWS::FLI1 expression, which acts as a pioneer factor. The recruitment of the transcriptional machinery results in an active enhancer pattern analogous to the chromatin architecture of ES36. Our results suggest that the ability of EWS::FLI1 to enhance chromatin accessibility in the cell of origin, i.e., the heMSC, may be initially triggered by its binding to CA microsatellites in intronic regions distinct from GGAA multimers, permitting the establishment of long-range interactions which eventually will cause an EWS::FLI1 enrichment in GGAA repeat elements at enhancers in ES cells. In this regard, it has been shown that the transcription factor NCoA3 governs the dynamic chromatin landscape through its binding to internal body sequences, probably by keeping the enhancer and promoter in close proximity to the intronic sequences recognized by NCoA337. Since the frequency of microsatellites is not different in distinct chromatin regions, except for those with GC repeats, which are sensitive to methylation38, the question arises as to which factor or factors determine the binding affinity of EWS::FLI1 to specific DNA sequences, depending on the intergenic or intronic locations. In any case, our results further support the notion that neo-enhancer looping is a critical process in ES pathogenesis36,39.

Regulatory GGAA microsatellite repeats are essential for maintaining the oncogenic properties of ES cells36. However, it is unlikely that EWS::FLI1 exclusively uses this mechanism to unleash the plethora of cellular changes that will ultimately result in ES. Moreover, only a fraction of the direct target genes of EWS::FLI1 have been identified so far, mostly in ES cell lines, which limits our understanding of the mechanisms of action of the fusion oncoprotein and its impact in early stages of tumorigenesis39. A well-known but little understood feature of ES is its intrinsic chemo- and radio sensitivity, at least initially. The basal DNA damage detected by the comet assay in EF-heMSCs is reminiscent of observations previously made in ES cells40. Authors have proposed ES as BRCA1-deficient tumors since EWS::FLI1 increases transcription to cause R-loops and block BRCA1 repair29. In contrast, other studies have shown the survival dependence of ES cells on BRCA1 expression and claim for a functional ATM deficiency as a major DDR defect and ATR as a collateral dependency30. Appreciating the considerable differences between these studies, all based on cell lines exclusively with a broad range of different mutations that may affect DDR, our results would support the latter hypothesis. We demonstrate that BRCA1 is a direct target of EWS::FLI1 in the potential cell of origin heMSCs. Consistent with this result, heMSCs expressing the oncogene have elevated levels of BRCA1 and are concomitantly deficient in DDR pathways, including defects in BRCA1 phosphorylation by ATM and ATR. Our working hypothesis is that the oncogene could be inducing the expression of phosphatases whose hyperactivity might decrease the ATM/ATR phosphorylation levels. Regardless of the underlying mechanism, these results demonstrate that the mere presence of the oncogene is responsible for the intrinsic vulnerability of ES cells to DNA-damaging agents.

Our efforts to maintain EWS::FLI1-infected heMSCs in culture have been unsuccessful, likely due to the rapid induction of p53-p21-RB1 signaling. It has been reported that EWS::FLI1 induces p53-mediated growth arrest in human primary fibroblasts41, and the growth arrest can be attenuated in p16 or p53-deficient mouse embryonic fibroblasts42. However, the p53 pathway and DNA damage signaling pathway are functionally intact in ES43, and to date, the only cells in which EWS::FLI1 does not induce cell cycle arrest and can be stably expressed are stem-like progenitor cells7. In our model, the discordance between the anti-proliferative effects and the tolerance of EWS::FLI1 expression could be related to oncogene concomitantly regulating the expression of proliferation and survival genes critical for heMSC viability. Alternatively, this discrepancy could be attributable to a delicate balance of expression between pro-survival and cell cycle arrest genes dependent on the levels of oncogene expression.

In vivo, EWS::FLI1-expressing heMSCs generated frequently hemorrhagic bone and soft tissue tumors, in accordance with the observation that ES arising in soft tissues are morphologically and molecularly indistinguishable from those arising in bones44. These results indicate that the same cell of origin would be capable of giving rise to both clinical presentations. In addition to the expression of markers previously described as molecular discriminants of ES, our experimental tumors display a histology compatible with human ES. Collectively, all these findings would endorse James Ewing’s characterization, 100 years ago, of this group of sarcomas as diffuse endotheliomas of bone and explain his original hypothesis of an endothelial cell of origin4.

In this study, we have isolated and characterized human embryonic mesenchymal cells that fulfill the criteria for the cell of origin of ES. Our results suggest that EWS::FLI1 is primarily a transcription factor with a marked potential to induce an aberrant cell differentiation program during early embryonic development with limited tumorigenic capacity. We hypothesize that changes in expression levels of the oncogene, potentially through hormonal influence during puberty, increase the chances for ES tumors to develop. This model is further supported by a recent study by Kovar et al. showing that EWS::FLI1 expression in murine osteochondral progenitors increases DNA accessibility for transcription factors whose activity must be reinforced by IGF1/Insulin to result in cell transformation45.

Based on H. Kovar´s study and our own results, we propose a two-step model of ES tumorigenesis, alike to the currently accepted model for acute leukemia, by which the first step would involve an in-utero initiating event with the fusion oncogene rewiring the chromatin of a pre-neoplastic ES cell (first hit, initiation). The second step would involve the postnatal acquisition of tumorigenic competence by the pre-neoplastic ES cell induced by high levels of pubertal hormones (second hit, transformation). This model would explain our previous clinical observation of EWSR1 rearrangements in hemangioma samples from prepubertal patients who subsequently developed ES during pubertal growth46.

In conclusion, we propose early heMSCs as the cell of origin of ES. These cells present a distinct binding pattern of the oncogene with intronic microsatellites (>10 CA dinucleotides) as the preferential sites; show defects in DDR with increased expression of BRCA1; and form tumors in mice with characteristics of human ES. Our experimental model reproduces the clinicopathological characteristics of ES, allows for the characterization of early tumorigenesis mechanisms, and the discovery of new markers of the disease.

Methods

Ethical compliance

This study complied with all relevant ethical regulations. The use of human stem cells in this study was approved by the relevant Spanish authorities (Commission on Guarantees concerning the Donation and Use of Human Tissues and Cells of the Carlos III Spanish National Institute of Health; approval: 222012). The human research protocol was approved by the Ethics Committee of the Hospital del Mar Research Institute (Approval no. 2019/8562/I). All human samples were obtained with written informed consent and with the approval of the corresponding ethical committees. Work with mice adhered to the European regulations and was approved by the Comitè Ètic d’Experimentació Animal of the Barcelona University (protocol number 365/18). Animals were euthanized when the tumor volume exceeded 1500 mm3 or if body weight loss was more than 20% of the starting weight.

Cell culture

Commercial cell lines were grown in standard conditions. The heSC line ES4 was previously characterized and maintained as described47. heMSCs, isolated from teratomas by enzymatic digestion, were grown in Eagle’s medium supplemented with 10% FBS and bFGF (1 ng/ml). ES4 and heMSC characterization was performed in November 2018 (qGenomics, Spain). A673 and 293T cells were cultured in standard conditions. hpMSCs were obtained from bone marrow aspirates of HSJD pediatric normal subjects under written informed consent with the approval of the “Committee on Biomedical Investigation at Hospital Sant Joan de Deu” review board.

Teratoma formation

Experimental teratomas in NOD/SCID mice (Charles River) were generated as previously described48. Immunofluorescence characterization of the experimental teratomas was performed with antibodies to detect TuJ1 (Covance), FOXA2 (R&D Systems), AFP (Dako), and α-SMA, α-sarcomeric actin, and NF-200 (Sigma Aldrich).

Characterization of heMSC

heMSC DNA extraction was performed using the Qiagen Gentra Puregene commercial kit (Qiagen). NGS was performed using the commercial AmpliSeq™ Sequencing Panel for Illumina Childhood Cancer Panel, which has a sensitivity of 98.5% for variants with of 98.5% for variants with a VAF of 5% and a specificity of 100%49. This panel allows the determination of point variants and small deletions and insertions (130 genes) and copy number variants (24 genes). All genes included in the sequencing panel allow the analysis of pathogenic variants related to pediatric cancer. The minimum sequencing depth established is 300×, and the limit of detection of the technique is greater than 5%. greater than 5%. Analysis and interpretation of the results is performed using Illumina’s DNA Amplicon analysis software. Immunophenotypic analysis was carried out using antibodies to detect CD31 (Miltenyi Biotec), and CD73, CD90, CD34, CD45, CD105, and HLA-I (BD Biosciences) with an LSRII flow cytometer running FACSDiva v5.02 Software (BD Biosciences). Trilineage differentiation was performed with the StemMACS™ Trilineage Differentiation Kit human (Miltenyi Biotech), as described before50. For mitochondria staining, cells were grown on glass slides and incubated at 37 °C with Mito Red (Sigma Aldrich) and Calcein AM (Enzo Life Sciences) in serum-free medium for 1 h.

Targeted determination of TCA metabolites

TCA metabolites (pyruvate, citrate, α-ketoglutarate, succinate, fumarate, and malate) were determined in three different heMSCs, four different hpMSCs, and A673 cells (n = 1). Metabolites were determined based on a previously reported method51. Briefly, after addition of the internal standard (consisting of a mixture of labeled analytes), 25 μl of the lysate extract was derivatized with o-benzylhydroxylamine. The reaction was stopped by the addition of 1 ml of water, and the analytes were extracted with ethyl acetate. After evaporation of the organic layer, the extract was reconstituted into 150 μl of water. 10 μl of the reconstituted extract was injected into the LC-MS/MS system consisting of an Acquity I-Class UPLC system coupled to a triple quadrupole (TQS Micro) mass spectrometer (Waters Associates) equipped with an electrospray interface. The separation was achieved at 550 °C using an Acquity BEH C18 column (100 × 2.1 mm i.d., 1.7 μm) (Waters Associates) at a flow rate of 300 μl/min. Water–ammonium formate (1 mM)/formic acid (0.01%) and methanol–ammonium formate (1 mM)/formic acid (0.01%) were used as mobile phases. A gradient program with a linear change of the percentage of organic solvent as follows: 0 min, 30%; 1 min, 30%; 6 min, 55%; 6.8 min, 80%; 8.3 min, 99%; 9 min, 99%; 9.01 min, 30%; 10 min, 30%. The total run time was 10 min. Analytes were determined by selected reaction monitoring using the most specific transition for their determination (see Source data for Fig. 2). The analytical method was validated in different matrices with adequate intra- and inter-assay accuracies (80–120%) and intra- and inter-assay precisions (<20%) for all targeted analytes51. MassLynx software v4.1 and TargetLynx XS (all from Waters Associates) were used for data management. A representative chromatogram can be found in Source data.

In vitro functional assays, cell cycle, and apoptosis analyses

To assess in vitro growth rates, 1 × 105 cells per 10-cm plate were seeded. Every 2 or 3 days, cells were harvested, and 1 × 105 cells were replated until cells stopped proliferating. The population doubling time was calculated as described52. Cell cycle profiles of heMSCs were analyzed by flow cytometry following standard propidium iodide staining. Apoptosis detection was performed according to the manual of Annexin V staining (Biolegend). A positive control for apoptosis was established by exposing the cells to 65 °C for 5 min before analysis by flow cytometry.

RT-qPCR

heMSCs were infected with Flag-tagged EWS::FLI1 lentiviral supernatants, and 48 h later were recovered. For EWS::FLI1 knockdown experiments, cells were infected with control or Flag-tagged EWS::FLI1 supernatants and 24 h later oligofected with siRNA against the fusion breakpoint or Flag. Cells were harvested 72 h after infection. shRNA sequences were siEWSFLI fusion 2 (GCAGAGUUCACUGCUGGCCUU) and siFLAG (CAAAGACGAUGACGACAAGUU). RNA was isolated with Genelute Total Mammalian RNA Kit (Sigma Aldrich), and cDNA was obtained by using Transcriptor First Strand cDNA Synthesis Kit (Roche). RT-qPCR assays were performed using SYBR Green PCR master mix (Applied Biosystems, Life Technologies). For normalization purposes, we run simultaneously RT-qPCR with primers for GAPDH. The ABI PRISM 7900HT cycler’s software calculated a threshold cycle number (Ct) at which each PCR amplification reached a significant threshold level. Primer sequences used for RT-qPCR are described in Table 1.

Table 1 Sequences of the primers used for RT-qPCR

Immunohistochemistry, immunofluorescence, and Western blot

Western blots were performed with protein extracts obtained with RIPA-M buffer53. Briefly, cells were lysed with 1 % NP-40, 0.1 % Sodium deoxycholate, 150 mM NaCl, 1 mM EDTA, 50 mM Tris, pH 7.5, and protease inhibitors EDTA-free 1X (Roche). Samples were centrifuged at 20,000 × g 4 °C, 15 min, and the supernatants were collected. Pellets were resuspended with Urea-T buffer (50 mM Tris pH 8.1, 75 mM NaCl, 8 M Urea, and protease inhibitors EDTA-free 1X (Roche))54 by sonication, and centrifuged at 20,000 × g 4 °C, 15 min. The supernatants were collected and mixed with the supernatant of the first step. Immunohistochemistry in randomly selected tumor samples from patients and experimental tumors and immunofluorescence were performed using standard methods55. The number and area of BRCA1 foci were determined by immunofluorescence in randomly selected cells and quantified by confocal microscopy. Antibodies used for these techniques were: FLI1, p53, p21, ATM, hHLA, PRKCB (Santa Cruz); CD99, hKu80(Cell Signaling); β-actin (Abcam); RB1 (BD Biosciences); human nucleus, BRCA1, pS1981ATM(Millipore); pS1423-BRCA1, ATR, pT1989-ATR, DNA-PKc, pS2056-DNA-PKcs (Abclonal); Flag (Merck Life Science); BCL11B (Biolegend) and ITM2A (Fisher Scientific).

RNA sequencing

Total RNA from heMSCs and hpMSC was purified and reverse transcribed. NGS libraries with polyA capture were prepared according to the manual Protocol for use with NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (New England Biolabs) (version 1.0, 7/2 Illumina TruSeq library preparation (Illumina)). Libraries were sequenced on an Illumina HiSeq 2500. Raw sequencing reads in the fastq files were mapped with STAR version 2.7.1a56 Gencode release 36 based on the GRCh38.p13 reference genome and the corresponding GTF file. The table of counts was obtained with the FeatureCounts function in the package subread, version 1.6.457. The differential gene expression analysis (DEG) was assessed with voom+limma in the limma package version 3.48.058 using R version 4.1.0. Genes having less than 10 counts in at least 2 samples were excluded from the analysis. Raw library size differences between samples were treated with the weighted “trimmed mean method” TMM59 implemented in the edgeR package60. The normalized counts were used in order to make unsupervised analysis, PCA, and clusters. For the differential expression (DE) analysis, read counts were converted to log2-counts-per-million (logCPM) and the mean-variance relationship was modeled with precision weights using the voom approach in the limma package.

Pre-Ranked Gene Set Enrichment Analysis61 implemented in clusterProfiler62 package version 4.0.0 was used in order to retrieve enriched functional pathways. The ranked list of genes was generated using the -log(p.val)*signFC for each gene from the statistics obtained in the DE analysis with limma58. Functional annotation was obtained based on the enrichment of gene sets belonging to gene set collections in the Molecular Signatures Database (MSigDB). The collection used in this project is c5.bp: Gene sets derived from the Biological Process Gene Ontology (GO), version 7.2. RNA-seq data were deposited in GEO under accession number GSE272957.

Gene expression analyses

Raw Affymetrix CEL files were normalized and processed by RMA using the R/Bioconductor oligo package and Affymetrix63. Unsupervised analyses were performed by PCA and hierarchical clustering using Euclidean metrics. Standard deviation (s.d.) density plots were used to determine the cut-off value for gene expression unsupervised analyses (SD ≥ 1). Supervised gene expression analysis was performed using limma58. Probes were considered significantly differentially expressed when the FDR-adjusted Wilcoxon rank test for independent samples was <0.05 and the absolute logarithmic fold change (|LFC|) was >2.

Gene set enrichment analysis (GSEA)

Functional annotation was obtained based on: (1) Gene sets from collections in MSigDB related to ES; () Gene lists provided by Baird’s et al.25; (3) The TOP 400 up and TOP 400 down-regulated genes derived from public data from GEO datasets GSE6460, GSE7637, GSE7896, GSE8884, GSE9451, GSE9440, GSE9510, GSE9520, GSE9593, GSE10315, GSE13604, GSE13828, GSE17679, GSE34620, GSE37371 and GSE31215 (n = 236). Expression was normalized using Quantile Normalization. For the DE analysis to obtain the list of interested genes, an empirical Bayes moderated t-statistics model (limma) was built. The GSE ID variable was included in the model as a covariate. To identify the most variable probes, we selected the probes above 1 standard deviation, which were 8176. The top 400 up-regulated and top 400 down-regulated genes were obtained, ranking the genes by p.adjusted and logFC values. Gene sets description: MIYAGAWA_TARGETS_OF_EWSR1_ETS_FUSIONS_UP MSigDB; ZHANG_TARGETS_OF_EWSR1_FLI1_FUSION MSigDB; RIGGI_EWING_SARCOMA_PROGENITOR_UP MSigDB; STAEGE_EWING_FAMILY_TUMOR MSigDB; (EWS Baird et al.); heSC-vs-MSC_DN Gene sets derived from GEO datasets (Top 400 down genes); heSC-vs-MSC_UP Gene sets derived from GEO datasets (Top 400 up genes).

ChIP-qPCR, ChIP-seq, and bioinformatic analysis

Two biological replicates of heMSC-1 cells were subjected to ChIP-qPCR following standard procedures. Briefly, cells were fixed with crosslink solution (HEPES 50 mM, pH 8,0; NaCl 100 mM; EDTA 1 mM; EGTA 0.5 mM; 0.5% Formaldehyde) for 10 min at room temperature, and incubated with stop solution (10 % Glycine in Tris-HCl 10 mM, pH 8.0). Cells were lysed for 20 min on ice with 10 mM Tris-HCl, pH 8.0, 0.25% Triton X-100, 10 mM EDTA, 0.5 mM EGTA, 20 mM β-glycerol-phosphate, 100 mM NA-orthovanadate, 10 mM Na Butyrate, and complete protease inhibitor cocktail. The supernatants were sonicated, centrifuged at 13,000 rpm for 15 min, and supernatants were incubated overnight with the Flag antibody (Merck Life Science) or with anti-mouse IgGs (Millipore). For the core histone marks, the antibodies used were H3K4me3 (Abcam), H3K27me3, H3K4me1, H3K36me3, and H3K9me3 (Diagenode). Precipitates were captured with protein A/G-Sepharose, extensively washed, and the crosslink was reverted by incubating the samples with Proteinase K. DNA was extracted and analyzed by qPCR using the ABI 7700 sequence detection system and SYBR Green master mix protocol (Applied Biosystems). Each immunoprecipitation was done in triplicate. The reported data represent real-time qPCR values normalized to input DNA and are expressed as a percentage (%) of bound/input signal. Primers used for ChIP-qPCR can be found in Table 2.

Table 2 Sequences of the primers used for ChIP-qPCR

ChIP-seq libraries were prepared using the NEBNext Ultra DNA Library Prep from Illumina according to the manufacturer’s protocol. Briefly, 5 ng of input and ChIP-enriched DNA were subjected to end repair and addition of “A” bases to 3′ ends, ligation of adapters, and USER excision. All purification steps were performed using AgenCourt AMPure XP beads (Qiagen). Library amplification was performed by PCR using NEBNext Multiplex Oligos from Illumina.

Raw sequencing reads in the fastq files were mapped with Bowtie264 version 2.4.2 on the GRCh38.p13 reference genome and the corresponding GTF file. BAM files of replicates were merged using BEDTools65 version 2.30.0 in order to get one file per sample condition. Peak calling was obtained using MACS266 using inputs of the ChIP experiments as controls. MACS2 was run in narrowPeak mode (Default) without building the shifting model (–nomodel) and specifying the size of the binding region (–extsize) using information from QC (phantompeakqualtools). We used BEDTools intersect to extract peaks unique to Flag and not present in controls. Annotation of the peaks was done using the ChIPseeker67 package version 3.30.1. GENCODE version 41 was used to annotate the peaks using version 3.16.0 from TxDb.Hsapiens.UCSC.hg38.knownGene package. Ensemble version 86 was used to transform from Entrez to the symbol annotation of the genes. Annotation of chromatin states was done with BEDtools. The statistical significance of the relative frequency of EWS:FLI1 peaks in each chromatin state was assessed using Fisher’s exact test.

Overlapping peaks between EWS::FLI1 in heMSCs and FLI1 in A67328 were identified with the function findOverlapsOfPeaks of the ChIPpeakAnno package (version 3.30.1) using a window of 100 bp. Annotation of the peaks was done using the ChIPseeker67 package version 3.30.1. GENCODE version 39 was used to annotate the peaks using version 3.15.0 from TxDb.Hsapiens.UCSC.hg38.knownGene package.

For the identification of consensus binding sites, the peaks were split according to their annotation, and fastq files were written using the packages GenomicRanges (version 1.50.2), BSgenome.Hsapiens.UCSC.hg38 (version 1.4.5) and rtracklayer (version 1.58.0). MEMECHIP was then run over the generated fastq files using MEME (version 5.1.1)68.

EWS::FLI1 peaks in A67328 were annotated using the annotatePeak function from ChIPseeker (version 1.34.1)67 considering the TSS range from −5000 bp to 100 bp and using the annotation from TxDb.Hsapiens.UCSC.hg38.knownGene package (version 3.15.0). TxDb.hsapiens.UCSC.hg38.knownGene: Annotation Package for TxDb Object(s). Peak annotation of EWS::FLI1-bound genes shared between heMSCs and A673 was retrieved and analyzed.

All analyses were performed under R version 4.2.1 (R Core Team 2022; https://www.R-project.org/).

Comet assay

Cells were infected with lentiviral supernatants and recovered 72 h later. Alkaline comet assay was performed using reagents from Trevigen in accordance with the manufacturer’s protocol. Imaging was performed with a fluorescence microscope, and the Tail Moment was determined using the ImageJ analysis software. Fifty or more comets were analyzed for each experiment.

MTS

Cells were seeded in 96-well plates (1500 cells/well) and infected with lentiviral supernatants. Twenty-four hours later, cells were treated with serial dilutions of etoposide. Seventy-two hours later, 10% MTS (Promega) was added, and absorbance at 490 nm was read using a Tecan microplate reader. Percent viability was calculated by normalizing absorbance values to those from cells grown in media with vehicle treatment, after background subtraction. IC50 was determined using Prism 8 Software (GraphPad).

In vivo tumorigenesis assay

heMSCs were infected with Flag-tagged EWS::FLI1 lentiviral supernatants, and 48 h later were recovered. 1 × 106 cells were embedded in Matrigel and injected into the gastrocnemius of 21-day-old NOD/SCID mice (Charles River). Mice were sacrificed when the animals showed clear signs of distress or the presence of any tumoral growth. Dissected tumors and organs were fixed in 10% formalin and processed routinely for histologic examination.

Spatial transcriptomics

Tumor samples were formalin fixed and paraffin embedded. Spatial transcriptomics was performed with Visium CytAssist Spatial Gene Expression Reagent Kits, following the protocol CG000495. RNA quality of the tissue was tested by calculating the percentage of total RNA fragments >200 nucleotides (DV200) of RNA extracted from tissue sections. Once RNA quality was evaluated, 5 μm tissue sections were placed on SuperFrost slides and placed in a desiccator to ensure proper drying. After overnight drying, samples were deparaffinized, stained with H&E, and images of the tissues were taken, following the protocol Visium CytAssist Spatial Gene Expression for FFPE Deparaffinization, H&E Staining, Imaging and Decrosslinking (CG000520-Rev C). Then, the tissues were decrosslinked and the libraries were constructed following the instructions in the protocol for Visium CytAssist Spatial Gene Expression for FFPE, Human Transcriptome, 6.5 mm (Preparation Guide CG000518-Rev D), and sequenced (slide 6.5 mm 250MR: 5000 spots at 50,000 reads/spot, 40 Gb/sample). To analyze the data, the spaceranger count function within the 10x Genomics Space Ranger 3.0.0 tool, utilizing the GRCh38-2020 reference genome, was employed. The resulting “filtered_feature_bc_matrix.h5” files were then imported for subsequent analysis, leveraging the Seurat package version (5.0.3)69. The SCTransform function with default parameters was applied to normalize the data, identify variable features, and scale each sample independently. Additionally, PCA dimensionality reduction, determination of k-param nearest neighbors, and computation of cell clusters preceded the implementation of UMAP dimensional reduction. Gene set scores were obtained using the singscore package version 1.18.070,71. Data from the transcriptome analysis of human ES versus stem cells was used to create a gene reference; specifically, the top 100, 200, or 400 differentially expressed genes (adjusted p value < 0.05) were ordered by the log2(FC). The scaled data from the Seurat object was ranked using the rankGenes function. The output was then used for the simpleScore function, determining the direction as known (knownDirection = TRUE). The obtained total score was then imputed to the Seurat object. All analyses were performed with R version 4.2.1 (R Core Team 2022; https://www.R-project.org/).

Statistical analysis

Experimental data were analyzed using Prism 10.4 software (GraphPad Prism Software Inc.). A two-tailed Mann–Whitney U test, P  <  0.05 (effect size: Δ with 95% confidence interval) was considered statistically significant. Error bars (mean ± standard deviation) of the data are shown in the experiments. Detailed data for each experiment are shown in Source data. No statistical method was used to predetermine sample size. All experimental data were analyzed without subjective exclusions. Investigators performing experiments and outcome assessments were blinded to group allocation, and data analysts remained blinded until completion of statistical analyses, except in those experiments designed to specifically evaluate the effects of different experimental conditions.

Inclusion and ethics statement

All collaborators in this study who met the authorship criteria required by the Nature Portfolio journals have been included as authors, as their participation was essential for the design and implementation of the study. Roles and responsibilities were agreed upon among collaborators prior to the research.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.