Introduction

Gliomas are central nervous system tumors that originate from neuroglial stem or progenitor cells. High histologic grade gliomas—Grade III anaplastic astrocytomas and Grade IV glioblastomas (GBM)—represent half of all primary adult brain malignancies, and are often rapidly fatal due to diffusely invasive growth and a lack of effective therapies. Genomic analyses have identified key glioma gene drivers and targets, e.g., IDH1 and IDH2 neomorphic mutations, that have substantially improved glioma classification and prognosis, though have not as yet led to more effective therapies or improved patient outcomes1.

Work to better understand GBM biology and identify therapeutic vulnerabilities depends on tractable, revealing preclinical disease models. Glioma stem cell (GSC) cultures address this need in several ways: they are initiated from GBM surgical resection tissue; are enriched in tumor stem/repopulating cells; can capture and propagate many GBM molecular and phenotypic features; are faster and cheaper to use than many other GBM disease models; and can support high-throughput in vitro as well as organoid, xenograft and on-chip GBM modeling protocols2,3,4,5.

We have systematically characterized important features of four GSC cultures from unrelated adult patients to serve as GBM cellular disease models. All four resemble ~ 90% of newly diagnosed adult primary GBM in that they lack IDH neomorphic or other IDH-inactivating mutations and express functional IDH proteins (i.e., are ‘IDH + ’)1,6. They were initiated from surgical resection tissue from unrelated adults7,8; resemble previously defined GBM genomic and proteogenomic subtypes; and display reproducible dose-survival curves for the GBM standard-of-care therapies ionizing radiation (IR) and temozolomide (TMZ). Molecular barcoding and single cell gene expression profiling were used to further define GSC cell/cell state heterogeneity, and to track single cell trajectories over 40 days after IR treatment. These well-characterized, experimentally tractable cellular disease models should prove useful for investigating GBM biology, and aid the search for new and more effective GBM therapies and treatment regimens.

Results

Propagation, proliferation rate and colony-forming efficiency

GSC cultures were initiated from surgical resection tissue from unrelated, anonymized adult GBM patients as previously reported7,8. We established standard conditions for propagating, verifying and handling GSCs; new short tandem repeat (STR) DNA fingerprinting profiles to identify all four cultures; and verified the absence of contaminating cells and Mycoplasma species prior to generating liquid nitrogen archival stocks and working cultures (Fig. 1A, B and Table S1).

Fig. 1
figure 1

Glioma stem cell (GSC) proliferation and therapeutic response profiles. (A) GSC culture experimental workflow and culture color code used throughout. (B) Dark field microscopy of GSC cultures in mid-exponential growth in serum-free medium and 5% oxygen on untreated plastic. Scale bars are 200 μm long. (C) Serial cell counting reveals different GSC culture population doubling rates. (D) Colony-forming efficiencies (CFE) of untreated GSC cultures determined by dilution cloning. (E) GSC culture CFE as a function of IR or TMZ dose versus untreated controls. Abbreviations: CellTag, lentiviral molecular barcoding system; IR, ionizing radiation; mtDNA, mitochondrial DNA; TMZ, temozolomide; scRNAseq, single cell RNA sequencing; WES, whole exome sequencing. Error bars represent standard error of the mean in replicate experiments.

All four cultures could be continuously propagated in defined, serum-free media at 37 °C and a 5% O2/5% CO2 atmosphere. We developed a standard protocol for propagating cultures together with new short tandem repeat (STR) DNA fingerprints to identify and authenticate all four cultures. We reasoned that low oxygen growth more closely mimicked in vivo oxygen tension than does ambient air (20% O2), and permitted cultures to be continuously propagated without requiring specialized growth substrates. All four cultures could nonetheless form monolayers when plated on laminin- or polylysine-coated plastic9. Under these conditions GBM4, GBM8 and 0827 grew predominantly in suspension to form multicellular neurosphere- or tumoroid-like aggregates, whereas 0131 grew from substrate-attached single cells to form both attached and suspension aggregates (Fig. 1B).

Proliferation rates were determined by serial cell counting, and colony formation from single cells by dilution cloning. Population doubling times ranged from 24.9 to 59.8 h, with GBM8 consistently displaying the fastest population doubling time, followed by 0827, GBM4 and the substantially slower growing 0131 culture (Fig. 1C). All four cultures generated single cell-derived colonies upon dilution cloning, with colony-forming efficiencies (CFE) of ~ 10% (GBM4 and 0131) or ~ 45% (GBM8 and 0827) (Fig. 1D). Flow cytometry and scRNAseq analyses subsequently documented a high growth fraction of putative cycling cells in all four GSC cultures (Fig. 3C, S5).

Radiation and drug dose-survival analyses

Both population-based proliferation suppression and CFE assays were used to assess the sensitivity of GSC cultures to radiation, drugs and small molecules. We reasoned that these assays when used in concert gave a more accurate and nuanced view of GSC responses than more popular short term (typically 3 day) growth suppression or viability assays. This is especially true for IR, where delayed radiation effects can only be accurately scored using culture times longer than 3 days and sensitive assays such as colony formation that remain radiation biology mainstays. IR doses of 1.0–1.5 Gy reduced survival by ~ 67%, with culture-specific differences in CFE ranging up to ~ 100-fold at 6 Gy (Fig. 1E left panel). Cultures could be further IR-sensitized by treating with the ATM kinase inhibitors such as KU-60019 or AZD-1390 3 h prior to irradiation, with 3- to 40-fold reductions in CFE as a function of IR dose; only GBM4 displayed ATM inhibitor-dependent CFE suppression prior to IR (Fig. S1A).

GSC cultures were either sensitive or effectively resistant to temozolomide (TMZ), the GBM standard-of-care chemotherapeutic agent. Sensitive cultures had LD50 values in the low uM range, whereas TMZ-resistant cultures were not killed over the full range of TMZ concentrations tested (≤ 64 uM in Fig. 1E, or up to 250 uM) after continuous exposure for up to 6 days. Expression of the DNA repair protein methylguanine methyltransferase (MGMT) is a major determinant of TMZ sensitivity10,11 Only GSC 0131 expressed MGMT in conjunction with TMZ resistance (Fig. S1B). When grown in the demethylating agent decitabine in an attempt to induce MGMT expression, all four GSC cultures were effectively sterilized by sub-micromolar decitabine concentrations leaving too few surviving cells to permit MGMT Western analysis (Fig. S1C). GBM4, GBM8 and 0827 were also killed by low nM concentrations of ST-401, a novel small molecule targeting microtubules already shown to be active against mouse GBM xenografts12,13 (Fig. S1D).

Genomic characterization

All four GSC cultures were aneuploid with mean metaphase chromosome counts of 51–86. GBM4 and GBM8 were near-tetraploid, whereas 0827 was near-triploid with all four cultures displaying previously reported recurrent GBM chromosomal abnormalities. There were many additional numerical and structural abnormalities and, in GBM4 and GBM8 metaphases, ‘double minute’ chromosomes that often indicate gene amplification (Fig. S2, Table S12)14,15.

Whole exome sequencing (WES) identified GSC-specific single nucleotide variants (SNVs) and small insertion/deletion (indel) variants of ≤ 10 bp to compare with 2,539 GBM samples that were almost exclusively (97%) primary GBM tumors (see: ‘CNS/Brain—Cancer Subtype Designation’ ‘Glioblastoma multiforme’ samples, https://www.cbioportal.org/, accessed 7 Apr 2025). This comparison identified 212 genes by OncoKB Cancer Gene criteria that were adequately sampled (≥ 10 variant samples/2539 tumors) and were altered in > 1% of GBM samples. Among the top 25 genes altered in GBM, 22 were altered or variant in one or more GSC culture (Fig. 2A, Table S2).

Fig. 2
figure 2

GSC culture genomic and mtDNA variants identified by whole exome and Duplex mtDNA sequencing. (A) GSC variants identified by whole exome sequencing in the top 25 genes altered by frequency in 2593 cBioportal GBM IDH + and IDH-mutant samples. Numbers in cells indicate GSC variants/gene, with the cell color code indicating the worst predicted consequence of gene-specific variants. (B) Barplot of fractional representation of GSC culture single base substitution (SBS) mutational signatures among top-ranked COSMIC SBS signatures. GBM4 and GBM8 had virtually identical mutation distributions attributable to SBS5 (446 and 445 mutations, respectively) and to SBS96 (987 and 972 mutations, respectively) (see Methods for detail). (C) mtDNA single nucleotide (SNV) and insertion/deletion (indel) variant frequencies in GSC cultures. Note Y-axis difference in SNV versus indel units. (D) Plot of GSC culture-specific mtDNA variants by location, molecular type and predicted functional consequences on human mtDNA complementary strands (center) with coding regions shown for proteins (light green), rRNAs (purple) and tRNAs (blue).

Many of the variants we identified by exome sequencing, 88.5%, were missense variants in genes that participate in receptor tyrosine kinase (RTK), RAS and PI3K signaling pathways with low or no predicted functional impact. No IDH1 or IDH2 neomorphic or MGMT-inactivating variants were identified, although 15 variants in 0131, 0827 and GBM8 were predicted pathogenic from their high variant effect prediction scores in REVEL16 or AlphaMissense17 analyses. Pathogenic variants were predicted to induce frameshifts and/or truncate open reading frames (n = 11); promote mis-splicing (n = 1); or generate internal duplication-deletions (n = 3) (see Methods for scoring details) (Table S2), with 11 of the ‘Top 25’ GBM-mutated genes potentially actionable as assessed by ‘OncoKB Actionable Genes—Glioma’ and COSMIC ‘Mutation Actionability in Precision Oncology’ ‘CNS gliomas and related neoplasms’ resources (www.oncokb.org/actionable-genes#cancerType=Glioma&sections=Tx; update API 27 February 2025; cancer.sanger.ac.uk/cosmic/download/actionability, version 16, 21 May 2025, respectively (Table S2).

Exome sequencing data also allowed us to estimate GSC culture tumor mutational burden (TMB) and perform culture-specific mutation signature analyses. The median GSC culture TMB was 3.76 mutations/Mb (range 3.36—67.86 mutations/Mb), or 1.4–3.6-fold higher than median TMB estimates of 10,294 glioma samples (2.6 mutations/Mb)18 or for a TCGA cohort of 359 IDH + samples (1.04 mutations/Mb). GSC 0827 had a substantially higher median TMB of > 50.8 mutations/Mb, more closely resembling the TMB observed in hypermutated IDH + GBM samples. We did not identify an obvious genomic driver of TMB in 0827, e.g., mismatch repair loss and/or the presence of POLD1/E1 mutator alleles19,20,21. However, a 0827 mutational signatures analysis revealed a dominant SBS11 (Single Base Substitution) signature and a strong SBS5 signature. SBS11 has been attributed to TMZ DNA base damage, and SBS5 to an as-yet unidentified clock-like source (per SigProfiler—Human Cancer version 3.4, accessed 14 April 2025 ) (Fig. 2B). Consistent with these results, upon further investigation we learned that 0827 had been initiated from recurrent GBM surgical resection tissue from a patient treated 4 years earlier with IR and TMZ (personal communication Drs. Patrick Paddison, P.J. Cimino and Jeongwu Lee).

Mitochondrial DNA variation in GSC cultures

Mitochondrial DNA (mtDNA) mutations have been reported in GBM and many other tumor types (see, e.g.,22), where they have the potential to disrupt mitochondrial function, metabolism, drug and apoptosis sensitivity22,23,24. In order to deeply characterize mtDNA variants in GSC cultures, we used ultra-sensitive and accurate Duplex DNA Sequencing to detect and quantify 479 unique mtDNA variants in GSC cultures25. Most mtDNA variants (~ 90%) were single nucleotide substitutions (SNVs), with from 74 (0131) to 122 (GBM4) unique variants/culture having Variant Allele Fractions (VAFs) ranging from 0.291 to 8.26 × 10–5. We also found mtDNA insertions and deletions (indels) of 1 bp, > 1 bp or more complex variants involving 2–19 bp (Fig. 2C). Among mtDNA variants shared between at least two GSC cultures (16 SNV and 9 indel variants), 18 of 25 were predicted not to alter coding and likely represent common population polymorphisms (Tables S3, S4). GSC mtDNA SNVs were strongly biased to G > A/C > T transitions, with low levels of putative oxidative damage-linked G > T/C > A transversions. Variants were distributed over the whole mtDNA genome, with 59% in coding regions with 2/3 of these predicted to alter or terminate a protein open reading frame (Fig. 2D). SNV molecular type distributions were similar across GSC cultures, and did not change after 1–2 Gy of IR (Fig. S4). These results collectively provide the deepest look to date at mtDNA variation and subclonal architecture in GBM-derived GSC cultures, and resemble findings in our prior analyses of somatic mtDNA variation in normal human brain25,26,27. GSC mtDNA variants also provided a way to define the subclonal architecture of GSC cultures: all cultures contained multiple mtDNA variant subclones identified by ≥ 2 Duplex sequence reads that contained identical mtDNA variants. These subclone-defining reads represented ~ 23% of all 0827 mtDNA reads, versus 8–10% in GBM4, GBM8 or 0131 (Fig. S3).

Gene expression profiling

Bulk and single cell RNA sequencing (scRNAseq) were used to identify previously defined GBM subtype signatures in GSC cultures. GBM4 and GBM8 had similar expression profiles in both bulk RNA sequencing analyses and were more closely related to 0827 than 0131, a distinctive outlier (Fig. 3A, B). Mesenchymal subtype expression states MES1/MES2 were identified in 0131 together with and to a lesser extent an AC (astrocytic)-like state, while GBM4, GBM8 and 0827 displayed proneural/NPC (neural progenitor)-like gene expression signatures28 (Figs. 3C, S6, Table S8).

Fig. 3
figure 3

Transcriptional profiling defines GSC cellular subtypes and functional states. (A) Gene expression in replicate bulk GSC samples displayed in a PCA plot, and (B) from scRNAseq expression analyses displayed in a UMAP plot. Both reveal the high reproducibility of gene expression analyses together with GSC culture-specific differences and heterogeneity. (C) Bulk (left) and single-cell (right) analysis data identify gene programs for cell cycling/growth fractions (as G1.S, G2.M phases), and quantify expression of previously identified Wang GBM intrinsic and Garofano metabolic subtypes and cell states (Neftel ‘6’ and Xie CT1-CT6 states). Additional enrichment scorings are shown for gene expression signatures reflecting potential/activity for invasion, glutamate-mediated cell signaling via AMPAR ionotropic transmembrane receptors, cell–cell connectivity and signaling (see text for additional detail and Table S8 for signature sources). (D) GSC culture gene expression data analyzed using a Functional Module States (FMS) framework reveal enrichment for metabolic pathways/processes including cell signaling, proliferation, metabolism, senescence and death. Abbreviation key: AC, astrocytic; GPM, glycolytic/plurimetabolic; MES1/2, mesenchymal states; MTC, mitochondrial; NEU, neuronal; NPC neural progenitor cell; OPC, oligodendrocyte progenitor cell; PPR, proliferative-progenitor related).

All four cultures variably expressed cell surface proteins (e.g., CD15/FUT4, CD44, CD133/PROM1 and CD177/L1CAM) and genes or gene sets (e.g., CD44, MYC, CD133/PROM1, CTNNB1, EZH2, and SOX2) that have been used to define GBM stem cell-like features (Figs. S5, S7, S8; Table S8). Recently described transcriptional subtypes and cell states were also readily identified in GSC culture gene expression data, e.g., cell states ranging from stem-like CT1 cells most abundant in GBM4 to potentially post-mitotic CT-6 cells most abundant in GBM829 (Fig. 3C). Our results thus paralleled other reported GBM gene expression data30,31,32 including reports for 0131, 08279 and GBM833,34. GBM proteometabolic and functional subtypes could also be identified35,36,37,38 (Fig. 3C): GSC 0131 most strongly resembled the glycolytic/plurimetabolic (GPM) subtype with high activity in glycan biosynthesis (Fig. 3C, D), whereas 0827 displayed mitochondrial/oxidative phosphorylation-dependent (MTC) features consistent with high mitochondria-associated gene expression (Fig. 3C, S9). GBM8 displayed both neuronal (NEU) and proliferative/progenitor (PPR)-like features, whereas GBM4 displayed greater NEU than PPR character (Fig. 3C)36.

These analyses were extended using a Functional Module States (FMS) analysis39. Harmonized GSC and TCGA-GBM gene expression data were captured as three Functional Module factor-defined clusters in which GBM4, GBM8 and 0827 displayed proneural, and 0131 mesenchymal, subtype features (Fig. S10A). When compared with TCGA GBM samples, GBM4, GBM8 and 0827 had high expression scores for amino acid metabolism, transcription and protein translation modules, consistent with high cell cycling potential; higher module scores for cell cycle, replication and repair, transcription, translation, and amino acid metabolism; and lower scores for apoptosis, energy metabolism, and membrane transport (Fig. S10B). GSC 0131, in contrast, displayed high expression of modules associated with cellular senescence, apoptosis, TP53 signaling, cellular community, cell cycle, replication and repair together with lower energy metabolism scores (Fig. S10C). In a four culture comparison, GBM4, GMB8 and 0827 again displayed similarities together with some key culture-specific differences: e.g., GBM4 had lower cell cycle, replication and repair module expression, whereas 0131 showed higher activity of many metabolism-related functional modules as well as cell motility and cellular community (Fig. 3D).

GBM gene expression data are increasingly being used to develop signatures that capture and/or predict important, non-cell-autonomous (tissue or cell population-level) features of GBM5,40, e.g., cell proliferation and invasion38,41,42,43,44,45; cell–cell communication and electrical coupling that may drive proliferation44,46 , and local as well as more distant tumor cell and tumor/non-tumor cell–cell communication and signaling44,45,47,48 (Fig. 3C). GSC cultures with the exception of GBM4 displayed expression and module enrichment scores that favored invasion (Fig. 3C). Consistent with these predictions, GBM8 displayed diffusely invasive xenograft growth in contrast to GBM4 that displayed a more compact nodular growth phenotype in mouse orthotopic xenografts8. GBM8 followed by 0131 displayed the highest cell signaling/cell–cell communication/synaptic coupling signatures as assessed by glutamate (or AMPAR) receptor expression and connectivity ‘up/down’ signatures49 (Fig. 3C). GBM8 displayed the highest fraction of OPC-like cells where elevated PDGFAR and KCND2 expression has been linked to cortical bursting, synchronous electrical discharges and potential tumor-associated seizure activity47.

Proteomic and phosphoproteomic responses to radiation

Proteomic analyses have provided an important additional way to link gene and protein expression to GBM metabolic and functional phenotypes31,35,38. We thus used tandem mass spectrometry (MS–MS) to profile GSC cultures prior to and after IR and identify 4000 protein groups and 10,000 phosphopeptides with high (R > 0.9) reproducibility (Fig. 4, S11 and S12). The experimental design we used aimed to identify key immediate-early proteomic responses that dominate much of the DNA damage response to IR. We thus performed experiments using IR doses that strongly induced γ-H2AX, a key early DNA damage response marker at 1 h but led to minimal cell killing, with sampling at 1 h after 1 Gy (Fig. 1E). We reasoned, given the central role IR plays in GBM therapy, that this protocol would be most revealing of GSC IR responses while avoiding artefacts due to high levels of cell killing or sampling later to reveal post-radiation responses.

Fig. 4
figure 4

GSC culture proteomic/phosphoproteomic profiling by mass spectrometry. (A) Workflow for tandem mass spectrometry-(MS–MS) profiling using rapid-robotic (phospho) proteomic sample prep protocols (R2-P1/P2). (B) Comparative PCA plots of GSC-specific protein/phosphosite abundances as a function of IR. (C) Gene Ontology (GO) term enrichment analysis of GSC-specific changes by cellular compartment (left) and biological process (right). (D) Mean Spearman rank correlations between GSC proteomes and untreated GBM samples grouped by IDH mutation status and Wang proteogenomic subtypes. (E) Amino acid enrichment/depletion surrounding protein phosphosites in IR-treated GSC cultures. (F) Violin plots of log2-fold changes of phosphosite abundance in control versus IR-treated cells. All phosphosites (left column), and phosphosites corresponding to ATM/PRKDC (center) or CDK (right column) consensus motifs, were subjected to a Wilcoxon test of group median comparisons that identified 0131 as an outlier. (G) Kinase substrate enrichment analysis (KSEA) identifies IR-induced phosphosite changes. Negative/positive z-scores imply predicted decrease/increase, respectively, in kinase overall activity in IR-treated versus control cells. Significant differences, corrected for multiple testing, are noted by asterisks (*). Gray scale code (shown right) identifies kinases associated with DNA damage response (DDR), cell cycle regulation (cell cycle), or both pathways.

Principal component analysis (PCA) of protein peptide and phosphosite MS–MS abundances displayed tight clustering in biological replicates with clear separation among cultures (Fig. 4B): GBM8 and 0131 most closely resembled classical and mesenchymal proteogenomic GBM subtypes, respectively, while GBM4 and 0827 displayed a mix of classical and mesenchymal subtype features31 (Fig. 4D). All four cultures expressed both IDH1 and IDH2 proteins with the highest steady state IDH peptide levels in GBM8 and 0131 (Table S11). GBM8, the most rapidly growing of our four GSC cultures, displayed strong enrichment in proliferation-related processes such as nucleic acid metabolism, gene expression and cell cycling. In contrast, 0131 had the highest content of cytosolic, ER and membrane proteins associated with secretory processes (Fig. 4C).

IR treatment led to numerous protein phosphorylation changes, most notably among DNA damage response proteins and phosphorylation response pathways (Fig. S12). These changes included significant post-IR [pS/pT]Q motif enrichment (Fig. 4E) with quantitative induction in all cultures except 0131 (Fig. 4F): the [pS/pT]Q and [pS/pT]Px[R/K] motifs correspond respectively to DNA damage-responsive ATM, DNA-PK and CDK kinase consensus motifs (Fig. S12C). Many other IR-modulated phosphosites linked to specific biological functions were also identified (Fig. S12D). IR-induced phosphorylation site changes in XRCC4 and H2AFX, in contrast, were observed only in 0131, our most radio-resistant GSC culture.

A Kinase-Substrate (KSEA) Enrichment Analysis identified many of the IR-induced phosphorylation site kinases. ATM was among the top predicted active kinase across all GSC cultures and was most active in GBM8, followed by TTK in all but GBM4 and by NEK2 and CDK1 in GBM8 and 0131 (Fig. 4G). ATM signaling may play an important general role in maintaining GBM4, as suggested by the ability of ATM kinase inhibition to suppress colony formation in both irradiated and control cultures (Fig. S1A). Several activated MAPK members were predicted in GBM4 and 0827, together with differential IR activation of protein kinase C isozymes, PRKDC (DNA-PKcs) and PAK1/PAK2. All of these kinases have recently been linked to functional GBM proteogenomic subtypes, with PRKDC identified as an apical or ‘master’ kinase regulating many GBM phenotypic hallmarks37. PRKDC together with PTPN11 participate in a protein tyrosine phosphatase regulatory signaling hub in high grade gliomas42. We identified a key PTPN11 regulatory phosphosite, Y62, in our data and significant post-IR phosphorylation changes in all four GSC cultures, with 0827 having the highest Y62 phosphorylation levels in association with an activating EGFR mutation9. PTPN11-S591, a phosphosite with few known functional roles, was also identified, though not the PTPN11-Y546 phosphosite reported in IDH-mutant gliomas42 (Table S11).

Unsupervised hierarchical clustering of untreated cultures identified large differences in phosphosite abundances between 0827 and 0131 (Fig. S11C). A parallel analysis of protein abundances across the cultures, identified significant coordinated differences among proteins annotated to specific protein complexes. These complexes are associated with chromatin remodeling, DNA replication, cell cycle control, DNA mismatch repair, and POL3-dependent transcription (Fig. S11D). Proteins in these complexes were most abundant in GBM8, and least abundant in 0131. Protein markers of GBM therapeutic response could also be identified in our MS–MS data35,36,37: e.g., 0827, the most IR-sensitive GSC culture followed by GBM4, displayed elevated expression of oxidative phosphorylation-related proteins, reduced expression of the poor prognostic biomarker FKBP prolyl isomerase 9 (FKBP9) but no corresponding trend observed for the favorable prognostic biomarker phosphoglycerate dehydrogenase (PHGDH).

GSC population heterogeneity and cellular dynamics post-IR

Morphologic and behavioral heterogeneity have been defining features since GBM was first recognized as a discrete CNS disease entity50. Subsequent GBM pathologic, genomic and proteomic analyses have progressively refined our understanding of GBM heterogeneity, and identified important biologic and prognostic GBM subgroups1,51. More recent efforts have focused on cell state heterogeneity and plasticity as drivers of GBM structure and evolution over time, and in response to therapy38,52,53.

In order to better define and track GSC culture cell state heterogeneity, we used scRNAseq data in conjunction with lentiviral DNA barcoding and intrinsic mtDNA variants. The lentiviral DNA barcodes (referred to hereafter as ‘tags’) were embedded and expressed with EGFP protein to allow capture by bulk and scRNAseq54. The GSC-specific mtDNA variants were characterized as described above. Both barcode types were tracked over 40 days in control and IR-treated replicate cultures (Fig. 5A). Pilot experiments were used to identify IR doses predicted to kill 80% of cells in a GSC culture, as we reasoned that moderate cell killing might reveal an initial gene expression signature linked to time/treatment-dependent enrichment or depletion, while avoiding dominant, less informative ‘jackpot’ subclones that can arise following high fractional cell kills and/or repeated treatment cycles (see, e.g.,34).

Fig. 5
figure 5

Molecular barcodes identify and track cellular heterogeneity after IR treatment. (A) Design of time course sampling experiment to follow lentiviral barcode and mtDNA variant frequencies in control (-IR), irradiated (+ IR), and mock-treated, reduced complexity (mock) experimental arms (see Methods for additional details). (B) Lentiviral DNA barcodes identified in GSC cultures by amplicon sequencing (green), scRNA sequencing (blue) or both sequencing methods (teal). (CF) Unique sequence barcode tag trajectories over 40 days ± IR identify subsets with significant enrichment (red), potential enrichment (pink), depletion (blue), potential depletion (cyan) or no significant change (gray) by Wilcoxon test p value of IR/control ratios. (GJ) Unique sequence enriched/depleted barcode tags across multiple time points (all), or Day 40 vs. Day 0 (D40/D0). (KN) Unique cells with at least half of cell-associated barcode tags enriched (pink) or depleted (cyan) by at least 50% at D40. (OQ) Cells enriched (red) or depleted (blue) as a function of IR treatment mapped onto individual cells identified by scRNAseq UMIs. (RT) Fraction of cells and level of expression of genes differentially expressed in enriched vs. depleted cells as a function of IR treatment. At least two of three GSC cultures were required to display the same associations to be included. A comparable analysis of differentially expressed genes in enriched versus depleted cells in control cultures is shown in Fig. S16.

Flow-sorted cells expressing EGFP-linked DNA barcodes were expanded and used to seed experimental arms of our time course experiment after bulk and scRNAseq analyses and DNA sequencing of mtDNA variants and integrated lentiviral barcodes (Fig. 5A, B). This design allowed us to identify starting gene expression programs, and link them to time and IR treatment-dependent growth trajectories (Figs. S14, S15). Starting EGFP + cells had transduced mean tag counts of 2–3/cell (range 2.17–3.34), with 0131 an outlier with over 90% of tagged cells having a mean tag count of 17.35 (Fig. S13). Mean unique tag counts linked to a scRNAseq cell-identifying UMI tag ranged from 1.33 to 6.04 (Figs. S13, S15). The high tag counts in most 0131-transduced cells led us to focus on GBM4, GBM8 and 0827, where lower tag counts and simpler tag mapping topologies allowed us to unambiguously track tag- and cellular frequency changes over time.

Five tag and cellular trajectory classes were identified in untreated GSC cultures by K-means clustering of corrected, log-transformed tag count ratios of Day 40 versus Day 0 counts (Fig. S16A–D). IR-enriched or -depleted cells were identified in IR-treated cultures by a Wilcoxon test for p values of P < 0.1, using tag read ratios of IR/control of > 2 (for enriched tags) or < 0.5 (for depleted tags) (Figs. 5C–F, S16E–G). Tag depletions were more common than enrichments in all cultures except 0827 (Fig. 5G–N). Gene expression analyses of cells enriched or depleted as a function of time and/or IR treatment identified 16 genes in common in control cultures and 6 in IR-treated cultures (Figs. 5R–T, S16H–J). Top genes enriched in GBM4, GBM8 and 0827 control cultures were involved in ribosomal or mitochondrial structure and function (Figs. 5R-T, S16H–J). In IR-treated cultures the top-ranked genes were also involved in ribosome structure or function (e.g., RPL23A and RPS12) (Fig. 5R–T) or in mitochondrial electron transport (MT-CO3, MT-ND3, MT-ND4, MT-CYB). These results suggest that ribosomal and mitochondrial content and activity are important determinants of proliferation and IR-treated survival. We could not perform a comparable, well-powered analysis using expressed mtDNA variants due to their low frequencies in GBM4, GBM8 and 0131 cultures (range 0.13–0.19% of cells) (Figs. S4, S17).

Discussion

We characterized genomic, gene expression, proteomic and phenotypic features of four GSC cultures from unrelated adult patients to better enable their use as versatile GBM cellular disease models. All four cultures were initiated from surgical resection tissue of IDH + GBMs, the genomic subtype that represents ~ 90% of adult GBMs. All four GSC cultures also displayed features of previously defined GBM genomic, gene expression and proteomic subtypes. We found that when continuously propagated under defined conditions in vitro, cultures retained their GBM subtype identities with large fractions of stem-like or tumor-repopulating cells. A practically important finding was that all four cultures, when propagated in more physiologic 5% oxygen tension, did not require specialized growth substrates. All four nonethe-less retained the ability to grow as monolayer cultures on plastic coated with laminin or poly-lysine. This ability considerably lessened the cost of maintaining GSC cultures, while still enabling protocols that require monolayer cultures55. We also established new short tandem repeat DNA typing signatures that should prove useful for identifying and authenticating all four GSC cultures.

The close resemblance of many GSC culture features to their origin GBM tumors together with experimental tractability have fueled growing interest in GSC use as a GBM disease model. GSC cultures can support rapid, high throughput combinatorial profiling of ‘therapeutic space’ (see, e.g.,56). They are also amenable to common gene transfer and genome engineering protocols, and have usefully high colony forming efficiencies that make it possible to engineer isogenic GSC genotypes of considerable complexity as well as allelic series, followed by the isolation of clonally-derived sublines if needed for functional analyses.

GSC cultures can also be used to initiate many hybrid or in vivo modeling protocols including organoid, xenograft and ‘on-chip’ disease models. These extensions allow the use of GSC cultures for analyses of additional GBM therapies that are difficult to model in vitro, e.g., antibody, antibody–drug conjugate (ADC) and cellular therapies4,5,57,58,59. These examples illustrate how GSC cultures can rapidly establish treatment potency and help identify potential mechanisms of action, while enabling the search for useful synergies with IR or new drug candidates more rapidly and inexpensively than when using other popular GBM disease models.

Several features of GSC cultures need to be kept in mind to take full advantage of these desirable features. Chief among these are GSC culture origins; their intrinsic heterogeneity and plasticity in culture53; the absence of a cognate tumor microenvironment; and the potential impact of aneuploidy and genomic instability on different end uses. The origins of few GSC cultures, including the four profiled here, are known in any detail. GSC cultures generated using now-standard explant protocols60 most likely originate as oligo- or polyclonal cell populations that can variably capture and propagate key cells and features of their originating GBM tumors. Thus GSC characterizations of the types reported here provide a useful way to locate individual GSC cultures on the landscape of GBM heterogeneity, and provide data to identify the most useful cultures for specific experiments.

Absence of a cognate tumor microenvironment is often cited as a weakness of GSC disease models, and used to argue in favor of GBM preclinical models such as genetically engineered mouse models (GEMMs) despite their considerable cost, long development times and low throughput. However many key features of GBM are largely cell autonomous, and can be captured and usefully analyzed using GSC cultures. We provide examples of this by using GSC gene expression data to identify potential tissue- or tumor-level phenotypes, and to predict experimentally verified in vivo growth phenotypes in mouse orthotopic xenografts8. Additional analyses of these types using, e.g. cerebral organoid models, are providing additional ways to study population heterogeneity, plasticity and tissue- or cell population level phenotypes, while facilitating the search for useful new GBM therapies that depend on important though still incompletely defined features of the tumor microenvironment53,61.

GSC cultures are aneuploid and genomically unstable, as are the GBM tumors from which they originate14,15,30. While few systematic data exist to help quantify instability rates by type and origin62, many chromosome-level and smaller genomic changes appear to be relatively stable and may help sustain GSC culture in vivo growth and continuous proliferation in culture9,63. The potential impact of genomic instability of different origins can be minimized by using replicate, minimally passaged GSC cultures propagated under standardized conditions together with contemporary cellular, genomic and proteomic controls.

Additional important features of our and other GSC cultures remain to be characterized or important for specific applications. These features include DNA methylation profiles, miRNA and lc/lncRNA expression, and the presence and functional importance of extra-chromosomal circular DNA molecules (ecDNAs)52,64,65. GBM4 and GBM8 likely both harbor ecDNAs, as their karyotypes display double-minute chromosomes, a classical marker of gene amplification, together with MYC amplification or MYCN, PDGFRA and MDM2 amplification with homozygous CDKN2A/B deletion (Table S12). These and other new data will enable the continued imaginative use and development of GSC cultures as cellular disease models with the potential to improve our understanding of GBM biology, and to aid the search for more effective GBM therapies and treatment regimens.

Methods*

Glioma stem cell cultures

GSC cultures were initiated from resection tissue from unrelated adult GBM patients as previously described7,8, with initial cultures provided by Drs. Robert Rostomily and Andrei Mikheev (University of Washington and Houston Methodist Hospital; GBM4 and GBM8) and Patrick Paddison (Fred Hutchinson Cancer Center, Seattle WA; 0131 and 0827). For consistency and brevity we use the designations GBM4, GBM8, 0131 and 0827 throughout.

Cultures were propagated in serum-free medium at 37 °C in a low oxygen (5% O2/5% CO2) humidified atmosphere on untreated cell culture plastic. Cell suspensions were generated using Accutase™ with pipetting for counting, dilution and further propagation. GBM4 and GBM8 were grown in B-27 and N-2 supplemented Neurobasal medium (ThermoFisher) with added EGF (PeproTech, 20 ng/mL), FGF (PeproTech, 20 ng/mL) and heparin (Sigma, 5 ug/mL). GSC cultures 0131 and 0827 were grown in Neurocult medium (StemCell Technologies, Inc) supplemented with recombinant human EGF (20 ng/mL), FGF (20 ng/mL) and heparin (0.8 ug/mL). We established new DNA authentication profiles for all four cultures using the CellCheck 9 Plus STR panel (IDEXX BioAnalytics, Columbia, MO), as none existed in DSMZCellDive or CLIMA 2.2 databases. Cultures were repeatedly PCR-screened, and shown to be negative for cell line-, species- and Mycoplasma infection. GSC culture donor sex assignments were confirmed by karyotyping, comparative genomic hybridization and/or molecular fingerprinting, but could not be independently confirmed in light of IRB restrictions. The use of these anonymized human tumor-derived GSC cell cultures from now-deceased individuals is not considered human subjects research as defined by federal and state guidelines.

Proliferation, colony formation and dose–response profiling

Cell proliferation rates were determined by serial cell counting over 6 days after plating as described66. Colony-forming efficiencies (CFE) were determined by dilution cloning in replicate 96-well plates followed by growth without refeeding for 2–3 weeks before scoring wells containing colonies of ≥ 50 cells. Colony forming efficiency was calculated from colony-negative and -positive well counts conditioned on the cell seeding density. Cell cycle phase distributions were determined by flow cytofluorometric analysis of fixed and DAPI-stained, exponentially growing GSC cultures as previously described67.

Ionizing radiation, drug and small molecule treatment effects were quantified using bulk population and CFE assays. Bulk population assays used 500–1000 cells/well seeded in 200 ul of complete growth medium in 96 well plates prior to irradiation in a RS-2000 X-irradiation unit (Rad Source Technologies, Buford, GA). Irradiated and control plates were grown for 7–10 days without refeeding, followed by WST-1 straining to quantify relative cell numbers in IR-treated and control wells. CFE assays were performed by dilution cloning in duplicate 96 well plates as described above. ATM kinase inhibitors KU-60019 or KU-59933 were added 3 h prior to irradiation to determine their ability to modify IR dose-dependent population growth or CFE. All treated cultures were grown for a minimum of 6 days in the presence of drug or inhibitor without refeeding.

Dose-dependent suppression of GSC proliferation and CFE by temozolomide; 5-aza-2'-deoxy-cytidine (decitabine) and ST-401 was determined using bulk population and CFE protocols as above. Methylguanine methyltransferase (MGMT) expression was analyzed by gene expression profiling and Western blot analysis using MGMT Rabbit 2739 primary antibody and Anti-Rabbit IgG (7074) secondary antibody (CellSignaling). All inhibitors, drugs or small molecules were obtained from SelleckChem (Houston TX) or from Sigma-Aldrich (St Louis MO) with the exception of ST-401, a gift from Dr. Nephi Stella (University of Washington Department of Pharmacology, Seattle WA).

Genomic characterization

G-banded metaphase karyotypes were generated by the University of Washington Cytogenetics and Genomics Laboratory Research Service using a standard protocol68, with complete analysis of a minimum 20 cells/culture by an ABMG-certified cytogeneticist.

Whole exome sequencing was performed by targeted exon capture and sequencing of GBM4, GBM8 and 0827 libraries starting from 150 to 200 ng of DNA (Illumina TruSeq Exome Preparation kit, Illumina, San Diego, CA). Amplified library capture used the Roche Nimblgen SeqCap EZ Exome (v2) kit (Roche Sequencing and Life Sciences, Indianapolis, IN) after substituting IDT xGEN dual index blocking oligos (Integrated DNA technologies, San Diego, CA). Library hybridizations were at 47℃ for 72 h, followed by capture washes on a Perkin Elmer Sciclone G3 NGSx workstation prior to PCR amplification for 13 cycles using Kapa HiFi HotStart Polymerase. GBM4, GBM8 and 0827 genomic libraries (1 sample/lane) were then subjected to paired-end, 75 bp sequencing on an Illumina HiSeq 4000. An equivalent 0131 library prep used DNA sonicated to ~ 200 bp, KAPA A-tailed with end repair and adapter ligation, and hybridization to a NimbleGen SeqCap EZ HGSC VCRome capture panel (Roche Sequencing and Life Sciences, Indianapolis, IN). Two lanes were then loaded for sequencing on an Illumina HiSeq2500.

All raw sequence data were analyzed jointly, with primary QC and mapping using the nf-core sarek pipeline (Nextflow v 22.04.3; sarek v3.0). Read pre-processing, adapter trimming and quality filtering were done using fastp, using both fastp and FastQC QC analyses and metrics. All samples that passed QC metrics were then aligned to reference sequence GATK.GRCCh37 with decoy sequences (human_g1k_v37_decoy.fasta) using bwa and manufacturer-provided target capture regions specific for SeqCapEZ v2 and VCRome v2.1 capture libraries. Freebayes, Mutect2, and strelka were used for somatic mutation calling against a panel of normals (Broad Institute: https://www.broadinstitute.org/genomics). Final mutation calls were based on a consensus of at least two of three callers followed by Ensembl Variant Effect Predictor annotation (https://grch37.ensembl.org/info/docs/tools/vep/index.html).

Final results were summarized across tools and methods using Multiqc. Comparison GBM tumor data consisted of 2,539 IDH-wildtype and IDH neomorphic-mutant GBM samples from seven different TCGA and Cancer Genome Atlas projects (cBioportal ‘CNS/Brain—Cancer Subtype Designation Glioblastoma multiforme’ sample set, accessed 7 April 2025). Mutational signatures were analyzed using mutect2 and post-processed, combined.maf files. COSMIC SigProfile Assignment (https://cancer.sanger.ac.uk/signatures, v3.4 access date 25 April 2025) was used to assign Human Cancer Signatures after excluding known artifactual signatures SBS27, SBS43, SBS45, SBS46, SBS47, SBS48, SBS49, SBS50, SBS51, SBS52, SBS53, SBS54, SBS55, SBS56, SBS57, SBS58, SBS59, SBS60, SBS95, and DBS14.

Duplex DNA sequencing was performed as previously described69 with mtDNA-specific modifications70, followed by targeted capture using a human mtDNA target capture set (Integrated DNA Technologies, Coralville, IA) and the IDT xGen Lockdown protocol. Libraries were indexed and sequenced using ~ 150–cycle paired-end reads (300-cycles total) on an Illumina HiSeq4000 to generate ~ 20 × 106 reads per sample. Raw sequencing data were processed using an in-house bioinformatics pipeline with default consensus-making parameters, variant calling by VarDict-Java with left alignment for insertion-deletions (indels). Unique variants were counted and plotted once per genome position and sample, with variant frequencies quantified as described70.

Bulk RNA library generation and sequencing

RNA gene expression was analyzed using triplicate biological samples of 5 × 10e5 cells/culture, with RNA isolation, library prep and quality controls following Seattle Genomics in-house protocols (https://www.seattlegenomics.com/). In brief, total RNA was extracted and purified using the miRNeasy micro kit (Qiagen,Germantown MD), with RNA-seq library preparation using the rRNA depletion KAPA RNA HyperPrep Kit with RiboErase (HMR)(Roche Sequencing and Life Science, Indianapolis, IN). Resulting cDNA libraries were pooled and sequenced on a NextSeq using a mid- and high output, 150 cycle protocol (Illumina, San Diego, CA).

Raw (FASTQ) files were analyzed using nextflow (version 22.04.3) and nf-core/rnaseq (version 3.7) as part of an integrated workflow. Read QC analyses were performed using FastQC, with filtering of ribosomal RNA reads by SortMeRNA. Genomic contaminants were removed by BBSplit prior to QC assessment, with adapter trimming by TrimGalore. STAR was used to align filtered reads to the human GRCh37 reference genome, and SAMtools to sort and index alignments using Picard to mark duplicate reads.

Expression was quantified using RSEM and Salmon pseudo-alignment, with processing, alignments and quality metrics summarized using MultiQC prior to creating read coverage files using BEDTools and bedGraphToBigWig. Read count normalization, QC analyses and differentially expressed genes were identified using the DESeq2 R package with a Benjamini- Hochberg adjusted p value of 0.1 and log2 fold change of |1.3| set as thresholds to identify significantly differentially expressed genes.

Single cell RNA sequencing used 20,000 cells/culture taken on Day 0 of a time course sampling experiment, prepared using the 10X Chromium Next GEM Single Cell v3.1 3ʹ Reagent Kit and protocol according to manufacturer instructions. The resulting cDNA was sequenced without further enrichment using a standard protocol (10X Genomics, Pleasanton, CA) on an Illumina NextSeq, with mtDNA sequences retained to allow expressed mtDNA variants to be captured for analysis. Gene-specific expression was identified in raw base call (BCL) files after demultiplexing into fastq files, followed by count matrix-mapping to a human hg19 reference genome using Cell Ranger with an ‘expected cell number’ = 20,000. Final expression counts were based on genes with ≥ 20 reads/cell in cells that expressed ≥ 200 genes, further qualified by UMI counts ≤ 25,000 and mitochondrial read percentages of ≤ 20%. These thresholds collectively identified 5618–7371 unique cells/culture, expressing 15,477–16,887 genes. Downstream analyses including data visualization, cleaning, geneset scoring etc. were performed using the Python scanpy package (version 1.10.4).

Single sample gene set enrichment (ssGSEA) analysis were implemented in the GSVA package in Bioconductor R version 4.2.1; GSVA version 1.46.0. Single cell transcriptional profiling data were analyzed using the ‘score_genes’ function in scanpy (Python version 3.11.7, scanpy version 1.9.6) that reimplements Seurat ‘module scoring’. BulkRNA-seq enrichment scores (ES) represent normalized enrichment scores (NES) from ssGSEA, with scRNA-seq ES scores used as module scores, as implemented in scanpy (version 1.9.6)71. We looked for expression enrichment in key glioma biology-related gene sets, and functional enrichment for previously defined gene expression signatures related to GBM subtypes and transcriptional cellular states (Table S8).

GSC transcriptional states were further analyzed using a Functional Module States framework39 and TCGA GBM data. ssGSEA enrichment scoring was used for FMS module analyses, with ssGSEA scores for classical, mesenchymal, and proneural GBM gene sets used to identify three consensus clusters. GBM neurodevelopmental and metabolic subtypes were after36, and ‘Neftel states’ after28. Recently identified gene expression signatures for cell–cell communication, electrical coupling and synaptic connectivity between tumor cells and with non-tumor cell types were after44,45, and functional states enabling invasion and proliferation after 38,41,42 (Table S8).

MS–MS proteomic and phosphoproteomic profiling

Tandem mass spectrometry (MS–MS)-based proteomic and phosphoproteomic profiling were used to characterize GSC cultures, and to determine responses to IR treatment. Pilot experiments were used to define IR dose-survival curves for all four GSC cultures, after which analyses were performed that used 12 replicate T-25 flasks (6 IR-treated and 6 mock-treated control flasks) seeded with 2.5 × 10e6 cells/flask and allowed to recover for 24 h in a 5% oxygen atmosphere prior to 1.0 Gy of X-radiation on a RS-2000 X-irradiator (Rad Source Technologies, Buford, GA). Samples for MS profiling analyses were prepared following72, then analyzed on an Orbitrap Eclipse Tribrid Mass Spectrometer equipped with an Easy1200 nanoLC system (Thermo Fisher Scientific) operating in data-dependent acquisition mode.

MS–MS spectra were searched with MaxQuant (v.1.6.14) against the Uniprot human canonical + isoform protein sequence database. A target-decoy database search strategy was used to guide filtering and estimate FDRs to 1% at both peptide and protein levels. Statistically significant altered phosphosites and proteins were identified using the limma package with filtering for Benjamini–Hochberg multiple hypotheses-corrected p values of < 0.05 and absolute fold changes of |> 1.5|. GO enrichment analyses were performed using the whole human proteome as background and Fisher exact testing with Benjamini–Hochberg multiple hypothesis correction at the protein level for a FDR < 0.02 to identify enriched terms. Protein annotations were from Gene Ontology (GO) Biological Processes, GO Cellular Components and the Kyoto Encyclopedia of Genes and Genomes (KEGG). Kinase-substrate enrichment analyses were performed using KSEA, and protein motif enrichment analyses with iceLogo.

Molecular barcode analysis of population heterogeneity and response to IR

A lentiviral pSMAL-CellTag-V1 plasmid library (Addgene Library 115,643) with ~ 20,000 unique molecular barcodes was used to transduce and label GSC cultures54. PCR amplification of the plasmid barcode region generated a 134 bp amplicon for short read NGS sequencing to define a starting library barcode ‘white list’; the same primer pair was used to amplify cell-integrated CellTag-V1 vector copies from GSC cultures. CellTag barcodes were identified in fastq files using a custom script, then extracted using bcl2fastq using primer pairs as seed sequences. Base quality was checked using FASTQC, followed by use of a custom script to identify unique sequence tags. Downstream analysis used unique sequence tags from each GSC culture with ≥ 10 reads in at least one sample.

Time course sampling experiments were initiated using 20,000 EGFP + flow-sorted cells/culture that were expanded to ≥ 2 × 10e6 cells/culture to seed Day 0 control (untreated), IR-treated and ‘reduced complexity’ experimental arms designed to mimic the effect of an 80% reduction in viability but without IR exposure (mock). IR doses of 1.6–2.5 Gy were used in the + IR arm to reduce cell numbers by an estimated ~ 80%, with IR survivals quantified by CFE at Day 21 after IR treatment. Day 0 samples were analyzed using bulk (triplicate samples of 500,000 cells/GSC culture) and single cell (20,000 cells/culture) RNA sequencing; CellTag barcode amplicon sequencing; and mtDNA variant detection by Duplex DNA sequencing (the latter two used 200,000 cells/assay/culture). Cultures were propagated after Day 0 with repeat sampling of 200,000 cells/culture up to Day 40, allowing sufficient growth between samplings to minimize ‘bottle-necking’ of population complexity.

Lentiviral barcode and mtDNA variant read frequencies were normalized to Day 0 samples and used to determine Day-specific (Tn/T0) read count ratios, that were then log-transformed prior to K-means clustering using the Python ‘sklearn’ package to identify barcode clusters with similar trajectories. Barcode read ratios at Day 40 were used to identify potentially enriched (ratios > 2 of enriched/ctrl) and depleted (ratios < 0.5, depleted/ctrl) barcodes prior to a Wilcoxon test to identify significant enrichment or depletion under either alternative hypothesis (> /enriched vs. < /depleted).

Single cell sequencing with UMI indexing on the 10X Chromium platform was used to identify and link single cells to time course trajectories and initial gene expression programs. Expressed tags that showed apparent enrichment or depletion in IR-treated versus control samples were further assessed using the one-sided Wilcoxon test implemented in the ‘scipy.stats’ Python library. A ratio of < 0.5 or > 2 between IR/control (Ctrl) samples on Day 40 was used to identify depleted or enriched tags with a threshold p value of < 0.1 prior to plotting as a Scanpy UMAP and analyzed for differential gene expression using the ‘scanpy.tl.rank_genes_groups’ function with Wilcoxon rank sum testing.