Challenging the Astral mass analyzer to quantify up to 5,300 proteins per single cell at unseen accuracy to uncover cellular heterogeneity

Bubis, Julia A.; Arrey, Tabiwang N.; Damoc, Eugen; Delanghe, Bernard; Slovakova, Jana; Sommer, Theresa M.; Kagawa, Harunobu; Pichler, Peter; Rivron, Nicolas; Mechtler, Karl; Matzinger, Manuel

doi:10.1038/s41592-024-02559-1

Download PDF

Article
Open access
Published: 16 January 2025

Challenging the Astral mass analyzer to quantify up to 5,300 proteins per single cell at unseen accuracy to uncover cellular heterogeneity

Nature Methods volume 22, pages 510–519 (2025)Cite this article

29k Accesses
67 Citations
20 Altmetric
Metrics details

Subjects

Abstract

Despite significant advancements in sample preparation, instrumentation and data analysis, single-cell proteomics is currently limited by proteomic depth and quantitative performance. Here we demonstrate highly improved depth of proteome coverage as well as accuracy and precision for quantification of ultra-low input amounts. Using a tailored library, we identify up to 7,400 protein groups from as little as 250 pg of HeLa cell peptides at a throughput of 50 samples per day. Using a two-proteome mix, we check for optimal parameters of quantification and show that fold change differences of 2 can still be successfully determined at single-cell-level inputs. Eventually, we apply our workflow to A549 cells, yielding a proteome coverage ranging from 1,801 to a maximum of >5,300 protein groups from a single cell depending on cell size and search strategy used, which allows for the study of dependencies between cell size and cell cycle phase. Additionally, our workflow enables us to distinguish between in vitro analogs of two human blastocyst lineages: naive human pluripotent stem cells (epiblast) and trophectoderm-like cells. Our data harmoniously align with transcriptomic data, indicating that single-cell proteomics possesses the capability to identify biologically relevant differences within the blastocyst.

Pick-up single-cell proteomic analysis for quantifying up to 3000 proteins in a Mammalian cell

Article Open access 10 February 2024

Mass spectrometry-based proteomics data from thousands of HeLa control samples

Article Open access 23 January 2024

Initial recommendations for performing, benchmarking and reporting single-cell proteomics experiments

Article 02 March 2023

Main

Single-cell proteomics (SCP) by mass spectrometry (MS) has evolved into a powerful technique to investigate cellular heterogeneity with increasing coverage and throughput. To tackle the challenge of ultra-low input amounts, a multitude of miniaturized and automatized sample preparation workflows have been developed by several labs^{1,2,3,4,5,6,7,8,9}. The success of SCP, however, is also tightly connected to technological improvements in high-performance nanoflow liquid chromatography (LC) and high-resolution, high-sensitivity MS. The SCP field initiated its success by using isobaric labels for multiplexing to improve sensitivity and throughput¹. With more sensitive and fast mass spectrometers being commercialized, more and more groups focus on label-free quantitation workflows¹⁰ and use data-independent acquisition (DIA) combined with short LC gradients. With these DIA workflows, optimal reproducibility across runs is achieved while maintaining throughput at acceptable levels ranging from 40 to up to 180 samples per day (SPD)^11,12,13. Improvements in data analysis algorithms allow for confident identification of very-low-abundance peptides hidden within highly chimeric spectra.

The recent introduction of the Thermo Scientific Orbitrap Astral mass spectrometer¹⁴ further extends the reachable sensitivity and acquisition speed by combining the established Thermo Scientific Orbitrap analyzer with the newly developed Thermo Scientific Astral analyzer in a single instrument. The Astral analyzer combines high speed (up to 200 Hz) at high resolution and sensitivity, with nearly lossless transmission and a high dynamic range¹⁴. The instrument allows for high acquisition speed by simultaneously using the Orbitrap analyzer for MS¹ spectra and the Astral analyzer for MS² spectra at a resolution sufficient to resolve tandem mass tag reporter ions and at a sensitivity superior to the Orbitrap analyzer¹⁴.

Here, we investigate the performance of the Orbitrap Astral mass spectrometer in combination with the Aurora Ultimate TS 25-cm nanoflow ultra-high-performance LC (UHPLC) column that comes with a fully integrated source interface to minimize peak broadening and yield the highest possible signal intensity. We optimize the gradient length to yield maximum proteome coverage and assess the accuracy and precision of quantification.

We apply our improved workflow to investigate the cellular heterogeneity of A549 human lung cancer cells. Additionally, we used optimized SCP on naive human pluripotent stem (hPS) cells, which recapitulate the preimplantation epiblast (EPI), and trophectoderm (TE)-like cells differentiated from naive hPS cells. During the in vitro fertilization process, only about 40% of fertilized eggs are estimated to reach the blastocyst stage with sufficient quality for transfer to the mother’s uterus¹⁵. The criterion for selecting blastocysts largely depends on morphological characteristics, such as the expansion of the outer TE and a well-clustered inner EPI population^16,17. However, our understanding of the mechanisms regulating blastocyst morphology remains limited. This limitation is primarily due to restricted access to human embryos and the lack of technologies capable of low-input detection methods. In this context, high-sensitivity SCP emerges as a powerful tool. It holds promise for unraveling the mechanisms underlying the regulation and functionalization of blastocyst morphology, including aspects like TE epithelialization, cavity formation and tissue segregation between the EPI and TE.

Results

Reaching unprecedented proteomic coverage

We combined the Orbitrap Astral mass spectrometer with the Thermo Scientific field asymmetric waveform ion mobility spectrometry (FAIMS) Pro interface and the Auroa Ultimate 25-cm TS column with an integrated emitter tip and zero dead volume, aiming to maximize sensitivity for ultra-low-input proteomic samples. To have a representative standard of comparable input to a single cell for evaluation and benchmarking, we decided to inject 250 pg of commercial HeLa and K562 cell peptides. The bulk digests were diluted in 0.1% trifluoroacetic acid (TFA) containing 0.015% N-dodecyl-β-d-maltoside (DDM) to improve peptide solubility, reproducibility and quantitation accuracy¹⁸. Gradient lengths were varied from 14 to 38 min (Supplementary Table 1), which corresponds to reachable throughputs of 30–80 SPD, including sample loading as well as column equilibration and washing time. During loading and washing, flow rates were increased to improve throughput, and lower flow rates were applied during peptide elution to enhance sensitivity^13,19,20. As shown in Fig. 1a,e, we found a sweet spot around 50 SPD, yielding maximal identifications. Of note, the data were analyzed library free. When replicates were searched together, the number of protein groups (PGs) identified was boosted by 6.3–10.9% and 4.5–7.2% for HeLa cell peptides and K562 cell peptides, respectively. In addition, by allowing for matching, replicate measurements nicely align, leading to a data completeness close to 100% and vanishing error bars for ID numbers compared to the method evaluation mode.

**Fig. 1: Finding optimal parameters to maximize PG IDs and maintain data quality.**

Based on our results, we hypothesize that shorter gradients are advantageous for ultra-low-input samples as they produce sharper, and hence more intense, chromatographic elution peaks. Once a maximum is reached at 50 SPD, faster gradients potentially suffer from insufficient separation power and greater spectrum complexity, which adversely affects identification numbers. Fast gradients with very narrow elution peaks further reduce the number of data points per peak (Fig. 1c,g). We decided to focus on 50 SPD as the gradient length for all further experiments within this study as it still shows the best proteome coverage with five data points per peak on median (MS¹ level), enough for proper integration. We further decided to quantify on the MS¹ level as we see not only more data points there but also a lower median coefficient of variation (CV) clearly below 10% (Fig. 1b,f).

Next, we created a tailored library from 10 ng of the very same HeLa and K562 digests to further improve proteome coverage. The libraries were also recorded in DIA mode and resulted in more than 62,000 precursors identified within the library (Fig. 1d,h). Using these libraries, we performed a library search for the 250-pg runs, which improved PG identifications by 41.5% and 31.7% compared to a library-free search in method evaluation mode and by 26.3% and 25.6% compared to the library-free search with matching for HeLa and K562 cell peptides, respectively (Fig. 1d,h). Using default settings for the Spectronaut search engine, we identified more than 7,500 proteins from a single run with as little as 250 pg of HeLa cell peptide input and more than 6,800 proteins with 250 pg of K562 cell peptides. These numbers increased further to a total of 7,800 and 7,060 unique PGs identified from three replicates of HeLa and K562 cell peptides, respectively (Supplementary Fig. 1a,b). Of note, we identified a total number of 6,126 PGs from HeLa cell peptides using DirectDIA+ in method evaluation mode and only 6,017 using DirectDIA+ with matching replicates (DirectDIA+). We hypothesize that this is reasoned by a global false discovery rate (FDR) control applied over all three replicates on top of the run-specific control while only very few additional proteins could be matched in those technical replicates. Of those 6,017 total proteins, 5,989 were found in all replicates, indicating excellent data completeness (Supplementary Fig. 1a). We additionally checked for potential FDR inflation by using a library to boost ID numbers and performed a cosearch of 250-pg and 10-ng files instead (Supplementary Fig. 1c). When doing so, the PG numbers were similar to each other (difference of ~0.4%), and at the precursor level, the difference was ~5%, suggesting that FDR inflation is no problem in the Spectronaut version used (18.6).

Importantly, our results depicted a single hit wonder and were also successfully reproduced on a second Orbitrap Astral run using a different batch for the analytical column and K562 digest (Supplementary Fig. 1d).

To exclude potential issues occurring from carryover when using shorter gradients, we assessed the degree of residual protein and precursor identifications in the first five washes after 250-pg injections. (Supplementary Fig. 2). In method evaluation mode, we were not able to identify any proteins in wash runs where only 0.1% TFA was injected, as too little spectra for calibration were available. When allowing for matching across the recorded replicates, the first wash run after a sample injection yielded <2% of precursor IDs compared to the previous 250-pg injection. The total quantity of protein in those first washes was more than two orders of magnitude lower than the previous 250-pg injections independent of the gradient length used. Although those quantities are further reduced following continued washing, we conclude that the level of carryover is neglectable for such single-cell-level samples and for the throughput range tested in this study. Furthermore, we did not see any accumulation of material on the column over time (that is, no increase in quantity from the first to fifth replicate of a 250-pg injection; Supplementary Fig. 2c).

We assume that 250 pg of HeLa cell peptides is equivalent to single-cell-level input for methodological comparisons in this study; however, protein contents ranging from 100 to 500 pg were reported for HeLa cells depending on their size and cultivation time²¹. We therefore checked reachable proteome coverage for a broader range of input amounts ranging from 50 pg to 10 ng (relevant when using libraries or carrier proteomes). This resulted in ~4,700 to ~8,300 PGs identified from 50 pg to 10 ng, respectively (Supplementary Fig. 3). To consider the large difference in sample input, we adopted an isolation window and maximal injection time ranging from 5 to 20 Th and 10 to 80 ms, respectively. We consider such an adoption as highly relevant also for the generation of tailored libraries from higher input (see Fig. 4 without an adopted acquisition method and Fig. 5 with an adopted acquisition method for real single cells).

Validation of the FDR strategy

To estimate the real FDR, we generated a shuffled target database by shuffling all sequences of our target database while maintaining the positions of all protease cleavage sites (P, K and R). Both databases were used as targets in the subsequent search. They have less than 0.01% shared peptides and a similar size, allowing for a fair FDR estimation with a very low risk of finding peptides in the shuffled target database that represent true-positive hits (Fig. 1i). Of note, randomization of peptide sequences led to a slightly increased number of unique peptides in the shuffled target versus the target database, which we opted for deliberately to ensure a conservative estimate of the FDR.

When examining the HeLa cell data presented in Fig. 1d, we found that the FDR at the PG level was slightly above the expected 1%, indicating that default settings in Spectronaut version 18 are too relaxed (Fig. 1k). This is in line with earlier results²² demonstrated in Spectronaut versions 16 and 17, where the FDR was reported to be higher than expected when using default parameters.

We additionally used a Caenorhabditis elegans database as a second entrapment strategy for FDR validation (Fig. 1j). The estimated FDR when using C. elegans as entrapment was a bit higher and above 2%, confirming our initial observation of potentially too relaxed settings. We therefore tested the application of a more stringent cutoff of 0.01 for the run-wise protein q value, which did, however, not alter the results. Instead we applied fully stringent settings suggested by Baker et al.²². Their optimized settings set all cutoff values to 0.01. As shown in Fig. 1d,h, the additional protein q value cutoff did not change the results in DirectDIA mode, and numbers dropped only slightly for the library search. Using the settings suggested by Baker et al., PG numbers for HeLa cells dropped by ~3% for runs in DirectDIA+ and by ~16% for library searches.

Of note, reporting total ID numbers by summing up all unique IDs from three replicates (Supplementary Fig. 1a,b), without an additional FDR filter step, further accumulates wrongly identified precursors and proteins. This leads to an elevated FDR of up to 5.5% (Supplementary Fig. 1e). Hence, we advise reporting average ID numbers rather than total ID numbers.

The FAIMS Pro interface improves signal to noise, CV and coverage

Aiming for the best possible sensitivity, we screened for the optimal compensation voltage in high FAIMS using the FAIMS Pro interface. From previous experience, –48 V yields optimal results on the used device, which is why that voltage was used for all other runs. Screening the compensation voltage from –38 to –88 V in steps of 10 suggests that the optimal setting is between –48 and –68 V, with few more PGs and clearly more precursors detected at –58 V than at any other setting. Although using the FAIMS Pro interface bears the risk of losing some ions, it still seems advantageous due to an improved signal-to-noise ratio and dynamic range^23,24. As reported earlier by our lab²⁵, a clear gain in sensitivity, especially for limited-sample LC–MS/MS analyses, can be seen by using the FAIMS Pro interface, as the relative contribution of singly charged background ions gets more substantial at low sample loads.

In our hands, use of the FAIMS Pro interface at optimal settings resulted in a clearly improved signal-to-noise ratio (Supplementary Fig. 4), with up to 42.6% more precursors and 55.3% more PGs identified than measurements without a FAIMS interface (Fig. 2a). In line with our expectation from the improved signal-to-noise ratio, we quantified more low-abundance precursors/proteins when using FAIMS (Fig. 2d,e). However, some of the abundant precursors were lost, resulting in only 49% and 63% of peptides identified in runs without FAIMS being found at the most promising compensation voltages of –48 V and –58 V, respectively (Fig. 2f). Among shared peptides between runs without the FAIMS Pro interface and with a compensation voltage of –48 V, fewer close to 0 or missed values were present in the data recorded with the FAIMS interface (Supplementary Fig. 5b,c). Almost all (98%) peptides found in runs without FAIMS were present in one of the compensation voltage runs (Supplementary Fig. 5a). The observed effect was less dramatic at the protein level, where almost all proteins quantified without FAIMS were also quantified with FAIMS (Fig. 2g). In addition, FAIMS usage yielded more identified proteins. Global sequence coverage with FAIMS was slightly reduced (that is, on average, 7.8 peptides/protein were identified without FAIMS but only 7.6 and 6.0 peptides/protein at compensation voltages of –48 V and –58 V, respectively). Acquisition speed was also not affected as there was no influence on the number of data points per peak (Fig. 2c), and the CV (Fig. 2b) was strongly improved by FAIMS. Overall, we believe that FAIMS usage is highly advantageous for most single-cell studies mainly due to the improvement of signal-to-noise and CV that results in more proteins being identifiable and quantifiable.

**Fig. 2: Tuning coverage and signal to noise by FAIMS.**

Quantitative performance

We next evaluated the quantitative performance of our workflow at single-cell-level inputs and benchmarked the data to our Thermo Scientific Orbitrap Exploris 480 mass spectrometer using similar settings, the same input and the same gradient length.

Although the CV gives a reasonable estimate of the quantitative precision across technical replicates, we further evaluated the accuracy and precision of proteins quantified from a two-proteome mix (Fig. 3). By comparing two proteome mixes with different, but known, human-to-yeast mixing ratios, we assessed the accuracy of quantification. Of note, less than 50 pg of yeast, present in one of the samples, still enabled the quantification of 1,782 yeast proteins. As shown in Fig. 3a, the Orbitrap Astral mass spectrometer does an excellent job delivering a fold change in protein abundance very close to the expected value and outperforms the Orbitrap Exploris 480 mass spectrometer, especially for the more sparse yeast samples (Fig. 3). We further checked accuracy by looking at the distribution around the expected fold change and found that higher-abundance proteins within our sample tended to be quantified very accurately with a local CV of less than 5%. The local CV, however, rose with lowered protein abundance and approached 35% at the limit of quantification. We observed an exponential drop of CV with increasing protein abundance.

**Fig. 3: Human–yeast proteome mix to assess quantitative performance.**

Our results show that on the Orbitrap Astral mass spectrometer, more than twice the number of proteins was quantified compared to on the Orbitrap Exploris 480 mass spectrometer, which is likely due to its improved speed and sensitivity. When filtering for commonly found proteins (Fig. 3b,h,i), predominantly lower-abundance proteins were missing, highlighting the increased sensitivity of the Orbitrap Astral mass spectrometer. Furthermore, almost all proteins identified with the Orbitrap Exploris 480 mass spectrometer were also identified using the Orbitrap Astral mass spectrometer. The number of common proteins was very close to the protein number quantified using the Orbitrap Exploris 480 mass spectrometer. The commonly quantified proteins allow for a fair comparison of accuracy and precision between the Orbitrap Astral mass spectrometer and the Orbitrap Exploris 480 mass spectrometer and shows that the distribution of fold change values is clearly smaller on the Orbitrap Astral mass spectrometer. The same is true for the local CV and is most distinct for the proteins of lowest abundance, where the CV drops from ~20% for the Orbitrap Exploris 480 mass spectrometer to ~12% for the Orbitrap Astral mass spectrometer. Because quantification was performed at the MS¹ level, data from both instruments originated from an Orbitrap analyzer of the same construction type. We hypothesized that the increased CV even for the very same proteins is due to the lower speed of the Orbitrap Exploris 480 mass spectrometer. The Orbitrap Exploris mass spectrometer uses the Orbitrap analyzer for both MS¹ and MS² data acquisition, whereas the Orbitrap Astral mass spectrometer simultaneously uses the Orbitrap analyzer for MS¹ and the Astral analyzer for MS², which gives the latter an enormous advantage in speed. This is also reflected in more data points per elution peak recorded on the Orbitrap Astral mass spectrometer (median of three for Exploris and five on the Astral for this dataset).

Applying an additional filter to exclude proteins quantified based on only one peptide had no impact on the accuracy but further improved the quantitative precision, independent of the instrument used (Supplementary Fig. 6). As a tradeoff, 18% (Orbitrap Astral mass spectrometer) to 7% (Orbitrap Exploris 480 mass spectrometer) fewer proteins were quantified for a 2:1 ratio mixture. We investigated the quantitative performance using larger fold changes of 5:1 and 10:1 in yeast quantity (Fig. 3d–i), which showed similar results.

We additionally checked quantitation performance at the MS² level, yielding a bit wider deviation around the expected fold change ratio than seen at the MS¹ level (Supplementary Fig. 7).

Proof-of-principle studies using single cells

Heterogeneity of two human non-small cell lung cancer models

We first benchmarked several previously reported sample preparation workflows to decide on an optimal strategy for our single-cell studies (Supplementary Information and Supplementary Fig. 8). We decided on the One-Pot⁷ workflow as it delivered optimal coverage and low background levels. We applied our optimal workflow conditions for sample preparation and data acquisition to single cells from lung cancer. The A549 and H460 cell lines are epithelial-like human non-small cell lung cancer cell lines commonly used for basic research and drug discovery. Although the same size range (20–30 µm) of cells was selected for both cell types, their protein content seemed surprisingly different. Although we yielded 4,879 average PGs and 32,417 precursors for A549 cells, coverage for H460 cells was clearly lower at 3,166 average PGs and 17,619 precursors (Fig. 4a). In line with our previous results (Supplementary Fig. 8), blank controls showed only low background signals besides the abundant trypsin peaks. Linear dimensionality reduction through principal component analysis (PCA) and heat map analysis showed a clear separation of both cell lines and from the blank control runs based on their protein abundance profiles, proving the capability of the method to investigate cellular heterogeneity (Fig. 4b,c).

**Fig. 4: Heterogeneity of single A549 and H460 epithelial-like human lung cancer cells.**

Size: cell cycle dependency of A549 cells

To investigate our A549 model in more detail, three different size groups of cells ranging from 15–30 µm in diameter were collected using the cellenONE robot. Sixty-six single A549 cells and three blanks were measured using the Orbitrap Astral mass spectrometer. As expected, the number of PGs identified correlated with the size of the cells (Fig. 4d) and ranged from 1,801 PGs for small cells to 2,870 PGs for large cells, on average, when searching library free and in method evaluation mode. When allowing for matching across single cells (DirectDIA+), ID numbers were improved to 3,304 (+83%) and 4,439 (+55%) for smaller and larger cells, respectively. These average ID numbers are very close to numbers we obtained for A549 cells in our first single-cell dataset (Fig. 4a), which was measured on a different Astral instrument and in a different lab, demonstrating the reproducibility of the workflow. We additionally investigated if the use of a spectral library might further increase ID numbers and created a tailored library either from 20 or 40 A549 cells isolated in a single well of a 384-well plate, processed and measured using the exact same settings as for single cells. We could not significantly improve proteome coverage using a library search instead of DirectDIA+, although the used 20- and 40-cell libraries contained much more protein than found in single cells (6,395 PGs for 20 cells and 6,762 PGs for 40 cells). Using the 40-cell library, ID numbers even dropped a bit (Fig. 4d). This is why we assume that there is a sweet spot in optimal library size. MS acquisition parameters, optimized for single-cell input, as accumulation time or m/z windows are likely not perfectly optimized anymore for larger amounts as used for a large library. In addition, we believe that the high number of replicates already serves as an excellent basis for matching. In line with those assumptions, we observed a more pronounced effect of using a library when optimizing the MS method for the higher input and when using fewer replicates of single cells, as done in our second study on TE-like cells and naive hPS cells (Fig. 5). In our hands, using a 20-cell library yielded the best results and enabled us to identify, on average, 3,351 PGs from small single A549 cells and 4,689 PGs from large single A549 cells. Maximum coverage was 5,300 proteins from a single A549 cell. We also examined the performance of DirectDIA+ with optimized settings, which yielded slightly more ID numbers than the method evaluation approach.

**Fig. 5: Analysis of single TE and hPS cells.**

We investigated the technical characteristics of the dataset, data completeness and CV between single cells at the PG level. Data completeness dropped dramatically, with the number of files reaching only 1,053 of 5,805 PGs (~18%) identified in all 66 files (Supplementary Fig. 9a). There was also a slight dependence on cell size; for the 15- to 20-µm cell diameter range, we observed slightly lower data completeness across 19 files than in the two other groups (20–25 µm and 25–30 µm; Supplementary Fig. 9b–d). As expected, mainly due to biological heterogeneity, the CV distributions for this dataset also looked completely different from HeLa or K562 cell peptide dilution series, where the CV was below 10%. The median CV distribution for single-cell runs was 65%, whereas for stock runs of 20 and 40 cells the median CV distributions were 17% and 16%, respectively (Supplementary Fig. 9e–g). This difference in CV aligns with our assumption that cells have a unique protein abundance distribution, which results in a high CV among single cells. In 20- and 40-cell samples, the CV was much lower because we averaged among 20 or 40 cells; however, it was not close to HeLa or K562 cell peptide dilution series. Several factors might be the issue: (1) 20 or 40 cells is not enough to average all biological differences between cells, and (2) for real datasets, sample lysis and digestion cause some fluctuations.

We next examined the extent of cellular heterogeneity of these cultured and untreated A549 cells by PCA and found that cells clustered differently dependent on their diameter (Fig. 4e). This is even more pronounced when performing a nonlinear uniform manifold approximation and projection (UMAP; Fig. 4f). We took a deeper look at the most abundant proteins identified in the single cells, which were histone H4 (H4C1) and the actins ACTB and ACTC1. Indeed, their abundance profiles were nicely reflected in the clustering seen in the PCA and UMAP (Extended Data Fig. 1). We suspect that their abundance differences might correlate with not only the recorded cell sizes but also the different cell cycle stages. G2/M phase cells are reported to be the largest²⁶, which is why we annotated our largest cell size cluster as presumably G2/M. Next, we tried to confirm this by checking the expression pattern of abundant and known marker proteins. Actins are known to play a crucial role in cell division and in the formation of cell junctions and maintenance of cell shape, which is highly relevant especially for cancer cells and therefore why it seems logical that their expression levels correlate to cell size and cell cycle stage^27,28. Histone levels are known to be elevated during G2/M phase when DNA synthesis occurs. To determine whether the cells with elevated H4C1 and H14 levels as determined by PCA and UMAP (Extended Data Fig. 1) could indeed be primarily in G2/M phase, we investigated the abundance levels of the nuclear ubiquitous casein and cyclin-dependent kinase substrate 1 (NUCKS1), which was also quantified in this dataset. NUCKS1 is known to be expressed at high levels in S phase^29,30 and is indeed present at lower expression levels in those cells presumably in G2/M phase but is upregulated in another cell cluster (presumably in S phase). We also assessed the expression of the classical cell cycle marker CDK1. CDK1 was reported to have similar protein expression levels as NUCKS1 (ref. ²⁹), but we could not see a clear trend in the data. MKI67, another reporter protein expected to be maximally expressed in G2 phase³¹, also showed results correlating to the size distribution (Extended Data Fig. 1).

Of note, no cell cycle control was performed in this study, which is why our cellular population reflects a wild mix of all phases, impeding a clear separation but still allowing for the successful study of the heterogeneity of this system.

Characterization of human hPS cells and TE-like derivatives

To challenge our analytical LC–MS/MS setup and the abilities of SCP further, we analyzed naive hPS cells and induced from them TE-like cells. On day 4, TE-like cells were collected and sorted for expression of the TROP2 marker protein using fluorescence-activated cell sorting (FACS). We evaluated different analysis strategies (Fig. 5a). With our library-based approach, we could identify 2,339 and 2,544 PGs in TE-like and naive hPS cells, respectively. Data completeness at the PG level was 53% for hPS cells and 60% for TE-like cells across all files (Supplementary Fig. 10a,b). The CV distribution of PG abundances for TE-like and hPS single cells had a median of 44%, whereas 100-cell runs had abundances of 13% and 13%, respectively (Supplementary Fig. 10c–f).

To confirm that SCP can recapitulate known differences between naive hPS cells and TE-like cells, we performed PCA and UMAP clustering analysis on all cells in the dataset. We observed strong separation between these two cell populations (Fig. 5b,c) with both approaches, which aligned with cell type. PCA reported an explained variance of 51.8% (PC1) and 4.9% (PC2).

Gene Ontology (GO) analysis of proteins upregulated in TE-like cells showed enrichment in the following biological processes: cytoskeleton organization (GO:0007010), actin cytoskeleton organization (GO:0030036), epithelial cell differentiation (GO:0030855), epithelium development (GO:0060429), positive regulation of cell differentiation (GO:0045597) and cell differentiation (GO:0030154), which rightfully corresponds to the previously described biological processes underlying the differentiation of hPS cells into TE-like cells³² (Supplementary Fig. 11).

TE cell analogs express the proteins GATA2, GATA3, CDX2 and TROP2 (ref. ³³). Transcription factors are typically of low abundance in the cell and are often challenging to detect with proteomics techniques. They are also partly localized in the nucleus, which can make them less accessible for extraction and analysis. Additionally, many transcription factors are modified by phosphorylation or other post-translational modifications that can affect their detection. Here, we were able to identify the proteins GATA2 and GATA3 at very low abundance in some TE-like cells. Among differentially expressed proteins between hPS cells and TE-like cells, we found many TE markers well known from transcriptome data: KRT18 (epithelium cytoskeleton), KRT19 (epithelium cytoskeleton), RAB25 (RAS pathway), DAB2 (ERK pathway), YAP1 (Hippo pathway), S100A16, SP6, CDH1, ENPEP, SLC7A2, HAVCR1 and PDLIM1, which are associated with TE development, and DPPA4, DPPA2, SUSD2 and DNMT3L, which are well-established naive hPS cell markers^32,34 (Extended Data Fig. 2). Abundances for these proteins correlate with transcriptome data known from the literature^32,34.

Single-cell proteome data of naive hPS cells and TE-like cells demonstrate the capacity to distinguish primary cells and to extract meaningful biological information that complements and enriches results from transcriptome data.

Comparison with pseudobulk data (100 cells)

We compared the single-cell dataset of TE-like and hPS cells to pseudobulk data generated from 100-cell stocks. PCA distinguished different cell types with explained variances of 54.1% (PC1) and 15.5% (PC2; Supplementary Fig. 12a). We examined protein abundances of marker proteins, which were found for single-cell data. For all proteins but one, the trend was the same. ENPEP was found to be abundantly expressed in both TE-like and hPS cells in bulk data; however, at the single-cell level, it was found at high abundance only in TE-like cells.

GO analysis of proteins upregulated in the TE-like bulk dataset showed much more biologically relevant pathways, even though the total number of upregulated proteins in pseudobulk data (from 100 cells) was lower than in single-cell data (415 versus 243 proteins). If we compare the same pathways that were identified in single-cell and bulk data (Supplementary Figs. 11 and 13), the number of proteins in enriched pathways was similar; however, the P values of the pathways for the pseudobulk data were lower. This is likely because of more intense, and hence more confident, signals and results in a much clearer picture originating from the bulk dataset than the noisier single-cell data. The intersection of upregulated proteins between bulk and single-cell data is 60 proteins.

Discussion

Despite tremendous improvements already made in the field of ultra-low-input proteomics by MS in the past decade, achieving a high sample throughput, reproducibility, quantitative accuracy, precision and high sensitivity at the same time is still very challenging. The presented workflow aims to fulfill those criteria by combining a high-performance zero-dead-end volume column at a short gradient with one of the most advanced, fast and sensitive MS instruments, FAIMS-based noise reduction and a DIA-based acquisition. The used packed bead column has a pulled emitter directly at the end of the column, which minimizes peak broadening and is further supported by the use of high flow rates during sample loading, increasing the throughput. LC flow rates are reduced during the active gradient to enhance sensitivity. We found that total throughputs of 50 SPD represent a sweet spot between maximizing speed of sample acquisition and maintaining enough measurement time to detect a median of five data points per elution peak, enabling proper quantification.

Here, we assess quantitative accuracy and precision at the single-cell level using a two-proteome mix. At five data points per elution peak as reached on the Orbitrap Astral mass spectrometer at 50 SPD, successful separation of twofold changes in protein abundance and quantification of as little material as 10 pg in the most extreme case still worked. However, the CV of quantified proteins within a sample increased with decreased protein abundance. Hence, relative abundance within a single-cell sample is an important consideration for future biological studies as the confident clustering of cellular populations can currently only rely on more abundant proteins within single cells. Despite this limitation, the Orbitrap Astral mass spectrometer clearly outperforms the Orbitrap Exploris 480 mass spectrometer in terms of number of quantified proteins, sensitivity and quantitative accuracy, allowing for the investigation of heterogeneity of untreated cultured cells based on their size and cell cycle phase, as shown for A549 cells. Furthermore, the distinct separation we observed between TE-like cells and naive hPS cells highlights the potential of this SCP workflow for application to human blastocysts. It could clearly aid in exploring TE development mechanisms, blastocyst morphology regulation and the identification of molecular markers of high-quality blastocysts. Ultimately, this knowledge could contribute to our understanding of human blastocyst development and also potentially improve in vitro fertilization success rates.

In conclusion, we are convinced that our comprehensive workflow with optimized and improved parameters from sample preparation to data interpretation is a highly valuable contribution to evolve SCP to the next level in terms of sensitivity, reproducibility and quantitative performance, helping to transition the field of SCP from the developmental phase to a technique for the biologist’s toolbox.

Methods

Ethical approvals

The Wicell line H9 was used under agreement 20-WO-341 for a research program entitled ‘Modeling early human development: establishing a stem cell-based 3D in vitro model of human blastocyst (blastoids)’. Blastoid generation was approved by the Commission for Science Ethics of the Austrian Academy of Sciences. This work did not exceed a developmental stage normally associated with 14 consecutive days in culture after fertilization even though this is not forbidden by the International Society for Stem Cell Research (ISSCR) Guidelines as far as embryo models are concerned. All experiments complied with all relevant guidelines and regulations, including the 2021 ISSCR guidelines that forbid the transfer of human blastoids into any uterus³⁵.

Cultivation of A459, H460 and HeLa cells

Cells were cultured at 37 °C in a humidified atmosphere at 5% CO₂. A549 and H460 cells were grown in RPMI 1640 medium, and HeLa cells were grown in DMEM. Cell medium was supplemented with 10% fetal bovine serum (FBS; 10270, Fisher Scientific), 1× penicillin–streptomycin (P0781-100ML, Sigma-Aldrich), 100× l-glutamine (200 mM, 250030-024, Thermo Scientific) and 1 mM sodium pyruvate (for RPMI only; 4275, Sigma-Aldrich). Cells were grown to around 75% confluency before trypsinization with 0.05% Trypsin-EDTA (25300-054, Thermo Scientific), followed by washing three times with PBS. Cells were resuspended in PBS at a density of 200 cells per μl for isolation with the cellenONE (Cellenion).

Sample preparation of A459, H460 and HeLa cells

A549, H460 and HeLa cell isolation, lysis and digestion were performed within a 384-well plate (Thermo Scientific Armadillo 45PCR Plate, 384 wells, 12657516) using the cellenONE robot as previously described⁷; the cellenONE robot was operated using its control software (v2.0-1143). Briefly, cells were sorted into wells containing 1 µl of master mix (0.2% DDM (D4641-500MG, Sigma-Aldrich), 100 mM triethylammonium bicarbonate (TEAB; 17902-500ML, Fluka Analytical), 3 ng µl^–1 trypsin (Trypsin Gold, V5280, Promega), 0.01% enhancer (ProteaseMAX, V2071, Promega) and 1% DMSO). For single-cell samples, cells were deposited into individual wells, whereas for 20- or 40-cell libraries, the respective cell number was sorted into a single well. Humidity and temperature were controlled at 50% and 15 °C during cell sorting. A549 and H460 cells were isolated at a given diameter of 15–30 µm and HeLa cells at 18–25 μm. The maximum elongation was set to 1.5. Cell lysis and protein digestion were performed at 50 °C and 85% relative humidity for 30 min before an additional 500 nl of 3 ng μl^–1 trypsin was added. After lysis and digestion, 3.5 μl of 0.1% TFA was added to the respective wells for quenching and storage at −20 °C. For LC–MS/MS analysis, samples were directly injected from the 384-well plate.

For benchmarking studies to proteoCHIP-based sample preparation, cells were isolated and prepared exactly as described in the manufacturer’s manual. Standard (nonstringent) washing of chips was performed by sonication in methanol for 20 min, followed by extensive flushing with Milli-Q water and drying in the fume hood. A stringent washing protocol was established by Cellenion and is considered proprietary. In brief, 300 nl of master mix (0.2% DDM, 10 ng µl^–1 Trypsin Gold and 100 mM TEAB) was dispensed into each well of a LF48 proteoCHIP or an EVO 96 proteoCHIP (Cellenion) using the cellenONE robot. The LF48 chip was prefilled with 2 µl of hexadecane oil (H6703-100ML, Sigma-Aldrich) per well. ProteoCHIPs were cooled to 8 °C for cell isolation, which freezes the hexadecane (LF48). Cells were isolated as described above following stepwise heating to a final temperature of 50 °C at 85% relative humidity for lysis and digestion. Constant addition of water by the cellenONE robot kept the samples hydrated and at the approximate constant volume. After that, 3.5 µl of 0.1% TFA was added, and the proteoCHIP LF48 was cooled on wet ice to refreeze the hexadecane and manually separated from the sample containing the aqueous phase by transferring the sample to individual wells of a low-binding 96-well PCR plate (EP0030129512, Twin.tec PCR Plate 96 LoBind, skirted). The EVO 96 proteoCHIP did not contain hexadecane and was directly placed on top of a 96-well PCR plate to transfer the sample by centrifugation (500g, 30 s).

Naive hPS cell culturing

Naive PS cells were cultured on gelatin-coated plates with a feeder layer of gamma-irradiated mouse embryonic fibroblasts (MEFs). The coating and feeder layer preparation methods were as previously described³². Cells were cultured in PXGL medium composed of N2B27 basal medium supplemented with 1 µM PD0325901 (MedChemExpress, HY-10254), 1 µM XAV-939 (MedChemExpress, HY-15147), 2 µM Gö 6983 (MedChemExpress, HY-13689) and 10 ng ml^–1 human leukemia inhibitory factor (made in-house). Formulation of the N2B27 basal medium included 50% DMEM/F12, 50% neurobasal medium, N-2 and B-27 supplements, GlutaMAX supplement, nonessential amino acids and 100 µM 2-mercaptoethanol. Cells were maintained in a hypoxic chamber (5% CO₂, 5% O₂) and passaged every 3–4 days. All cell lines were routinely tested negative for mycoplasma.

TE differentiation from naive hPS cells

Naive PS cells were dissociated using Accutase (Biozym, B423201) at 37 °C for 5 min. Gentle mechanical dissociation was performed using a pipette, followed by centrifugation to collect the cell pellet. The pellet was resuspended in PXGL medium supplemented with 10 µM Y-27632 (MedChemExpress, HY-10583). To exclude MEFs, the cell suspension was transferred onto gelatin-coated plates and incubated at 37 °C for 70 min. Cell counting and viability were determined using a Countess automated cell counter (Thermo Fisher Scientific) with trypan blue staining. Cells were seeded at a density of 1.2 × 10⁵ cells per well on Geltrex (0.5 µl cm^–2)-coated six-well plates. Culturing was performed in a hypoxic chamber (5% CO₂, 5% O₂). On induction day 0, the medium was replaced with PDA83 medium comprising N2B27 basal medium, 1 µM PD0325901, 1 µM A83-01 (MedChemExpress, HY-10432) and 2% FBS. On day 1, 2 µM XMU-MP-1 was added. On days 2 and 3, the medium was switched to N2B27 with 2% FBS. On day 4, FACS was conducted using the following protocol.

TE and hPS cell FACS sorting and sample preparation

TE-like cells and naive PS cells were dissociated using Accutase (B423201, Biozym) at 37 °C for 10 min and 5 min, respectively. Gentle mechanical dissociation was performed using a pipette. Cells were then stained with antibodies to TROP2 and SUSD2, respectively. For sorting TE-like cells, TROP2⁺ cells were selected, ensuring exclusion of nondifferentiated cells. Similarly, for naive hPS cell sorting, SUSD2⁺ cells were sorted into six-well plates to exclude MEFs. Under both conditions, dead cells were identified and excluded using DAPI staining. The FACS gating strategy is shown in Supplementary Figs. 14 and 15 for TE and hPS cells. Cells were sorted into 384-well plates (Thermo Scientific Armadillo 45PCR Plate, 384 well, 12657516), which already contained 1 μl of lysis buffer (100 mM TEAB (17902-500ML, Fluka Analytical), 0.2% DDM (D4641-500MG, Sigma-Aldrich), 3 ng µl^–1 trypsin (Trypsin Gold, V5280, Promega) and 0.01% enhancer (ProteaseMAX, V2071, Promega)) in each well. After sorting, well plates were moved to humidity- and temperature-controlled cellenONE chambers. Samples were incubated for 30 min at a humidity of 85% and a temperature of 50 °C. An additional 500 nl of 3 ng μl^–1 trypsin was added. Constant addition of water by the cellenONE robot kept the samples hydrated and at the approximate constant volume. After lysis and digestion, 3.5 μl of 0.1% TFA and 5% DMSO was added to the respective wells for quenching and storage at −20 °C. For LC–MS/MS analysis, samples were directly injected from the 384-well plate.

Two-proteome mixes

HeLa (Thermo Scientific, Pierce HeLa Protein Digest Standard, 88328) and yeast (Promega, MS Compatible Yeast Protein Extract, Digest, Saccharomyces cerevisiae, 100 µg, V7461) were combined in 0.1% TFA at the following ratios and at a concentration of 250 pg µl^–1: HeLa:yeast = 200 pg:50 pg, 240 pg:10 pg and 150 pg:100 pg. One microliter (250 pg) of each two-proteome mix was injected to assess the limits of quantification at ultra-low-input levels.

Diluted bulk digests

HeLa (Thermo Scientific, Pierce HeLa Protein Digest Standard, 88328) or K562 (Promega, Mass Spec-Compatible Human Protein Extract, V6951) cell peptides were dissolved in 0.1% TFA and 0.015% DDM (D4641-500MG, Sigma-Aldrich) at a concentration of 5 ng µl^–1 for injection into the LC–MS system.

LC–MS analysis

Samples were analyzed using the Thermo Scientific Vanquish Neo UHPLC system. Thermo Tune software version 0.4 or higher was used to acquire data.

Peptides were separated on an Aurora Ultimate TS 25-cm nanoflow UHPLC column with an integrated emitter (Ion Optics) at 50 °C using direct injection mode.

Peptide separation was performed at 30–80 SPD with details provided in Supplementary Table 1 and with all single-cell measurements performed at 50 SPD. Fast sample loading was performed at a maximum pressure of 1.4 × 10⁸ Pa and at a maximum flow rate of 1 µl min^–1.

For MS measuring, the Orbitrap Astral mass spectrometer and the Orbitrap Exploris 480 mass spectrometer (both Thermo Scientific) were coupled to the LC. Both instruments were equipped with a FAIMS Pro interface (Thermo Scientific) and an EASY-Spray source. Data recorded for reproducibility checks (Supplementary Fig. 1d) were recorded on a second Orbitrap Astral instrument equipped with a FAIMS Pro Duo interface. A compensation voltage of –48 V (Orbitrap Astral) or –50 V (Orbitrap Exploris 480) was used, if not indicated differently. An electrospray voltage of 1.85 kV was applied for ionization and adopted to slightly higher voltages for aged emitters to ensure spray stability.

On the Orbitrap Astral mass spectrometer, MS¹ spectra were recorded using the Orbitrap analyzer at a resolution of 240,000 from m/z 400 to 900 using an automated gate control (AGC) target of 500% and a maximum injection time of 100 ms. For MS² in DIA mode using the Astral analyzer, nonoverlapping isolation windows ranging from an m/z of 5 for 100 cells and 5- to 10-ng bulks to an m/z of 8 for 5-ng bulks to an m/z of 10 for 700-pg to 1-ng bulks and to an m/z of 20 for 1–40 cells, blanks and 50- to 500-pg bulks. A scan range from an m/z of 400 to 800 was chosen. The precursor accumulation time ranged from 10 ms for 100 cells and 5- to 10-ng bulks to 14 ms for 700-pg to 1-ng bulks to 40 ms for 250- to 500-pg bulks to 60 ms for 100- to 150-pg bulks, 1–40 cells and blanks and to a maximum of 80 ms for 50-pg bulks. The AGC target was set to 800%.

On the Orbitrap Exploris 480 mass spectrometer, the MS method was altered due to the lower speed and sensitivity of that instrument than the Orbitrap Astral mass spectrometer. However, comparable settings for high-throughput analysis as tested previously¹³ were used with the following details: MS¹ spectra were recorded using the Orbitrap analyzer at a resolution of 120,000 from an m/z of 400 to 800 using an AGC target of 300%. For MS² in DIA mode, overlapping (m/z of 1) isolation windows of an m/z of 40 and a scan range from an m/z of 400 to 800 were chosen. The precursor accumulation time was set to a maximum of 118 ms, and the AGC target was set to 1,000%.

Data analysis

All raw data were analyzed using Spectronaut (version 18.6.231227.55695, Biognosys)³⁶. For results marked as ‘method evaluation’, DirectDIA+ was used in method evaluation mode without cross-normalization and with every raw file defined as a separate condition to ensure that Spectronaut treats each file as if they were analyzed individually. ‘DirectDIA+’ indicates that results were analyzed in DirectDIA+ mode with replicates defined as the same condition in the search settings. For library searches, higher-input DIA recorded results (as given) were first analyzed in DirectDIA+ mode, and a library was created from those files using Spectronaut. Quantification was performed at the MS¹ level. Carbamidomethylation of cysteines as a static modification was removed for single-cell searches as no alkylation step was performed. Factory settings were used for all analyses and for library generation unless otherwise indicated. ‘Optimized settings’ indicates that searches were performed with a more stringent cutoff value of 0.01 (for precursor q-value cutoff, precursor posterior error probability (PEP) cutoff, protein q-value cutoff (experiment/run) and protein PEP cutoff) as described by Baker et al.²². Searches were performed against the human proteome (UniProt proteome UP000005640, reviewed, 20,408 protein entries, downloaded 4 August 2023) and the CRAPome³⁷ (118 protein entries). Searches for the two-proteome analysis were run against the same human database and the yeast proteome (UniProt proteome UP000002311, reviewed, 6,727 protein entries, downloaded 11 December 2023).

If not otherwise stated, FDR filtering was based on default settings of Spectronaut and at 1% at the protein level. For FDR checks, a decoy (‘shuffled target’) database was generated using the pyteomics³⁸ Python package in shuffled mode with fixed positions for arginine, proline and lysine to make sure that peptide mass and length distributions were the same as in a target database. A C. elegans (UniProtKB, downloaded 13 June 2024, 27,084 entities) database was also used as an entrapment database. The FDR was estimated using the following equation³⁹:

$$\widehat{\rm{FDR}}=\frac{d(1+1/r)}{t+d},$$

where d is the number of decoy proteins, t represents target proteins, and r represents the ratio between decoy and target databases.

Postanalysis

Human and yeast dataset

Raw quantity values were used. All proteins with missing quantitative values were filtered out.

Single-cell datasets

For the statistical analysis of single-cell data, DirectDIA+ results were used. All proteins with missing values in more than 20 cells for A459 cells and 5 cells for the TE with hPS cell dataset were filtered out. For H460 and A549 cells (and blank files), dataset proteins with 20 or more missing values were filtered out. The rest of the missing values were substituted with intensities equal to the minimum in the dataset. Quantitative data were then transformed to log₂ format and normalized to normal distribution. The cell diameter data for PCA and UMAP clustering were taken from the cellenONE robot. PCA and UMAP clustering were performed using sklearn⁴⁰ (version 1.2.0) and umap⁴¹ (version 0.5.6) Python packages. Cluster mapping of A549 and H460 cells and blanks was performed using the seaborn⁴² (0.12.2) Python package.

To find statistical differences between hPS cells and TE cells, a Student’s t-test for two independent samples was performed (scipy (version 1.11.4) Python package). The Benjamini–Hochberg FDR was used to correct P values for multiple comparisons (statsmodel (v. 0.14.1) Python package). The level of significance for corrected P values was set to 0.05. Proteins with fold change values (TE/hPS cells) greater than 1 were considered upregulated. GO analysis of upregulated proteins was performed using String-db.org⁴³ (version 12.0).

Pseudobulk dataset of TE-like and hPS cells

As a ‘bulk’ dataset, we considered 100-cell runs, both TE-like and hPS cells analyzed together (six runs in total). We filtered out all proteins that were missing in three or more runs. The missing values were substituted with intensities equal to the minimum in the dataset. Quantitative data were then transformed to log₂ format and normalized to normal distribution. GO analysis of bulk data was performed the same way as for single-cell data.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The MS raw proteomics data, Spectronaut search results and fasta files used have been deposited to the ProteomeXchange Consortium via the PRIDE⁴⁴ partner repository with the dataset identifier PXD049412. Source data are provided with this paper.

Code availability

Data analysis of Spectronaut search results and figure plotting is available on GitHub at https://github.com/SimpleNumber/SCP_on_AstralMS and figshare at https://figshare.com/projects/SCP_on_AstralMS/226800 (ref. ⁴⁵).

References

Budnik, B., Levy, E., Harmange, G. & Slavov, N. SCoPE-MS: mass spectrometry of single mammalian cells quantifies proteome heterogeneity during cell differentiation. Genome Biol. 19, 161 (2018).
Article PubMed PubMed Central Google Scholar
Leduc, A., Huffman, R. G., Cantlon, J., Khan, S. & Slavov, N. Exploring functional protein covariation across single cells using nPOP. Genome Biol. 23, 261 (2022).
Article PubMed PubMed Central CAS Google Scholar
Ctortecka, C. et al. An automated nanowell-array workflow for quantitative multiplexed single-cell proteomics sample preparation at high sensitivity. Mol. Cell. Proteomics 22, 100665 (2023).
Article PubMed PubMed Central CAS Google Scholar
Zhu, Y. et al. Proteomic analysis of single mammalian cells enabled by microfluidic nanodroplet sample preparation and ultrasensitive NanoLC–MS. Angew. Chem. Int. Ed. 57, 12370–12374 (2018).
Article CAS Google Scholar
Liang, Y. et al. Fully automated sample processing and analysis workflow for low-input proteome profiling. Anal. Chem. 93, 1658–1666 (2021).
Article PubMed CAS Google Scholar
Johnson, K. R., Gao, Y., Greguš, M. & Ivanov, A. R. On-capillary cell lysis enables top–down proteomic analysis of single mammalian cells by CE–MS/MS. Anal. Chem. 94, 14358–14367 (2022).
Article PubMed PubMed Central CAS Google Scholar
Matzinger, M., Müller, E., Dürnberger, G., Pichler, P. & Mechtler, K. Robust and easy-to-use one-pot workflow for label-free single-cell proteomics. Anal. Chem. 95, 4435–4445 (2023).
Article PubMed PubMed Central CAS Google Scholar
Ye, Z. et al. One-Tip enables comprehensive proteome coverage in minimal cells and single zygotes. Nat. Commun. 15, 2474 (2024).
Article PubMed PubMed Central CAS Google Scholar
Li, Z.-Y. et al. Nanoliter-scale oil–air-droplet chip-based single cell proteomic analysis. Anal. Chem. 90, 5430–5438 (2018).
Article PubMed CAS Google Scholar
Matzinger, M., Mayer, R. L. & Mechtler, K. Label-free single cell proteomics utilizing ultrafast LC and MS instrumentation: a valuable complementary technique to multiplexing. Proteomics 23, 2200162 (2023).
Article PubMed PubMed Central CAS Google Scholar
Guzman, U. H. et al. Narrow-window DIA: ultra-fast quantitative analysis of comprehensive proteomes with high sequencing depth. Preprint at bioRxiv https://doi.org/10.1101/2023.06.02.543374 (2023).
Ye, Z. et al. High-throughput and scalable single cell proteomics identifies over 5000 proteins per cell. Preprint at bioRxiv https://doi.org/10.1101/2023.11.27.568953 (2023).
Zheng, R. et al. A high-sensitivity low-nanoflow LC–MS configuration for high-throughput sample-limited proteomics. Anal. Chem. 95, 18673–18678 (2023).
Article PubMed PubMed Central CAS Google Scholar
Stewart, H. I. et al. Parallelized acquisition of orbitrap and astral analyzers enables high-throughput quantitative analysis. Anal. Chem. 95, 15656–15664 (2023).
Article PubMed PubMed Central CAS Google Scholar
Chadid, M. L. et al. Percentage of blastulation on the number and function of metaphase II oocytes. JBRA Assist. Reprod. 19, 111–113 (2015).
Article PubMed Google Scholar
Hill, M. J. et al. Trophectoderm grade predicts outcomes of single-blastocyst transfers. Fertil. Steril. 99, 1283–1289 (2013).
Article PubMed Google Scholar
Chen, X. et al. Trophectoderm morphology predicts outcomes of pregnancy in vitrified-warmed single-blastocyst transfer cycle in a Chinese population. J. Assist. Reprod. Genet. 31, 1475–1481 (2014).
Article PubMed PubMed Central Google Scholar
Tsai, C.-F. et al. Surfactant-assisted one-pot sample preparation for label-free single-cell proteomics. Commun. Biol. 4, 265 (2021).
Article PubMed PubMed Central CAS Google Scholar
Luo, Q. et al. More sensitive and quantitative proteomic measurements using very low flow rate porous silica monolithic LC columns with electrospray ionization–mass spectrometry. J. Proteome Res. 5, 1091–1097 (2006).
Article PubMed CAS Google Scholar
Emmett, M. R. & Caprioli, R. M. Micro-electrospray mass spectrometry: ultra-high-sensitivity analysis of peptides and proteins. J. Am. Soc. Mass. Spectrom. 5, 605–613 (1994).
Article PubMed CAS Google Scholar
Volpe, P. & Eremenko-Volpe, T. Quantitative studies on cell proteins in suspension cultures. Eur. J. Biochem. 12, 195–200 (1970).
Article PubMed CAS Google Scholar
Baker, C. P., Bruderer, R., Abbott, J., Arthur, J. S. C. & Brenes, A. J. Optimizing Spectronaut search parameters to improve data quality with minimal proteome coverage reductions in DIA analyses of heterogeneous samples. J. Proteome Res. 23, 1926–1936 (2024).
Article PubMed PubMed Central CAS Google Scholar
Sweet, S. et al. The addition of FAIMS increases targeted proteomics sensitivity from FFPE tumor biopsies. Sci. Rep. 12, 13876 (2022).
Article PubMed PubMed Central CAS Google Scholar
Swearingen, K. E. & Moritz, R. L. High field asymmetric waveform ion mobility spectrometry (FAIMS) for mass spectrometry-based proteomics. Expert Rev. Proteom. 9, 505–517 (2012).
Article CAS Google Scholar
Stejskal, K., Op de Beeck, J., Dürnberger, G., Jacobs, P. & Mechtler, K. Ultrasensitive NanoLC–MS of subnanogram protein samples using second generation micropillar array LC technology with orbitrap Exploris 480 and FAIMS PRO. Anal. Chem. 93, 8704–8710 (2021).
Article PubMed PubMed Central CAS Google Scholar
Wang, Z. in Cell-Cycle Synchronization: Methods and Protocols (ed. Wang, Z.) 3–23 (Springer, 2022).
Rao, J. Y. et al. Cellular F-actin levels as a marker for cellular transformation: relationship to cell division and differentiation. Cancer Res. 50, 2215–2220 (1990).
PubMed CAS Google Scholar
Rizzelli, F., Malabarba, M. G., Sigismund, S. & Mapelli, M. The crosstalk between microtubules, actin and membranes shapes cell division. Open Biol. 10, 190314 (2020).
Article PubMed PubMed Central CAS Google Scholar
Zhao, S. et al. NUCKS1 promotes proliferation, invasion and migration of non-small cell lung cancer by upregulating CDK1 expression. Cancer Manag. Res. 12, 13311–13323 (2020).
Article PubMed PubMed Central Google Scholar
Hume, S. et al. The NUCKS1–SKP2–p21/p27 axis controls S phase entry. Nat. Commun. 12, 6959 (2021).
Article PubMed PubMed Central CAS Google Scholar
Uxa, S. et al. Ki-67 gene expression. Cell Death Differ. 28, 3357–3370 (2021).
Article PubMed PubMed Central CAS Google Scholar
Kagawa, H. et al. Human blastoids model blastocyst development and implantation. Nature 601, 600–605 (2022).
Article PubMed CAS Google Scholar
Ralston, A. et al. GATA3 regulates trophoblast development downstream of TEAD4 and in parallel to CDX2. Development 137, 395–403 (2010).
Article PubMed CAS Google Scholar
Gerri, C. et al. Initiation of a conserved trophectoderm program in human, cow and mouse embryos. Nature 587, 443–447 (2020).
Article PubMed PubMed Central CAS Google Scholar
Clark, A. T. et al. Human embryo research, stem cell-derived embryo models and in vitro gametogenesis: considerations leading to the revised ISSCR guidelines. Stem Cell Rep. 16, 1416–1424 (2021).
Article Google Scholar
Bruderer, R. et al. Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues. Mol. Cell. Proteomics 14, 1400–1410 (2015).
Article PubMed PubMed Central CAS Google Scholar
Mellacheruvu, D. et al. The CRAPome: a contaminant repository for affinity purification–mass spectrometry data. Nat. Methods 10, 730–736 (2013).
Article PubMed PubMed Central CAS Google Scholar
Levitsky, L. I., Klein, J. A., Ivanov, M. V. & Gorshkov, M. V. Pyteomics 4.0: five years of development of a Python proteomics framework. J. Proteome Res. 18, 709–714 (2019).
Article PubMed CAS Google Scholar
Wen, B. et al. Assessment of false discovery rate control in tandem mass spectrometry analysis using entrapment. Preprint at bioRxiv https://doi.org/10.1101/2024.06.01.596967 (2024).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Sainburg, T., McInnes, L. & Gentner, T. Q. Parametric UMAP embeddings for representation and semisupervised learning. Neural Comput. 33, 2881–2907 (2021).
PubMed PubMed Central Google Scholar
Waskom, M. L. seaborn: statistical data visualization. J. Open Source Softw. 6, 3021 (2021).
Article Google Scholar
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Article PubMed CAS Google Scholar
Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019).
Article PubMed CAS Google Scholar
Bubis, J. SCP_on_AstralMS. figshare https://figshare.com/projects/SCP_on_AstralMS/226800 (2024).

Download references

Acknowledgements

This work was supported by the infrastructure funding fourth call 2022/01 (AT-SCP) of the Austrian Research Promotion Agency (FFG) and the project LS20-079 of the Vienna Science and Technology Fund (WWTF). This work was further funded by the P35045-B project (grant DOI 10.55776/P35045) of the Austrian Science Fund (FWF). This research was funded in whole, or in part, by the FWF. J.A.B.’s work was funded by FWF (grant DOI 10.55776/ESP497). We thank the laboratory of J. Penninger at Institute of Molecular Biotechnology for providing A549 cells. For the purpose of open access, we have applied a CC BY public copyright license to this publication.

Author information

These authors contributed equally: Julia A. Bubis, Manuel Matzinger.

Authors and Affiliations

Research Institute of Molecular Pathology (IMP), Vienna BioCenter, Vienna, Austria
Julia A. Bubis, Peter Pichler, Karl Mechtler & Manuel Matzinger
Thermo Fisher Scientific, Bremen, Germany
Tabiwang N. Arrey, Eugen Damoc & Bernard Delanghe
Institute of Molecular Biotechnology (IMBA), Austrian Academy of Sciences, Vienna BioCenter, Vienna, Austria
Jana Slovakova, Theresa M. Sommer, Harunobu Kagawa, Nicolas Rivron & Karl Mechtler
Vienna BioCenter PhD Program, Doctoral School of the University of Vienna and Medical University of Vienna, Vienna, Austria
Theresa M. Sommer
Gregor Mendel Institute of Molecular Plant Biology (GMI), Austrian Academy of Sciences, Vienna BioCenter, Vienna, Austria
Karl Mechtler

Authors

Julia A. Bubis
View author publications
Search author on:PubMed Google Scholar
Tabiwang N. Arrey
View author publications
Search author on:PubMed Google Scholar
Eugen Damoc
View author publications
Search author on:PubMed Google Scholar
Bernard Delanghe
View author publications
Search author on:PubMed Google Scholar
Jana Slovakova
View author publications
Search author on:PubMed Google Scholar
Theresa M. Sommer
View author publications
Search author on:PubMed Google Scholar
Harunobu Kagawa
View author publications
Search author on:PubMed Google Scholar
Peter Pichler
View author publications
Search author on:PubMed Google Scholar
Nicolas Rivron
View author publications
Search author on:PubMed Google Scholar
Karl Mechtler
View author publications
Search author on:PubMed Google Scholar
Manuel Matzinger
View author publications
Search author on:PubMed Google Scholar

Contributions

J.A.B. and M.M. conceptualized the study, designed and performed experiments, analyzed the data and wrote the paper. T.N.A., E.D. and B.D. performed experiments and designed experimental settings. J.S. and H.K. prepared TE and hPS cells. J.S., H.K., T.M.S. and N.R. interpreted TE and hPS cell data. P.P. contributed to the data analysis strategy. K.M. conceptualized the study and performed data analysis. All authors revised and agreed on the paper.

Corresponding authors

Correspondence to Julia A. Bubis, Karl Mechtler or Manuel Matzinger.

Ethics declarations

Competing interests

T.N.A., E.D. and B.D. are employees of Thermo Fisher Scientific. The other authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Vadim Demichev and Catherine Wong for their contribution to the peer review of this work. Primary Handling Editor: Arunima Singh, in collaboration with the Nature Methods team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 PCA & UMAP of A549 cells with relative abundancies of selected cell-cycle specific proteins color coded.

a–n, Individual A549 cells were prepared using the One-Pot workflow. Digested cells were analyzed at 50 SPD using the previously optimized settings for LC and MS on the Orbitrap Astral MS. The color scale bars depict the log2-transformed protein abundance. PCA (a-g) and UMAP (h-n) are based on protein quantities with each dot representing a cell and colors reflecting the relative abundance of the protein given.

Source data

Extended Data Fig. 2 PCA & UMAP of hPSC & TE cells with relative abundancies of selected cell-type specific proteins color coded.

a–p, Individual hPSC and TE cells were prepared in a 384-well plate using the FACS. Digested cells were analyzed at 50 SPD using the previously optimized settings for LC and MS on the Orbitrap Astral MS. PCA on protein quantities with each dot representing a cell and colors reflecting the relative abundance of the protein given. The color bar corresponds to log₂-transformed protein abundance. Selected markers of TE cells (a-l) and hPSC (m-p).

Source data

Supplementary information

Supplementary Information

Supplementary Notes, Table 1 and Figs. 1–15.

Reporting Summary

Peer Review File

Source data

Source Data Fig. 1

Original numbers as received from the data processing software and used to generate all plots shown in Fig. 1a–k.

Source Data Fig. 2

Original numbers as received from the data processing software and used to generate all plots shown in Fig. 2a–f.

Source Data Fig. 3

Obtained fold change and CV values used to generate plots in Fig. 3a–f.

Source Data Fig. 4

Original numbers as received from the data processing software and cell diameters as received from the cellenONE robot that were used to generate all plots in Fig. 4a–f.

Source Data Fig. 5

Original numbers as received from the data processing software and used to generate all plots shown in Fig. 5a–c.

Source Data Extended Data Fig. 1

Original numbers as received from data processing and used to generate all plots in Extended Data Fig. 1.

Source Data Extended Data Fig. 2

Original numbers as received from data processing and used to generate all plots in Extended Data Fig. 2.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Bubis, J.A., Arrey, T.N., Damoc, E. et al. Challenging the Astral mass analyzer to quantify up to 5,300 proteins per single cell at unseen accuracy to uncover cellular heterogeneity. Nat Methods 22, 510–519 (2025). https://doi.org/10.1038/s41592-024-02559-1

Download citation

Received: 22 February 2024
Accepted: 06 November 2024
Published: 16 January 2025
Version of record: 16 January 2025
Issue date: March 2025
DOI: https://doi.org/10.1038/s41592-024-02559-1

This article is cited by

Single-cell proteomics of Arabidopsis leaf mesophyll reveals dynamic protein responses to water-deficit stress
- James M. Fulcher
- Pranav Dawar
- Ying Zhu
Genome Biology (2026)
Year in review 2025

Nature Methods (2026)
Towards unmanned proteomics data generation: a fully automated sample-to-data system for proteomic experiments
- Dongxue Wang
- Wendong Chen
- Fuchu He
Cell Discovery (2025)
Entering the era of deep single-cell proteomics
- Ludwig R. Sinn
- Vadim Demichev
Nature Methods (2025)
Breaking barriers in crosslinking mass spectrometry with enhanced throughput and sensitivity using Orbitrap Astral
- Fränze Müller
- Micha J. Birklbauer
- Karl Mechtler
Nature Communications (2025)

Subjects

Abstract

Similar content being viewed by others

Main

Results

Reaching unprecedented proteomic coverage

Validation of the FDR strategy

The FAIMS Pro interface improves signal to noise, CV and coverage

Quantitative performance

Proof-of-principle studies using single cells

Heterogeneity of two human non-small cell lung cancer models

Size: cell cycle dependency of A549 cells

Characterization of human hPS cells and TE-like derivatives

Comparison with pseudobulk data (100 cells)

Discussion

Methods

Ethical approvals

Cultivation of A459, H460 and HeLa cells

Sample preparation of A459, H460 and HeLa cells

Naive hPS cell culturing

TE differentiation from naive hPS cells

TE and hPS cell FACS sorting and sample preparation

Two-proteome mixes

Diluted bulk digests

LC–MS analysis

Data analysis

Postanalysis

Human and yeast dataset

Single-cell datasets

Pseudobulk dataset of TE-like and hPS cells

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links