Background & Summary

Neutrophils are the most abundant white blood cells in human blood and are crucial for the early immune defense against infections1. Upon activation, neutrophils utilize various mechanisms to combat pathogens, including degranulation, phagocytosis, and the release of neutrophil extracellular traps (NETs)2. NETs, first described by Brinkmann et al. in 2004, are extracellular web-like DNA structures composed of decondensed chromatin and antimicrobial proteins that trap and kill pathogens3,4. The process of NET formation, known as NETosis, has been shown to play a vital role in innate immunity5,6.

Recent studies have demonstrated that diverse stimuli, such as bacteria, fungi, and viruses, induce NET formation through distinct mechanisms. Early research identified only a limited set of proteins in NETs induced by phorbol 12-myristate 13-acetate (PMA), a general immune stimulator7. However, advances in proteomics have enabled a more comprehensive understanding of the NET proteome. Notably, NETs formed in response to different stimuli exhibit distinct proteomic profiles, as seen with LPS- and PMA-induced NETs8. The nature of the stimulus is the primary determinant of NET protein composition, although disease-related cellular properties may also have some subtle influence on their composition9,10. However, the specific proteins involved and their roles remain unclear.

To comprehensively analyze the proteomic profiles of NETs formed under various stimuli, Data - Independent Acquisition (DIA) mass spectrometry is a promising approach. DIA mass spectrometry offers several advantages over traditional Data-Dependent Acquisition (DDA), including higher throughput, better quantitative accuracy, and greater reproducibility11. DIA collects data from all ions within predefined m/z windows, providing a more comprehensive and unbiased analysis of complex samples12,13. These features make DIA especially suited for large-scale studies, such as those characterizing the dynamic proteomes involved in NET formation under various stimuli. The technique allows precise identification and quantification of proteins, as demonstrated in the present study.

In this study, we induced NET formation in neutrophils using five different stimuli: β-glucan (BG), lipopolysaccharide (LPS), polyinosinic-polycytidylic acid (Poly(I:C)), resiquimod (R848), and severe fever with thrombocytopenia syndrome bunyavirus (SFTSV). BG and LPS elicit canonical responses to fungal and bacterial infections, respectively; Poly(I:C) and SFTSV model broad-spectrum viral challenge versus authentic infection by highly pathogenic viruses; and R848, a high-affinity synthetic TLR7 agonist, mimics viral single-stranded RNA to elucidate TLR7-dependent signaling pathways14,15,16,17. Collectively, these agents represent defined pathogen-associated molecular patterns that activate intrinsic innate immune responses. Using DIA mass spectrometry, we identified 5,868 NET-associated proteins. Significant differences in protein composition were observed across different stimuli, reflecting the diversity of immune responses involved in NET formation. This study provides a comprehensive NET proteome dataset that enhances our understanding of the protein composition of NETs in response to various pathogens. By comparing the proteomic profiles of NETs induced by different stimuli, we identified key proteins involved in immune responses. These findings may serve as a basis for exploring immune evasion strategies and potential therapeutic targets in immune-related diseases.

Although the study does not investigate the specific biological functions of individual proteins, the presented dataset offers valuable insights into how different stimuli impact the NET proteome. These findings can guide future studies on the role of NETs in pathogen clearance and immune regulation.

Methods

Animals

C57BL/6 J male mice (6–8 weeks old) were obtained from Beijing HuaFuKang Biotechnology Co., Ltd. All animal experiments in this study were approved by the Animal Care Research Ethics Committee of the Capital Medical University (AEEI-2023-302) and conducted following their ethical guidelines.

Neutrophil isolation and culture

In this study, mouse neutrophils were utilized to circumvent the challenges associated with individual variability in human sample collection, ensuring sufficient cell numbers and high purity. While differences exist between human and mouse neutrophils, this model remains well-suited for the experimental objectives of this study18,19. Neutrophils were isolated from the bone marrow of C57BL/6 J mice using magnetic beads separation (MACS® Separation Buffer and Anti-Ly6G MicroBeads Ultrapure; Miltenyi Biotech; Bergisch Gladbach, Germany). Each experimental group included three independent biological replicates. To ensure data reliability and reproducibility, neutrophils were isolated from different mice for each experiment. The neutrophils were cultured in RPMI-1640 medium supplemented with 5% heat-inactivated fetal bovine serum (FBS) at a density of 6.5 × 10^6 cells per well in a 6-well plate.

In vitro NET induction and protein extraction

Neutrophils were stimulated with BG (50 μg/mL), LPS (100 ng/mL), poly(I:C) (50 μM), R848 (100 ng/mL), or SFTSV (MOI = 1) for 3 h at 37 °C in a 5% CO2 incubator according to previous reports9,20,21. After stimulation, the cells were treated with 5 U/ml DNase-I and incubated for 20 min at 37 °C. DNase-I digestion was stopped with 5 mM EDTA. The samples were centrifuged to remove intact cells and debris, followed by protein precipitation using acetone at −20 °C. The precipitated proteins were stored at −20 °C overnight7. Acetone precipitation introduces mass-spectrometric interference, chromatographic shifts, and quantification errors by modifying residual acetone with Gly-X peptides, affecting ~5% of peptides and reducing reproducibility22.

Sample preparation for mass spectrometry

The precipitated proteins were resuspended in 200 μL of protein lysis buffer (8 M urea, protease inhibitors), followed by reduction with 10 mM dithiothreitol (DTT) at 56 °C for 1 h. Alkylation was performed with 55 mM iodoacetamide (IAM) at room temperature for 45 min in the dark. Protein concentration was measured using the Bradford method. Proteins were digested overnight using FASP (filter-aided sample preparation). The resulting peptides were vacuum-dried and resuspended in 10 μL of 0.1% formic acid-water.

Mass Spectrometry Detection

Mass spectrometry analysis was performed using the FAIMS Pro coupled with the Orbitrap Exploris 480 mass spectrometer (Thermo Scientific). Peptides were subjected to LC-MS/MS detection. The samples were loaded onto the pre-column at a flow rate of 3 μL/min using the EASY-nano-LC 1200 chromatography system (Thermo Scientific) and then separated on the analytical column at a flow rate of 300 nL/min. The spray voltage of the ion source was set to 2.2 kV. FAIMS Pro used two voltages, −45V and −65V. The primary scan range was 398-902 m/z with a resolution of 60 K (@ m/z 200), AGC target of 3e6, and maximum IT of 50 ms. The secondary scan used DIA mode with 25 windows per voltage, a window overlap of 0.5 (m/z), AGC target of 1E6, automatic maximum IT, and MS2 activation by HCD (collision energy: 30).

MS quantitative analysis

MS quantitative analysis was performed using DIA-NN v1.8.1, wherein a spectral library was constructed from DIA data and the UniProt mouse proteome database (22,354 reviewed and unreviewed sequences). Peptide feature matching was refined via nonlinear gradient retention-time calibration. Fragment-ion co-elution peaks were then extracted from raw DIA data and assembled into spectra based on fragment intensity correlation (Pearson’s r > 0.8), spectral similarity (dot product > 0.7), and mass accuracy within a preset precursor tolerance (±10 ppm), all within a dynamic retention-time window (±0.5 min, RMS < 5 ppm). These criteria were integrated into a multidimensional scoring matrix, and an iterative linear classifier (200 iterations, learning rate = 0.05) was employed to select optimal elution peaks, ensuring precise peptide and protein characterization and quantification. Finally, stringent quality control filters (peptide Q-value ≤ 0.01; protein-group Q-value ≤ 0.01) were applied to guarantee the statistical robustness of the identification results.

Data Records

The raw spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE20,23 partner repository with the dataset identifier PXD06056924. A total of 21 data files were collected and uploaded for this experiment, comprising 3 SEARCH format files and 18 RAW format files. To ensure consistency in file naming, control group samples were uniformly labeled with the prefix ‘CTR’, while experimental group samples were designated based on their respective stimulus names.

Technical Validation

Quality evaluation of mass spectrometry data

DIA proteomic analysis was performed on six groups: control (CTR), BG, LPS, Poly(I:C), R848, and SFTSV, with three biological replicates per group and rigorous quality control. We assessed the average number of proteins identified in each sample group to ensure comprehensive coverage of the NET proteome (Fig. 1a,b). In total, we identified 5,868 proteins and 51,798 peptides across all groups, with a 1% false discovery rate (FDR). In all collected samples, a total of 3590 proteins were identified, accounting for 61.2% of the total 5868 identified proteins (Fig. 1c). Some proteins may be detected in a single sample but are not widely present in all samples, so they are still identified as being present but with less than 10 percent coverage. This result indicates that the composition of NETs proteins varies under different stimuli, consistent with the findings of Petretto et al. 8, thereby increasing the credibility of this dataset. Additionally, the peptide distribution across samples was consistent, with the majority of proteins being identified by ≥3 peptides (72.3%), confirming the accuracy of protein quantification (Fig. 1d). It can be observed that most proteins were identified with multiple peptides across various samples, indicating stable expression of these proteins in different samples.

Fig. 1
figure 1

Quality evaluation of mass spectrometry data. (a) The average number of protein groups quantified in each set of samples at 1% FDR. (b) The average total number of peptides identified in each set of sample. (c) Coverage of protein groups in each sample, with an analysis of missing values. (d) Variance analysis of all quantified proteins, showing the distribution of protein quantified with ≥3 peptides, 2 peptides, and only 1 peptide. (e) Distribution of identified peptide lengths. (f) The relationship between peptide length and its corresponding mass-to-charge ratio (m/z). The majority of peptides fall within the expected range, indicating high-quality mass spectrometry data.

The peptide length distribution indicated that most peptides were within the expected range of 7–20 amino acids, a characteristic consistent with typical enzymatic digestion processes (Fig. 1e). The mass-to-charge (m/z) ratio distribution of the peptides further demonstrated that the data were of high quality, with a significant proportion of peptides exhibiting a charge of +2, which is typical for LC-MS/MS analyses (Fig. 1f).

Data reproducibility and consistency

Reproducibility across biological replicates was evaluated by analyzing the distribution of protein intensity (MaxLFQ) in the six groups. The box plots revealed that protein intensities within each group were highly consistent, confirming the reliability of the experimental replicates (Fig. 2a). Moreover, we present the protein intensities across different sample groups, observing relatively uniform levels across treatment conditions, with significant upregulation exclusively in the SFTSV group. We hypothesize that other stimuli elicit limited responses via single receptor-mediated signaling pathways, whereas live virus SFTSV infection may induce a more robust host immune reaction through the activation of a broader receptor network (Fig. 2b).

Fig. 2
figure 2

Reproducibility within the replicates. (a) Distribution of protein abundances (MaxLFQ) in each acquired sample group. The box plot illustrates consistency in protein intensities within each group, indicating good reproducibility. (b) Averaged protein intensity across all samples in each treatment group. Variations between groups reflect different immune responses triggered by each stimulus. (c,d) PCA plot of the complete proteomic dataset. Samples from the same treatment group cluster together, indicating minimal variation within groups and significant differences between groups. (e) Correlation plot showing the relationships between biological replicates. The correlation heatmap (generated using ChiPlot, https://www.chiplot.online/) reveals high intra-group reproducibility and distinct inter-group differences.

To assess overall sample consistency and identify any significant differences between conditions, principal component analysis (PCA) was performed. The PCA results demonstrated excellent reproducibility, with samples from the same treatment group clustering closely together and clearly separating from other groups (Fig. 2c). To better visualize the separation among the four groups—CTR, Poly(I:C), R848, and LPS—we refined our PCA by independently plotting these four groups based on the original six-group PCA framework. The results revealed a certain degree of overlap among them within the principal component space, providing a more detailed perspective on their distribution (Fig. 2d). This suggests minimal variation within each group, while highlighting the significant proteomic differences between different stimulus conditions.

Furthermore, correlation analysis was conducted to measure the biological consistency within each group and the degree of difference between groups (Fig. 2e). The high Pearson correlation coefficients (R) within each group indicated that the proteins identified in the biological replicates were highly reproducible, while the correlation between the groups demonstrated the expected differences in protein profiles based on the stimulus.

Protein differential expression analysis

Differentially expressed proteins (DEPs) were identified through a t-test with a significance threshold of p < 0.05 and a fold change ≥1.5 (upregulated) or ≤0.66667 (downregulated). In total, 3,883 DEPs were detected across all the comparisons. Specifically, 1,532 DEPs were identified in the BG vs. CTR comparison, with 247 upregulated and 1,285 downregulated. In the LPS vs. CTR group, 124 DEPs were detected, including 71 upregulated and 53 downregulated. The Poly(I:C) vs. CTR analysis revealed 267 DEPs, with 112 showing increased expression and 155 decreased. Similarly, the R848 vs. CTR comparison identified 237 DEPs, consisting of 115 upregulated and 122 downregulated proteins. Notably, the SFTSV vs. CTR comparison exhibited the highest number of DEPs, totaling 1,747 with 863 upregulated and 884 downregulated (Fig. 3). These data clearly demonstrated the impact of different stimuli on the NET proteome.

Fig. 3
figure 3

Protein differential expression analysis. Volcano plots (upper panels) and hierarchical clustering heatmaps (lower panels) showing differentially expressed proteins in the following comparisons: (a) BG_vs_CTR, (b) LPS_vs_CTR, (c) PolyIC_vs_CTR, (d) R848_vs_CTR, (e) SFTSV_vs_CTR. Notes: The volcano plots (upper) display each protein as a point, with differential expression defined by p < 0.05 and fold changes ≥1.5 (upregulated) or ≤0.66667 (downregulated). The hierarchical clustering maps (lower) represent proteins (rows) and samples (columns), with color gradients indicating the level of protein expression. Proteins with distinct expression patterns across different treatment conditions are clearly visible.

Hierarchical clustering of these DEPs across the six groups showed clear patterns of protein expression, reflecting the impact of different stimuli on the NET proteome (Fig. 3). The heatmap and volcano plots (Fig. 3a–e) provided a visual representation of protein abundance changes and allowed for easy identification of significantly upregulated or downregulated proteins in each comparison group. Detailed data are added in Table S1.

We conducted differential enrichment analysis on this dataset to examine changes in the biofunctional characteristics and cellular localization of NET protein components under varying stimulation conditions. Specifically, we analyzed the Gene Ontology (GO) enrichment results across five comparisons—BG vs. CTR, LPS vs. CTR, Poly(I:C) vs. CTR, R848 vs. CTR, and SFTSV vs. CTR—encompassing three key dimensions: biological processes, cellular components, and molecular functions (Fig. 4). To further refine the analysis, we counted the number distribution of related proteins in each GO category, highlighting distinct functional regulatory patterns of NET components across different stimuli. These findings provide a valuable reference for the systematic investigation of NET functional mechanisms and offer deeper insights into their roles in metabolic pathways, molecular functions, and intracellular localization.

Fig. 4
figure 4

Functional classification analysis of differential proteins. GO enrichment analysis of differentially expressed proteins: (a) BG_vs_CTR, (b) LPS_vs_CTR, (c) PolyIC_vs_CTR, (d) R848_vs_CTR, (e) SFTSV_vs_CTR. Bar plots of GO enrichment for biological processes (red), cellular components (green), and molecular functions (blue). The X-axis represents the number of enriched proteins per GO term, and the Y-axis lists significantly enriched terms (ranked by adjusted p-value).

Together, these analyses confirm that the proteomic dataset is of high quality, with robust reproducibility and accurate identification of DEPs and accurate pathway enrichment analyses, supporting the validity of the dataset for further functional and comparative analyses.