Introduction

Primary central nervous system lymphoma (PCNSL) was an uncommon variant of lymphoma located outside the lymph nodes, because pathogenesis of PCNSL disease was unknown, lacking a serious of clinical indicators for early screening and warning contribute to high mortality and recurrence rate1. At present, imaging methods are mainly used for early clinical diagnosis, such as computerized tomography(CT)and magnetic resonance imaging (MRI), but each method has its limitations2. First, due to the high cell content, lesions usually equal dense to high-density and can be mistaken for blood on a CT scan. Next, hyperlymphatic corticosteroids can mask the pathological findings of MRI scans. In addition, most tumors are deep or multifocal, and stereotactic biopsy was usually required to obtain tissues, which is difficult and complicated3.Therefore, developing early specific and highly sensitive diagnostic techniques is an effective means to pretreat PCNSL. Significantly, determining the clinical relevance of biomarkers and therapeutic targets associated with lymphoma metabolism remains challenging. However, screening metabolites and related proteins in serum based on multi-omics combined technology was a favorable tool to find lymphoma-specific biomarkers4. In particularly, amino metabolomics can not only effectively identify lymphoma biomarkers, but also comprehensively reveal tumor-related metabolic network disorders, contributing to a deeper understanding of the pathogenesis of lymphoma5. For instance, down-regulating serine and glycine levels could inhibit the growth and proliferation of cancer cells and slow down the growth of tumors6. In addition, arginine regulated various cellular functions by forming hydrogen bonds with proteins and enhancing contact with target proteins to treat lymphoma7. L-arginine, D-lysine, L-aspartic acid and D-glutamic acid could promote tumor growth, but D-aspartic acid, L-glutamic acid, D-arginine and L-lysine inhibited those8. In previous studies, we found that the expression of tumor protein 53 in PCNSL cells decreased with the increase of cysteine and homocysteine (p < 0.05). Notably, those are required to activate target of rapamycin, which regulates protein translation and regulated tumor cell growth and autophagy. Thus, this was great clinical and biological significance to monitor amino metabolites and related proteins levels with high sensitivity and selectivity for prevention and treatment of PCNSL patients.

Some amino metabolites have no chromophore, but amino groups can be used to selectively react with labeled probe to generate stable derivatives with strong detection signals and improve detection sensitivity by capillary electrophoresis and high-performance liquid chromatography (HPLC) coupled to ultraviolet9 fluorescence10, and mass spectrometry (MS)11. Among these methods, considering the complex matrix existing in cell and serum samples, simultaneous determination of multi-amino metabolite, analytical selectivity, and sensitively, the approach based on amino group’s derivatization with labeled probe and separation followed by MS detection is appreciated. Furthermore, high-resolution MS (HRMS), such as triple time of flight (Triple TOF) and Orbitrap Eclipse Tribrid mass spectrometry, has unparalleled comprehensive performance, greatly improving sensitivity, resolution and scanning speed with 45 Hz, 500,000 at m/z 200 FWHM12. And mass range was expanded to 8,000 m/z, and MSn experiments were carried out on amino metabolites, protein complexes and their components (including subunits and noncovalent binding species) at the same time. In addition, high-field asymmetric waveform ion mobility spectrometry (FAIMS) pro interface was configured to significantly improve coverage of protein genomics. Furthermore, sure quant quantitative process was carried out for amino metabolites and protein with ultra-high sensitivity13.

At present, chemical labeling method is feasible and acceptable for the determination of amino metabolites by fluorescence, ultraviolet and MS detection. The most commonly used labeling reagents are phenyl isothiocyanate (PITC), fluorene methoxycarbonyl chloride (Fmoc-Cl), 6-aminoquinoxaline-N-hydroxysuccinimide carbamate (AQC) method, dansyl chloride (Dansyl-Cl), o-phthalaldehyde (OPA) and 2, 4-dinitrofluorobenzene (DNFB)14. But each reagent has its advantages and disadvantages limitation. OPA can label amino functional groups quickly, realize online automation easily. But it cannot react with secondary amino metabolites15. PITC maybe remain in analytical column and shorten those life16. DNFB is an explosive, highly toxic substance with strong carcinogenicity contribute to limit its application17. Post-column derivations of ninhydrin can detect most amino metabolites, but the sensitivity is limited18. Dongri Jin and Toshimasa Toyo’oka reported a fluorescent tagging reagent, 4-(3-isothiocyanatopyrrolidin-1-yl)-7-(N, N-dimethylaminosulfonyl)-2,1,3-benzoxadi azole (DBD-PyNCS), was applied into determination of cysteine and homocysteine in plasma and urine with LOD of 0.4–2.4 pmol19. Furthermore, this reagent method was applied into determination of cysteine and homocysteine in plasma and urine with LOD of 0.4–2.4 pmol. Besides, Ravi Bhushan and Rituraj Dubey used (S)-naproxen-benzotriazole to label cysteine and homocysteine by HPLC-UV with LOD of 0.0001–0.0015 nmol20. Furthermore, Sano et al. established a HPLC-FL method for simultaneous quantification of cysteine and homocysteine based on combination of OPA and L-amino metabolites with LOD of approximately a few picomoles21. In addition, Ito et al. used HPLC-UV method to detect cysteine and homocysteine based on 2,3,4,6-tetra-O-acetyl- β-D-glucopyranosyl isothiocyanate (GITC) labeling and LOD was 5 ng22. Liu Ping et al. reported a novel strategy of isotope labeling (ω-bromoacetonylquinolinium–d0/d7-bromide, d0/d7-BQB) in combination with high-performance liquid chromatography-double precursor ion scan mass spectrometry analysis for nontargeted profiling of amino-thiols with LOD of 0.012-5.9 nM23,24. In previous studies, our research group developed a novel MS probe, NCS-OTPP, which triphenylphosphine (TPP) carries permanent positive charge as the basic structure, and then achieved determination and separation of GSH, Cys and Hcy with LOD of 2.40–7.20 fmol25. However, the above-mentioned reagents such as DBD-pyNCS and GITC are developed for FL or UV detection, not for MS, and they cannot target labeling amino groups in the absence of standards, which limited trace analytic application in complex biological samples. Therefore, lots of MS probes still need to continue to be developed to address the current situation in Table S1.

Given this, a novel highly selective mass spectrometry probe (3-bromopropyl) triphenylphosphonium (3-BMP) would be utilized to targeted label amino group and a highly sensitive UHPLC-HRMS method could be established to simultaneous quantification of 20 kind of amino metabolites in human’ cell and serum. Moreover, the feasible method could be successfully applied into the authentic distinguish of PCNSL or HV using principal component analysis (PCA). Further, upstream proteins may be traced and perfected amino metabolism pathway. Ultimately, a predictive and classified model should be developed to supervise PCNSL based on machine learning algorithm, which contributed to in-depth explore relationship between those and pathogenesis and helpful for early diagnosis and warning of PCNSL patients under free-disease state (Fig. 1).

Fig. 1
figure 1

Intelligent monitoring model of PCNSL based on amino-associated protein combined with machine learning.

Results

Chemistry labelling of amino metabolites with 3-BM

Mass spectrometry probes have been widely used in the detection of trace substances because of their high selectivity and sensitivity. Here, trace amino metabolites in C18 columns were detected by reversed-phase chromatography. 3-BMP was selected as a mass spectrometry probe to detect the presence of amino acid metabolites because the permanently carried positive charge in BMP structure can not only improve the ionization efficiency but also improve the detection sensitivity of electrospray mass spectrometry in Fig. 2. In addition, the detector based on triphenylphosphorus has become the focus of current research. The detector has good detection performance, and the detection limit LOD value of Leu, Asp and Glu can reach 4.00–12.00 fmol. The full combination of the detector and UHPLC-HRMS system can better analyze and detect amino metabolites in cells and serum. Notably, no interference peaks were ob-served in UHPLC-HRMS mass spectrometry of 3-BMP probes. 3-BMP was used as a mass spectrometry probe to detect the presence of trace amino metabolites. 20 amino metabolites were selected to explore labelling reaction time at 60 °C. And then, amino metabolites content was determined by UHPLC-HRMS. The specific process of labeling amino metabolites with 3-BMP is shown in Fig. 3. The results showed that the peak area of amino derivatives increased gradually with the extension of reaction time. However, the peak area of the 3-BMP amino derivative leveled off at 100 min and remained constant at 180 min, a specific labeling process for 20 amino metabolites. Therefore, we conclude that the labeling efficiency is highest when derivatization is performed at 60 °C for 100 min.

Fig. 2
figure 2

Profiling reaction and chemical structures of amino metabolites with 3-BMP.

Fig. 3
figure 3

Time courses of the labelling reaction of amino metabolites with 3-BMP at 60 °C.

The fragmentation patterns of amino metabolites labeled by 3-BMP on UHPLC-HRMS

Considering that 3-BMP reacts not only with amino groups in the body, but also with sulfhydryl groups under alkaline conditions, the product ion scan model seems to be a favorable analytical method for analyzing amino metabolites in this situation. Since 3-BMP has a permanent positive charge structure, which increases the ionization efficiency, the product ion scan is performed in positive ion detection mode. Therefore, the amino metabolites were qualitatively analyzed by detecting the characteristic fragment ions of the amino metabolites under the optimal impact energy. Fig. 4 show amino derivative derived from the 3-BMP mass spectrometry probe contains the m/z 320.16 fragment ion for qualitative and quantitative purposes. Moreover, un-known differential amino metabolite were identified by 3-BMP labeling protonated precursor to characteristic fragment ion transitions m/z 320.15. In summary, we found that the structure of the m/z 320.16 fragment including m/z 28.02 (R-C-NH-R’), which can be considered as the characteristic fragment ion of targeted labeling amino me-tabolite by 3-BMP.

Fig. 4
figure 4

MS/MS product ion spectra and proposed fragmentation pattern with molecular formula and theoretical m/z of amino metabolites derivative of 3-BMP.

Validation of the proposed method

20 kinds of calibration curves were generated by using different concentrations and each concentration was repeated for three times. Table 1 shows the calibration curve equation and detection limits using this procedure. The linear calibration curves of various amino metabolites were good (R2≥ 0.9995). The limit of detection (S/N = 3) was between 4.00 and 12.00 fmol, limit of quantification (S/N = 10) was between 12.00 and 24.25 fmol. The precision (CVs, %) was evaluated by 3 next day determination and daily determination of different concentrations. In addition, UHPLC-HRMS of 20 amino metabolites in cells and serum of PCNSL patients were shown in Fig. 5. Table 2 the RSD values within the day ranged from 0.98 to 6.60%, and the RSD values during the day ranged from 0.63 to 5.18% all met the requirements. According to Table 3, the average recovery rate of amino metabolites in serum was 87.09-95.82%. The inter-day and intra-day CV of serum were 1.43-5.22% and 1.22-5.87%, respectively. In conclusion, this method meets the requirements of biological sample analysis.

Fig. 5
figure 5

UHPLC-HRMS chromatograms of 20 kinds of amino acids in PCNSL patients. (A): Standards; (B): cell; (C): serum.

Table 1 Calibration curves equations and limit of detection (LOD) of amino metabolites.
Table 2 Accuracy and precision of the proposed method by intra-day and inter-day assays.
Table 3 Determination of amino acid in serum fluid of PCNSL patients.

Determination of amino metabolites related proteins in PCNSL patients’ bio-samples

An optimized and validated UHPLC-MS/MS method was used to analyze the amino metabolites in the cell, serum and cerebrospinal fluid of 30 PCNSL patients and 30 healthy volunteers. As shown in Fig. 6A, in terms of cells, on the one hand, the contents of Arg, Gln, Asn, Lys, Thr, Met, Ser, Trp, Leu, Tyr, Val and His were detected in PCNSL patients to varying degrees. The contents of 12 amino metabolites were 3.78, 1.44, 1.35, 1.23, 1.22, 1.12, 1.11, 1.09, 1.08, 1.06, 1.03, 1.02 times that of normal cells (p < 0.05), respectively. The contents of Cys, Glu, Ala, Asp, Gly and Pro in PCNSL patients were decreased to varying degrees. Compared with PCNSL cells, the content of amino metabolites in normal cells was 1.92, 1.60, 1.34, 1.29, 1.20 and 1.03 times that of PCNSL cells (p < 0.05), respectively. Similarly, as shown in Fig. 6B, the contents of Glu, Gly, Gln, Cys, Asn, Leu and Val in serum of healthy volunteers and PCNSL patients were 1.31, 1.17, 1.11, 1.10, 1.08, 1.08 and 1.02 times of those of healthy volunteers, respectively (p < 0.05). However, serum levels of Ser, Met, His, Pro, Trp, Phe, Lys, Ala, Thr, Asp, Ile, Arg and Tyr in PCNSL patients decreased to different degrees compared with healthy volunteers. The content of amino metabolites in HV serum was 1.82, 1.52, 1.38, 1.27, 1.19, 1.19, 1.13, 1.13, 1.12, 1.11, 1.06, 1.02 and 1.01 times of that in serum of PCNSL patients, respectively (p < 0.05). In summary, the contents of Gln, Cys, Asn, Leu and Val in cells and serum of PCNSL patients were significantly different from those of HV patients.

Fig. 6
figure 6

Determination of 20 kinds of amino metabolites of PCNSL and HV (n = 30). (A): Cell; (B): serum. (*p < 0.05, **p < 0.01, ***p < 0.001).

Multiplex analysis of amino metabolites and proteins in PCNSL patients’ bio-samples

In this work, difference between patients and control group could be distinguished through this method in Fig. 7 The content of amino metabolites in serum of patients with HV or PCNSL could be separated by specific PCA analysis, as shown in Fig. 7A. In addition, 22 kinds of different amino metabolites were accurately identified based on characteristic fragment ion m/z = 320.16 of 3-BMP labeled derivative in Fig. 7B. Furthermore, Fig. 8A illustrated above amino metabolites maybe involved glutathione, tyrosine, glycine, serine and threonine metabolism, pentose phosphate pathway and aminoacyl-tRNA biosynthesis. Subsequently, 654 kinds of related proteins in above metabolic pathways were enriched and tracked, Fig. 8B showed 20 kinds of proteins were found to have significant differences in expression (p < 0.05). Moreover, the expression of platelet glycoprotein Ib alpha chain (P07359), platelet basic protein (P02775), platelet factor 4 (P02776) and immunoglobulin heavy variable 6 − 1 (A0A0B4J1U7) proteins in PCNSL serum were significantly down-regulated, but others were up-regulated. Similarly, volcano map showed PCNSL patients’ catalase (P04040), coagulation factor VII (P08709), haptoglobin (P00738) and large ribosomal subunit protein eL20 (P0DJ18) were higher than that in HV in Fig. 8C. Subsequently, we conducted a joint analysis of above proteins and aminos and found that increase of haptoglobin (P00738), coagulation factor VII (P08709), large ribosomal subunit proteineL20 (P0DJ18) and catalase (P04040) expression could positively regulate Ala, Lys and Phe on lysine degradation, alanine, aspartate, glutamate metabolism and phenylalanine, tyrosine tryptophan biosynthesis pathway, respectively. While the decrease of platelet basic protein (P02775), platelet factor 4 (P02776) and platelet glycoprotein Ib alpha chain (P07359) expression could negatively regulate the expression of Ser and Phe in glycine, serine and phenyalanine metabolism pathway in Fig. 8D.

Fig. 7
figure 7

Multiplex analysis of amino metabolites in HV and PCNSL patients’ serum. (A): PCA analysis; (B): heat map.

Fig. 8
figure 8

Multiplex analysis of amino related protein in HV and PCNSL patients’ serum. (A) Amino metabolic pathway; (B) heat map; (C) volcano map; (D) amino related proteins.

Machine learning model based on Aminos associated proteins

We performed independent ML on the entire dataset of nine different amino metabolites associated with seven proteins from 30 PCNSL patients and 30 HVs, constructed classification models based on random forest and logistic regression analyses to predict PCNSL, and identified the most important features in the classification in Table S2. Set-I (learning dataset) contained 60% of the stratified samples and was used to build the ML model and find feature importance scores. However, Set-II (hold Set) contains the remaining 40% and is used to independently test the model performance, as shown in Fig. 9A. The experimental results showed that after 5-fold cross validation (CV), the accuracy of the random forest classification model showed dynamic fluctuations in the training process of 2000 iterations, ranging from 85.71 to 94.29%. The model showed extremely high performance on the training set, with an average accuracy of 99.59% (standard deviation 0.99%), indicating a strong fitting ability to the training data. However, the average accuracy of the test set was 93.68% (standard deviation 2.79%), reflecting the generalization ability of the model on unseen data. In addition, we used machine learning algorithms such as random forest and logistic regression for the analysis. To determine the significance of features, we used MATLAB R2023b software specialized for machine learning. The error remained below 0.1 throughout the decision making process. The results of the analysis are shown in Figs. 8D and 9B. The Sankey plot in Fig. 9C is used to visually represent the differences in proteins and metabolites between patients and healthy volunteers (HVS). Twenty-four items were randomly selected as the training set and the remaining 16 items as the test set, and 2000 trials were performed in Table S3 with an average readiness of 93.68%. These results highlight the robustness of amino-associated proteins as potential diagnostic biomarkers for PCNSL, thereby achieving high accuracy in distinguishing cancer from normal groups.

Fig. 9
figure 9

(A) Machine learning strategy diagram. (B) Importance. (C) Protein mulberry map. (D) Error curve.

Materials and methods

Inclusion and exclusion criteria for patients

The inclusion criteria were as follows: all patients were diagnosed by MRI in our hospital, there were no obvious contraindications to surgery or anesthesia, and preoperative and postoperative pathological, clinical, imaging, laboratory and follow-up data were complete. Exclusion criteria were lesions limited to the central nervous system, exclusion of involvement of other systems within 6 months of diagnosis, HIV antibody positivity, and history of malignancy, autoimmune disease, or organ transplantation. The baseline information, preoperative examination results, treatment, postoperative follow-up results, and discharge information of the patients were obtained from the electronic medical record system of our hospital. Patients with incomplete follow-up data were followed up by telephone. Patient characteristics: A total of 60 patients with PCNSL were enrolled in the study, all of whom were HIV-negative and had no history of immunosuppression. The median age at diagnosis was 54 years, and the age of the patients ranged from 25 to 74 years: Of these patients, 36 were men and 24 were women, for a 3:2 male-to-female ratio. The prevalence increased with age and peaked in the 60 to 71 age group.

Materials

3-bromopropyltriphenylphosphine (3-BMP), Glycine (Gly), Alanine (Ala), Valine (Val), Leucine (Leu), Isoleucine (Ile), Proline (Pro), Methionine (Met), Serine (Ser), Cysteine (Cys), Aspartamide (Asn), Glutamine (Gln) Threonine (Thr), Phenylalanine (Phe), Tyrosine (Tyr), Tryptophan (Trp), Aspartic acid (Asp), Glutamic acid (Glu), Arginine (Arg), Lysine (Lys), Histidine (His) were purchased from Shanghai Bid Pharmaceutical Technology Corperration. Methanol (MeOH), acetonitrile (ACN), ethylene diamine tetraacetic acid (EDTA), formic acid (FA), and trifluoroacetic acid (TFA) are mass spectrometric reagent grades (Fisher, USA). All reagents do not require further purification because they are chromatograph-grade.

UHPLC-HRMS conditions

An ultra-high performance liquid chromatography system connected to Triple TOF 5600+ (AB SCIEX, USA) is used for UHPLC triple TOF high resolution mass spectrometry. Chromatographic separation was performed at 40 °C on Kinetex C18 (2.0 × 100 mm, 1.7 μm) column. 20 kinds of amino compounds were separated by gradient elution method with 0.1% formic acid aqueous solution as mobile phase A and 0.1% formic acetonitrile solution as mobile phase B. Gradient elution was performed by 10-19-19-22-45% mobile phase B, and the entire analysis process was completed in 0-5-20-25-35 min. Prepare the automatic injector at 4 °C and set the flow rate to 0.40 mL/min with an injection volume of 1 µl.

The UHPLC Triple TOF MS/MS conditions as shown in Table 4. The mass spectrometer was equipped with an electrospray ionization source in positive ion mode. Ion source gas was 50 L/min, and curtain gas was 30 L/min. Declustering voltage was 5.5 eV, ion release delay voltage was 67 eV, the ion release width voltage was 25 eV, and source temperature was 500 °C. Information dependent acquisition (IDA) mode was used to scan the data with ions with a resolution of 100,000 FWHM in the range of 50 to 1000 m/z, the target analytes were obtained using SCIEX OS 2.0.3 software, and all MS information and data were imported and processed using SCIEX OS-MQ software. This allows accurate mass measurements of product ions to be obtained for use in both IDA and product ion modes. The best normalized collision energy selection of product ions is the key to obtain the highest chromatographic peak response. To ensure reliability and accuracy, both qualitative and quantitative results are based on precise mass measurements of molecular ions with a mass error of less than 1 ppm. Each derivative was monitored using IDA. For each derivative product ions of 20 amino metabolites were monitored characteristic fragment ion m/z 320.15(Formula C21H22NP+)of 3-BMP targeted labeling amino functional group. These products are used for qualitative and quantitative detection due to the labeled amino characteristic ion fragments and the increased signal strength. Thermo Scientific™ EASY-nLC™ 1200 liquid phase system was used for protein monitoring, using positive ion ESI ion source combined with quadrupole, electrostatic orbit trap Orbitrap and linear ion trap mass spectrometry, with a scanning speed of 45 Hz and resolution of 500,000 FWHM. The scanning accuracy was set to 1ppm in the range of m/z 50-8000.

Table 4 UHPLC-HRMS conditions.

Preparation and derivatization of amino metabolites

Weighed 0.75 mg Gly, 0.89 mg Ala, 1.17 mg Val, 1.31 mg Leu, 1.31 mg Ile, 1.15 mg Pro, 1.49 mg Met, 1.05 mg Ser, 1.21 mg Cys, 1.32 mg Asn, 1.46 mg Gln, 1.19 mg Thr, 1.65 mg Phe, 1.81 mg Tyr, 2.04 mg Trp, 1.33 mg Asp, 1.47 mg Glu, 1.74 mg Arg, 1.46 mg Lys, and 1.55 mg His were transferred to a centrifuge tube and treated with acetonitrile-water (1:1, v/v) dilute it to 1 ml to configure a 10 µM solution. Then, 50 µL amino acid solution was added to the EP tube, followed by 50 µL 5% triethylamine and 50 µL 5 mmol 3-BMP solution. The tube was placed in a 60 °C thermostatic oscillator at 1000 revolutions per minute and finally 1.0 ml of the sample was injected into an ultra-high performance liquid chromatography-tandem mass spectrometry system.

Collection and pretreatment of cell and serum from PCNSL patients

Cell and serum samples from HV and PCNSL patients were selected for the detection of amino metabolites. The collection of samples was carried out without any analyte supplementation, while ensuring that all HV were in good health and that the volunteers had voluntarily read and signed informed consent prior to collection and this work was approved by institution research ethics committee of Shandong cancer hospital (SDTHEC2023011029). For the extraction of cells from PCNSL patients, fresh PCNSL tissues were placed in DMEM/F12 medium in a sterile environment and cut to 1 mm3-2 mm3 tissue blocks. The resulting tissue blocks were centrifuged after suspension, the supernatant was discarded, and then the suspension was obtained by mixing with the medium and matrigel. 20 µL suspension was rapidly added to the tissue in the concave pit, and the concave pit containing the suspension was placed in a petri dish and PBS was added to moisturize, and then placed in a CO2 incubator until matrigel cured. The solidified matrigel embedded PCNSL tissue blocks were completely removed from the concave pits and placed into 12-well plates, and the media was added to each hole for primary cell culture until primary lymphoma cellswere spilled out of the matrigel embedded pellets. The resulting cells are collected and frozen for subsequent experiments. Blood were first collected from HV and PCNSL patients who were fasting and placed into medical vacuum tubes containing separation gels and coagulants, and then the tubes were centrifuged at 4 °C at 1500 rpm for 15 min to obtain bio-samples. The supernatant is then transferred to a small tube with a rubber plug nut for storage, each tube can hold a volume of 100 µL of serum. The supernatant is tested immediately or stored at -80 °C until analysis. A mixture of 100 µL biological sample and 300 µL acetonitrile was added for deproteinization and centrifuged at 4 °C at 10,000 g for 15 min. The obtained supernatant solution (100 µL) was transferred to the new tube, and then 50 µL of 5% TEA and 50 µL of 5 mM 3-BMP were added successively. The test tubes were then incubated in a 60 °C thermostatic oscillator for 60 min.

Validation of the method

Different levels of amino metabolites in cells or serum were represented by linear calibration curves with 7 points (amino metabolite/IS (6-aminocaproic acid) peak area vs. concentration) in each of the twenty groups. Intra-day and intra-day precision and accuracy of 20 quality control samples were determined by measuring three times in the range of 0.01 to 1000 pmol at the calibration point, and the limit of detection (LOD) and limit of quantitation (LOQ) with the signal-to-noise ratio of 3:1 and 10:1, respectively. The precision (coefficient of variation, CV, %) of each concentration was calculated by six repeated assays. The cells and serum samples were taken as blank group without adding standard substance, and the standards with three concentrations of 7.81, 31.25 and 62.50 µM were mixed with the above biological samples as standard solution. The internal IS and 5% triethylamine were added as control group, and the addition recovery rates of serum and cells were calculated to meet the requirements.

Determination of amino metabolites related proteins in PCNSL patients’ bio-samples

During the treatment period of PCNSL patients, cells and serum were collected to observe and record changes in amino metabolites and related differential proteins, while samples were also obtained from HV and PCNSL patients upon admission without the use of medical supplies, with blood pressure, heart rate, and physical health index measurements required for each individual. Initially, lumbar puncture and venous blood collection were performed to determine the content of amino metabolites, as described in section of collection and pretreatment of cell and serum from PCNSL patients. For protein analysis, 10 µL of plasma sample was mixed with lysis buffer, thoroughly vortexed for 10 min, and centrifuged at 1000 g for 2 min at 4 °C to remove impurities, after which the supernatant was collected. Protein concentration was determined using the BCA method, where 10 µL of protein solution was mixed with 110 µL of PBS and 120 µL of BCA working solution (total dilution factor of 12×), incubated at 37 °C for 105 min, and the absorbance was measured. Subsequently, 100 µg of protein was added to a freshly prepared UA buffer and brought to a final volume of 250 µL, followed by urea denaturation using an ultrafiltration tube (13,600 rpm, 4 °C, 20 min) and two washes with UA buffer. Sequential steps included DTT reduction (200 µL, 37 °C, 1.5 h) and IAA alkylation (50 mM, protected from light for 30 min), with centrifugation at 16,000 rpm after each step to remove the liquid. The samples were then washed with urea buffer and 50 mM ammonium bicarbonate, followed by the addition of 0.25 µg/µL trypsin for overnight digestion at 37 °C. The digested products were processed using a C18 desalting column, activated with acetonitrile and 0.1% TFA before loading the sample, washed with 0.1% TFA, and eluted using 0.1% TFA-50% acetonitrile, with 600 µL of eluent collected and concentrated by freezing for 90 min, ready for subsequent LC-MS/MS analysis.

Establishment of machine learning model and data processing

In this work, machine learning (ML) was based on amino metabolites related proteins data independently based on two cohorts (PCNSL and HV). Stratified sampling was used to randomly divide the samples into two groups. The first group (Set-I) consisted of 36 samples and was used to identify the corresponding amino associated protein (p < 0.05) and determine its importance score using random forest and logistic regression analysis algorithms. Then the maximum likelihood classification model is constructed by using the random forest and logistic regression supervised learning algorithm. Moreover, data acquisition and processing were carried out with SCIEX OS and SCIEX OS-MQ software (AB SCIEX, USA), respectively. IBM SPSS Statistics 21.0 was used for statistical analyses with a student’s t-test. Results of (*p < 0.05), (**p < 0.01) and (***p < 0.001) were considered significant. GraphPad Prism 9.0 was used for the line chart and bar chart whisker-box plot.

Conclusions

In summary, this work established a highly sensitive and selective UHPLC-HRMS method for simultaneous quantitative determination of 20 kinds of amino metabolites and tracing different proteins in PCNSL patients’ serum based on 3-BMP labeling. Furthermore, up-regulated P00738, P08709 and P04040 could directly negatively regulate Ala, Lys and Phe, which caused Gln, Cys, Asn, Leu and Val on amnio metabolic pathway of PCNSL patients were significantly lower than those of HV (p < 0.05). Ultimately, a highly accurate classification and prediction model of PCNSL were developed for the first time based on amino metabolites associated proteins with accuracy rate of 93.68%. In short, this study would provide a novel mass spectrometry probe 3-BMP was for targeted identification of trace amino functional groups. This intelligent analysis strategy was demonstrated to be a promising tool to monitoring amino metabolites and different proteins occurring in lysine degradation and serine, phenyalanine metabolism pathway, which offers the opportunity for in-depth investigation of pathogenesis and prognostic monitoring when PCNSL patients without ill. Further determinations of many patients’ bio-samples are currently underway in our laboratory.