Introduction

Multiple myeloma (MM) is a lymphoid neoplasm characterized by the abnormal proliferation of a pathological clone of B-cells at a specific stage of differentiation1. Like other types of B-lymphoid malignancies, immunoglobulin heavy-chain (IGH) gene rearrangement begins at the earliest stages of B-cell development and involves a series of recombination events between V, D, and J genes, as well as somatic hypermutation (SHM), resulting in the formation of unique V(D)J sequences2,3. This process confers an extremely high diversity to B-cell receptors repertoires, enabling them to recognize different antigens effectively4. The detection of IGH gene rearrangements via next-generation sequencing (NGS) has further facilitated the study and identification of clonal B-cell populations in the bone marrow. This technology has helped shed light on the biology of myeloma cells, indicating that they are post-switch B-cells that have undergone antigen selection after traversing the germinal center and that there is a bias in the usage of certain IGHV genes.

Over the past decade, significant efforts have been made to improve the clinical outcomes in MM. However, MM remains largely incurable, with most patients experiencing multiple relapses and eventually becoming refractory to treatment. As a heterogeneous disease, previous studies support MM development through a multistep process via the acquisition of sequential genetic hits. A significant proportion of myelomas (50–70%) carry translocations targeting the switch regions of the IGH genes located at chromosome 14q325. These aberrant rearrangements juxtapose oncogenes into the proximity of the powerful IGH enhancers, driving abnormal expression of the translocated oncogenes. Several studies have hypothesized that these characteristics, together with the biased usage of certain VH gene segments, establish a molecular archaeology for myeloma and shape the IG repertoire prior to the acquisition of the malignant phenotype6. It is noteworthy that there is disparity in disease incidence and clinical behavior among patients from different ethnic and geographical backgrounds7,8. For instance, the median age of onset of MM in China is 57.9–59 years old8,9, which is notably lower than that observed in Western countries10,11,12. While the IGH repertoire might be ethnicity-specific, neither this repertoire nor the applicability of NGS-based IGH clonality testing has ever been reported in Chinese populations with MM, except for our previous study13,14.

Another notable point is that antigen stimulation is considered a key environmental driver of malignant transformation. In this regard, IGH gene segment usage in myeloma appears to reflect some degree of both positive and negative selection by environmentally encountered antigens15. Molecular studies of IGH gene rearrangements have provided valuable insights into the pathogenesis of MM in the context of epidemiology.

In this study, we aimed to better understand the repertoire of IG gene usage and molecular archaeology among Chinese patients with MM. We conducted an analysis of the IGH repertoire in 301 patients with newly diagnosed MM (NDMM). Our goal was to investigate the distribution and usage of clonal IGH gene rearrangements in the Chinese population with NDMM. We identified a unique subgroup with a biased usage of certain IGHV genes. We then investigated the usage of D and J genes, HCDR3 length, the status of SHM among this subgroup. Lastly, we explored the correlation between a specific VH gene and recurrent IGH translocations, aiming to uncover the relationship between antigen selection and chromosomal abnormalities in MM.

Materials and methods

Sample and clinical data

A total of 301 bone marrow samples were obtained from the MM specialty biobank of the National Clinical Research Center for Hematologic Diseases, sourced from NDMM patients treated at the First Affiliated Hospital of Soochow University between February 2019 and July 2024. The diagnostic and treatment response criteria for MM were assessed according to the criteria of the International Myeloma Working Group (IMWG) consensus16.

Cytogenetics analysis by fluorescence in situ hybridization

The FISH (fluorescence in situ hybridization) panel included IGH translocations such as t(4;14) (p16.3; q32), t(11;14) (q13; q32), t(14;16) (q32; q23), and t(14;20) (q32; q21), as well as other aberrations such as Gain(1q21), deletion (Del) (13q14) and Del(17p13). Purified CD138 + plasma cells were isolated as previously reported17. High-risk cytogenetic abnormality (HRCA) was defined by the presence of any of t(4;14), t(14;16), t(14;20), or Del(17p13) at diagnosis18. Standard risk cytogenetic abnormality (SRCA) was defined by the absence of these abnormalities18.

Clonality determination via NGS

IGH-based NGS of the V(D)J rearrangements was performed according to Chinese expert consensus on laboratory standardized technical specifications using NGS in MM19. Briefly, genomic DNA was isolated from fresh bone marrow aspirates at diagnosis, and purified DNA samples were quantified using the Qubit DNA Assay Kit (Qubit 4.0, Thermo Fisher Scientific, Inc.). Library preparation and clonality testing were conducted using the commercially available LymphoTrack assays kit-PGM (Invivoscribe, Inc. San Diego, CA, USA), targeting the IGH FR1 fragment. Sequencing results were analyzed using the LymphoTrackS5-PGM Software version 2.4.5 (Invivoscribe, Inc.). Sequence information was analyzed using IgBLAST (https://www.ncbi.nlm.nih.gov/igblast) databases and the international ImMunoGeneTics information system (IMGT) (http://www.imgt.org). The criteria for IGH clonality were determined as previously described13. The SHM status of IGHV was assessed using the closest germline gene20.

Statistical analysis

R Studio (Version 4.2.3, https://www.r-project.org/) and GraphPad Prism 8.3.0 were utilized for all statistical analyses and plotting. The comparison of categorical variables was performed by the Chi-square test or Fisher’s exact test. Median and interquartile ranges were calculated for quantitative variables. Mann–Whitney U test and Student t-test were employed to analyze intergroup differences in SHM rates and heavy-chain complementarity-determining region 3 (HCDR3) length. All statistical tests were two-tailed, and P-values < 0.05 were considered statistically significant.

Results

Baseline characteristics of the study population

A total of 301 patients were enrolled in the study, comprising 156 males and 145 females, with a median age of 61 years across the study population. Approximately 66.7% of patients were classified as Revised International Staging System (R-ISS) stage II. The predominant immunoglobulin isotype was IgG (58.2%), followed by IgA (27.1%), IgD (3.0%) and IgM (0.3%). 31 patients were light chain only MM and 3 patients were Non-secretory. FISH detected 78 patients (26.4%) were t(4;14), and 45 patients (15.2%) were t(11;14), 6 patients were t(14;16), and only 1 patient was t(14;20). There were 105 patients (35.5%) detected with HRCA. The other detailed characteristics of the 301 NDMM patients were presented in Table 1.

Table 1 Clinical characteristics of 301 newly diagnosed multiple myeloma patients.

Distribution of clonal rearrangements

A total of 322 clonal IGH rearrangements from 301 patients were identified, comprising 282 productive and 40 unproductive rearrangements. The analysis flowchart of this study was summarized in Fig. 1. Among these patients, 262 cases (87.0%) presented only one productive rearrangement, 16 cases (5.3%) presented one productive and one unproductive rearrangement, 18 cases (6.0%) presented only one unproductive rearrangement, 2 cases (0.7%) presented two productive rearrangements, and 3 cases (1.0%) presented two unproductive (see Supplementary Fig. S1 online).

Fig. 1
figure 1

Analysis flowchart. Including one case with an unproductive rearrangement.

IGH gene repertoire

To profile the IGH rearrangement repertoire, we analyzed the V, D, and J genes rearrangements. We identified 43 functional IGHV genes in 282 productive rearrangements. Among these, IGHV3 was the most frequent subgroup (53.5%), followed by IGHV4 (23.8%), IGHV1 (8.9%), IGHV2 (8.9%), IGHV5 (3.5%), IGHV7 (1.1%), and IGHV6 (0.4%). According to the IGHV analyses, IGHV3-30 was the most expressed (12.4%), followed by IGHV3-23 (8.2%), IGHV4-39 (8.2%), IGHV4-59 (6.7%), and IGHV3-21 (6.0%), accounted for almost half of the series (41.5%) (Fig. 2a). Comparison between our series and other two studies previously reported21,22, we found a significantly higher usage of IGHV3-21 and IGHV4-39 (6.0% vs. 3.1% and 8.2% vs. 3.4%, respectively; P < 0.05). Particularly, IGHV4-39 usage was significantly more frequent in our study than in Western MM patients from previously reported by Gkoliou et al21. (8.2% vs. 3.4%, P = 0.0037) (Fig. 2b). And IGHV3-9 and IGHV4-31 was significantly lower than Korean MM patients 22 (2.8% vs. 11.6% and 2.1% vs. 9.3%, respectively; P < 0.05). IGHV4-39 was not observed in 43 MM patients from Korea (see Supplementary Table S1 online).

Fig. 2
figure 2

IGH gene repertoire by NGS-based assay in multiple myeloma. (a) Bar plot showing the frequency of IGHV gene usage in our MM series: 43 IGHV genes were shown on the X-axis, with the five most abundant IGHV genes depicted in blue. (b) Comparison of the five most abundant IGHV genes between our series and two other previously reported studies. P-values were calculated using Chi-square test. (c) Bar plot showing IGHD gene usage in our cohort: 25 IGHD genes were shown on the X-axis, with the five most abundant IGHD genes depicted in blue. (d) Comparison of the five most frequent IGHD genes in three groups. P-values were calculated using Chi-square test. (e) Bar plot showing the IGHJ gene usage in our group and the comparison with two other groups. P-values were calculated using Chi-square test. * P < 0.05, *** P < 0.005. § 523 patients from collaborating institutions in Greece (n = 176), Italy (n = 72), Spain (n = 201), and the IMGT/LIGM-DB public database (n = 74). The sequence datasets from the Italian and Spanish groups have been reported previously23,24.

Among the IGHD family, we defined 25 IGHD genes in our series. IGHD3 and IGHD2 were the two most abundant subgroups, comprising 36.4% and 16.0%, respectively. IGHD3-10 (10.9%), IGHD6-13 (9.5%), IGHD3-16 (8.4%), IGHD1-26 (6.9%), IGHD5-12 (6.9%) were the five predominant genes (Fig. 2c). The predominant IGHD gene didn’t show any significant differences compared to other two groups (Fig. 2d). The detailed information on the comparison of IGHD gene among the three groups was summarized in Supplementary Table S2.

According to IGHJ gene usage, we noticed that, IGHJ4 (45.6%) and IGHJ6 (26.3%) were the most frequent segments in our cohort. In addition, the frequency of IGHJ3 usage in our cohort was significantly lower than that in Western MM (7.8% vs. 13.6%, P = 0.015). Meanwhile, IGHJ6 usage was significantly higher than that in Western series. (26.3% vs. 19.7%, P = 0.030) (Fig. 2e). There was no significant difference in other IGHJ genes among the three groups (see Supplementary Table S3 online).

Antigen selection imprint on IGHV4-39

Considering the remarkable bias of IGHV4-39 usage in our series, we subsequently investigated the antigen selection imprint on IGHV4-39. In 23 MM patients with productive IGHV4-39 rearrangements, no common antigen epitopes were found (see Supplementary Fig. S2 online). Each patient had their own unique HCDR3 sequence (see Supplementary Table S4 online). A biased pairing of certain functional IGHD genes and IGHJ genes was identified, with 7 out of 23 cases (30.4%) recombining with IGHD3 and IGHJ4 in addition to IGHV4-39. Further subgroup analysis revealed 3 cases with IGHD3-10 and IGHJ4 (Fig. 3a). The median rate of IGHV SHM with IGHV4-39 was significantly higher compared to those without IGHV4-39 (11.1% vs. 9.0%, P = 0.0056). Comparing the five most abundantly used IGHV subgroups, SHM rate of IGHV4-39 was significantly higher than that of IGHV3-21 (median: 11.1% vs. 7.8%; P = 0.0005) (Fig. 3b). The length of the HCDR3 region was significantly shorter in patients with IGHV4-39 than without IGHV4-39 group (median: 14 vs. 16 amino acids, P = 0.0321; Fig. 3c). Comparing between the IGHV subgroups, IGHV4-39 was significantly shorter than IGHV3-30 (mean: 14.5 vs. 17.2 amino acids; P = 0.0041) and IGHV4-59 (mean: 14.5 vs. 17.1 amino acids; P = 0.0182) (Fig. 3d).

Fig. 3
figure 3

Specificity characteristics of biased pairings, somatic hypermutation rates and HCDR3 lengths in multiple myeloma patients with IGHV4-39 clonal rearrangement. (a) Heatmap showing the number of biased pairings of productive IGHD and IGHJ genes in IGHV4-39 rearrangement (n = 23). (b) Violin plot showing the distinct somatic hypermutation rates in patients with IGHV4-39 compared to other IGHV. P-values for each IGHV subgroup were calculated using Mann Whitney test, P < 0.05. (c) Line chart showing the percentage of distinct HCDR3 lengths in patients with and without IGHV4-39. Statistical differences were calculated by Mann Whitney test, P < 0.05. (d) Comparison of the HCDR3 lengths between IGHV4-39 and four other IGHV gene subgroups. IGHV3-21 group, n = 17; IGHV3-23 group, n = 23; IGHV3-30 group, n = 34; IGHV4-59 group; n = 19. P-values for each comparison group were calculated using Student t-test, P < 0.05.

Cytogenetic abnormalities and IGHV4-39

FISH analysis was performed in 296 out of 301 patients (98.3%) at the time of diagnosis. The relationship between IGHV gene and cytogenetic abnormalities were illustrated in Fig. 4a. The proportion of IGHV4-39 accompany with HRCA was significantly higher than that of IGHV3-23 (58.3 vs. 26.9%; P = 0.0438) and other IGHV group (58.3% vs. 31.0%, P = 0.0115) (Fig. 4b). When comparing the baseline characteristic between patients with and without IGHV4-39, we observed that cytogenetic abnormality was the only statistically differential factor (High-risk: 58.3% vs. 33.8%; P = 0.0294), especially high-risk t(4;14) alteration (45.8% vs. 24.6%; P = 0.0435) (Table 2). Furthermore, a multivariate analysis was performed to adjust for potential confounders such as age, sex, R-ISS stage and so on. We found that R-ISS stage (III vs. I/II: OR 5.234, 95% CI 2.75–10.28, P < 0.0001) and IGHV4-39 (Yes vs. No: OR 2.778, 95% CI 1.129–7.017, P = 0.0268) contributed to HRCA (Fig. 4c). These results further confirmed an independent association between IGHV4-39 and HRCA in our MM cohort.

Fig. 4
figure 4

Cytogenetic abnormalities and multivariate analysis in 301 multiple myeloma patients. (a) Landscape of IGHV gene subgroups and cytogenetic abnormalities. (b) Bar plot displaying the comparison of HRCA proportion between cases with IGHV4-39 clonal rearrangement and five other IGHV subgroups. Statistical differences were calculated by Fisher’s exact test, P < 0.05. (c) The forest plots display the odds ratios (ORs) (95% CIs) and P values of clinical features derived multivariate logistic regression analyses for HRCA in the overall cohort. P < 0.05. NS, not significant.

Table 2 Baseline characteristics in MM patients with and without clonal IGHV4-39 gene rearrangements.

We subsequently analyzed the correlation between the unique IGHV usage and IGH/14q32-ralated chromosomal translocations. We identified that IGHV4-39 had the highest frequency (14.1%, 12/85) in patients involving IGH/14q32-related HRCA. IGHV4-39 was the most frequently used (14.1%; 11/78) in t(4;14) group. Conversely, only 1 case with IGHV4-39 (2.2%; 1/45) was detected in t(11;14) group (Fig. 5). But the statistical difference was not reached between two groups.

Fig. 5
figure 5

Analysis of IGHV gene and IGH/14q32-ralated chromosomal translocations. Waterfall plots showing the association between IGHV gene and IGH/14q32-related cytogenetic abnormalities, including t(4;14), t(11;14), t(14;16) and t(14;20). The red bar plot on the right side showing the frequency of 14q32-related HRCA. NA, not available.

To explore the impact of SHM rate and HCDR3 length between t(4;14) and IGHV4-39. We divided all patients into four groups according to the cytogenetic abnormalities. The results showed t(4;14) with IGHV4-39 had a significantly higher mutation rate than that of non-t(4;14) without IGHV4-39 (median SHM: 12.0% vs. 8.9%, P = 0.0278). There was no statistical difference among the other groups (see Supplementary Fig. S3a online). Furthermore, we analyzed the distribution of HCDR3 lengths and found no significant correlation between IGHV mutation rate or t(4;14) and HCDR3 length (see Supplementary Fig. S3b and S3c online).

Clinical relevance (an exploratory study)

To further investigate the potential clinical implications of IGHV4-39, we conducted an exploratory analysis of the association between IGHV4-39 and clinical outcome. 274 patients were available for follow-up, including 24 cases with IGHV4-39. The median of follow-up time was 28 months (range: 1–75 months). No statistically significant differences were found between patients with or without IGHV4-39 in both progression-free survival (PFS) and overall survival (OS) (P > 0.05, see Supplementary Fig. S4a and S4b online). However, these results should be interpreted with caution due to the overall patient population is highly heterogeneous, with all stages and different treatments.

Discussion

We investigated the IGH gene repertoire of a cohort of 301 NDMM with Chinese patients via NGS assay at diagnosis for the first time. Our results represented the largest IGH data in MM series from Asia. Consistent with previous studies25, the vast majority of patients (93.0%) displayed at least one productive clone. Gkoliou et al.21 studied IGH gene repertoire from Western MM patients in a largest population, including Greece (n = 176), Italy (n = 72), Spain (n = 201), and the IMGT/LIGM-DB public database (n = 74). Medina et al.24 presented the characterization of IGH gene rearrangements in 413 myeloma patients treated in Spanish trials, including 113 patients characterized by NGS. Only 43 Korean patients were reported from Asia22. The repertoire of IGH gene usage in Chinese patients with MM has not been reported to date. We compared the VDJ gene usage among Chinese, Western, and Korean patients with MM. MM patients displayed a skewed IGH gene repertoire in our cohort. Consistent with Western population, IGHV3-30, IGHD3-10, and IGHJ4 was the most abundant IGHV, IGHD and IGHJ gene, respectively. However, variations were noted in the VDJ gene usage. Notably, IGHJ6 was more prevalent, whereas IGHJ3 was relatively rare in Chinese patients compared to Western counterparts. Remarkably, IGHV4-39 (8.2%) was prominently used in our patients, but relatively lower in Western MM cohorts with an incidence of 3.4%21 and no in Korean patients22. This gene segment appears to be specifically favored in Chinese myeloma cases. Intriguingly, both our study26 and previous research27 indicated a higher incidence of IGHV4-39 usage in Chinese chronic lymphocytic leukemia (CLL) compared to Western CLL. Medina et al.24 reported that gene selection was biased in MM, with a significant overrepresentation of IGHV3, IGHD2, and IGHD3, as well as of the IGHJ4 gene group, compared to the normal B-cell repertoire. Kim et al.22 found IGHV3-9, IGHV4-31, and IGHD3-3 were common in Korean patients with MM but no in Western patients. Furthermore, our and other previous studies have demonstrated that Chinese patients exhibit distinct IGHV gene usage and gene mutation profiles in CLL and Lymphoplasmacytic lymphoma/Waldenström macroglobulinemia compared to Western cohorts28,29. These findings suggested the ethnicity and geography play roles in shaping IGHV usage. Therefore, the slightly different IGH gene repertoire, particularly the elevated frequency of IGHV4-39 gene usage in Chinese MM patients, may be attributed to their distinct ethnic backgrounds and antigenic stimulation within the same disease context.

We further investigated the characteristics of IGHV4-39 gene rearrangement, including the HCDR3 sequence, D-J gene pairing, SHM rate, and HCDR3 length. Notably, we observed that 30.4% of patients exhibited biased pairing of IGHD3 and IGHJ4 genes. IGHV genes play a crucial role in epitope binding affinity determination and B cell differentiation in B cell neoplasms15. Extensive studies have examined IGHV sequences in various human B-cell tumors. Previous research has confirmed that IGHV4-39 was characterized by exhibits the highest affinity for non-muscle myosin II, an autoantigen potentially implicated in CLL promotion30. Furthermore, the usage of IGHV4-39 and along with stereotyped HCDR3 has been identified as an independent risk factor for Richter syndrome transformation31. However, MM cases with stereotyped HCDR3 have not been identified so far21. Interestingly, a higher SHM rate and shorter HCDR3 length were observed in the patients with IGHV4-39 compared to those without IGHV4-39 in our cohort. The SHM status serves as evidence of the origin of tumor B cells and reflects the status of tumor clones before and after transformation32. It’s established that the SHM status of IGHV is a prognostic factor in CLL33. Myeloma cases with mutated IGHV genes suggest derivation from B cells undergoing a germinal center reaction in response to antigen stimulation. The relatively higher mutation rate observed in IGHV4-39 among MM patients prompts discussion on whether it reflects developmental antigen selection induced by a specific exogenous antigen.

To the best of our knowledge, this study is the first to elucidate the association between clonal IGH rearrangement and cytogenetic alterations in MM. We investigated the correlation between IGHV and cytogenetic abnormalities in MM patients using cytogenetic data obtained at diagnosis. The t(4;14) translocation was the most common IGH translocation in our cohort, consistent with a report by Fan H et al. in Chinese MM patients34, which is higher than that reported in Western cohorts (~ 15–20%), as previously described24. We first observed that the IGHV4-39 gene exhibited higher expression in MM patients with HRCA compared to other IGHV subgroups. Importantly, the IGHV4-39 gene was more frequently detected in cases with t(4;14) than those without. The presence of specific cytogenetic abnormalities confers heterogeneity in prognosis. The t(4;14) defines a unique subtype associated with poor clinical outcomes, as previously reported35. Previous studies have identified certain genetic lesions, such as recurrent IGH translocations and hyperdiploidy, as oncogenic drivers in MM36. These translocations typically result in overexpression of partner genes. Among them, only the t(4;14) generates a fusion protein involving MMSET and FGFR3 genes, which significantly influences disease behavior, including transforming ability, impact on MM cell growth and survival, and clinical outcome37. However, it has been reported that the t(4;14) alone is insufficient to directly lead to myeloma. Recent research identified true high-risk t(4;14) patients by analyzing the coordinates of translocation breakpoints in the NSD2 gene38. Similarly, in our study, we observed that cases with t(4;14) were more likely to be accompanied by the IGHV4-39 gene segment, suggesting that distinct IGHV repertoires in MM patients may contribute to secondary genetic changes.

Our study also tries to explore the potential clinical implications of IGHV4-39 gene rearrangement in an exploratory fashion. No statistically significant differences were found between patients with or without IGHV4-39 for survival. Spanish’s previous study presented the use of IGHD2 and IGHD3 groups were associated with improved prognostic features and prolonged progression-free survival rates for MM patients24. These findings unveiled IGH gene rearrangements could be considered as new molecular markers for PFS and OS in MM. Whether IGHV4-39 could serve as a prognostic marker still need to further research. Additional studies are needed to better understand the nature and mechanism of IGHV4-39.

Several limitations should be considered in our present study. Although the overall cohort is large, only 24 cases were IGHV4-39 rearrangement, some analyses (e.g., SHM rate and HCDR3 length) may be underpowered. Additionally, it might be insufficient for assessing the impact of IGHV4-39 on clinical outcomes with a relatively small representative sample size and short follow-up period. Lastly, as a single-center retrospective study, the results may not be fully representative of the Chinese population. Further studies involving a larger number of Chinese patients with MM, especially those with IGHV4-39, would be helpful for confirming our observations. Corroboration from a larger cohort with more comprehensive data is necessary to ensure the validity of our study.

In conclusion, we have presented the distinct clonal profiles of IGH gene rearrangements in Chinese NDMM patients. The clonal IGHV4-39 rearrangement, as a particular subgroup, was characterized by predominant usage, high SHM status and short HCDR3 length, as well as association with high-risk cytogenetic abnormalities involving t(4;14) in our cohort. Our findings, for the first time, profiled the IGH repertoire and investigated the correlation between specific IGHV and cytogenetic lesions in Chinese MM patients.