Introduction

Acute lactational mastitis refers to the acute purulent infection of the breast in lactating women1. The pathogenesis is mainly bacterial infection, among which the most common bacteria are Staphylococcus aureus and Streptococcus. These bacteria can enter the breast tissue through nipple fissures, duct blockage, milk stasis and other routes2,3. The main symptoms of this disease are breast pain, redness, swelling and fever3. These symptoms will not only bring severe discomfort to patients, but also disrupt breastfeeding, resulting in insufficient nutritional intake for infants. Recent epidemiological data show that the incidence of acute mastitis in lactating women ranges from 2.2 to 10% and is on a steady upward trend4.

Although the pathophysiological mechanism of acute mastitis has been relatively clear, its etiology is still rather complex. Its occurrence involves multiple risk factors5, breastfeeding techniques6, abnormal breast structure and immune status7. Currently, the clinical prevention of acute mastitis mainly depends on health education. In addition, diagnosis and treatment are usually based on clinical manifestations and laboratory test results8. However, due to the non-specific characteristics of the early symptoms of acute mastitis and the difficulty in predicting the onset of this disease, there are often delays in its diagnosis and treatment. Such delays often make early intervention and treatment of the disease more complicated. These delays may lead to the occurrence of complications, for example, the formation of abscesses. These complications will have a negative impact on the treatment effect9. Therefore, it is crucial to identify the key risk indicators for the occurrence of acute mastitis in lactating women. This is not only of great significance for effective prevention but also crucial for timely diagnosis and treatment.

In recent years, big data technology and ML technology have made certain progress. The progress of these technologies enables people to conduct more detailed analysis of medical data, and then realize the prediction of disease risks10. Machine learning algorithms have the ability to extract valuable information from complex data sets. They can identify potential risk factors, which have a positive effect on the early detection and intervention of diseases11. Previous studies have explored the prediction of acute mastitis using retrospective data. However, many studies have some limitations. For example, some studies have a small sample size, incomplete consideration of risk factors, and insufficient generalization ability of the models12. Logistic Regression (LR) and Naive Bayes (NB) are widely used algorithms. However, these two algorithms are relatively simple. When facing complex data sets, they may have difficulty in screening out relevant indicators from them13. XGBoost, an enhanced version of gradient boosting, performs better in terms of accuracy, stability, scalability and customizability14,15. Multilayer Perceptron (MLP) is another powerful and flexible artificial neural network model. It performs excellently in learning complex patterns through multiplelayer nonlinear transformations. MLP is especially good at capturing complex relationships in high-dimensional data16,17. Therefore, it is suitable for tasks such as regression and classification. Different from relatively simple models such as LR or NB, the deep learning ability of MLP enables it to extract complex features and interactions between variables.

In this study, we adopted four ML models, namely LR, NB, XGBoost and MLP. We carried out a retrospective analysis of twelve relevant indicators of 369 patients with acute mastitis and 447 healthy controls. The aim of our study was to develop a more accurate and reliable prediction model for acute mastitis in lactating women, with the MLP model showing the most promise. This model can predict the occurrence of the disease before the clinical manifestations of acute mastitis appear, thus enabling earlier implementation of intervention measures, and ultimately achieving the reduction of the severity of the disease and the improvement of the overall health status of mothers and infants.

Materials and methods

Research object

In recent years, given the rapid advancement of big data and ML technologies, a growing number of studies have employed these technologies to carry out in-depth analyses of medical data for predicting disease risks. Retrospective case-control studies were conducted. A total of 369 patients with acute lactational mastitis were admitted from January 2016 to December 2023 and 447 healthy individuals without breast diseases during the same period (control group) were selected.

Inclusion and exclusion criteria

This study focused on breastfeeding women aged 18 to 50 years, with a clear diagnosis of acute mastitis (whether present or not). Participants were required to have a complete medical history, including information on age, primiparity status, history of breast surgery, nipple trauma, puerperium, gestational diabetes, and relevant biomarkers (e.g., C-reactive protein (CRP), procalcitonin (PCT), neutrophils (NE), white blood cells (WBC)). Symptoms like elevated body temperature, general pain, headache, chills, and abnormal nipple discharge were also considered in the diagnosis.

Exclusion criteria included women outside the 18 to 50 age range, non-lactating women, and cases with incomplete or missing medical data (more than 10%). Other breast diseases that could interfere with the diagnosis of acute mastitis, severe systemic diseases, and medications affecting mastitis occurrence or diagnosis were also excluded. Additionally, women with acute mastitis during pregnancy, unless it occurred during lactation, were excluded from the study. The study received ethical approval from the Ethics Committee of Liuzhou Hospital of Guangzhou Women and Children’s Medical Center (approval number: 2024 − 212).

Determination time of biomarkers

During routine postpartum examinations, blood samples for biomarkers were collected from both the control and case groups, with additional sampling for the case group at the time of acute mastitis diagnosis. These examinations, as standard procedures for lactating women, involve a systematic assessment of maternal health. Measured indicators include CRP, PCT levels, and blood routine tests, which aim to comprehensively evaluate the inflammatory status and exclude potential diseases that may affect breastfeeding and maternal health.

Establishment of machine learning models

When constructing machine learning models, we used algorithms such as LR, NB, XGBoost, and MLP to develop predictive models. The dataset was divided into training and testing subsets to evaluate model effectiveness, and cross-validation was employed to enhance model robustness. Models were assessed using metrics including ROC curves, accuracy, sensitivity, specificity, and F1 score, and confusion matrices were analyzed to understand their strengths and weaknesses in predicting acute mastitis.

During model construction, we performed 5 rounds of 10-fold cross-validation combined with grid search for parameter optimization. This approach aimed to ensure that models would not overfit the training data while maintaining performance and predictive ability on unseen data. Through this method, we confirmed that the models exhibited good generalization performance. Prior to evaluation, the optimal decision threshold was determined by maximizing the Youden index on the ROC curve of the training data.

Data statistics

During the data preprocessing stage, numerical statistics and proportion computations were carried out on count data to intuitively understand the distribution among various categories. Subsequently, the chi-squared test was adopted to evaluate the association between different categories, a prevailing statistical method suitable for analyzing the association of categorical data.

For continuous data that did not comply with the assumption of normal distribution, the non-parametric Mann-Whitney U test was chosen to compare two independent samples. This test method is independent of the data distribution, making it especially appropriate for handling continuous data that is not normally distributed. Through this approach, the robustness and reliability of the model were ensured, providing considerable data support for subsequent clinical applications and decision-making.

Results

General data comparison

In the absence of acute mastitis and acute breast disease groups, the average age of the non-acute breast disease group was 32.776 years, with a standard deviation of 7.23 years. For the acute mastitis group, the average age was 34.36 years, and the standard deviation was 6.06 years. Regarding CRP, the average value in the non-acute mastitis group was 6.7 mg/L, while that in the acute mastitis group was 21.1 mg/L.

For the acute breast disease group, the average WBC count was 9.239 × 10⁹/L. As for the non-acute mastitis group, the average WBC count was 7.219 × 10⁹/L, with a maximum value of 9.329 × 10⁹/L. Additionally, in the non-acute mastitis group, the average lymphocyte count was 3.72 × 10⁹/L, while in the acute breast disease group, the average cell count was 3.29 × 10⁹/L.

Our research shows that age, cracked, CRP, NE, and WBC all have statistical significance (p < 0.05) (Table 1).

Table 1 Baseline characteristics in acute mastitis and Non acute mastitis.

Model evaluation and assessment

A machine learning model was constructed based on variables such as age, cracked, CRP, NE, and WBC, encompassing logistic regression analysis, Naive Bayes, XGBoost, and Multilayer Perceptron. The analysis results show that the multilayer perceptron model exhibits a relatively high area under the receiver operating characteristic curve (AUROC) on both the training set and the test set. This highlights its excellent ability to distinguish between different classes (as shown in Figs. 1A-B). Table 2 shows detailed information on the specific parameters of the models constructed by using different algorithms. Model performance is reported separately for the training set (for model optimization) and testing set (for generalization validation). Testing set metrics represent the predictive efficacy for clinical application.

Table 2 Performance comparison of four algorithms on training and testing datasets.
Fig. 1
figure 1

Evaluation of the machine learning models. (A) ROC curve and AUROC for the training set; (B) ROC curve and AUROC for the testing set.

DCA curve analysis

Decision curve analysis (DCA) is a tool for evaluating the clinical value of a model. It enables us to grasp the practical application value of the model by comparing the expected net benefits at different decision-making thresholds. Through DCA, as shown in Fig. 2, it is found that the MLP model shows the highest net benefit at most thresholds, indicating that it has excellent predictive performance on the test set. This performance advantage not only performs well statistically but also shows considerable clinical value in practical applications, further confirming that the MLP model is one of the most suitable models to use.

Fig. 2
figure 2

DCA curves of multiple models in the test set.

Discussion

This study aims to construct a predictive model based on machine learning techniques, identifying the risk factors associated with the occurrence of acute mastitis in lactating women. The data set used in this study comes from a case-control study, involving 369 lactating women diagnosed with acute mastitis and 447 breastfeeding women without this condition. Data covering various potential risk indicators, such as age, primiparity status, history of breast surgery, nipple trauma, external breast trauma, postpartum status, history of gestational diabetes, and abnormal nipple discharge, were collected from these participants.

Machine learning techniques, especially XGBoost and MLP, are used to build prediction models. XGBoost is a powerful variant of gradient boosting and is known for its excellent performance characteristics, including high precision, robustness, scalability, and customizability18. On the other hand, MLP is an artificial neural network that is good at capturing complex patterns through multilayer non-linear transformations, making it especially good at learning complex relationships in high-dimensional data16. This ability makes MLP an efficient tool for regression and classification tasks and it often outperforms traditional algorithms in terms of flexibility and predictive ability19.

The application of these advanced machine learning methods in the field of medical diagnosis, especially in the prediction and diagnosis of acute mastitis, demonstrates their ability to handle complex high-dimensional data sets and reveal underlying disease patterns20,21. For example, XGBoost can assist clinicians by analyzing clinical symptoms, diagnosis results, and imaging data, so as to improve the accuracy and timeliness of mastitis diagnosis22. Meanwhile, MLP can reveal non-linear relationships in a wide range of data sets, including patients’ medical history and genetic information, providing more personalized treatment options23. These machine learning methods not only improve the diagnostic efficiency of acute mastitis but also help promote personalized medicine, making clinical decisions more precise24.

In this research, the logistic regression model was also applied to fit the data, and the results highlighted several key variables significantly associated with the onset of acute mastitis. These include age, nipple cracking, CRP level, NE count, and WBC count. Using these insights, a comprehensive machine learning based prediction model was developed, which demonstrated its high accuracy and precision. Various machine learning algorithms, including logistic regression, were used to train and validate this model, ultimately resulting in a powerful tool for predicting the probability of acute mastitis in lactating women.

The predictive model discussed in this study exhibited the outstanding ability to differentiate. It could effectively separate patients suffering from acute mastitis from those who were healthy and breastfeeding. It was especially adept at pinpointing individuals at high risk. This skill is of great clinical significance. It offers crucial backing for decisions made in a clinical setting. In comparison with earlier research, the model from this study showed better sensitivity and AUROC5. This highlights the promise of machine learning techniques. They can improve assessments of clinical risk for acute mastitis25.

In this study’s findings, age emerged as a significantly important factor related to the development of acute mastitis, with a distinct connection observed both in the disease’s onset and progression. Acute mastitis affects a varying percentage (2%−30%) of breastfeeding women globally, and most cases occur in the first three weeks after childbirth26. Younger women who have recently given birth, particularly those who start breastfeeding soon after delivery, are at a greater risk of developing acute mastitis. Prior research has noted that the average age of women with lactational mastitis or breast abscesses is 29 years, reinforcing the notion that younger women are more prone to this health issue5.

Damage to the nipple was found to elevate the risk of acute mastitis. Such damage includes cracks. It also includes abnormalities. These can be inversions or deformities. These issues might lead to blockages in the milk-producing ducts. They can also hinder the normal flow of milk27,28. The results of this study confirm earlier findings. They underscore the importance of keeping the nipple in good condition. This is vital for preventing the start of acute mastitis.

CRP levels function as a significant indicator of inflammation. They are recognized for their association with the intensity of acute mastitis. In instances of mastitis, increased CRP levels are typically observed. This elevation occurs in both blood serum and breast milk. It indicates the degree of inflammation affecting the entire body29,30. Research conducted on patients with idiopathic granulomatous mastitis (IGM) revealed noteworthy results. In patients experiencing severe IGM, serum CRP and interleukin-6 (IL-6) levels were markedly elevated. This was in contrast to patients with less severe manifestations of the disease30,31. These outcomes emphasize the role of CRP. It serves as a marker of inflammation. This is relevant for both the diagnosis and the handling of mastitis cases.

Neutrophils (NE) are pivotal in the inflammatory reaction to mastitis. They operate as the main cells that combat pathogens. One such pathogen is Escherichia coli32. In cows with mastitis, changes in the DNA methylation of neutrophils have been observed. These changes are connected to shifts in gene expression and the regulation of microRNA. This may add to the development of mastitis33. Furthermore, pro-inflammatory cytokines play a part. An example is tumor necrosis factor-alpha (TNF-α). This cytokine is produced by macrophages in the mammary gland. It aids in bringing neutrophils to the site of infection. In addition, molecules like interleukin-8 (IL-8) increase the arrival of neutrophils34. Nitric oxide (NO) also affects neutrophil function. It may control their movement and activity during mastitis35.

WBC, notably polymorphonuclear neutrophils (PMNs), are crucial in the immune reaction during mastitis. They move from the blood to the mammary gland. Their role is to fight infection36. While the inflammatory response is necessary to remove pathogens, too much inflammation can damage tissue. This highlights the importance of properly controlling the immune response. Such control is needed to avoid additional harm to mammary tissue37. This process is strictly managed by both the innate and adaptive immune systems. T and B lymphocytes become active when they meet antigen-presenting cells. They also react to other immune signals38.

The predictive models developed in this study hold significant clinical significance, enabling the identification of women at high risk of acute mastitis. This allows for the timely implementation of interventions and preventive measures, thereby reducing the incidence and severity of mastitis and improving patient outcomes. The models also provide new insights into the etiology and pathophysiological mechanisms of acute mastitis, pointing the way for future research.

However, the clinical application of the models is constrained by the timing of biomarker measurements (e.g., CRP), which were only conducted during routine postpartum examinations and at the time of diagnosis. This may hinder their ability to effectively predict risk before the onset of clinical symptoms. If these biomarkers only exhibit significant changes after symptom occurrence, the models’ early predictive capacity will be limited. Future studies should explore the changes in these biomarkers prior to symptom onset to enhance the models’ early prediction capability.

Although the findings of this study are encouraging, it is important to acknowledge certain limitations. The first limitation pertains to the source of the data. The data were gathered from one medical institution only. This fact could restrict the ability to apply the results to other areas or groups of people.

In addition, the design of the study did not account for every possible risk factor associated with acute mastitis. Subsequent research endeavors should investigate additional potential risk factors. These could include genetic tendencies, exposure to environmental factors, and lifestyle choices. As a result, the predictive model developed may not encompass all variables that affect the start of the disease. Consequently, it is necessary to validate the model in larger and more varied populations. This step is crucial to evaluate how widely applicable the model is.

Moreover, subsequent research endeavors should investigate additional potential risk factors. These could include genetic tendencies, exposure to environmental factors, and lifestyle choices. Such exploration could refine and improve the model’s predictive power.

Conclusion

This study applied machine learning techniques to successfully construct a prediction model for acute mastitis in lactating women, providing important references for clinical practice and scientific research. The research identified age, nipple fissures, CRP, neutrophils, and WBC as the five key predictive indicators. However, the biomarkers used in the current model are only measured during routine postpartum examinations and diagnosis, which may limit the model’s predictive ability before the onset of clinical symptoms. Future studies should further optimize the model based on multicenter, large-sample data and explore the temporal dynamic changes of these biomarkers before symptom occurrence, so as to enhance the early predictive value of the model and the accuracy of its practical clinical application.