Introduction

Mastitis, a significant economic burden for dairy farms, is one of the most prevalent and costly diseases in dairy herds in developed countries1, often associated with a constellation of behavioral changes e.g. reduced activity, decreased feed intake, and diminished daily social interactions2. Identifying specific behavioral alterations associated with mastitis can facilitate early detection and treatment3. Mastitis is influenced by several factors that are called key risk factors e.g. poor hygiene practices, mastitis-causing pathogens4, nutritional imbalances5, stress, and suboptimal milking techniques6. Developing countries face additional challenges due to limited resources and educational disparities. Reproductive performance is also susceptible to the effects of mastitis7. While mastitis does not typically result in visible changes in milk or udder appearance, it is characterized by reduced milk production, elevated somatic cell counts, and altered milk composition8. Although mastitis has been shown to affect both production and reproductive traits7. For improving mastitis detection accuracy, advanced data analysis techniques have been used. Fadul-Pacheco et al.9 explored various machine learning algorithms, demonstrating their efficacy in early prediction of clinical mastitis with high accuracy and reliability. Similarly, Khatun et al.10 developed a novel detection method for automatic milking systems, integrating sensor data to improve diagnostic precision and system compatibility. Kamphuis et al.11 emphasized the effectiveness of decision-tree induction in analyzing sensor data, significantly enhancing clinical mastitis detection compared to traditional methods. Cavero et al.12 utilized artificial neural networks (ANNs) to detect mastitis, achieving high sensitivity and specificity, while Polat et al.13 demonstrated the utility of infrared thermography as a non-invasive tool for identifying subclinical mastitis. Sun et al.14 further advanced the field by employing ANNs to detect mastitis and its progression stages in automatic milking systems, underscoring the importance of early intervention.

Fuzzy logic Introduced by Lotfi Zadeh in 196515. There are two main types of fuzzy logic systems: Type-1 Fuzzy Logic Systems (T1FLS): where the membership functions are crisp numbers between 0 and 116,17,18,19. T1FLS operate through three key processes: (1) Fuzzification20; (2) Inference/Rule Base21; and (3) Defuzzification21. T1FLS are noted for their simplicity in design and implementation compared to type-2 systems18. Interval Type-2 Fuzzy Logic Systems (IT2FLS) where membership functions are represented as fuzzy sets with crisp boundaries21. While IT2FLS offer enhanced capabilities for handling uncertainty and imprecision, they are computationally more complex compared to T1FLS18,19,22. T1FLS are easier to design and implement but offer limited flexibility in managing uncertainty. In contrast, IT2FLS provide superior robustness for uncertainty handling. Both T1FLS and IT2FLS represent broader frameworks capable of integrating both Mamdani and Sugeno fuzzy models23. The sole difference between these systems is their defuzzification process24. Sugeno fuzzy inference utilizes singleton output membership functions25. Fuzzy logic enables data granulation for informed decision-making, and enhances the efficiency of the multifactorial production processes associated with dairy cattle26,27,28,29. It is used for estrus detection and its improving26,29, creating model for animals culling27, subclinical mastitis detection12,30,31,32, lameness31,33, milk production, feed consumption34 and carcass weight udder health6, qualitative analysis of physical and chemical characteristics of milk35, heat stress20, cattle disease36, and automatic monitoring of livestock status. Adaptive Neuro-Fuzzy Inference System (ANFIS) is a hybrid intelligent system that combines the capabilities of fuzzy logic and neural networks27,37,38. ANFIS has used to optimize the production of methane in pig dung18 and mastitis which involves the analysis of a range of input parameters39,40,41.

This study pioneers the application of ANFIS models to improve the accuracy of mastitis classification in Iranian Holstein dairy cattle, as evaluated by veterinary experts. By using expert knowledge as the ground truth, this novel approach seeks to minimize uncertainty in mastitis diagnosis, which enables farm managers to make well-informed and timely interventions. Ultimately, this will lead to better udder health, higher quality milk, and greater productivity, with significant positive economic effects for the dairy sector as a whole.

Materials and methods

Data

This study analyzed a dataset comprising 184,412 records with 31 features, collected from 69 dairy herds in Isfahan, Iran. The primary objective was to employ fuzzy-based classification to categorize mastitis cases into three codes: I (healthy), II (suspected/subclinical), and III (treated by veterinarian).

Data preprocessing

This study analyzed 69,034 Holstein cattle from 69 herds in Isfahan province, with data collected over a 10-year period (2011–2021). The dataset was compiled from veterinary records, dairy herd improvement test-day records, and automated milking systems, ensuring comprehensive coverage of production and health indicators. The entire data contained: Herd ID, Animal ID, Sire ID, Dam ID, Last recorded parity, Date of birth, Parity, First service sire ID, Conception sire ID, Last insemination date, Number of Inseminations per Conception, Calving type, Days open, Calving date, Calving interval, Dystocia score, Birth type (single or multiple), Placental expulsion date, Culling date, Placental expulsion notes, Milk yield, Days in milk, Calf sex, Calf birth weight, Average milk yield, Fat percentage, Protein percentage, Somatic cell count, Age at first calving, Dry period length, Days to first service. However, the following Table 3 in the supplementary provides detailed descriptive statistics for the selected features. To further decrease data dimensionality and minimize the number of rules within the fuzzy inference system, two feature reduction techniques were applied: principal component analysis (PCA) and Pearson-based correlation.

Feature reduction

In this section, we employed two distinct methodologies to reduce data dimensionality and minimize the number of rules within the fuzzy inference system: Principal Component Analysis (PCA) and Pearson-based correlation. PCA was applied to identify the most significant variables contributing to the variance in the output. Specifically, PCA revealed that five variables—protein percentage (PP), fat percentage (FP), calving interval (CI), number of inseminations per conception (NIC), and somatic cell count (SCC)—accounted for 90% of the variance in the output variable. This reduction was achieved by transforming the original dataset into a set of orthogonal components, retaining those that explained the majority of the variance. On the other hand, Pearson correlation was used to assess the linear relationships between variables, yielding a different set of key variables: SCC, milk period (MP), total milk (TM), PP, and age at first calving (AFC). The Pearson correlation coefficients were calculated to determine the strength and direction of these relationships, ensuring that only the most relevant features were retained for further analysis. All calculations, including the implementation of PCA and Pearson correlation, were performed using MATLAB 2022 software, which facilitated the efficient processing and analysis of the dataset. This methodological approach ensured that the dimensionality of the data was effectively reduced, enabling the development of more streamlined and interpretable fuzzy inference systems.

Fuzzification and class imbalance

The selected variables were fuzzified into two groups, with the bell function serving as the membership function for each variable. The classification results showed an imbalanced class distribution, with 83% of data labeled as code I, 11% as code II, and 6% as code III. To address this class imbalance, a centroid undersampling algorithm was applied 1000 times using MATLAB, resulting in an equal dataset (n = 1000) for each group. Undersampling helps reduce overfitting to the majority class by decreasing the majority class data and enabling the model to better learn the minority class features. By following this structured approach, the study effectively processed and prepared the data for fuzzy-based classification, ensuring that the model could learn from a balanced dataset and make accurate predictions about mastitis cases in Holstein dairy cattle.

The Takagi–Sugeno fuzzy model in ANFIS

In this study, which will be described subsequently, the first-order Takagi–Sugeno (TS) fuzzy inference system was employed as the foundation for our modeling framework. Specifically, we utilized the adaptive neuro-fuzzy inference system (ANFIS) architecture, which enhances the TS model by integrating neural network principles to improve learning efficiency and adaptability. The ANFIS architecture employed in this study is illustrated in Fig. 1, from which 32 rules can be generated. In the context of ANFIS, rules derived from fuzzy sets associated with various input variables are predominantly combined using the AND operator. This approach stems from ANFIS employing a TS fuzzy inference system, where rules are typically structured to represent conditions that must be satisfied concurrently. For the sake of clarity and brevity, only a subset of 9 rules is presented here.

Fig. 1
figure 1

Schematic diagram of 5-layer ANFIS architecture employed in this study, corresponding to a first-order TS fuzzy model for five input variables. Each layer does some specific function or output.

Rule 1. if \({x}_{1}\) is A and \({x}_{2}\) is A and \({x}_{3}\) is A and \({x}_{4}\) is A and \({x}_{5}\) is A, then \({y}_{1}={p}_{1}x+{q}_{1}y+{r}_{1}\).

Rule 2. if \({x}_{1}\) is A and \({x}_{2}\) is A and \({x}_{3}\) is A and \({x}_{4}\) is A and \({x}_{5}\) is B, then \({y}_{2}={p}_{2}x+{q}_{2}y+{r}_{2}\).

Rule 3. if \({x}_{1}\) is A and \({x}_{2}\) is A and \({x}_{3}\) is A and \({x}_{4}\) is B and \({x}_{5}\) is A, then \({y}_{3}={p}_{3}x+{q}_{3}y+{r}_{3}\).

Rule 4. if \({x}_{1}\) is A and \({x}_{2}\) is A and \({x}_{3}\) is B and \({x}_{4}\) is A and \({x}_{5}\) is A, then \({y}_{4}={p}_{4}x+{q}_{4}y+{r}_{4}\).

Rule 5. if \({x}_{1}\) is A and \({x}_{2}\) is B and \({x}_{3}\) is A and \({x}_{4}\) is A and \({x}_{5}\) is A, then \({y}_{5}={p}_{5}x+{q}_{5}y+{r}_{5}\).

Rule 6. if \({x}_{1}\) is A and \({x}_{2}\) is A and \({x}_{3}\) is A and \({x}_{4}\) is B and \({x}_{5}\) is B, then \({y}_{6}={p}_{6}x+{q}_{6}y+{r}_{6}\).

Rule 7. if \({x}_{1}\) is A and \({x}_{2}\) is A and \({x}_{3}\) is A and \({x}_{4}\) is A and \({x}_{5}\) is A, then \({y}_{7}={p}_{7}x+{q}_{7}y+{r}_{7}\).

Rule 8. if \({x}_{1}\) is A and \({x}_{2}\) is A and \({x}_{3}\) is B and \({x}_{4}\) is B and \({x}_{5}\) is B, then \({y}_{8}={p}_{8}x+{q}_{8}y+{r}_{8}\).

Rule 9. if \({x}_{1}\) is A and \({x}_{2}\) is A and \({x}_{3}\) is B and \({x}_{4}\) is A and \({x}_{5}\) is B, then \({y}_{9}={p}_{9}x+{q}_{9}y+{r}_{9}\).

Layer 1—Fuzzification: This layer transforms input data into fuzzy sets by using membership functions. Every node i in this layer is an adaptive node with a node function \({O}_{1i}={\mu }_{Ai}\left(x\right), for i=\text{1,2},or\) \({O}_{1,i}={\mu }_{Bi-2}\left(y\right), for i=\text{3,4}\). Here the membership function can be any appropriate parameterized membership function.

Layer 2—Rule Evaluation/Implication: In this layer, the strengths of each rule are evaluated by determining how well the input data aligns with the fuzzy sets. This phase utilizes fuzzy inference to activate the rules. Each node in this layer is a fixed node, designated as such, and its output is calculated as the product of all incoming signals: \({\text{O}}_{2,\text{i}}={\text{w}}_{\text{i}}={\upmu }_{\text{Ai}}\left(\text{x}\right){\upmu }_{\text{Bi}}\left(\text{y}\right),\text{ i}=\text{1,2}\). The output of each node reflects the strength with which a rule is activated. Typically, any T-norm operators that perform a fuzzy AND operation may be employed as the node function in this layer.

Layer 3—Defuzzification: Each node in this layer is a fixed node labeled N. The iith node computes the ratio of the iith rule’s firing strength to the sum of all rules’ firing strengths: \({\text{O}}_{3\text{i}}= {\overline{\text{w}} }_{\text{i}}=\frac{{\text{w}}_{\text{i}}}{{\text{w}}_{1}+{\text{w}}_{2}}\) , i = 1,2.

Layer 4—Normalization: This layer is responsible for normalizing the firing strengths, ensuring that the total of all rule activations equals one, which is crucial for the weighted summation in the following layer. Each node i is an adaptive node with a specific node function: \({\text{O}}_{4,\text{i}}= {\overline{\text{w}} }_{\text{i}}{\text{f}}_{\text{i}}={\overline{\text{w}} }_{\text{i}}({\text{p}}_{\text{i}}\text{x}+{\text{q}}_{\text{i}}\text{y}+{\text{r}}_{\text{i}})\) where \({\overline{\text{w}} }_{\text{i}}\) is a normalized firing strength from layer 3 and {pi, qi, ri} is the parameter set of this node. Parameters in this layer are referred to as consequent parameters.

Layer 5—Defuzzification: The final layer carries out the defuzzification process, which integrates the weighted rule activations to produce the model output, represented as a specific numerical value. This layer contains a single fixed node, labeled as such, which calculates the overall output by summing all incoming signals:

Following the division of the data into two categories, namely training (80%) and testing (20%), the data were utilized in the development of three ANFIS models: GD-ANFIS, PSO-ANFIS, and GA-ANFIS. To this end, different ANFIS configurations, e.g. input membership function types (Triangular, Trapezoidal-shaped, Gaussian), the number of input membership functions (2 to 12), and the number of training epochs (50 to 500) were trained, and tested. To estimate the parameters (Table 1), a confusion matrix derived from the output images of ANFIS was employed. Given that the output variable had three distinct levels, a 3 × 3 confusion matrix was generated. The True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN) rates were calculated following the estimation method outlined in Fahmy Amin’s research42. In evaluating the performance of the ANFIS models, metrics derived from the confusion matrix were computed to assess classification accuracy and reliability. These metrics included Accuracy, Precision, Recall, F1-score, Specificity, Error rate, Type 1 error, and Type 2 error. Accuracy measures the overall correctness of the model by evaluating the proportion of correctly classified instances (both true positives and true negatives) relative to the total number of instances. Precision quantifies the model’s ability to correctly identify positive cases (mastitis) without misclassifying negative cases (non-mastitis). Recall, also known as sensitivity, reflects the model’s capability to detect all actual positive cases, minimizing false negatives. The F1-score, a balanced measure of precision and recall, is particularly useful in scenarios with imbalanced class distributions. Specificity evaluates the model’s ability to correctly identify negative cases, ensuring that non-mastitis instances are not misclassified as mastitis. Additionally, the Error rate quantifies the overall proportion of misclassified instances. Type 1 error, representing the false positive rate, indicates the proportion of non-mastitis cases incorrectly classified as mastitis. Conversely, Type 2 error, representing the false negative rate, reflects the proportion of mastitis cases incorrectly classified as non-mastitis. These metrics collectively offer a comprehensive evaluation of the model’s predictive accuracy, robustness, and reliability in classifying mastitis cases in dairy cattle. All calculations were performed using MATLAB 2022 software, ensuring precise and reproducible results. Figure 2 illustrates the overall framework of this study.

Table 1 PCA results: variance, eigenvalues, cos2, and contribution percentages.
Fig. 2
figure 2

The overall framework of this study.

Results and discussion

This study employed fuzzy-based inference models to analyze mastitis in dairy cattle. Specifically, three ANFIS classification methodologies: GD-ANFIS, PSO-ANFIS, and GA-ANFIS were utilized in conjunction with two feature reduction techniques, namely the Pearson-based method and PCA, to model categorized mastitis data. Notably, the variables selected for further analysis using the Pearson reduction technique did not precisely align with those retained through PCA-based methods, although the total number of variables remained consistent (n = 5). The results of the PCA-based method are illustrated in Fig. 3. The biplot (the left one) in Fig. 3 shows the projection of variables on the first two principal components (PC1 and PC2). The x-axis represents the 1st Principal Component (PC1), and the y-axis represents the 2nd Principal Component (PC2). The red arrows (or vectors) indicate the contributions of different variables (e.g., SCC, NIC, MP, etc.) to PC1 and PC2. Variables close to one another are correlated, and those farther apart or opposite are less correlated or negatively correlated. For instance, the SCC, AFC, and NIC cluster near the origin, suggesting their contributions to PC1 and PC2 are small. MP has a higher contribution along PC2, while PP contributes more toward PC1. Also in this biplot (the right one), the projection of variables on the third and fourth principal components (PC3 and PC4) is shown. The x-axis is the PC3, and the y-axis is PC4. This indicate variables like MP contribute significantly to PC4 and other variables (e.g., TM and FP) influence PC3 more strongly. Variables close to the origin (e.g., SCC and AFC) have minimal contributions to these PCs. In addition, the bottom panel in the Fig. 3 shows the percentage of total variance explained by each principal component. PC1 explains the largest proportion of variance (around 22%), followed by PC2 (roughly 20%). We can see that the PC1 and PC2 capture most of the variability in the data, meaning they are the most important for describing the dataset. Table 1 provides further detail and clarification regarding our PCA.

Fig. 3
figure 3

The results of PCA analysis. SCC (somatic cell count), NIC (number of inseminations per conception), MP (milk period), PP (protein percentage), FP (fat percentage), TM (total milk), and AFC (age at first calving).

In fact, our objective was to minimize the number of input variables used in the development of various ANFIS-based models. The inclusion of each additional input variable generally results in an increased number of fuzzy rules that must be generated and evaluated. This proliferation of rules leads to heightened computational demands during both the training and inference phases, consequently extending the processing time43,44. Through our investigations, it was determined that the bell function provided a superior fit to the data compared to other types of membership functions. Previous research has underscored the higher prevalence of subclinical mastitis compared to its clinical manifestations, emphasizing the urgent need for reliable detection techniques to mitigate its impacts on dairy farming operations45. A comprehensive understanding of the etiology and risk factors associated with mastitis is essential for the development of improved management strategies46,47. ANIFS has demonstrated the capability to classify mastitis by analyzing multiple inputs, such as SCC and observable clinical signs, facilitating prompt assessments of mastitis severity and enabling targeted interventions48.

To compare the three ANFIS classification methodologies, as evident in Table 1, the highest calculated accuracy was achieved by the GD and PSO algorithms, while the lowest accuracy was observed for GA algorithm. To assess the validity of these algorithms, three metrics—precision, recall, and F1-score—were calculated and compared. The GD-ANFIS algorithm consistently yielded the highest values for these metrics, whereas the GA-ANFIS produced the lowest values. To evaluate the error rate, three methods: overall error rate, type 1 error (false positives), and type 2 error (false negatives) were measured. Consistent with previous comparisons, the GD-ANFIS algorithm exhibited the lowest error rates, while the GA-ANFIS algorithm demonstrated the highest error rates. Based on these calculations, the GD-ANFIS algorithm emerged as the most effective for this study, while the GA-ANFIS algorithm incurred the highest error rates. The GA parameters, including maximum iterations, population size, crossover probability, mutation probability, and mutation rate, were determined. Subsequently, the GA fitness function was formulated. The optimization problem considered the values of the premise and consequent parameters, as well as the number of membership functions, as decision variables. For the proposed GA-ANFIS algorithm, the following parameters were employed: crossover percentage = 0.7, mutation percentage = 0.5, mutation rate = 0.1, and selection pressure = 8 b means of Trainlm. Trainlm is an artificial neural network (ANN) training function that updates the values of weights and biases based on the Levenberg–Marquardt (LM) algorithm. Additionally, the learning rate is a critical parameter in the training process of multilayer perceptron (MLP) networks; it can be adjusted to ensure that the weights converge rapidly enough to yield a response without inducing oscillations. The specifications of the GD-ANFIS algorithm are boldy presented in Table 1.

As illustrated in Table 2, the GD-ANFIS model integrated with the Pearson method demonstrated superior performance compared to PSO-ANFIS and GA-ANFIS across key evaluation metrics such as accuracy and error rates. However, due to the interplay of multiple evaluation criteria and the closely clustered fitted values, determining a definitive “best” model remained challenging.

Table 2 The result of ANFIS based classification models.

The highly obtained accuracy in this study could likely be attributable to two main factors. First, there is a possibility that the expert who coded the mastitis status of cows may have been influenced, consciously or unconsciously, by the SCC records, which were used as input variable in our model. Second, and more critically, our analysis was based on a single one run of the data without employing k-fold cross-validation (e.g., 5- fold cross-validation). The absence of cross-validation likely resulted in the model being evaluated on data that were not sufficiently independent from the training set, leading to overfitting and an overestimation of model performance. We agree that k-fold cross-validation is a more robust and reliable approach for assessing the generalizability of classification algorithms, as it involves repeatedly partitioning the data and evaluating the model across multiple subsets. In future work, we plan to implement k-fold cross-validation and expand our dataset to ensure a more accurate and unbiased evaluation of our methods.

Soft computing has increasingly been applied to modeling bovine data. A study aimed at predicting subclinical mastitis utilized both ANFIS and ANN methodologies39 incorporated variables such as lactation number, milk production, electrical conductivity, average lactation duration, and SCC as the output variable. Two distinct models were evaluated based on their sensitivity, specificity, and error rates. The ANN model exhibited a sensitivity of 80%, specificity of 91%, and an error rate of 64%. In contrast, the ANFIS approach demonstrated a sensitivity of 85%, specificity of 91%, and a significantly lower error rate of 35%. These findings suggest that the ANN method may be less effective for the prediction of subclinical mastitis compared to the ANFIS method. In a separate investigation, two methodologies ANN and ANFIS were employed to determine the estimated breeding values (EBV) for milk and fat production. The results indicated that both approaches can be effective in calculating EBV in dairy cattle49. In a study conducted by Yaganoglu (2022), the factors influencing the milk production period were assessed using the ANFIS method. The findings indicated that triangular membership functions yielded the highest coefficient of determination (R2) value of 0.848 and the lowest RMSE of 0.361. This outcome contrasts with the current study, which identified the Gaussian Bell membership function as the most suitable50. In a separate investigation, Mohadesi et al.51 compared ANN method with the GA-ANFIS algorithm for predicting mineral indicators related to water quality. The results demonstrated that the GA-ANFIS algorithm exhibited superior accuracy in producing outputs from experimental data when compared to the ANN approach. These findings are consistent with the results of the current study, highlighting the efficacy of GA-ANFIS algorithm in achieving higher predictive accuracy. Additionally, Salehi et al.23 performed a study on modeling the biological absorption of blood serum triglycerides using ANFIS. Their research reported RMSE values of 3.88 for the experimental dataset and 3.66 for the training dataset. These figures are notably higher than those observed in the current study, which ranged from 0.05 to 0.19. This difference, highlights the improved performance and reduced error rates achieved in the current study, indicating progress in the use of ANFIS models for biological data modeling. The ANFIS was employed to forecast milk production over a period of 305 days. This approach utilized Gaussian membership functions and involved 150 iterations. The results indicated no significant disparity between the actual and predicted milk production figures. The training and testing datasets R2 of 0.940 and 0.865, respectively, along with RMSE values of 0.393 and 0.576, respectively. These findings suggest that ANFIS serves as an effective predictor for milk production over the 305-day timeframe52. Also, using ANFIS method alongside multilayer perceptron techniques to create models aimed at forecasting food production it was turned out that the ANFIS model, which incorporated Gaussian Bell membership functions53, exhibited the least error in food production predictions. These results align with the current study, which also utilizes the ANFIS model and Gaussian Bell membership functions. Figures 4, 5 and 6 illustrate extensive comparisons across all modeling tasks conducted in this study. For each data reduction technique based on Pearson correlation or PCA, the performance of all three models is displayed for both training and test datasets. Each cell within this figure includes the distribution of mastitis classes, the confusion matrix, and the error results. Figure 4 shows Pearson Correlation outperforms PCA in both training and testing datasets. Classification accuracy is marginally higher with Pearson. Both methods generalize well (minimal performance degradation from training to testing). However, the PCA-based model shows a larger performance gap between training and testing, suggesting it might be slightly more prone to overfitting. The application of Pearson correlation as a data reduction method results in improved classification performance for the ANFIS model when compared to PCA, especially in relation to the mastitis dataset examined in this study. In Fig. 5, the Pearson correlation technique shows better performance compared to the PCA technique for both training and testing datasets. Both techniques demonstrate good model performance with errors centered around zero and high diagonal values in the confusion matrices. The PSO-ANFIS algorithm effectively classifies mastitis using both data reduction techniques, with Pearson correlation providing slightly better results. However, in Fig. 6, Pearson correlation technique shows slightly better performance compared to the PCA technique for both training and testing datasets. Both techniques demonstrate good model performance with errors centered around zero and high diagonal values in the confusion matrices. The GA-ANFIS model effectively classifies mastitis using both data reduction techniques, with Pearson Correlation providing marginally better results. Overall, the GA-ANFIS model performs well in classifying mastitis, with both data reduction techniques showing promising results. The Pearson Correlation technique appears to offer a slight edge in terms of lower error metrics. The comparative analysis across Figs. 4, 5, and 6 demonstrates that the Pearson correlation technique consistently outperforms PCA in classifying mastitis using various ANFIS models. While both data reduction methods show good generalization capabilities, Pearson correlation provides marginally better classification accuracy and reduced error metrics.

Fig. 4
figure 4

The performance of GD-ANIFS classification for mastitis across PCA and Pearson data reduction techniques.

Fig. 5
figure 5

The performance of PSO-ANIFS classification for mastitis across PCA and Pearson data reduction techniques.

Fig. 6
figure 6

The performance of GA-ANIFS classification for mastitis across PCA and Pearson data reduction techniques.

The top row in Fig. 7 shows GD-ANFIS outputs. (a): 3D surface plot shows the output of the GD-ANFIS model using PCA for data reduction. The plot illustrates how the output varies with two input variables (input1 and input2). The surface appears relatively smooth with some variations, indicating the model’s response to different input combinations. (b): the output of the GD-ANFIS model using Pearson correlation for data reduction. The surface is smoother compared to the PCA-based model, suggesting a more stable and potentially more accurate model response. The output values transition more gradually across the input space. The middle row in Fig. 3, shows PSO-ANFIS output. (c): This plot displays the output of the PSO-ANFIS model using PCA for data reduction. The surface exhibits more pronounced variations and sharp transitions, indicating a more complex relationship captured by the model. The output values show significant changes with small variations in the input variables. (d): This plot shows the output of the PSO-ANFIS model using Pearson correlation for data reduction. The surface is smoother and more uniform compared to the PCA-based PSO-ANFIS model. The output values change more gradually, suggesting a more stable model response. The bottom row in Fig. 3, shows GA-ANFIS output: (e): like above, this plot presents the output of the GA-ANFIS model using PCA for data reduction. The surface shows a peak, indicating a significant response to certain input combinations. The output values vary more dramatically, suggesting the model captures complex interactions between the input variables. (f): in similar way, this plot shows the output of the GA-ANFIS model using Pearson correlation for data reduction. The surface is relatively smooth with gradual transitions, indicating a stable model response. The output values change more uniformly across the input space. In general, in GD-ANFIS both PCA and Pearson-based models show smooth surfaces, with the Pearson-based model exhibiting more stability; in PSO-ANFIS, the PCA-based model shows more complex variations, while the Pearson-based model is smoother and more stable and finally in GA-ANFIS the PCA-based model captures significant peaks, indicating complex interactions, while the Pearson-based model is smoother and more uniform. We could say that Pearson correlation-based models tend to produce smoother and more stable output surfaces compared to the PCA-based models across all optimization techniques. This suggests that Pearson correlation may be more effective in reducing data dimensionality while maintaining model stability and accuracy.

Fig. 7
figure 7

The output of the different ANFIS algorithms used in this study was analyzed using MATLAB software.

The PSO-ANFIS and GA-ANFIS models effectively classify mastitis, with Pearson correlation offering a slight but notable advantage over PCA. Overall, the results indicate that employing Pearson correlation as a data reduction method enhances the performance of ANFIS models for mastitis classification. As it went out, in our study, the GD-ANFIS outperformed the other two methods. Several potential reasons can explain this outcome: GD algorithm, is known for their efficiency in finding the optimal solution, especially when the objective function is smooth and differentiable. The GD-ANFIS may have converged more quickly and accurately to the optimal parameters compared to the other methods54, the GA and PSO algorithms are designed to explore the solution space globally; however, this global search can be less efficient and may require more iterations to converge. In contrast, GD algorithm focus on local optimization, which can be more effective when the initial parameters are reasonably close to the optimal solution55,56. The performance of GA and PSO algorithms can be highly sensitive to the choice of parameters, such as population size, mutation rate, and crossover probability. Suboptimal parameter settings can lead to slower convergence or suboptimal solutions. In contrast, GD algorithm typically has fewer parameters to tune, making them more robust and easier to implement effectively57,58. The GD algorithm generally have lower computational complexity compared to evolutionary algorithms like GA and PSO. This lower complexity can result in faster training times and more efficient use of computational resources, which is particularly advantageous when dealing with large datasets59, the objective function in our study may have been relatively smooth and well-behaved, which favors GD-ANFIS. The GD algorithm can take advantage of the smoothness to make large steps towards the optimum, whereas GA may not exploit this property as effectively60. The GD algorithm often converge more quickly than GA algorithm, especially in the early stages of optimization. This faster convergence can be crucial in practical applications where computational resources and time are limited, the specific characteristics of the mastitis modeling problem, including the nature of the input data and the relationships between variables, may have been more suited to the strengths of GD-ANFIS optimization. Different problems can have different characteristics that favor one optimization method over another61. In summary, the superior performance of GD-ANFIS in our study can be attributed to its efficient optimization, local search capabilities, lower parameter sensitivity, reduced computational complexity, and the suitability of the problem’s characteristics to GD algorithm. These factors collectively contributed to the GD-ANFIS outperforming the GA-ANFIS and PSO-ANFIS in terms of evaluation metrics. Additionally, the choice of the Gaussian Bell membership function likely played a crucial role in the model’s effectiveness. It is worthwhile to be mentioned that, as we could see in the supplementary file, classical algorithms like Linear SVM and Quadratic SVM provided a strong baseline for mastitis detection, particularly in scenarios where computational efficiency and interpretability are critical. However, for more complex datasets with inherent uncertainty, ANFIS models, especially GD-ANFIS, offer superior performance and flexibility. In this study, we hypothesized that fuzzy logic, along with various linguistic variables, would enhance the identification of animals with clinical mastitis that are not easily recognizable.

In Fig. 8a: shows the architecture of the GD-ANFIS model using PCA for data reduction. The model consists of five input variables (input1 to input5), each associated with a membership function. These inputs are combined to form a single output (output1) through a series of fuzzy rules. The system has 5 inputs, 1 output, and 5 rules; Fig. 8b: shows the architecture of the GD-ANFIS model using Pearson correlation for data reduction. Similar to the PCA-based model, it has five input variables (input1 to input5), each with a membership function. The inputs are combined to produce a single output (output1) through fuzzy rules. The system has 5 inputs, 1 output, and 30 rules, indicating a more complex rule base compared to the PCA-based model; Fig. 8c: illustrates the architecture of the PSO-ANFIS model using PCA for data reduction. The model includes five input variables (input1 to input5), each with a membership function. These inputs are combined to form a single output (output1) through fuzzy rules. The system has 5 inputs, 1 output, and 10 rules; Fig. 8d: shows the architecture of the PSO-ANFIS model using Pearson Correlation for data reduction. It has five input variables (input1 to input5), each with a membership function. The inputs are combined to produce a single output (output1) through fuzzy rules. The system has 5 inputs, 1 output, and 10 rules; Fig. 8e: presents the architecture of the GA-ANFIS model using PCA for data reduction. The model consists of five input variables (input1 to input5), each with a membership function. These inputs are combined to form a single output (output1) through fuzzy rules. The system has 5 inputs, 1 output, and 5 rules and finally, Fig. 8f: shows the architecture of the GA-ANFIS model using Pearson Correlation for data reduction. It has five input variables (input1 to input5), each with a membership function. The inputs are combined to produce a single output (output1) through fuzzy rules. The system has 5 inputs, 1 output, and 30 rules. In total we could say, for GD-ANFIS, both PCA and Pearson-based models have five input variables and produce a single output. The Pearson-based model has a more complex rule base (30 rules) compared to the PCA-based model (5 rules); for PSO-ANFIS, both PCA and Pearson-based models have five input variables and produce a single output. Both models have 10 rules, indicating a moderate complexity in the rule base and for GA-ANFIS, both PCA and Pearson-based models have five input variables and produce a single output. The Pearson-based model has a more complex rule base (30 rules) compared to the PCA-based model (5 rules). In this way, Pearson correlation-based models tend to have a more complex rule base compared to the PCA-based models across all optimization techniques. This suggests that Pearson Correlation may capture more intricate relationships in the data, leading to a more detailed rule set in the ANFIS models. However, in many studies, PSO-based ANFIS algorithm has performed better across different domains. Li et al.62 employed GA and PSO-based ANFIS algorithms to estimate bio-oil yield. The PSO-ANFIS algorithm demonstrated superior performance compared to the GA-ANFIS model. Furthermore, it is revealed that the ANFIS-PSO model outperformed the GA-ANFIS model (based on R2 and RMSE)17. These findings align with the current research, which underscores the advantages of PSO-ANFIS over GA-ANFIS in terms of optimization and predictive accuracy. Also, Abdelaziz et al.63 shown the PSO-ANFIS effectiveness to forecast the production of biochar as a form of renewable energy. Additionally, in assessing various grades of non-alcoholic fatty liver, it was shown that the PSO-ANFIS (RMSE: 0.5048) was a more appropriate model64. Also, it is shown that the PSO-ANFIS algorithm is the most appropriate for predicting peak ground acceleration65 and predicting gas density66. However, Sharma et al.67 concluded that the GA-ANFIS achieved a sensitivity of 96.6%, a specificity of 95.3%, and an accuracy of 98.67% but Azad et al. shown that the ANFIS-GA exhibited inferior performance relative to the ANFIS model optimized using differential evolution (DE -ANFIS) in evaluating the river’s quality68. In a separate study aimed at forecasting wheat grain yield based on energy inputs, it was turned out that the ANFIS model, utilized Gaussian Bell input membership, outperformed ANN model in terms of accuracy for predicting wheat grain yield, which was consistent with the methodology of the current study69. Overall, the performance of the ANFIS model surpassed that of the autoregressive integrated moving average (ARIMA) model70. Haznedar and Kalinli71 utilizing GA-ANFIS with backpropagation algorithm, identified the GA as the most effective optimizing method.

Fig. 8
figure 8

Different architectures of ANFIS algorithms learned by data in this study.

The primary objective of this research was to evaluate the effectiveness of various ANFIS algorithms in classifying mastitis in dairy cattle. The study focused on assessing the accuracy, precision, and overall predictive performance of ANFIS models in forecasting variables related to dairy cattle health. By training and evaluating these models using expert-labeled data, the ANFIS-based models were able to learn and generalize the underlying patterns and relationships characteristic of mastitis infections. Our findings highlight the practical applications of these models in real-world settings, offering tangible benefits for precision breeding and management in dairy farms. Additionally, the study explored the ANFIS model’s capability to accurately predict various parameters crucial to dairy cattle production. The research also delved into the potential advantages of employing ANFIS modeling to enhance decision-making processes within the dairy sector. In this study, we tried to come up with a novel application of ANFIS-based models for mastitis classification in Holstein dairy cattle, aiming to reduce uncertainty in mastitis diagnosis and enhance the reliability of detection methods. To this end, we utilized expert-labeled mastitis cases to ensure the accuracy of the classification process. This study could be of some potential practical benefits for dairy farms, including improved mastitis management and decreased economic losses. To this end, significant implications for the dairy industry, offering reliable and early detection methods for mastitis can be addressed in this study. By elaborating on these key aspects, this study underscores its relevance, innovativeness, and potential impact on the dairy industry, making it more engaging and informative for the reader. This study is expected to enhance the swift detection of mastitis, facilitating timely interventions and treatments that can prevent the spread of infection and reduce the severity of the condition. It aims to provide farmers with a reliable tool for monitoring herd health, thereby encouraging better management practices and lessening the economic repercussions associated with mastitis. Additionally, it will assist in identifying cattle with lower susceptibility to mastitis, thereby improving the genetic quality of the herd and increasing overall productivity. Furthermore, this research will adopt a data-driven approach to decision-making, empowering farmers to make informed choices regarding treatment, culling, and breeding strategies, while also contributing to a reduction in the unnecessary use of antibiotics, thus fostering sustainable farming practices. The potential of feature engineering to enhance model performance was not explored in the present study. Our focus was deliberately directed toward evaluating feature selection methods applied to the raw input variables. This approach was chosen to minimize the risk of overfitting, increased model complexity, and potential redundancy that can arise with the introduction of engineered features, especially in conjunction with the feature selection techniques already employed. Future investigations will consider the impact of feature engineering, examining time-based variables, interaction effects, and normalization strategies to further refine the classification of mastitis.

Conclusion

Our research conclusively demonstrated the superiority of the ANFIS model optimized using GD algorithm. The advantages of GD are multifaceted and can be attributed to several key factors: its efficiency in convergence, simplicity, scalability, fine-tuning capabilities, and robustness against local minima. GD is renowned for its effective convergence characteristics, particularly in high-dimensional environments. By systematically modifying model parameters towards the steepest decline of the loss function, GD often achieves quicker convergence compared to population-based approaches such as PSO. This efficiency is crucial in scenarios requiring prompt decision-making, such as in the management of dairy cattle. In summary, the superior performance of the GD-ANFIS model in our study can be attributed to its efficient convergence, simplicity, scalability, and fine-tuning capabilities. These features make GD a robust and effective optimization method for real-world applications, particularly in the field of dairy cattle management. The findings underscore the potential of GD-ANFIS models to provide accurate and timely predictions, thereby supporting informed decision-making and improving the overall efficiency and sustainability of dairy farming practices.