Introduction

Effective maintenance management plays an indispensable role in industrial settings where operational efficiency, equipment reliability, and safety compliance directly influence both productivity and profitability. Within fertilizer manufacturing units, where production processes are typically continuous and equipment-intensive, the role of maintenance crews becomes even more critical. The uninterrupted function of rotating machinery, pressure vessels, and process pipelines is not only essential to meeting production targets but also vital to ensuring safe and compliant plant operations. However, despite the high stakes associated with industrial maintenance, evaluating the performance of maintenance crews has historically posed significant challenges. Evaluations have often relied on subjective impressions, anecdotal feedback, or isolated metrics, failing to provide a comprehensive or comparable measure of performance across crews or contractors. This subjectivity hampers accountability, weakens improvement feedback loops, and obscures the identification of best practices.

To address these challenges, this study presents the development and successful implementation of a data-driven evaluation and comparative rating scheme for maintenance crews. The framework provides a structured, transparent, and quantifiable approach to performance management that can be embedded into ongoing maintenance workflows. The central premise of the scheme is to shift from intuition-based assessments to an objective model grounded in standardized key performance indicators (KPIs), enabling equitable comparison, actionable insights, and performance benchmarking. Specifically, the model integrates three core metrics: the time taken for the execution of a maintenance task, the actual cost incurred, and the technical quality of the work performed. These metrics correspond respectively to three critical indicators—Mean Time To Repair (MTTR), Cost of Maintenance (CoM), and Mean Time Between Failures (MTBF)—which are widely recognized in industrial engineering literature as benchmarks of efficiency, cost-effectiveness, and reliability. They are defined as under :

  • Mean Time Between Failures (MTBF): The average time a system or component operates between failures.

  • Mean Time to Repair (MTTR): The average time required to diagnose and repair a failed component.

  • Cost of Maintenance (CoM): The total expenditure associated with performing maintenance tasks, including labor, parts, and overhead.

The performance evaluation framework is designed to incentivize timely execution, control cost overruns, and uphold engineering quality standards. The time factor, associated with MTTR, captures the crew’s responsiveness and planning ability, emphasizing adherence to start and completion schedules. The cost factor reflects the discipline in resource utilization and procurement efficiency, aligning with CoM as an operational KPI. Finally, the quality factor assesses the durability and precision of executed work, linking directly to MTBF and long-term equipment reliability. Together, these three elements form a holistic view of maintenance effectiveness, bridging operational execution with strategic outcomes. Each job is scored based on deviations from planned targets, and cumulative scores are used to rank and compare contracting agencies or internal teams over defined timeframes.

This study further contributes to the literature by aligning the proposed model with modern advancements in predictive maintenance and digital analytics. As established in the literature review, recent research from 2022 to 2024 highlights the growing use of AI, IoT-enabled monitoring, digital twins, and machine learning for predictive diagnostics and real-time maintenance planning. However, while these tools offer powerful insights into machine conditions, they often underutilize human performance analytics. The current framework fills this gap by quantifying the performance of maintenance personnel in conjunction with equipment data, thereby enabling a more integrated view of operational health. Drawing upon findings from related studies in chemical and power generation industries, the framework introduces benchmarking logic that allows organizations to compare agencies based on cumulative performance data and de-list underperforming crews. This comparative analysis fosters a culture of continuous improvement and competitive excellence within maintenance departments.

The statistical validation of the model, as detailed in Sect. 6, provides further rigor to the framework. Using paired t-tests, one-way ANOVA, regression modeling, and confidence interval analysis, the study confirms that implementation of the scheme leads to statistically significant improvements in MTTR, CoM, and MTBF. Beyond these quantitative gains, the system offers qualitative benefits such as increased accountability, enhanced planning accuracy, and improved transparency in crew assessments. Importantly, it mitigates the influence of confounding factors by incorporating data segmentation based on environmental conditions and shift patterns, as discussed in Sect. 7. In addition, the model’s comparative structure enables benchmarking that is fair and unbiased, even in diverse operational contexts.

The discussion in later sections, especially Sect. 9, illustrates the broader operational implications of the evaluation scheme. Improvements in downtime reduction, cost control, and work quality contribute directly to overall equipment effectiveness (OEE), offering tangible business value. Moreover, the scheme’s modular design allows for scalability across sectors such as power generation, oil and gas, and automotive manufacturing—any domain where structured maintenance and measurable performance are critical to success. This adaptability, combined with its empirical grounding and technological readiness, makes the model not only practical but future-proof. By aligning maintenance evaluation with measurable KPIs, predictive capabilities, and human performance analytics, this framework sets a new standard for how industrial organizations can optimize one of their most critical functions: maintenance execution.

Literature review

The increasing complexity of modern industrial operations, particularly within process-intensive sectors like fertilizer manufacturing, has elevated maintenance from a reactive function to a strategic driver of operational excellence. Against this backdrop, a growing body of research over the past three years has emphasized the convergence of predictive analytics, artificial intelligence (AI), real-time diagnostics, and human-centric evaluation models. This study builds upon these insights by introducing a weighted, data-driven evaluation framework that holistically assesses maintenance effectiveness across three dimensions: Mean Time To Repair (MTTR), Cost of Maintenance (CoM), and Mean Time Between Failures (MTBF). The integration of these dimensions enables a more balanced view of maintenance crew performance, bridging gaps between technical precision, economic efficiency, and operational continuity.

Recent advances in AI-driven predictive maintenance have been particularly transformative, enabling significant shifts in how industrial assets are managed. Chen and Li3 and Hinrichs et al.7 highlight the role of intelligent algorithms, including neural networks and deep reinforcement learning, in accurately predicting equipment failure, leading to measurable reductions in unplanned downtime—up to 30% in some high-risk environments. These models also support proactive resource allocation, yielding more than 20% savings in maintenance expenditure. Such outcomes align directly with the objectives of the proposed framework, which aims to minimize MTTR and CoM through data-informed planning and crew accountability.

Complementing these developments is the integration of Condition-Based Maintenance (CBM) systems with IoT infrastructure. Studies by Kumar & Singh10, Nguyen and Pham13, and Bai et al.2, show that real-time monitoring systems powered by cloud and edge computing can reduce unplanned outages by over 50%, while simultaneously enhancing technician deployment and responsiveness. These findings support the model’s emphasis on continuous performance tracking and dynamic evaluation, particularly when assessing the effectiveness and timeliness of maintenance interventions. Equally important is the evolution of digital twin technology, which has rapidly emerged as a cornerstone of simulation-based asset management. Lee and Park11 demonstrated that digital twins can improve MTBF metrics by up to 30%, allowing engineering teams to forecast the impacts of various maintenance strategies and optimize scheduling. The incorporation of simulated outcomes into performance scoring—mirrored in Sect. 5 of this study—adds a new layer of insight to conventional post-maintenance evaluation.

However, a key gap in existing literature remains the underrepresentation of workforce analytics in maintenance models. While equipment behavior has been thoroughly studied, the human element—maintenance crew planning, execution, and discipline—has often been overlooked. Qian and Wang16, and Ma and Zhang12, address this issue by advocating for supervised learning and process mining techniques to track and assess crew performance, safety compliance, and task accuracy. Their recommendations are directly reflected in the current study’s emphasis on structured scoring for work time, cost deviation, and repair quality—elements essential to quantifying human contributions to maintenance outcomes. Building further on this, benchmarking frameworks have gained traction as tools for driving performance through transparency and competition. Singh and Kumar18 report that organizations using AI-enhanced benchmarking experience a 15–20% boost in efficiency and lower contractor costs. Omar and Ali14 reinforce the value of standardized scoring systems, particularly for multi-vendor comparisons. In line with this, the evaluation framework in this study uses cumulative scores to not only rank maintenance crews but also to de-list underperforming agencies, promoting a results-driven culture and supporting long-term performance gains.

Equally crucial to the model’s credibility is the robust statistical validation of outcomes. Ding and Zhang4, Wang & Zhou20, Xu & Chen 21and Hinrichs et al.7 recommend methods such as paired t-tests, one-way ANOVA, and regression modeling to isolate treatment effects and confirm significance in maintenance interventions. Section 6 of this research adheres to this methodological rigor, demonstrating that post-implementation improvements in MTTR, CoM, and MTBF are not only operationally relevant but also statistically sound. Addressing the complexity of industrial environments, the model also accounts for confounding factors such as work shift variability, crew experience, environmental conditions, and asset age. Wang and Zhou [22] advocate for regression adjustments and stratified sampling to manage such complexity, a strategy adopted in Sect. 7 to ensure unbiased and equitable evaluation.

In evaluating the positioning of this model relative to existing approaches, it becomes evident that traditional strategies such as CBM, RCM, and TPM focus primarily on equipment health, often sidelining crew performance. Adhikari and Gupta1 and Fernandez and Garcia5 call for more integrative frameworks that consider both technical and human aspects of maintenance. The proposed model responds to this call by aligning machine metrics with human-driven parameters, creating a more holistic and actionable performance evaluation platform. Finally, the model’s scalability is underscored in Sect. 9, where it is presented as a modular solution applicable beyond fertilizer manufacturing. Research by Tan and Huang19 and Xu and Chen [23] affirms the need for adaptable evaluation frameworks across sectors such as power generation, oil and gas, and discrete manufacturing, where maintenance remains a core determinant of system reliability and organizational competitiveness.

Methodology

Before delving into the specific components of the methodology, it is essential to outline the overarching research approach and the rationale behind the adopted framework. The methodology was carefully designed to ensure a comprehensive evaluation of maintenance crew performance using both empirical data and structured assessment criteria. The following subsections describe in detail the study design, data collection mechanisms, evaluation model, analytical techniques, and measures taken to control for confounding variables.

Study design

This study adopts a mixed-methods research design, combining both quantitative and qualitative approaches to evaluate the performance of maintenance crews in a fertilizer manufacturing environment. The research is grounded in comparative case analysis, wherein the performance of crews is assessed both before and after the implementation of a structured, data-driven evaluation framework. This longitudinal approach supports time-based comparisons and trend detection, facilitating outcome tracking over a sustained period.

Drawing inspiration from contemporary hybrid models in maintenance analytics6,7, the study design reflects the dual need for precision and contextual insight in industrial performance measurement. The methodology is underpinned by prior successes in predictive maintenance research1 and workforce analytics12, offering a scalable and empirical foundation that aligns with the broader industry trend toward smart maintenance ecosystems.

Data collection

Data was collected over a 12-month operational cycle from diverse sources to enhance reliability, triangulate insights, and mitigate bias. Multiple data collection streams were employed, each linked directly to the three pillars of the evaluation model—time, cost, and quality.

  • Operational Records: Work orders, job cards, and maintenance logs were reviewed to track task schedules, durations, and resourcing. This provided baseline and post-implementation data for Mean Time To Repair (MTTR) and Cost of Maintenance (CoM) indicators.

  • IoT and Sensor-Based Monitoring: Real-time machine data was obtained from IoT-enabled systems that logged uptime, vibration patterns, and fault codes. These systems enabled the capture of continuous performance metrics, aligning with Ibrahim & Osman’s8 model of condition-based digital diagnostics.

  • Quality Audits and Inspections: Independent audits were conducted by certified engineering inspectors. These evaluations were benchmarked against OEM technical manuals and quality norms as structured by Jiang & Li9, ensuring objective and standardized assessments of work quality (linked to MTBF outcomes).

  • Historical and Contractual Data: Performance archives from previous contractor engagements were analyzed to identify cost trends, safety incidents, and performance deviations. This supported benchmarking efforts, a method validated in the research of Singh & Kumar18.

  • Contextual Metadata: Environmental, operational, and production-related data were also gathered to control for external factors and ensure fair comparisons across diverse operational contexts19.

Evaluation framework

The evaluation framework was structured around three primary criteria: time efficiency, cost adherence, and quality of execution—each represented by MTTR, CoM, and MTBF respectively. These dimensions were weighted and scored using a tri-factor rating system, allowing for standardization and comparability.

  • Time Taken for Execution (40%): This factor was assessed against pre-defined job start and end dates. Delays were penalized using a progressive point deduction system: 1 point deducted per day of start delay and 2 points per day of completion delay. This system emphasized punctuality and planning efficiency, encouraging crews to improve scheduling discipline.

  • Actual Cost Incurred (30%): Cost performance was compared against revised work order estimates, which included statutory and inflationary adjustments. Any overrun beyond 10% of the approved budget resulted in a 2-point penalty per increment. This structure promoted financial discipline and minimized the misuse of contingencies.

  • Quality of Work Executed (30%): Quality was evaluated using a three-tier grading scale—Poor (0%), Acceptable (60%), and Good (100%)—based on inspection results and post-repair failure rates. Engineers assessed outcomes based on compliance with technical standards and job sustainability. This approach was modeled after practices in Ma & Zhang12 and validated by Patel & Desai15 in AI-enhanced predictive maintenance contexts.

The combined performance score thus allowed for objective comparison of internal teams and external contractors, facilitating rankings and targeted improvement interventions.

Analytical techniques

To ensure the analytical rigor of the evaluation framework and to validate its real-world impact, the following statistical validation methods were applied using SPSS and Python statistical libraries:

  • Paired and Independent t-Tests: These were conducted to identify statistically significant changes in performance metrics (MTTR, CoM, and MTBF) before and after framework implementation. Paired tests assessed within-group improvements, while independent tests compared across different crews17.

  • One-way ANOVA with Tukey’s HSD: ANOVA tested whether the observed differences in crew performance were significant across shifts or departments. Post-hoc Tukey’s tests identified which specific groups differed, supporting comparative performance evaluation2.

  • Multiple Regression Analysis: Regression was used to isolate the impact of independent variables such as crew experience, workload intensity, equipment age, and shift duration on MTTR, CoM, and MTBF. This method, supported by Qian & Wang16, provided insight into performance influencers and enabled predictive modeling.

  • Confidence Intervals (95%): Statistical intervals were calculated for MTTR, CoM, and MTBF metrics to quantify the precision of estimated changes and validate robustness4.

This suite of methods provided comprehensive validation, enabling both descriptive and inferential insights into performance trends.

Control for confounding variables

To ensure that the observed changes were attributable to the evaluation framework and not to external influences, the study included control mechanisms for five key confounding variables:

  • Skill Variability: Crew profiles were stratified based on years of experience, technical qualifications, and certification levels. Grouping allowed for performance normalization and reduced skill bias [22].

  • Environmental Conditions: Data was adjusted for seasonal variables such as humidity, ambient temperature, and monsoon-related disruptions. This ensured comparability across crews operating under different environmental loads3.

  • Production Load: Maintenance activities were matched with corresponding production cycles to eliminate performance distortion due to task intensity differences. Evaluations were restricted to comparable production output periods13.

  • Equipment Condition: Repairs were categorized based on asset life stage—new installations versus aged equipment. This enabled equity in scoring, especially where failure likelihood or downtime complexity was inherently variable11.

  • Shift Schedules: Day and night shifts were analyzed separately to isolate impacts due to fatigue, lighting, and staff availability. This ensured that shift timing did not bias crew comparisons, a method recommended in Omar & Ali14.

Through this layered methodological structure, the study ensured that its findings were internally valid, externally applicable, and statistically defensible—laying a strong foundation for Sects. 4 to 9, which detail implementation, scoring, benchmarking, results, and cross-sectoral adaptability.

The scheme

The structured performance evaluation scheme developed and implemented in this study revolves around a tripartite model comprising three core dimensions—time, cost, and quality. These dimensions were selected based on their direct influence on equipment reliability, maintenance economics, and operational excellence in industrial settings, particularly in fertilizer manufacturing environments where high uptime and low tolerance for deviations are critical. The scheme was designed to introduce objectivity, transparency, and comparability into the assessment of contracting agencies responsible for executing maintenance tasks. Each factor was assigned a specific weightage to reflect its strategic importance in maintenance performance outcomes, as shown in Table 1.

Table 1 Factors and weightage points for evaluation Scheme.

Each factor is independently scored and cumulatively contributes to the overall rating of a contracting agency. The details of the operational mechanism of each factor are described in the following subsections.

Time taken for the work

Adherence to execution timelines is critical in minimizing operational disruptions and improving equipment availability. In typical maintenance contracts, scheduled start and completion dates are indicated in the work order. This scheme allocates a total of 40 points to this factor—10 for adherence to the start schedule and 30 for timely completion.

If the contracting agency initiates the task on the scheduled start date, it receives the full 10 points. Any delay results in a deduction of 1 point per day of delay. Likewise, if the task is completed by the scheduled date, the agency receives the full 30 points allocated to this sub-factor. Delays in completion incur a penalty of 2 points per day. This disproportionate penalty structure reflects the increased operational impact of late completions compared to late starts.

The scheme accounts for legitimate exceptions. If an extension is granted due to reasons attributable to the organization or external factors beyond the contractor’s control (e.g., equipment non-availability, site readiness issues), the scheduled dates are officially revised, and no penalty is applied. However, if delays are due to the contractor’s inefficiency, the original schedule is retained for scoring. This mechanism encourages proactive planning and accountability. The guiding philosophy behind this factor is to minimize the Mean Time To Repair (MTTR), thereby improving responsiveness and reliability.

Actual cost incurred during execution of the work

Cost performance is a key determinant of operational efficiency, particularly in contract-based maintenance environments. The scheme allocates 30 points to this factor, aiming to reward contractors who manage resources efficiently and penalize excessive cost deviations.

Cost escalations may arise from multiple sources, including scope additions, statutory price revisions, material cost inflation, rework, or delays attributable to the principal organization. In such cases, the original work order value is revised accordingly. The actual cost incurred is then compared to this revised baseline.

If the actual cost is within or equal to the revised value, the contractor is awarded the full 30 points. For every 10% increase over the revised value, a penalty of 2 points is applied. This progressive penalty structure encourages cost control while allowing for reasonable flexibility in cases of justified changes. The overarching principle of this factor is to minimize the Cost of Maintenance (CoM) by promoting financial discipline and resource efficiency.

Quality of work executed

The final evaluation parameter pertains to the quality of work, which is essential for ensuring equipment reliability and operational safety. Quality is assessed in terms of adherence to technical specifications, workmanship standards, and compliance with engineering best practices. The evaluation is performed by a qualified engineer or supervisor designated by the organization.

To ensure objectivity, a three-tier rating system is adopted, as shown in Table 2.

Table 2 Evaluation of quality Parameter.

A rating of “Good” indicates flawless execution as per defined specifications and yields the full score. “Acceptable” indicates minor, non-critical deviations and yields 60% of the total points. “Poor” quality, which includes critical deviations requiring rework or posing safety risks, is assigned a zero score.

The scoring system is intended to promote precision, accountability, and sustainability in maintenance tasks. Quality-related penalties also act as a deterrent to cost-cutting at the expense of performance. This factor is aligned with the philosophy of maximizing the Mean Time Between Failures (MTBF)—a critical reliability indicator in process industries.

Evaluation

The total points scored under the scheme are the sum of the points obtained for.

  1. a)

    time taken for the execution of the work.

  2. b)

    actual cost incurred during execution of the work, and.

  3. c)

    quality of the work executed.

Based on the total points scored, the ranking of the contracting agency is done as shown in the Table – 3.

Table 3 Ranking of the contracting agencies

It is proposed to evaluate the performance of the contracting agency, after execution of the work awarded. If the agency is ranked “Unacceptable/Poor”, it will be deleted from the list of the contracting agencies registered with the company and it will not be given the opportunity to participate in the tender for subsequent works of the company. If the system proposed is implemented meticulously, it will eliminate the inefficient contracting agencies from getting the work-order awarded and it ultimately leads to improvement in the “stream days” availability and consequently, the productivity of the organization.

Statistical validation of findings

To ensure the credibility and robustness of the observed improvements in maintenance crew performance, this study employed a range of statistical validation tools. These included paired and independent t-tests, one-way analysis of variance (ANOVA), multiple regression analysis, and confidence interval estimation. Each method was selected to validate specific aspects of the performance metrics—Mean Time to Repair (MTTR), Cost of Maintenance (CoM), and Mean Time Between Failures (MTBF)—and to isolate the effects of confounding variables.

Paired and independent sample t-Tests

To assess whether the differences in maintenance performance before and after the implementation of the evaluation framework were statistically significant, paired t-tests were conducted on pre- and post-intervention data. Independent t-tests were also applied to compare performance across different maintenance crews.

Table 4 presents the results for three key performance metrics. The analysis revealed statistically significant improvements in all areas. For instance, MTTR decreased from 8.2 to 5.7 h, CoM dropped from ₹112,000 to ₹91,300, and MTBF increased from 27.5 to 34.4 days—all with p-values less than 0.001.

Table 4 Paired t-test results for key performance Metrics.

One-way ANOVA and tukey’s post-hoc test

To examine variability among different crews, a one-way ANOVA was conducted for MTTR, CoM, and MTBF. The results, shown in Table 5, indicate statistically significant differences across crews for all three metrics. A post-hoc Tukey’s HSD test revealed that Crew A and Crew F differed significantly in MTTR (p = 0.002), and that Crew C outperformed others in MTBF (p < 0.001). This supports the framework’s usefulness in identifying best-performing teams.

Table 5 ANOVA results for performance metrics across Crews.

Multiple regression analysis

A multiple linear regression analysis was conducted to determine which factors most influenced maintenance efficiency. Predictors included crew experience, workload, equipment age, and shift type. The findings are summarized in Table 6.

Table 6 Regression coefficients and significance Values.

Crew experience emerged as the strongest positive predictor (β = 0.47, p < 0.001), while both workload (β = −0.31) and equipment age (β = −0.29) had statistically significant negative impacts. Shift type was not a significant predictor. The model explained 68% of the variance in performance (R² = 0.68, Adjusted R² = 0.64), indicating a strong overall fit.

Confidence intervals for key performance metrics

To enhance the robustness of the findings, 95% confidence intervals (CIs) were calculated for post-implementation metrics. The intervals, presented in Table 7, are relatively narrow, reinforcing the precision and reliability of the observed outcomes. For example, MTTR has a CI of [5.1, 6.3] hours, and MTBF ranges between [31.7, 37.1] days.

Table 7 Confidence intervals for performance Metrics.

Thus, the comprehensive statistical validation confirms that the observed improvements following implementation of the evaluation framework are statistically significant. The integration of t-tests, ANOVA, regression modeling, and confidence intervals lends strong support to the framework’s effectiveness in improving maintenance efficiency, reducing costs, and enhancing equipment reliability. These findings validate the practical application of the model and offer a data-backed approach to workforce and operational performance enhancement.

Addressing potential confounding factors

To ensure the validity and credibility of the observed improvements in maintenance crew performance within fertilizer manufacturing units, it is essential to consider potential confounding factors. These variables, if not accounted for, could influence the results and lead to misinterpretation of the effectiveness of the proposed evaluation framework. The following key confounders were analyzed and discussed:

Workforce skill level

Variability in crew experience and technical expertise could significantly impact maintenance efficiency. To control for this factor, crew members were categorized based on years of experience and training certifications. Statistical adjustments were applied using regression analysis to isolate the effect of skill level on performance outcomes.

Environmental conditions

Temperature, humidity, and other external conditions can affect both equipment performance and crew productivity. Data normalization techniques were employed to adjust for variations in environmental factors, ensuring that observed performance improvements were not due to seasonal or climatic changes.

Changes in production schedules

Variations in production load can influence maintenance demands, potentially skewing performance metrics. To mitigate this confounding effect, production data was included in the analysis, and performance comparisons were conducted during periods of similar production intensity.

Equipment age and condition

The performance of maintenance crews is partly dependent on the condition of the machinery they service. A comparative analysis of maintenance efforts on new versus aging equipment was performed to account for differences in failure rates and maintenance complexity.

Shift schedules and workload distribution

Maintenance performance may vary between day and night shifts due to differences in workforce fatigue, available support staff, and operational constraints. To address this, shift-specific performance data was analyzed separately, and statistical corrections were applied to ensure fair comparisons.

Thus, by identifying and accounting for potential confounding factors, this study enhances the credibility and robustness of its findings. The inclusion of workforce skill level, environmental conditions, production schedules, equipment condition, and shift schedules ensures that performance improvements attributed to the evaluation framework are not the result of external influences. Future research could further refine these controls by integrating real-time monitoring systems to dynamically adjust for confounders in predictive maintenance models.

Comparative analysis with alternative maintenance evaluation methods

To strengthen the originality of this study, the proposed framework is compared with other established maintenance evaluation methods, including AI-driven predictive maintenance, condition-based maintenance (CBM), and reliability-centered maintenance (RCM). Additionally, the potential for applying this framework beyond fertilizer manufacturing to industries such as oil and gas, power generation, and general manufacturing is explored.

Comparison with alternative maintenance methods

AI-Driven predictive maintenance

Predictive maintenance utilizes AI and machine learning models to analyze sensor data and predict equipment failures before they occur. While this approach minimizes unplanned downtime, it requires extensive real-time data collection and advanced computational infrastructure. In contrast, the proposed framework provides a structured, data-driven evaluation of crew performance without the need for high-end predictive modeling, making it more feasible for facilities with limited digital infrastructure.

Condition-based maintenance (CBM)

CBM relies on real-time monitoring of equipment parameters such as vibration, temperature, and pressure to determine maintenance needs. While CBM improves maintenance efficiency by addressing issues as they arise, it does not directly assess crew performance or operational effectiveness. The proposed framework complements CBM by integrating crew efficiency metrics, ensuring that both human and machine factors are considered in the maintenance evaluation process.

Reliability-centered maintenance (RCM)

RCM is a structured approach that prioritizes maintenance tasks based on equipment criticality and failure consequences. While RCM focuses on optimizing resource allocation, it does not provide ongoing performance tracking of maintenance crews. The proposed framework can be used alongside RCM to enhance workforce management and operational efficiency by providing continuous performance feedback.

Cross-industry applications of the framework

Oil and gas industry

Oil and gas operations require highly efficient maintenance strategies due to the critical nature of equipment reliability and safety regulations. The proposed framework can be adapted to assess maintenance crew performance in offshore drilling platforms, refineries, and pipeline networks, ensuring compliance with safety standards and optimizing equipment uptime.

Power generation

Power plants, whether coal, nuclear, or renewable energy facilities, rely on well-coordinated maintenance teams to prevent outages and enhance operational stability. By implementing this framework, power generation facilities can systematically evaluate maintenance effectiveness, reducing downtime and improving workforce productivity.

General manufacturing

In manufacturing sectors such as automotive, electronics, and chemical production, efficient maintenance strategies are crucial for sustaining productivity and reducing operational disruptions. The proposed framework can serve as a standardized method for evaluating maintenance crew performance across various manufacturing settings, leading to improved resource allocation and process optimization.

Therefore, comparing the proposed framework with predictive maintenance, CBM, and RCM highlights its uniqueness in addressing workforce efficiency alongside equipment reliability. Furthermore, its adaptability to industries beyond fertilizer manufacturing—such as oil and gas, power generation, and general manufacturing—demonstrates its broader applicability. Future research can explore industry-specific customizations of this framework to maximize its impact across different sectors.

Discussion: linking the evaluation scheme to operational efficiency

The implementation of the structured evaluation scheme has had a measurable and significant impact on the operational efficiency of the fertilizer manufacturing unit. By integrating critical metrics—Mean Time to Repair (MTTR), Cost of Maintenance (CoM), and Mean Time Between Failures (MTBF)—into a transparent scoring framework, the study successfully demonstrates how maintenance performance can be objectively assessed and improved.

Maintenance efficiency: reduction in MTTR

One of the clearest indicators of improved operational performance was the reduction in MTTR, which dropped from an average of 8.2 h to 5.7 h following the implementation of the framework—a 30.5% reduction. This was primarily driven by the time-based scoring criteria, which penalized delayed starts and completions, thus encouraging better planning and execution by maintenance crews.

This reduction in MTTR translated to a 21% increase in equipment uptime, enabling more stream days for production and thereby enhancing overall throughput. Similar outcomes have been observed in other process industries adopting structured time-based evaluations17.

Cost efficiency: reduction in com

The second key performance area affected was Cost of Maintenance (CoM). Data analysis revealed that the average cost per maintenance task decreased from ₹112,000 to ₹91,300, a total savings of 18.4%. Notably, the number of jobs completed within or under budget increased from 59 to 76%, reflecting more disciplined resource management and reduced rework costs.

This aligns with literature indicating that introducing accountability-based cost evaluation frameworks reduces overspending and increases budget adherence12.

Equipment reliability: improvement in MTBF

Improvements in the quality of maintenance work had a direct impact on equipment reliability. The Mean Time Between Failures (MTBF) increased from 27.5 days to 34.4 days, representing a 25% improvement. The quality assessment mechanism—categorizing outcomes as Acceptable, Good, or Poor—proved effective in encouraging adherence to technical standards and eliminating low-quality workmanship.

This not only led to fewer equipment breakdowns but also significantly reduced unplanned downtime, aligning with best practices outlined by Bai et al.2 and Lee & Park11.

Workforce performance and accountability

The framework’s comparative rating system allowed for transparent benchmarking of maintenance crews. Performance evaluation scores revealed:

  • 26% of crews were rated “Excellent”.

  • 42% rated “Very Good”.

  • 21% rated “Good”.

  • Only 11% rated “Poor”, and were disqualified from future contracts.

This form of structured feedback encouraged continuous improvement and facilitated the identification of best practices. Agencies with high scores were recognized, fostering healthy competition, while underperforming crews were held accountable and removed from the roster.

Overall equipment effectiveness (OEE)

The combined improvements in MTTR, CoM, and MTBF contributed to a measurable boost in Overall Equipment Effectiveness (OEE), which increased from 77.6 to 83.7% following the implementation of the evaluation framework. This 6.1% improvement in OEE indicates a tangible enhancement in plant productivity and operational stability. The improvement is primarily attributed to better availability (+ 5.0%), a slight increase in performance (+ 0.44%), and improved quality (+ 1.47%). These component-wise gains validate that structured performance management can directly influence equipment uptime, production throughput, and defect reduction.

This section provides a transparent and reproducible calculation methodology for Overall Equipment Effectiveness (OEE), which was found to increase from 77.6 to 83.7% following the implementation of the performance evaluation framework. The OEE is computed using standard formulas that incorporate availability, performance, and quality rates. Below are the detailed formulas and example calculations using representative data from the case study.

OEE Formula

OEE is calculated using the following relation :

  • OEE = Availability X Performance X Quality.

Where:

  • Availability (%) = (Operating Time/Planned Production Time) × 100.

  • Performance (%) = (Ideal Cycle Time × Total Units Produced)/(Operating Time × 60) × 100.

  • Quality (%) = (Good Units/Total Units Produced) × 100.

Present study calculation (before and after implementation)

Pre-Implementation:

  • Planned Production Time = 1000 h.

  • Operating Time = 870 h.

  • Ideal Cycle Time = 1 min/unit.

  • Total Units Produced = 48,000.

  • Good Units = 46,560.

  • Availability = (870/1000) × 100 = 87.0%.

  • Performance = (1 × 48,000)/(870 × 60) × 100 = 91.95%.

  • Quality = 46,560/48,000 × 100 = 97.0%.

  • OEE (Before) = 0.87 × 0.9195 × 0.97 = 0.776 → 77.6%.

Post-Implementation:

  • Planned Production Time = 1000 h.

  • Operating Time = 920 h.

  • Total Units Produced = 51,000.

  • Good Units = 50,220.

  • Availability = (920/1000) × 100 = 92.0%.

  • Performance = (1 × 51,000)/(920 × 60) × 100 = 92.39%.

  • Quality = 50,220/51,000 × 100 = 98.47%.

  • OEE (After) = 0.92 × 0.9239 × 0.9847 = 0.837 → 83.7%.

Summary of the above given calculations are presented in the Table 8 given below :

Table 8 Overall equipment effectiveness (OEE).

Thus, the observed OEE improvement is consistent with industry benchmarks where data-driven maintenance performance evaluations have been shown to improve resource utilization and production capacity6.

Implications for strategic operations

These results underscore the strategic value of implementing such a framework:

  • Improved scheduling minimizes production interruptions.

  • Cost tracking enhances financial planning and reduces overheads.

  • Quality assurance supports long-term reliability and reduces risk.

  • Performance visibility empowers managers to make data-backed decisions.

Moreover, the adaptability of this model suggests potential for application across various industrial domains, including power generation, oil and gas, automotive, and chemical manufacturing, where similar challenges in maintenance and reliability persist14,19.

Conclusions

This study successfully developed and implemented a data-driven performance evaluation framework for maintenance crews in fertilizer manufacturing units, addressing critical gaps in traditional maintenance assessment methods. By integrating three core metrics—Mean Time to Repair (MTTR), Cost of Maintenance (CoM), and Mean Time Between Failures (MTBF)—into a weighted scoring system, the framework achieved measurable improvements across all key operational parameters.

The results provide compelling evidence of the model’s effectiveness:

  • MTTR reduced by 30.5%, from 8.2 h to 5.7 h (p < 0.001).

  • CoM dropped by 18.4%, from ₹112,000 to ₹91,300 per task.

  • MTBF increased by 25%, from 27.5 days to 34.4 days.

These enhancements collectively contributed to a 6.1% boost in Overall Equipment Effectiveness (OEE)—rising from 77.6 to 83.7%. This improvement was driven by:

  • + 5.0% in availability.

  • + 0.44% in performance.

  • + 1.47% in quality.

The framework’s strength lies in both statistical validity and operational adaptability. Validation using paired t-tests (all p-values < 0.001), one-way ANOVA (F-values ranging from 5.87 to 9.41), and regression analysis (R² = 0.68) confirmed that the observed improvements were significant. Its modular design allows straightforward adaptation to other process industries—particularly in oil and gas (where unplanned downtime costs can exceed $500,000/day) and power generation (where a 1% OEE increase can yield up to $200,000 in annual savings per turbine). Furthermore, comparative analysis confirms that the framework complements established strategies like Reliability-Centered Maintenance (RCM) by integrating a workforce performance dimension absent in equipment-centric models.

Future Research Directions and Implementation Opportunities.

To expand the framework’s impact, four key areas are proposed for further exploration:

  1. 1.

    Metric Expansion.

  2. 2.

    Incorporate safety performance indicators (e.g., TRIR) and sustainability metrics (e.g., energy consumption from maintenance operations) for a more holistic view. Preliminary findings in similar industries show 15–20% reductions in accidents through safety-integrated models.

  3. 3.

    Technology Integration.

  4. 4.

    Combine with AI-driven predictive tools and real-time IoT sensor data to dynamically align crew performance with equipment condition. This could potentially reduce MTTR by an additional 10–15% through intelligent maintenance scheduling.

  5. 5.

    Cross-Sector Validation.

  6. 6.

    Although developed for fertilizer manufacturing, the framework aligns with ISO 55,000 standards and is relevant to 85% of asset-intensive industries. Pilots in sectors like automotive manufacturing—where maintenance can account for 15–20% of production costs—can verify broader applicability.

  7. 7.

    Longitudinal Studies.

  8. 8.

    Conduct multi-year implementation tracking to assess long-term cultural adoption and continuous improvement. Benchmarks suggest such programs can drive 4–6% annual gains beyond first-year results.

The framework’s unique ability to quantify human factors—which account for 40–60% of performance variability according to WEF studies—positions it as a potential industry benchmarking standard. Future iterations could evolve into certification tools for contractors or contribute to ISO/ANSI standards for workforce performance. By bridging the gap between equipment analytics and human capital management, this approach provides a replicable path approaching World Class Maintenance standards (OEE > 85%) across process industries.