Abstract
Sewer systems play a crucial role in protecting public health and mitigating flood risk. This study proposes a framework that integrates survival analysis and spatial data management to predict sewer failure time. Taking Hong Kong as an evidence study, comprehensive sewer data are incorporated into the ArcGIS database. The methodology employs Kaplan–Meier analysis to determine a critical time threshold (T0) at a 95% survival probability. Group differences are assessed using log-rank tests, and cumulative hazard rates are estimated via Nelson–Aalen estimation. The study investigates post-T0 degradation patterns in physical, functional, and environmental factors. Based on cumulative hazard rates after T0, a tertile-based classification system defines two boundaries (T1 and T2). This system categorizes pipelines into four risk levels, enabling decision-makers to select an appropriate failure time. The results are visualized through GIS mapping and supported by an iterative forecasting system that optimizes maintenance strategies through operational feedback.
Similar content being viewed by others
Introduction
Sewage and stormwater drainage networks play a critical role in safeguarding public health, preserving environmental quality, and sustaining the resilience of modern cities1,2,3. These vital systems demand substantial public investments, with replacement costs often reaching billions of dollars even in medium-sized cities4. As these networks aging, pipeline failures occur more frequently5,6,7, and can result in increasingly severe consequences on surrounding infrastructure8,9,10,11.
Traditionally, utility companies primarily relied on closed-circuit television (CCTV) inspections to passively identify visible defects, such as cracks, corrosion, and deposits, in drainage pipelines9,11,12,13. Although valuable, CCTV-based assessments are often prohibitively expensive, limiting inspection frequency and reducing overall coverage14. Consequently, many pipelines classified as lower priority remain insufficiently examined, allowing potential issues to develop unnoticed until severe deterioration or outright failure occurs5,14,15.
To address these limitations, predictive maintenance approaches have emerged as promising solutions for forecasting sewer failures before they occur. In this field, researchers have developed three primary categories of predictive models, namely physical models, statistical models, and artificial intelligence (AI) models, each designed to anticipate potential failures. Physical models rely on the principles of mechanics and corrosion to simulate the degradation of components under both environmental and operational stresses. For example, Teplỳ, Yoon16,17, and Shadabfar18 employed corrosion theory, whereas Davis and Frank19,20 utilized Linear Elastic Fracture Mechanics, and Zamanian21 applied Nonlinear Continuum Mechanics to predict failure. Statistical models leverage historical failure data to estimate future failures using various methods. For example, Ebrahimi and Teplỳ16,22 and Jiang and Li23,24 have used techniques such as eXplainable Inference Models (XIM) and multiple linear regression, while other studies, including those by Altarabsheh25 and Ghavami26, have applied methods like Markov chain models and Bayesian networks for failure prediction. AI and machine learning techniques provide another data-driven solution for sewer failure prediction. Artificial Neural Networks (ANN) have demonstrated exceptional capability in handling multidimensional input data, as shown in the works of Khan and Sousa11,27, while Support Vector Machines (SVM) have proven effective in addressing nonlinear prediction problems11,28. Advances in ensemble learning methods have yielded significant improvements in prediction accuracy, with studies by Fontecha and Santos8,29 implementing Random Forest, XGBoost, and CART algorithms. Additionally, rule-based systems, including fuzzy logic24, expert systems30,31, and rule-based simulation32,33 have shown effectiveness in incorporating expert knowledge and handling uncertainty analysis. These physical, statistical, and AI models typically integrate various factors, including pipe age, material properties, environmental conditions, and operational stresses, to deliver comprehensive failure predictions11,12. Despite recent advances in modeling techniques, multi-factor approaches often encounter significant challenges due to the extensive volume and stringent quality requirements of data. By contrast, concentrating on a single, high-impact factor offers several advantages. First, it streamlines implementation in environments constrained by limited data type resources. Second, it isolates the independent influence of that factor on system deterioration, thereby providing direct guidance for risk-based decision-making. Consequently, this analytical perspective not only provides a pragmatic tool for identifying critical thresholds but also lays a solid foundation for implementing timely intervention strategies.
Building on this rationale, the present study introduces a factor-specific survival analysis framework designed to systematically isolate and quantify the time-dependent effect of a key determinant on pipeline deterioration. To achieve this, the framework utilizes the Kaplan–Meier estimator and the Nelson–Aalen technique, methods originally developed for predicting patient survival in medical research34,35,36. These nonparametric methods offer distinct advantages over machine learning and hydraulic-based approaches. Specifically, they require only survival age and categorical attributes, making them suitable for data type resource-constrained utilities; they rely on straightforward mathematical computations instead of complex, iterative algorithms; and they yield clear visual outputs that facilitate immediate insights and prompt, actionable maintenance decisions while maintaining robust predictive capability. Having been proven effective across various domains, including infrastructure reliability assessment37,38, this framework equips utilities with a practical tool to prioritize maintenance through focused analysis of critical indicators such as pipe material, diameter, or age, ultimately supporting efficient system management despite data type resource constraints.
To demonstrate its real-world applicability, this study uses Hong Kong as a case example. Hong Kong’s densely built environment and extensive sewer infrastructure underscore both the complexity of pipeline deterioration and the practical need for efficient maintenance planning39,40. By integrating survival analysis with local pipeline data, this research illustrates how a focus on a single key factor can support proactive and streamlined management.
Results
The results of this study are presented in three main sections. The first section provides comprehensive statistical indicators derived from survival analysis, serving as quantitative references for industry practitioners. Section two discusses the process of determining the pipeline failure boundary time by integrating key statistical indicators with specific operational contexts. Section three summarizes the key indicators and maintenance strategies.
Physical factor-based analysis
Pipeline length classification utilizes a trisection method to divide pipelines into long, medium, and short spans. The Log-Rank test detected significant differences in survival patterns among these groups, with a p-value smaller than 0.01. Specifically, long-span pipelines reached a critical point at 42 years, medium-span at 47 years, and short-span at 49 years, all with similar initial hazard rates of approximately 0.052 (Table 1). Beyond these critical points, risk progression patterns diverged considerably. Long-span pipelines exhibited the most pronounced risk escalation, with hazard rates increasing from 0.1578 to 0.3156 between 60 and 70 years, ultimately reaching 0.5401. This accelerated deterioration likely results from factors such as greater deflection, stress concentration, and multiple joint-related risks. Medium-span pipelines showed moderate progression, with a final hazard rate of 0.3640, whereas short-span pipelines demonstrated the slowest risk accumulation (final hazard rate: 0.3226), which may be attributed to their enhanced structural stability.
Material analysis comparing vitrified clay and concrete pipelines revealed significant differences between them, as determined by the Log-Rank test (p < 0.01). Concrete pipelines reached their critical point at 45 years (survival probability 0.9498, hazard rate 0.0515), while vitrified clay pipelines lasted slightly longer to 47 years (0.9498, 0.0516) (Table 1). Both materials showed similar final cumulative hazard rates (concrete: 0.3970, clay: 0.4053). However, their risk patterns differed throughout service life. Concrete pipelines showed consistently higher hazard rates, possibly due to acid-base corrosion, cracking, and carbonation. Vitrified clay, despite better initial durability from corrosion resistance, may have experienced faster deterioration later, likely attributed to brittleness and joint seal failures.
The Log-Rank test (p < 0.01) performed in the diameter analysis demonstrated significant differences among the three categories. Small-diameter pipelines reached their critical point earliest at 46 years (survival probability 0.9498, hazard rate 0.0515), followed by medium-diameter at 46 years (0.9473, 0.0541) and large-diameter at 48 years (0.9474, 0.0540) (Table 1). While initial hazard rates were comparable, long-term risk patterns differed significantly. Small-diameter pipelines showed the steepest risk increase, with hazard rates rising from 0.1475 to 0.3256 between years 60 and 70, ultimately reaching 0.4386. This accelerated deterioration reflects their vulnerability to sediment blockages and structural deformation due to smaller cross-sections and thinner walls. Medium-diameter pipelines displayed moderate risk progression (final hazard rate 0.3805), while large-diameter pipelines showed the slowest deterioration (final hazard rate 0.3511) due to superior flow characteristics and structural stability.
A comprehensive maintenance strategy emerges from analyzing physical pipeline factors. Long-span pipelines need intensive preventive maintenance between years 60 and 70 with replacement considerations after 70 years, while medium spans require moderate maintenance focused on high-stress areas, and short spans allow extended intervals. Concrete pipelines demand intensive preventive care from their first alert year, whereas vitrified clay types need regular early inspections with increased later-stage frequency. Small-diameter pipelines require frequent maintenance, medium diameters need moderate intervals, and large diameters can have longer inspection gaps with regular structural checks.
Function factor-based analysis
The Log-Rank test revealed significant differences between foul sewer and stormwater systems (p < 0.01). Stormwater systems reached their critical point at 45 years, while foul sewer systems reached theirs at 47 years, with comparable initial survival probabilities (0.9494 for foul sewer, 0.9499 for stormwater) and hazard rates (0.0520 and 0.0514, respectively) (Table 2). The systems’ risk progression patterns differed markedly thereafter. Stormwater pipelines showed accelerated deterioration, with hazard rates rising sharply from 0.1525 to 0.4163 after year 60, due to variable loading conditions, seasonal flow changes, and debris accumulation. In contrast, foul sewer pipelines maintained more stable risk levels, benefiting from regular daily flow patterns despite carrying complex wastewater.
Based on these risk progression patterns, a targeted maintenance strategy is proposed. For stormwater pipes, it is recommended to increase inspection frequency after the age of 60. For foul sewer pipes, while a relatively stable maintenance cycle can be maintained, continuous monitoring of wastewater composition and flow patterns remains necessary.
Environment factors-based analysis
Based on Table 3, each land use category exhibits distinct hazard patterns: Green open space pipelines show three phases - slow growth from ages 46 to 61 (cumulative hazard 0.0531 to 0.1155), steady growth from 61 to 76 (reaching 0.2333), and rapid acceleration after age 76 (exceeding 0.4628). Residential pipelines demonstrate a gradual risk increase from 0.0548 at age 47 to 0.3753 at age 80. Commercial pipelines display the most stable progression, uniformly increasing from 0.0539 at age 48 to 0.4164 by age 80. Industrial pipelines show a unique pattern: increasing by 0.1142 over 20 years from age 41, accelerating sharply between ages 60 and 70 to 0.3612, and reaching 0.5390 after age 75. This pattern correlates with harsh industrial conditions, including corrosive wastewater and fluctuating ground loads. To statistically verify the differences among land use categories, the Log-Rank test was conducted. The results (Table 4) confirmed that industrial areas differed significantly from other land use categories (p < 0.01), while no significant differences were found among commercial areas, green open spaces, and residential areas (p > 0.01). These findings suggest differentiated maintenance strategies: intensive monitoring of corrosion and external loads in industrial areas from age 60 with replacement, when necessary, regular periodic inspections in residential and commercial areas, and increased monitoring frequency in green open spaces after age 76.
The study analyzed pipeline survival characteristics across low, medium, and high humidity levels. Statistical analysis revealed significant differences between medium humidity and other conditions (p < 0.01), while low and high humidity environments showed no significant variations (p = 0.26) (Table 5). Using 95% survival probability as a critical threshold, high humidity environments reached this point first at age 43, followed by low (age 45) and medium humidity (age 48). According to Table 3, risk dynamics monitoring showed distinct patterns: high humidity environments exhibited accelerated risk between ages 60 and 70, while medium humidity environments showed gradual growth from ages 48 to 60, followed by marked acceleration beyond age 60. Low-humidity environments displayed rapid early-stage risk increases before stabilizing around age 63. Medium-humidity environments demonstrated unique characteristics with low-risk accumulation until age 60, followed by sharp increases. This pattern likely results from combined moderate humidity erosion and wet-dry cycling effects. Based on these findings, targeted maintenance strategies are recommended: intensive monitoring around age 60 for medium humidity environments, consistent maintenance for high humidity pipelines, and regular standardized inspections for low-humidity systems.
The study analyzed pipeline survival characteristics across four districts: New Territories, Kowloon, Hong Kong Island, and the Islands. Statistical analysis revealed significant differences between all district pairs (p < 0.01) except between Hong Kong Island and Islands (p = 0.82) (Table 6). The survival analysis showed distinct patterns among districts. New Territories reached the warning level earliest at age 42, experiencing a steady increase until age 63 before stabilizing at 0.1777. Kowloon entered the warning period at age 47, showing moderate growth until age 59, then sharply rising to 0.4047. Islands reached the warning level at age 49, maintaining slow growth until age 68, after which the risk stabilized at 0.2248. Hong Kong Island, despite entering the warning period at age 49, displayed the most dramatic increase after age 76, reaching the highest risk level of 0.4133 (Table 3). Based on these patterns, the following maintenance strategies are proposed: (1) New Territories: routine maintenance between ages 42 and 63, followed by enhanced preventive measures; (2) Kowloon: moderate maintenance ages 47–59, with intensified interventions thereafter; (3) Islands: standard maintenance before age 68, with continued regular monitoring thereafter as risk stabilizes; (4) Hong Kong Island: moderate interventions before age 76, followed by intensive maintenance and renewal plans after age 76.
The study analyzed pipeline survival characteristics across three temperature categories: high, medium, and low. Statistical analysis revealed significant differences between high-temperature environments and others (p < 0.01), while low and medium-temperature environments showed no significant differences (p = 0.79) (Table 7). Using 95% survival probability as the warning threshold, low-temperature pipelines reached this level first (43 years), followed by medium-temperature (46 years) and high-temperature pipelines (49 years). Risk progression analysis (Table 3) showed distinct patterns: Low-temperature pipelines experienced gradual risk increase (0.0514–0.1466) between 43 and 61 years, followed by accelerated deterioration to 0.4177. Medium-temperature pipelines demonstrated the most stable progression, reaching 0.3606. High-temperature pipelines, despite the latest breach of 49 years, showed superior reliability to 75 years, likely due to stricter design standards. However, after 75 years, their failure probability exceeded others, suggesting accelerated aging from prolonged heat exposure. Based on these patterns, the following maintenance strategy is proposed: (1) High-temperature pipelines: Begin prevention at 49 years, focusing on material degradation, with accelerated replacement after 75 years; (2) Medium-temperature pipelines: Implement regular maintenance from 46 years; (3) Low-temperature pipelines: Institute intensive monitoring from 61 years with consistent lifecycle maintenance.
The study analyzed pipeline survival across four traffic categories: none, light, moderate, and heavy. Statistical analysis showed significant differences between heavy traffic versus light/moderate conditions (p < 0.01), and between no-traffic versus light/moderate conditions (p = 0.01) (Table 8). Pipelines enter critical monitoring when the survival probability drops below 95%. Tracking data revealed distinct warning threshold patterns: heavy traffic pipelines reached warning levels first at age 44 (survival probability 0.9472), followed by no traffic (age 46, 0.9500), light traffic (age 47, 0.9484), and medium traffic (age 48, 0.9475). Heavy traffic pipelines showed the highest final cumulative risk (0.7812), while others maintained moderate levels (0.3393–0.3975). Risk analysis revealed accelerated deterioration in heavy traffic pipelines, with risks increasing from 0.0542 to 0.1365 between ages 44 and 60, then rapidly doubling. This acceleration stems from increased vertical pressure and fatigue effects, leading to accelerated structural damage and crack propagation. No traffic and moderate traffic pipelines showed similar risk patterns, with final risks of 0.3975 and 0.3393, respectively - statistically different but practically comparable (Table 3). Based on these findings, maintenance recommendations are: (1) Heavy traffic pipelines: Begin preventive maintenance at age 44, with intensive monitoring through age 60. Consider replacing it after age 60. (2) Light/moderate traffic pipelines: Start inspections at ages 47–48, focusing on stress points. Longer maintenance intervals are acceptable given lower risks. (3) No-traffic pipelines: Implement regular maintenance from year 46, despite higher cumulative risk (0.3975).
The study analyzed pipeline survival characteristics across three rainfall categories: high, medium, and low rainfall regions. Statistical analysis revealed significant differences between all rainfall regions (p < 0.01). Medium rainfall regions reached the critical point earliest (0.9476 at age 44), followed by low rainfall regions (0.9485 at age 46), and high rainfall regions (0.9488 at age 48). Initial cumulative risks at warning thresholds were comparable across regions (Table 3): high rainfall at 0.0525 (age 48), medium at 0.0538 (age 44), and low at 0.0529 (age 46). Risk patterns remained similar until age 64, then diverged significantly. High rainfall regions showed the most dramatic increase, from 0.1972 to 0.4186 between ages 65 and 77, due to increased flow impacts and accelerated corrosion from constant moisture exposure. Low rainfall regions maintained the most stable progression (final risk 0.2225), while medium rainfall areas showed a moderate increase (final risk 0.3735). Recommendations: Implement intensive preventive maintenance after 65 years in high rainfall regions, focusing on corrosion protection with replacement consideration when necessary. Medium-rainfall regions require moderate maintenance with regular inspections, while low-rainfall areas can extend maintenance cycles with optimized inspection frequencies.
Pipeline performance was evaluated across six geological settings: graphite-bearing strata, backfilled areas, tuff and lava zones, granodiorite zones, surface sediment zones, and granite zones. Statistical analysis revealed significant differences between multiple geological conditions (p < 0.01 to p = 0.03) (Table 9). Data analysis reveals distinct patterns across geological settings, with critical points reached at different ages: graphite-containing formations (age 33, risk: 0.0517–0.0678), filled areas (age 42, risk: 0.0537–0.4005), tuff and lava regions (age 44, risk: 0.0553–0.4010), granodiorite areas (age 45, risk: 0.0590–0.1643), superficial deposits (age 47, risk: 0.0517–0.4448), and granitic rocks regions (age 49, risk: 0.0522–0.3922). Risk evolution patterns vary significantly under different foundation conditions. Filled areas consistently show higher risks than granite foundations due to compaction and settlement issues. Granite and tuff/lava foundations exhibit similar trends until age 65, after which granite shows higher risks due to stress concentration. Granodiorite areas stabilize after age 65, while granite areas continue deteriorating due to brittleness and stress sensitivity. Before the age of 70, filled areas maintain higher risks than tuff and lava areas. Post-70, tuff and lava areas experience sudden risk increases due to strength degradation and accelerated weathering. Surface sediment areas show late-stage risk surges (ages 70–75), while backfilled regions deteriorate earlier (Table 3). Based on these patterns, the maintenance framework prioritizes (1) Superficial Deposits: Monitor from age 47; implement stabilization before 72–75 years (2) Filled areas: Preventive maintenance from age 42; intensive care during 50–70 years (3) Tuff/lava regions: Monitor from age 44; focus on stabilization during 55–70 years (4) Granitic regions: Start monitoring at age 49; emphasize stress management post-65 years (5) Granodiorite areas: Monitor from age 45; intensive inspection 45–58 years (6) Graphite formations: Basic monitoring from age 33; minimal intervention needed.
Failure boundary time definition and risk-based spatiotemporal analysis
The failure boundary time determination approach uses a tertile-based classification system developed specifically for pipelines that have surpassed their critical thresholds. In this framework, T0 is defined as the time point at which pipelines exhibit a 95% survival probability. Beyond T0, the cumulative risk level is divided into three segments using tertile boundaries. T1 marks the first tertile boundary, and T2 marks the second. Based on these thresholds, pipelines are categorized into four distinct risk levels. The Safe Level (Green) group includes pipelines operating before T0 with a high survival probability. The Low Risk group (Yellow) comprises pipelines falling between T0 and T1. The Medium Risk group (Orange) consists of pipelines between T1 and T2. The High Risk group (Red) encompasses pipelines beyond T2.
In practice, the selection of failure boundary time should align with risk acceptance levels and be supported by statistical indicators that serve as quantitative benchmarks. Different strategies can be adopted based on organizational needs and risk tolerance: A conservative approach may adopt T0 as the failure boundary time, restricting operations to pipelines within the Safe Level zone; a balanced strategy might select T1, allowing operations within the low-risk zone; while an aggressive strategy could opt for T2, permitting continued operation into the Medium Risk zone. Pipelines falling into the high-risk category warrant immediate monitoring, further assessment, and potential replacement. This flexible framework, combining statistical evidence with specific operational contexts and expert knowledge, enables practitioners to define appropriate failure criteria and modify risk classifications as specific requirements demand.
A GIS map (Fig. 1) visually illustrates the spatial distribution of these risk categories for the long-span pipeline factor across Hong Kong. The green-bordered panel in the upper right displays the overall distribution of pipelines with enlarged sections indicated by colored boxes. The red-bordered panel provides a detailed view of Hong Kong Island and Kowloon, where Medium Risk and High Risk pipelines can be seen congregating in central Kowloon and the urban areas of northern Hong Kong Island. The black-bordered panel in the upper left focuses on the Tai Po district, predominantly showing pipelines in the Safe Level category. The blue-bordered panel in the lower left highlights the northwestern New Territories, where Safe Level pipelines prevail with occasional Low Risk segments. A purple-bordered detail of central Hong Kong clearly shows all four risk levels represented by red, orange, yellow, and green. This visualization enables management personnel to identify clusters of varying risk levels and to make informed decisions regarding resource allocation.
The figure presents a multi-panel map showing the risk assessment of long-span pipeline networks in Hong Kong. The central panel (red border) displays the main overview of Hong Kong Island, Kowloon, and surrounding areas, with pipeline segments color-coded according to risk levels: high-risk (red), medium-risk (orange), low-risk (yellow), and safe-level (green). Two detailed inset maps are shown in black and blue borders, focusing on specific districts with dense pipeline networks. The top-right corner includes a smaller overview map (green border) showing the geographical context of the study area, and a legend indicating the risk level classification. A north arrow is provided for orientation. The purple boxes in the main map highlight urban areas with detailed views.
Integrating statistical analysis with spatial visualization yields a robust decision support framework for infrastructure management. By identifying geographic clusters of high-risk pipelines with GIS mapping, resources can be directed more efficiently toward areas needing urgent intervention while standard maintenance protocols apply to lower-risk areas. This combined approach promotes a proactive and strategic pipeline management system that optimizes resource allocation and enhances public safety.
Summary of key indicators and maintenance strategies
Based on the single-factor survival analyses of physical, functional, and environmental factors, critical thresholds, cumulative hazard rate evolution patterns, and corresponding maintenance priorities and recommendations have been identified for each subfactor. These core findings have been condensed into Tables 10–12 to facilitate quick reference and comparison by industry practitioners.
Table 10: Summarizes key risk indicators and maintenance strategies for physical characteristics (length, material, diameter), facilitating quick identification of pipelines requiring priority attention during routine inspections and major maintenance planning.
Table 11: Focuses on functional group performance differences (stormwater vs. foul sewer), listing warning thresholds, acceleration phases, and targeted inspection frequency recommendations for each pipeline type.
Table 12: Covers multiple environmental factors, including land use, climate, traffic load, and soil conditions, concisely presenting “recommended monitoring initiation ages,” “critical protection periods,” and key concerns.
These tables effectively translate the extensive survival analysis statistical results (e.g., Kaplan–Meier survival curves, Nelson–Aalen cumulative hazard rates, Log-rank tests) into practical maintenance decision-making guidelines. Based on these guidelines, when critical thresholds are reached, in addition to intensifying the monitoring phase, the replacement of part or all of the infrastructure in the ‘district’ under consideration can be carried out. This replacement must be planned in appropriate maintenance tools, supported by appropriate economic and financial considerations.
Important considerations for table interpretation include referencing detailed results from the corresponding analysis for comprehensive curve interpretations and group differences, as well as adapting the recommendations to local conditions since they are based primarily on the Hong Kong case study. The “warning thresholds” and “risk characteristics” should be adjusted according to specific pipeline characteristics, climate conditions, and available resources in different regions.
Discussion
The present study examines the impact of individual factors on failure patterns by employing a framework that integrates group comparison tests with uncertainty quantification. Log-rank tests serve as the cornerstone of this framework, rigorously assessing differences between groups and revealing highly significant variations (p < 0.05) across factor categories. These differences confirm that the grouping criteria successfully capture meaningful distinctions in failure behavior, as demonstrated by the unique survival patterns observed in various traffic loading categories and pipe materials. Confidence intervals derived from Kaplan–Meier survival curves and Nelson–Aalen cumulative hazard functions further validate the findings by quantifying the uncertainty in the estimates. These intervals not only demonstrate the stability of the estimates under sample variation but also underscore the reliability of the predictions, offering practitioners clear insights into the precision of failure forecasts. Although the single-factor framework does not permit direct comparisons of relative importance among different factors, the observed variations offer valuable guidance for maintenance planning.
Building upon these statistical validations, the study conducted comprehensive analyses across all groups, with primary focus placed on those showing significant differences (log-rank < 0.05) for detailed maintenance strategy development. For groups exhibiting non-significant differences (log-rank > 0.05), although basic analyses were performed, opportunities exist for enhancing their management strategies. First, the current classification system could be refined by merging categories with similar statistical characteristics, potentially leading to a more efficient management structure. Second, the resource allocation could be optimized based on this refined classification, allowing for more focused distribution of maintenance resources to high-risk areas while maintaining appropriate monitoring levels for merged categories. Third, the monitoring strategies could be streamlined while ensuring system safety through simplified yet effective inspection procedures. This approach provides a scientific foundation for future management refinements while maintaining practical feasibility.
The framework’s effectiveness was validated using Hong Kong’s extensive sewer pipe data. The framework is also suitable for cities with limited data type availability. Its core statistical methods (the nonparametric Kaplan–Meier method and Nelson–Aalen estimator) exhibit an inherent advantage when working with limited types of data. By focusing on single-factor effects instead of complex multi-factor interactions, the framework only requires key data types for decision-making rather than comprehensive data categories. This characteristic makes it valuable for drainage authorities in rural districts or areas far from densely populated centers, where data collection capabilities may be limited. In these regions, authorities only need to collect single-factor data types essential for decision-making, significantly reducing the data collection burden.
The framework also demonstrates strong transferability, as its methodological structure can be readily adapted by simply substituting local data into the established analytical framework. This transferability stems from its fundamental statistical design and adaptive mechanisms. The framework employs widely established statistical methods that do not depend on region-specific assumptions. Risk boundary calculations employ relative metrics instead of absolute values, allowing for automatic calibration to local conditions without extensive modifications. Moreover, the integration of GIS capabilities guarantees spatial analysis functionality independent of geographic context, and the risk classification system’s relative thresholds automatically adapt to local conditions.
While demonstrating these advantages, the framework exhibits certain limitations in infrastructure risk assessment. Its factor-specific analytical approach enables systematic evaluation of individual variables. However, this method fails to capture crucial interactions among multiple factors, such as material properties, environmental conditions, and usage patterns that collectively influence failure mechanisms. While the simplified methodology reduces data requirements and computational complexity, it sacrifices accuracy in capturing real-world interaction effects. Additionally, because the framework primarily relies on time-dependent deterioration patterns and historical data, it struggles to predict sudden infrastructure failures triggered by exceptional events such as natural disasters or severe weather conditions. These limitations become particularly pronounced in dynamic operating environments. In dynamic environments where infrastructure usage patterns evolve rapidly or where new technologies and materials alter traditional deterioration trends, predictive outcomes frequently lag behind actual risk conditions. Although the framework incorporates a self-updating mechanism, updates inevitably lag behind real-time system changes and emerging risk developments. To address these limitations, future research should focus on enhancing the framework’s real-time monitoring and predictive capabilities. Integration with SCADA (Supervisory Control and Data Acquisition) systems could enable continuous monitoring and dynamic maintenance planning41,42, while incorporation of Building Information Modeling (BIM) tools could improve data visualization and decision support capabilities43. Such technological integration could help bridge the gap between predicted and actual risk conditions. Given these significant limitations, the framework should not be used as a standalone risk assessment tool. Instead, it should be integrated with other evaluation methods that can compensate for its weaknesses in capturing factor interactions, sudden failures, and emerging risks. Future research may extend this methodology by incorporating multi-factor models, such as Cox proportional hazards regression or other multivariate techniques, to provide further insights into the relative influence of these factors. In summary, while the framework effectively captures the relationship between variations in individual factors and failure patterns, its application should be complemented with other assessment methods to provide a comprehensive risk evaluation.
Methods
This section presents the overarching research methodology employed in this study, including the survival analysis techniques (Kaplan–Meier, Nelson–Aalen) and the dynamic failure analysis framework for sewer pipelines. The details regarding data acquisition, preprocessing, and database management are subsequently introduced in the “Data sources and requirements”, “Data cleaning and interoperability”, “Data integration and management”, and “Database construction” sections.
Survival analysis
Survival analysis is a specialized statistical approach used to study the time duration from a defined starting point to the occurrence of a specific event44,45. In medical research, such events commonly include death, disease recurrence, or other clinical endpoints46,47. A notable feature of survival analysis is its capacity to manage censored data, which represents instances where the event of interest has not occurred by the study’s conclusion48. This method is particularly well-suited for analyzing failure time data in infrastructure systems49,50, as it effectively accounts for censored data, where assets remain functional at the observation period’s end. Given that many pipelines in sewer networks remain operational during the study period, survival analysis methods can comprehensively utilize data from these non-failed pipes to assess service life. This section will introduce two fundamental functions in survival analysis: the survival function \(S(t)\) and the cumulative hazard function \(H\left(t\right)\), along with their estimation methods.
Survival function
The survival function \(S(t)\) is a fundamental function in survival analysis, defined as the probability that a subject survives beyond time \(t\):
Where \(T\) is a random variable representing survival time. The survival function \(S(t)\) is a monotonically decreasing right-continuous function, exhibiting important statistical properties: at \(t=0\), \(S\left(t\right)=1\), indicating all subjects are alive at the start of the study; as t approaches infinity, \(S(t)\) approaches 0, reflecting that all subjects will eventually experience the target event.
The Kaplan–Meier method provides a nonparametric estimation approach for the survival function, with its estimator defined as:
Where \({t}_{(i)}\) represents the observed failure time, di is the number of failures at time \({t}_{(i)}\), and ni is the number of individuals in the risk set at time \({t}_{(i)}\). This method requires no prior assumptions about the distribution of survival times and can effectively handle censored data.
The log-rank test is a nonparametric method used to compare differences in survival curves between two or more independent groups. Its test statistics are based on the difference between observed and expected values at each failure time point:
Where \({O}_{i}\) represents the observed number of failures for group \(i\) at each time point, \({E}_{i}\) is the expected number of failures under the null hypothesis, and \({V}_{i}\) is the corresponding variance. Under the null hypothesis, this statistic approximately follows a χ2 distribution with (k-1) degrees of freedom, where k is the number of groups being compared. Based on the calculated \({\chi }^{2}\) value, the corresponding P-value can be obtained from tables, and when P < 0.05 indicates statistically significant differences in survival curves between groups.
Cumulative hazard function
The cumulative hazard function \(H(t)\) is a fundamental function in survival analysis, defined as the total accumulated risk of experiencing the target event from the start of the study up to time \(t\):
Where:
\(h\left(u\right)\) is the hazard function at time \(u\), representing the instantaneous risk of the event occurring at that specific time.
The cumulative hazard function is commonly estimated using the Nelson–Aalen method, which has gained widespread acceptance in survival analysis. The Nelson–Aalen estimator provides a robust nonparametric estimation method for the cumulative hazard function:
Where \({d}_{i}\) is the number of failures at time \({t}_{(i)}\), and \({n}_{i}\) is the number of individuals in the risk set.
Development of the dynamic failure analysis and maintenance strategy framework
To address the uncertainty and evolving nature of pipeline conditions, this study established an integrated data analysis framework (Fig. 2) that adopts a dynamic, multi-dimensional approach to continuously update risk metrics through the latest data integration and systematic feedback loops. The framework comprises three primary stages: data integration, core statistical function generation, and time-based failure analysis. In the data integration stage, historical data are collected and consolidated using the ArcGIS platform, yielding a unified and standardized dataset for subsequent univariate analyses.
The workflow consists of three main components and a feedback loop. The data integration process begins with historical data processing through the ArcGIS platform for feature-based data extraction. This feeds into two parallel analytical approaches: Kaplan–Meier function generation (including survival analysis, log-rank tests, and critical time threshold identification) and Nelson–Aalen function generation (comprising estimation and cumulative hazard rate function calculation). These analyses converge into a comprehensive time-based failure analysis and strategy component, which includes six sequential steps: time-threshold-based segmentation, post-threshold survival analysis, temporal risk boundary calculation, risk categorization and spatial visualization, feature-based failure time determination, and key indicators and maintenance strategies summary. The process concludes with new installation and failure data collection, which updates the GIS database, creating a continuous improvement cycle through data feedback.
In the core statistical function generation stage, both the Kaplan–Meier and Nelson–Aalen methods are employed. The Kaplan–Meier pathway (depicted on the left side of Fig. 2) conducts survival analysis by constructing survival curves, examining inter-group differences under various conditions via the log-rank test, and identifying a key temporal threshold (T0) at a 95% survival probability level. Concurrently, the Nelson–Aalen estimation method (shown on the right side of Fig. 2) quantifies the cumulative hazard function, providing a quantitative basis for defining the system failure boundary time.
Building on these statistical underpinnings, a time-based failure analysis and strategy framework was developed. Initially, data segmentation is performed according to the predefined temporal threshold (T0). Subsequently, a post-threshold survival analysis was conducted and summarized to evaluate pipeline degradation trends after T0. Key risk boundaries are then established through tertile analysis of the cumulative hazard function beyond T0: T0 marks the onset of the risk period, while T1 (the lower tertile) and T2 (the upper tertile) delineate risk boundaries within the cumulative hazard data. In the risk classification and spatial visualization stage, the system is categorized into four risk levels based on these boundaries: safe (below T0, green), low risk (between T0 and T1, yellow), moderate risk (between T1 and T2, orange), and high risk (above T2, red). A GIS map subsequently offers an intuitive visual representation. This classification framework supports the determination of failure boundary time by defining three critical moments. The conservative failure boundary Time T0 is identified when failure features first appear, making it suitable for conservative management strategies with low risk tolerance. The balanced failure boundary Time T1 is determined when failure features become significant without showing deterioration, offering a balanced approach between risk and benefit. The aggressive failure boundary Time T2 is recognized when failure features approach a state of functional failure, guiding management strategies that accept higher risks. Overall, this framework provides a concise summary of key performance indicators and maintenance strategies, allowing decision-makers to select the appropriate failure boundary time based on their risk tolerance.
Finally, the framework incorporates a self-updating mechanism for continuous model evolution. As new pipeline installation and failure data are added to the database, the model automatically updates its parameters. The survival curves and risk thresholds are recalculated dynamically, and comparative analyses are refreshed with the latest failure data. This adaptive process transforms the framework from a static tool into a dynamic system that effectively captures the evolving failure patterns as the dataset grows.
Data sources and requirements
Single-factor analysis focuses on how individual factors independently affect pipeline failure. This approach requires two essential data elements: (1) failure age, (2) the specific factor under analysis. Using these basic records, Kaplan–Meier and Nelson–Aalen estimations can establish reliable time-to-failure relationships.
The primary dataset was sourced from the Drainage Services Department (DSD) of Hong Kong. It covers a broad range of pipeline attributes in GIS format, including district, installation date, length, material, age, diameter, spatial layout, and connection relationships, along with detailed CCTV inspection reports from 2007 to 2021.
Numerous studies have highlighted the critical influence of external environmental factors on pipeline deterioration. In response to these findings, the present study systematically expanded the dataset to incorporate a range of environmental variables from various Hong Kong government departments. Annual Average Daily Traffic (AADT) data were acquired from the Transport Department to capture the load conditions above or adjacent to pipeline segments. Climatic measurements, including daily temperature extremes (maximum, minimum, and average), humidity levels, and rainfall records, were retrieved from the historical database of the Hong Kong Observatory. Land use information, sourced from the Planning Department and Lands Department, was derived from statutory plans and land utilization databases. Soil condition data, including geological and ground investigation records, were collected from the Geotechnical Engineering Office of the Civil Engineering and Development Department. These governmental bodies maintain regular data collection and quality control procedures to ensure the reliability and consistency of their respective datasets.
Data cleaning and interoperability
During the data cleaning process, the first major challenge was the inconsistency in data format, which made the collected data unsuitable for direct analysis. CCTV inspection records were provided in PDF format, containing inspection dates and condition details, whereas pipeline information was stored in GIS format, which included attributes such as district installation date, length, material, age, diameter, and spatial layout. Environmental data from various government departments were presented in diverse digital formats. To resolve these inconsistencies, a Python script was developed to standardize the diverse data formats, with manual verification performed to ensure accuracy. The standardization process involved converting all data into a uniform tabular structure, standardizing date formats and measurement units, and harmonizing naming conventions for pipeline attributes.
The second challenge in data cleaning involved the management of maintenance records. The inspection records included information on pipes with maintenance activities mixed with those without. To tackle this issue, maintenance records were identified through keyword filtering, such as “renew,” “replacement,” and “rehabilitation.” For pipes with maintenance history, their records were meticulously processed to generate two separate entries: one covering the period from installation to the maintenance date, and the other from the maintenance date to the latest inspection. This method ensured that the age and condition parameters accurately represented the pipeline’s true state at specific time points.
Data integration and management
A robust data integration and management process was established on the ArcGIS platform to aggregate the cleaned datasets from multiple sources into a centralized repository. This repository not only consolidates historical records but also accommodates new pipeline installation and failure information in real-time.
Specifically, each maintenance or inspection entry is spatially linked with the corresponding pipeline segment via unique pipeline identifiers. ArcGIS spatial analysis tools enable automatic matching of pipeline locations, ensuring efficient extraction and organization of data. Environmental inputs, such as traffic loads, climate indicators, and land use classifications, are likewise joined to pipeline layers based on geographic coordinates.
To keep pace with ongoing sewer network changes, the ArcGIS database is regularly updated with newly acquired data. This continuous update mechanism ensures that the latest operational information—for example, recent maintenance activities or sudden failures—immediately feeds back into the maintenance decision framework. As such, the dynamic repository streamlines further analysis, allowing managers to monitor risk indicators and intervene promptly.
Database construction
Following the data integration process, a comprehensive database was constructed to facilitate in-depth failure time analysis. The final database is organized into three main categories—physical parameters, functional factors, and environmental factors—and each category encompasses multiple fields that collectively support subsequent modeling efforts. Specifically, physical parameters comprise attributes such as pipeline length, material (e.g., concrete, vitrified clay), diameter, and age, which directly link to the structural aspects influencing pipeline deterioration. Meanwhile, functional factors label each pipeline segment as stormwater or foul sewer to enable targeted analyses based on service function, further including operational aspects like flow rates and ownership, where available. In addition, environmental factors incorporate external variables such as temperature, humidity, rainfall, land use categories (industrial, commercial, residential, green/open spaces), district classification (Hong Kong Island, Kowloon, New Territories, Islands), traffic load (AADT), and soil characteristics. Twenty-year averages of traffic and climate data are also included to represent stable background conditions. Each data record in the database is timestamped for time-series analysis, ensuring traceability of pipeline conditions over multiple inspection intervals. Figure 3 illustrates the major steps of this construction process, from data cleaning to geospatial alignment and final attribute segmentation. By systematically consolidating all relevant parameters, the ArcGIS-based database not only provides a robust foundation for survival analysis and hazard estimation but also lays the groundwork for future expansions—new fields or data sources can be readily added as additional factors are identified.
The framework illustrates three main stages of data processing: data collection, data cleaning, and database construction. In the data collection phase, two primary data sources are integrated: Internal Pipeline Condition Records (including pipeline physical information and CCTV inspection reports) and External Environment Data (comprising annual average daily traffic, climatic data, land use information, and soil condition data). The data cleaning stage consists of Format Standardization (uniform tabular structure, standardized date formats.
Data availability
All data supporting the conclusions of this study are available on request from the corresponding author.
References
Abdelkhalek, S. & Zayed, T. A multi-tier deterioration assessment models for sewer and stormwater pipelines in Hong Kong. J. Environ. Manag. 345, 118913 (2023).
Benbachir, M., Cherrared, M. & Chenaf, D. Managing sewerage networks using both failure modes, effects and criticality analysis (FMECA) and analytic hierarchy process (AHP) methods. Can. J. Civ. Eng. 48, 1683–1693 (2021).
Steele, W. & Legacy, C. Critical urban infrastructure. Urban Policy Res. 35, 1–6 (2017).
Lehman, M. The American Society of Civil Engineers’ report card on America’s infrastructure. In Women in Infrastructure 5–21 (Springer, 2022).
Angkasuwansiri, T. & Sinha, S. Development of a robust wastewater pipe performance index. J. Perform. Constr. Facil. 29, 04014042 (2015).
Wang, N. et al. Automatic damage segmentation framework for buried sewer pipes based on machine vision: case study of sewer pipes in Zhengzhou, China. J. Infrastruct. Syst. 29, 04022046 (2023).
Zamanian, S. & Shafieezadeh, A. Age-dependent failure probabilities of corroding concrete sewer pipes under traffic loads. Structures 52, 524–535 (2023).
Fontecha, J. E. et al. A two-stage data-driven spatiotemporal analysis to predict failure risk of urban sewer systems leveraging machine learning algorithms. Risk Anal. 41, 2356–2391 (2021).
Hawari, A., Alamin, M., Alkadour, F., Elmasry, M. & Zayed, T. Automated defect detection tool for closed circuit television (cctv) inspected sewer pipelines. Autom. Constr. 89, 99–109 (2018).
Kaddoura, K. & Zayed, T. Criticality model to prioritize pipeline rehabilitation decisions. In Pipelines 2018 75–85 (American Society of Civil Engineers Reston, VA, 2018).
Sousa, V., Matos, J. P. & Matias, N. Evaluation of artificial intelligence tool performance and uncertainty for predicting sewer structural condition. Autom. Constr. 44, 84–91 (2014).
Hawari, A., Alkadour, F., Elmasry, M. & Zayed, T. A state of the art review on condition assessment models developed for sewer pipelines. Eng. Appl. Artif. Intell. 93, 103721 (2020).
Salihu, C. et al. A deterioration model for sewer pipes using CCTV and artificial intelligence. Buildings 13, 952 (2023).
Gokhale, S. & Graham, J. A. A new development in locating leaks in sanitary sewers. Tunn. Undergr. Space Technol. 19, 85–96 (2004).
Kuliczkowska, E. An analysis of road pavement collapses and traffic safety hazards resulting from leaky sewers. Balt. J. Road. Bridge Eng. 11, 251–258 (2016).
Teplỳ, B., Rovnaníková, M., Řoutil, L. & Schejbal, R. Time-variant performance of concrete sewer pipes undergoing biogenic sulfuric acid degradation. J. Pipeline Syst. Eng. Pract. 9, 04018013 (2018).
Yoon, H.-S., Yang, K.-H., Lee, K.-M. & Kwon, S.-J. Service life evaluation for RC sewer structure repaired with bacteria mixed coating: through probabilistic and deterministic method. Materials 14, 5424 (2021).
Shadabfar, M., Mahsuli, M., Xue, Y., Zhang, Y. & Wu, C. Time-variant system reliability analysis of concrete sewer pipes under corrosion considering multiple failure modes. ASCE ASME J. Risk Uncertain. Eng. Syst. Part A Civ. Eng. 9, 04023002 (2023).
Davis, P., Burn, S., Moglia, M. & Gould, S. A physical probabilistic model to predict failure rates in buried PVC pipelines. Reliab. Eng. Syst. Saf. 92, 1258–1266 (2007).
Frank, A., Freimann, W., Pinter, G. & Lang, R. W. A fracture mechanics concept for the accelerated characterization of creep crack growth in PE-HD pipe grades. Eng. Fract. Mech. 76, 2780–2787 (2009).
Zamanian, S., Hur, J. & Shafieezadeh, A. Significant variables for leakage and collapse of buried concrete sewer pipes: A global sensitivity analysis via Bayesian additive regression trees and Sobol’ indices. Struct. Infrastruct. Eng. 17, 676–688 (2021).
Ebrahimi, M., Jalali, H. H. & Sabatino, S. Probabilistic condition assessment of reinforced concrete sanitary sewer pipelines using LiDAR inspection data. Autom. Constr. 150, 104857 (2023).
Jiang, G., Keller, J., Bond, P. L. & Yuan, Z. Predicting concrete corrosion of sewers using artificial neural network. Water Res. 92, 52–60 (2016).
Li, X. et al. Evaluation of data-driven models for predicting the service life of concrete sewer pipes subjected to corrosion. J. Environ. Manag. 234, 431–439 (2019).
Altarabsheh, A., Ventresca, M., Kandil, A. & Abraham, D. Markov chain modulated Poisson process to stimulate the number of blockages in sewer networks. Can. J. Civ. Eng. 46, 1174–1186 (2019).
Ghavami, S. M., Borzooei, Z. & Maleki, J. An effective approach for assessing risk of failure in urban sewer pipelines using a combination of GIS and AHP-DEA. Process Saf. Environ. Prot. 133, 275–285 (2020).
Khan, Z., Zayed, T. & Moselhi, O. Structural condition assessment of sewer pipelines. J. Perform. Constr. Facil. 24, 170–179 (2010).
Mashford, J., Marlow, D., Tran, D. & May, R. Prediction of sewer condition grade using support vector machines. J. Comput. Civ. Eng. 25, 283–290 (2011).
Santos, P., Amado, C., Coelho, S. T. & Leitão, J. P. Stochastic data mining tools for pipe blockage failure prediction. Urban Water J. 14, 343–353 (2017).
Hahn, M. A., Palmer, R. N., Merrill, M. S. & Lukas, A. B. Expert system for prioritizing the inspection of sewers: knowledge base formulation and evaluation. J. Water Resour. Plan. Manag. 128, 121–129 (2002).
Ortolano, L., Le Coeur, G. & MacGilchrist, R. Expert system for sewer network maintenance: Validation issues. J. Comput. Civ. Eng. 4, 37–54 (1990).
Hawari, A., Alkadour, F., Elmasry, M. & Zayed, T. Simulation-based condition assessment model for sewer pipelines. J. Perform. Constr. Facil. 31, 04016066 (2017).
Ruwanpura, J., Ariaratnam, S. T. & El-Assaly, A. Prediction models for sewer infrastructure utilizing rule-based simulation. Civ. Eng. Environ. Syst. 21, 169–185 (2004).
Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53, 457–481 (1958).
Nelson, W. Theory and applications of hazard plotting for censored failure data. Technometrics 14, 945–966 (1972).
Aalen, O. Nonparametric inference for a family of counting processes. Ann. Stat. 6, 701–726 (1978).
Stevens, N.-A., Lydon, M., Marshall, A. H. & Taylor, S. Identification of bridge key performance indicators using survival analysis for future network-wide structural health monitoring. Sensors 20, 6894 (2020).
Beng, S. S. & Matsumoto, T. Survival analysis on bridges for modeling bridge replacement and evaluating bridge performance. Struct. Infrastruct. Eng. 8, 251–268 (2012).
Salihu, C., Hussein, M., Mohandes, S. R. & Zayed, T. Towards a comprehensive review of the deterioration factors and modeling for sewer pipelines: a hybrid of bibliometric, scientometric, and meta-analysis approach. J. Clean. Prod. 351, 131460 (2022).
Elmasry, M., Hawari, A. & Zayed, T. Defect based deterioration model for sewer pipelines using Bayesian belief networks. Can. J. Civ. Eng. 44, 675–690 (2017).
Pugliese, F., De Paola, F., Fontana, N., Giugni, M. & Marini, G. Experimental characterization of two pumps as turbines for hydropower generation. Renew. Energy 99, 180–187 (2016).
Ferna´ndez, J., Barrio, R., Blanco, E., Parrondo, J. & Marcos, A. Experimental and Numerical Investigation of a Centrifugal Pump Working as a Turbine. In Proceedings of the ASME 2009 Fluids Engineering Division Summer Meeting. Volume 1, pp. 471–479 (Vail, Colorado, USA, 2009). https://doi.org/10.1115/FEDSM2009-78524
De Paola, F., Speranza, G., Ascione, G. & Marrone, N. New Digital Tool for Optimal Design of Water Distribution Network: Integration of Dynamo-Epanet-Harmony Search Algorithm (Dyehs). Available at SSRN: https://ssrn.com/abstract=5013185 or https://doi.org/10.2139/ssrn.5013185
Hosmer Jr, D. W., Lemeshow, S. & May, S. Applied Survival Analysis: Regression Modeling of Time-to-Event Data, Vol. 618 (John Wiley & Sons, 2008).
Lee, E. T. & Wang, J. Statistical Methods for Survival Data Analysis, Vol. 476 (John Wiley & Sons, 2003).
Collett, D. Modelling Survival Data in Medical Research (Chapman and Hall/CRC, 2023).
Singh, R. & Mukhopadhyay, K. Survival analysis in clinical trials: basics and must know areas. Perspect. Clin. Res. 2, 145–148 (2011).
Klein, J. P. & Moeschberger, M. L. Survival Analysis: Techniques for Censored and Truncated Data (Springer Science & Business Media, 2006).
Syachrani, S., Jeong, H. & Chung, C. Advanced criticality assessment method for sewer pipeline assets. Water Sci. Technol. 67, 1302–1309 (2013).
Xie, Q., Bharat, C., Nazim Khan, R., Best, A. & Hodkiewicz, M. Cox proportional hazards modelling of blockage risk in vitrified clay wastewater pipes. Urban Water J. 14, 669–675 (2017).
Acknowledgements
This work was supported by the Research Grants Council (Hong Kong) - General Research Fund under grant number 15209022. The authors would also like to thank the Hong Kong Drainage Services Department (DSD) for the data support and the Hong Kong Utility Training Institute (UTI) for providing the Hong Kong Conduit Condition Evaluation Codes technical reference materials and for their valuable industry communication and guidance.
Author information
Authors and Affiliations
Contributions
J.C. Yang and X.Y. Liu contributed to conceptualization, methodology, software, investigation, and wrote the original draft. T.Z. provided resources, investigation, supervision, funding acquisition, and review & editing of the manuscript. D.A., M.N., and A.I. participated in the investigation and original draft preparation. D.A. was responsible for revising the manuscript in response to reviewers' comments. All authors reviewed and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Yang, J., Arimiyaw, D., Zayed, T. et al. Survival analysis framework for sewer failure time: evidence from Hong Kong. npj Clean Water 8, 91 (2025). https://doi.org/10.1038/s41545-025-00479-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41545-025-00479-x