Abstract
As the global initiative for carbon neutrality in the construction sector accelerates, the low-carbon retrofitting of existing buildings is emerging as a critical pathway to combat climate change. This paper proposes a systematic framework that integrates explainable machine learning with multi-objective optimization to support the sophisticated optimization of carbon emissions in renovation projects. The framework is centered on three core. Material Carbon Emission Intensity (MCEI), Operational Carbon Emission Intensity (OCEI), and Seasonal Carbon Emission Balance (SCEB). Leveraging high-resolution carbon emission simulation data, predictive models were developed using six machine learning algorithms, among which CatBoost demonstrated superior performance. Subsequently, SHAP values were employed to identify key design variables influencing carbon emissions, such as FLH, WWR1, NOF, and WWR2, thereby providing an evidence-based foundation for strategic decision-making. The framework’s utility was validated through a case study of a three-story industrial building retrofit, where the NSGA-II algorithm was applied for multi-objective optimization. This process yielded four distinct sets of feasible solutions. The most balanced solution achieved a 71.06% reduction in MCEI, a 37.20% reduction in OCEI, and a 24.75% improvement in SCEB compared to the baseline scenario. This study culminates in a series of recommended low-carbon strategies, including material reuse, promotion of low-carbon materials, optimization of partition walls, enhancement of the thermal performance of the building envelope, and improvements in atrium design. In conclusion, this research provides a systematic, scalable, and replicable technical pathway for the low-carbon retrofitting of buildings, holding significant practical value for achieving carbon neutrality goals.
Similar content being viewed by others
Introduction
Background
The escalating global climate crisis unequivocally necessitates a low-carbon transformation within the building sector1. The Intergovernmental Panel on Climate Change (IPCC)’s Sixth Assessment Report, released on March 20, 2023, comprehensively outlines the catastrophic consequences of sustained greenhouse gas emissions2. Among the four primary global sources of greenhouse gas emissions—buildings, industrial manufacturing, power generation, and transportation—the building sector alone accounts for approximately 40% of total carbon emissions3.
Over the past few decades, China has undergone rapid industrialization and urbanization, with its urbanization rate soaring from 7.3% in 1949 to 66.16% in 20234. This extensive new construction activity has made China one of the world’s largest carbon emitters. In light of severe climate challenges, the Chinese building industry urgently needs to transition from a development model centered on new construction to a low-carbon pathway prioritizing carbon emission reduction and efficient resource utilization.
In recent years, policy directives globally have advocated for a shift in the construction industry from “incremental new construction” to “existing stock optimization.” This emphasizes redirecting focus from new builds to existing structures, particularly the green renovation of aging industrial buildings5. Many industrial cities, especially those with significantly outdated infrastructure, are grappling with the challenge of revitalizing old industrial buildings that no longer meet modern production demands6. Renovating these structures not only effectively reduces carbon emissions but also enhances resource efficiency, yielding substantial environmental, economic, and social benefits.
Among China’s diverse climate zones, the Hot-Summer and Cold-Winter (HSCW) region presents particularly complex challenges. This region encompasses 16 provinces, municipalities, and autonomous regions, covering approximately 1.8 million square kilometers, with a permanent population of around 550 million, contributing 48% of the national GDP7. Characterized by high temperatures and humidity with low wind speeds in summer, and cold and damp conditions in winter, the HSCW region experiences a year-round “hot-humid summer, cold-damp winter” climate, leading to extensive use of air conditioning systems in both summer and winter. This significantly escalates building energy consumption and carbon emissions. Furthermore, extreme weather events exacerbated by climate change intensify seasonal energy demand fluctuations and carbon emission pressures8.
Related work
Driven by the global “Dual carbon” targets and the imperative for high-quality building development, low-carbon building retrofits have become a central focus in international research. Existing studies extensively explore carbon emission optimization across various dimensions, including low-carbon technology applications, dynamic carbon emission factor management, material carbon accounting, and operational carbon reduction strategies. For instance, Liu et al.9 evaluated the emission reduction potential and cost-effectiveness of 16 typical low-carbon technologies for net-zero building retrofits. Mulya et al.10 developed a low-carbon retrofit pathway for public buildings based on the Green Building Rating System (GBRS) framework. Lou et al.11,12 proposed using dynamic carbon emission factors to replace traditional static values for improved assessment accuracy and analyzed their performance in long-term retrofit strategies. Furthermore, Hao et al.13 quantified carbon emissions from material waste during the demolition phase, while Craft et al.14 compared the carbon reduction capabilities of different bio-based materials in building reuse. For the operational phase, García-López et al.15 assessed the total carbon emissions of retrofitted buildings from a life cycle perspective, and Li et al.16 proposed a refined carbon reduction pathway for commercial building operations.
Simultaneously, data-driven multi-objective optimization methods have matured significantly in the building carbon emission field, finding widespread application in balancing various objectives such as life cycle cost, energy consumption, and carbon emissions17. Luo et al.18 and Thrampoulidis et al.19 utilized large-scale datasets focusing on office and residential buildings, respectively, while Justo Ascione et al.20 employed simulation results from EnergyPlus and MATLAB for building performance optimization. Such research typically encompasses multiple optimization targets, including prioritized life cycle cost21, energy consumption22, and carbon emissions23. Similarly, Li et al.24 considered carbon emissions, thermal comfort hours, and overall cost, while Li et al.25 optimized for carbon emissions, life cycle cost, and energy consumption. More complex studies, such as those by Mostafazadeh et al.26 and Bayer et al.27, have incorporated objectives like energy consumption, retrofit cost, operational cost, carbon emissions, water use, life cycle cost, and discomfort hours. Regarding optimization techniques, Liu et al.28 employed regression analysis, and Song et al.29 applied particle swarm optimization. Most studies30,31,32,33 have adopted the NSGA-II algorithm due to its versatility and stability in multi-objective optimization tasks.
Furthermore, machine learning (ML) demonstrates broad potential in building carbon emission prediction and strategy generation. Zhou et al.34 investigated predicting building carbon emissions, photovoltaic carbon offsets, and net carbon emissions based on energy efficiency parameters, comparing the effectiveness of LightGBM and XGBoost models. Wang et al.35 utilized a Random Forest (RF) algorithm to quantify the impact of various factors on carbon emissions and reveal regional differences. Pan et al.36 proposed an ensemble learning model combining Bayesian Optimization (BO) and XGBoost for residential building life cycle carbon emission prediction. Yuan et al.37 employed an XGBoost-SHAP model to elucidate the impact of urban green-blue landscape (UGBL) spatial patterns on carbon sequestration. Zhang et al.38 integrated the spatial interpretability of Geographically Weighted Regression (GWR) with the CatBoost algorithm to enhance the accuracy and interpretability of carbon emission spatial prediction. Qu et al.39 focused on the correlation between residential building energy consumption and building characteristics, proposing a hybrid model combining the CatBoost algorithm with various meta-heuristic algorithms to improve energy consumption prediction accuracy. Papadopoulos S et al.41 evaluated three tree-based ensemble learning algorithms (Random Forest, Extra Trees, Gradient Boosting Regressor) for predicting building heating and cooling loads, demonstrating superior performance compared to traditional methods. Giannelos S et al.42 applied multiple ML models (linear regression, ARIMA, shallow neural networks, deep neural networks) for carbon emission prediction in the building sector. Martinez-Comesana M et al.43 proposed a spatiotemporal interpolation method based on optimized Multi-Layer Perceptron (MLP) neural networks for real-time estimation of indoor environmental conditions (temperature, relative humidity, and CO2 concentration) in existing buildings.
Despite significant advancements in carbon emission assessment and optimization research, several key limitations persist: (1) Limited regional applicability: Most existing studies focus on cold or tropical regions, failing to adequately address the bimodal load characteristics of China’s HSCW region, where high temperatures and humidity in summer coexist with low temperatures and high humidity in winter. This makes it challenging to effectively tackle its unique climatic challenges. (2) Lack of seasonal dynamics: The majority of carbon emission optimization models are static or conduct only single-season analyses, lacking a comprehensive approach to address the dynamic imbalance of carbon emissions across different seasons. (3) Design translation gap: A significant disconnect exists between optimization tools and practical architectural design, creating an “analysis-strategy-design” chasm. Consequently, research findings are often difficult to translate into concrete retrofit strategies or design language. Therefore, there’s an urgent need to establish a building low-carbon retrofit optimization framework that is regionally adaptable, seasonally responsive, and strategically translatable.
Research aims
Unlike most studies that focus on tropical or frigid zones, this research centers on the low-carbon retrofit optimization of existing buildings in China’s Hot-Summer and Cold-Winter (HSCW) region. The objective is to develop a versatile carbon emission optimization framework applicable to multiple building types, thereby facilitating the transition of regional low-carbon strategies from theoretical research to practical application. Meanwhile, addressing the seasonal characteristics of the climate, we will introduce a three-dimensional optimization objective system encompassing Material Carbon Emission Intensity (MCEI), Operational Carbon Emission Intensity (OCEI), and Seasonal Carbon Emission Balance (SCEB), which is specifically designed to tackle the unique “dual-peak” energy consumption demand in the HSCW region.
To bridge the gap between algorithmic optimization and design strategies, we will construct the CatBoost-SHAP-NSGA-II framework to enhance the prediction accuracy and speed of carbon performance and optimization algorithms. By integrating interpretable machine learning analysis with multi-objective optimization, we will analyze the optimization results to extract specific retrofit strategies or design languages. Empirical analysis will be conducted on typical old industrial building retrofit projects in the HSCW region: real-world data will be utilized to validate the model’s effectiveness in addressing issues such as extreme seasonal energy consumption fluctuations and carbon emission imbalances, ultimately formulating tailored design optimization strategies for this category of buildings.
Methodology
The research on low-carbon building retrofits has evolved through three primary stages, as illustrated in Fig. 1: (1) Carbon emission modeling and target system establishment: The initial phase involves developing carbon emission simulation models for the retrofit process and establishing a three-dimensional carbon emission target system. (2) Machine learning prediction and key parameter identification: This stage focuses on using machine learning models to predict retrofit carbon emissions, followed by applying the SHAP (SHapley Additive exPlanations) method to identify key design parameters and understand their influencing mechanisms. (3) Accelerated multi-objective optimization: The best-performing prediction model is then utilized to accelerate multi-objective optimization calculations with the NSGA-II algorithm, generating a set of Pareto optimal solutions. (4) Clustering and strategy analysis: The final phase involves clustering and analyzing the optimized solutions to extract building retrofit strategies specifically tailored to the unique climatic characteristics of the HSCW region.
Research framework.
Setting of design parameters and carbon calculation
Building baseline model and input parameters for low-carbon retrofit
This study utilizes an industrial building in Guanghan City, Deyang, Sichuan Province, China, as a case study (Fig. 2). The local climate is typical of the HSCW region, with summer temperatures reaching up to 35 °C and winter lows of −0.8 °C. The annual average relative humidity is approximately 85.3%44, with meteorological data sourced from the China Standard Weather Data (CSWD). The map in Fig. 2 was generated using QGIS 3.36 (https://qgis.org) with base data from OpenStreetMap (OpenStreetMap contributors, ODbL license: https://www.openstreetmap.org/copyright). The map was stylized for color and layout purposes without substantive modifications to the underlying geodata.
Figure 3 illustrates the selected industrial building, which covers an area of 3600 square meters and is surrounded by other industrial structures, with a vehicle-accessible road immediately to its east. Figure 4 presents the baseline retrofit conditions of the building, where internal partitions were added on each floor to accommodate basic office functions, while the building’s overall form, exterior wall structure, and window dimensions remained unchanged. The building measures 120 m in length, 30 m in width, and 12.6 m in height, comprising three stories.
Geographical position (This image was created by ArcGIS Pro 3.1, Annotations by the corresponding author).
The selected industry building (Satellite imagery provided by Google Maps, Annotations by the corresponding author).
Baseline building of retrofitting (This image was created by Rhinoceros 3D, Annotations by the corresponding author).
To comprehensively characterize the retrofitted industrial buildings, this study established a total of 20 input parameters. These comprise 12 building form parameters, including area, aspect ratio, orientation, floor height, number of floors, atrium ratio, atrium location, atrium length-width ratio, number of atria, window-to-wall ratio (1 and 2), and room width. Additionally, 8 structural parameters cover the construction types of outer walls, interior walls, floors, ground, roofs, doors, windows, and railings, with each structural parameter linked to specific materials through construction indicators. These parameters are only input as building form parameters, used to simulate and calculate building carbon emissions under different scenarios. Among the simulation parameters, those that can serve as optimization parameters are individually marked in the last column of the table for distinction. To ensure the accuracy and effectiveness of the trained prediction model, we need to ensure that all optimization parameters fall within the scope of the simulation input parameters; obviously, parameters not included in the simulation cannot be optimized.。Detailed information for all these parameters is systematically presented in Table 1.
Figure 5(a) illustrates the primary building geometry defined by its area and aspect ratio. The ranges of area and aspect ratio are derived from a survey of relevant industrial buildings in the Hot-Summer and Cold-Winter (HSCW) region, with consideration limited to rectangular-plan industrial buildings of conventional scale. In contrast, the window-to-wall ratios (WWR) for the north-south and east-west orientations are denoted as WWR1 and WWR2, respectively, and their ranges comply with China’s building energy efficiency codes.
The number of courtyards specifies the courtyard spaces incorporated into the retrofitted building, while the room width refers to the width of a single office room; both parameters are determined through a balance of functional requirements and energy-efficient design. The vertical dimensions of the building are characterized by floor height and number of floors, which are inherently correlated: their product, i.e., the total building height, is consistently constrained to no more than 24 m. This constraint stems from the fact that non-single-story buildings exceeding 24 m are classified as high-rise buildings. Given that the original old industrial buildings are not high-rise structures, there is no necessity to convert them into high-rise buildings during retrofitting—additionally, the code requirements for structure and fire protection applicable to high-rise buildings cannot be satisfied by low-rise and mid-rise buildings.
Figure 5(b) shows the courtyard features, including the courtyard ratio, which is the ratio of the total courtyard area to the original building footprint. The courtyard aspect ratio represents the overall layout of the courtyards. The courtyard extreme position is defined by coordinates: [0,0] places all courtyards at the bottom-left corner of the building, while1 places them at the top-right corner. In these extreme positions, a minimum interior space of 5 m is maintained to accommodate a 1.5-meter-wide corridor and a 3.5-meter-deep small office. The building orientation − 10° indicates a building orientation of 10° west of south. When the north parameter is set to − 90°, 0°, or 90°, the building faces east, south, and west, respectively.
Building parameter schematics. (a) Building morphology is defined by area and aspect ratio. WWR1, WWR2 represent north-south and east-west facades, respectively. Other parameters include number of courtyards, room width, floor height, and number of floors. (b) ATR and ALR are shown. An orientation of − 10° means 10° west of south; other settings face east, south, or west. Extreme courtyard positions ([0,0] for bottom-left1, for top-right) maintain a minimum 5-meter internal circulation space. (This image was created by Rhinoceros 3D, Annotations by the corresponding author).
When these parameters are designated as optimization objectives, the optimization of the overall building form is defined and operates under specific constraints, categorized into two types: additive and subtractive modifications. Additive modifications include: increasing the number of floors, but limited to a maximum of one additional floor; raising the floor height, provided that the total building height does not meet the criteria for high-rise structures; or adding an extension to one side of the building, with the stipulation that the increased floor area does not exceed 20% of the original. Subtractive modifications encompass: reducing the number of floors by a maximum of one floor; lowering the floor height, while ensuring each floor remains above the minimum required height; or demolishing a section of one side of the building, with the requirement that the reduced floor area similarly does not exceed 20% of the original Furthermore, the scenario of adding an extension to one side of the building is simplified in our study. All such cases are treated as internal room expansion, with no overlapping external walls.
Output parameters for Building decarbonization retrofits
This study defines three key objectives for the output data from building renovation carbon emission simulations: MCEI, OCEI, and SCEB. Each objective represents a distinct optimization direction, enabling seamless integration of the prediction model into a multi-objective optimization process.
MCEI quantifies the carbon emissions associated with materials used in the renovated building. This typically encompasses emissions from material production, transportation, construction, use, and end-of-life phases. For this study, we specifically utilize the material production phase to represent total material carbon emissions, as it constitutes the primary contributor45. MCEI is expressed in [kgCO2e/(m2·a)]. This metric is derived by distributing the total material carbon emissions across the building’s usable area and expected service life, reflecting the average annual carbon emissions per square meter attributed to material use.
OCEI measures the carbon emissions generated from energy consumption during the operational phase of the renovated building46, also expressed in [kgCO2e/(m2·a)]. This indicator reflects the average annual carbon emissions per square meter resulting from energy use. It is calculated as the sum of carbon emission intensities from four components: cooling carbon emission intensity (CCEI), heating carbon emission intensity (HCEI), lighting carbon emission intensity (LCEI), and equipment carbon emission intensity (ECEI).
SCEB assesses the difference in carbon emissions between the winter and summer seasons for the renovated building. This metric is calculated by subtracting HCEI from CCEI, thereby revealing the seasonal fluctuations in carbon emissions.
Assessment of carbon emissions in Building retrofit
Material carbon emission intensity
Calculating MCEI follows a defined formula. This process involves creating the necessary model in Rhino, extracting model data using the Grasshopper platform, and then calculating material carbon emissions with external material carbon emission factors. The calculation formula47 is:
Where: Csc represents the building’s MCEI in [kgCO2e/(m2·a)]. Mi denotes the quantity of the i-th main material used. Fi represents the carbon emission factor of the i-th main material in (kgCO2e/material unit). Ai indicates the service life of the i-th main material in (years). a is the total area of the building in (m²).
To calculate MCEI, the building was divided into eight primary components: exterior walls, interior walls, roofs, floors, ground, windows, doors, and railings. The carbon emissions for each component were calculated using the Material Carbon Emission of Unit Construction (MCEUC) indicator, which adapts to various material dimensions and specifications. The detailed calculation data, including the specific construction methods, material layers, and corresponding thermal properties for each component variant, are comprehensively provided in Appendix A. The material thermal properties and construction methods detailed in this appendix refer to the following national building standard design atlases: Energy-Efficient Building Doors and Windows (06J607-1) and Building Construction with External Wall Internal Insulation (07J924). Material carbon emission factors (CEF) primarily cite the national standard Standard for Calculation of Building Carbon Emissions (GB/T51366-2019) and the official emission factor database40.
The key results of these calculations are summarized in Table 2 below, which presents the MCEUC values and key thermal performance indicators for each component option.
MCEUC represents the total carbon emissions generated per unit area or unit length over the component’s expected service life (e.g., 30 years for walls). The compilation of this data provides a crucial reference for the carbon emission analysis in this study.
Operational carbon emission intensity
According to Article 4.1.4 of the “Standard for Building Carbon Emission Calculation (GB/T 51366 − 2019),” carbon emissions during the building operation phase should be calculated based on the energy consumption of various system types and their corresponding carbon emission factors. Systems included in the operational phase calculation are HVAC, domestic hot water, lighting, elevators, renewable energy systems, and carbon sequestration systems. The operational phase carbon emissions per unit building area and total carbon emissions (CM) are calculated using the following formulas41:
Where: CM represents the operational phase carbon emissions per unit building area [kgCO2/m²]. Ei denotes the annual consumption of the i-th type of energy in the building [kWh/a]. EFi is the carbon emission factor for the i-th type of energy. For hot summer and cold winter regions, which span multiple provinces, this study uses the national average power grid carbon emission factor of 0.5703 [kgCO2/kWh], as provided by the Ministry of Ecology and Environment in their 2023 “Research on China’s Regional Power Grid Carbon Emission Factors,” to convert electricity usage into carbon emissions. Ei, j indicates the annual consumption of the i-th type of energy by the j-th system [kWh/a]. ERi, j represents the amount of the i-th type of energy supplied by the renewable energy system to the j-th system [kWh/a]. i refers to the final energy type, including electricity and natural gas. This study primarily focuses on electricity. j refers to the type of building energy system, such as HVAC, lighting, and domestic hot water. In this study, HVAC, lighting, and equipment usage are simulated using dedicated software. Cp is the annual carbon reduction from the building’s green area carbon sink system [kgCO2/a]; this factor is not considered in the current study. y is the building’s design life [years]. A is the building area [m²].
We adhered to the “Standard for Building Carbon Emission Calculation” to define the simulation conditions for HVAC, lighting, and equipment carbon emissions. The simulation settings are categorized into four main aspects: thermal performance of the building envelope, indoor occupancy patterns, temperature control setpoints, and fresh air requirements.
The thermal performance of the building envelope is benchmarked against the “General Code for Energy Efficiency and Renewable Energy Utilization in Buildings (GB 55015 − 2021),” which is applicable to office buildings in Zone A of hot summer and cold winter regions. Key thermal parameters include: roof U-value of 0.4 W/(m²·K), external walls 1.0 W/(m²·K), partition walls 1.5 W/(m²·K), floors 1.8 W/(m²·K), exposed floors 1.0 W/(m²·K), and entrance doors 2.0 W/(m²·K). External windows have a U-value of 2.0 W/(m²·K) and a Solar Heat Gain Coefficient (SHGC) of 0.4, while skylights have a U-value of 2.8 W/(m²·K) and an SHGC of 0.2. These values align with the default settings of normative index 1 in building construction practices, with other thermal parameters adjusted based on specific building modifications.
Indoor occupancy levels are based on an average resting metabolic rate of 120 W per person (see Fig. 6). Occupancy rates are set as follows: 95% from 9 AM to 11 AM and 2 PM to 5 PM; 80% at 12 PM and 1 PM; 10% at 7 AM; 50% at 8 AM; 30% at 6 PM and 7 PM; and no occupancy during other times. Temperature control setpoints are set to 26℃ for summer and 18℃ for winter, with time-based adjustments.
The energy efficiency ratio for cooling and heating is standardized to 2.5. Lighting power density is set at 18 W/m², and equipment power density at 13 W/m². Both lighting and equipment are expected to convert 30% of their power into heat, with operation schedules synchronized with occupancy times. Indoor air change rates are set at 30 cubic meters per person per hour, based on office building standards, and ventilation schedules align with occupancy patterns. Lighting in corridor areas is calculated separately, with a lighting density of 2.5 W/m² and continuous operation throughout the workday.
Lighting rate and air conditioning setting temperature.
Seasonal carbon emission balance
To quantify the difference in carbon emissions between winter and summer, we introduce the SCEB. SCEB is calculated as the difference between the CCEI during summer and the HCEI during winter, both derived from the operational carbon emission components. The calculation formula is as follows:
We integrated this calculation step into Grasshopper, utilizing CCEI and HCEI values obtained from the renovated building’s operational carbon emission simulation results. The simulation conditions for operational carbon emissions remain consistent with those previously described.
Simulation tools
For the carbon emission simulation process, Microsoft Excel was used to organize carbon emission and thermal performance data. Rhino 7.19 (https://www.rhino3d.com/) facilitated the 3D modeling of building geometries. Energy consumption simulations (cooling, heating, lighting, and equipment) were conducted using the Ladybug and Honeybee plugins (version 1.5.0, https://www.ladybug.tools/) for Grasshopper. Additionally, a custom Grasshopper-based workflow was developed to convert energy consumption data into carbon emissions. To manage large-scale parametric simulations, the Design Space Exploration (DSE) plugin was employed for input sampling and result collection.
Machine learning model training and transfer
To develop a robust carbon emission prediction model with strong generalization capabilities, this study employed a comparative analysis of various mainstream machine learning algorithms. These algorithms, encompassing ensemble learning (Random Forest, Extra Trees), gradient boosting frameworks (XGBoost, LightGBM, CatBoost), and Multilayer Perceptron (MLP), were selected for their proficient non-linear modeling capabilities and multi-variable adaptability.
During the initial stages of model construction, a systematic evaluation of each algorithm’s performance on the research dataset will be conducted to determine the primary model. Given the subsequent requirement for SHAP-based interpretability analysis, the model’s transparency and variable importance output capabilities will also serve as critical selection criteria. The Table 3 summarizes the fundamental principles, advantages, and limitations of each model, providing a basis for algorithm selection and model evaluation.
To comprehensively assess model performance, we utilized several key metrics: coefficient of determination (R²), Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), and Mean Squared Error (MSE).
R2 quantifies the goodness-of-fit of the model, representing the proportion of the total variance in the dependent variable that can be explained by the independent variables. Its value ranges from [0, 1]. An R2 closer to 1 indicates a better fit of the model to the data. If R2 is 0, it suggests no correlation between the model’s predictions and the true values, equivalent to predicting with the mean. The formula for R² is:
Where \(\:\stackrel{-}{\text{y}}\) is the mean of the true values, n is the number of samples, \(\:{\text{y}}_{\text{i}}\) is the true value for the i-th sample, and \(\:\widehat{{\text{y}}_{\text{i}}}\) is the predicted value for the i-th sample.
MAE measures the average magnitude of the errors between predicted and true values. It’s calculated by averaging the absolute differences between each predicted value and its corresponding true value, reflecting the average deviation of the model’s predictions from the true values. A smaller MAE indicates that the model’s predictions are closer to the true values. The formula for MAE is:
Where n is the number of samples.
MAPE is a relative error metric that expresses the average percentage error of predictions relative to the true values. It provides a more intuitive understanding of prediction accuracy in percentage form. The formula for MAPE is:
MSE quantifies the average squared difference between predicted and true values. It’s used to measure the accuracy of model predictions: a smaller MSE indicates that predictions are closer to the true values. MSE is particularly sensitive to large prediction errors. The formula for MSE is:
The trained machine learning models will be integrated into the Python Script component within Grasshopper. This integration, combined with the multi-objective optimization plugin Wallacei, enables rapid optimization computations. This establishes a simplified, highly integrated workflow achievable within Rhino, facilitating continuous analysis and optimization of the design process itself.
SHapley additive explanations
SHAP, an acronym for SHapley Additive exPlanations, draws its core principle from the Shapley value in cooperative game theory. Essentially, SHAP decomposes a model’s prediction into the sum of each feature’s marginal contribution. In cooperative game theory, the Shapley value quantifies each player’s contribution to the team’s total payoff. SHAP extends this concept to machine learning model interpretability, treating features as “players” and the model’s prediction as the “total payoff.” For a machine learning model f(x), given a sample x=(x1, x2,…, xn), the difference between its prediction and a baseline value can be decomposed into the sum of each feature’s SHAP value:
Where x0 represents the baseline sample, and ϕi(x) is the SHAP value for feature xi on sample x.
Through an additive attribution framework, SHAP enables both local and global explanations for machine learning models. For a single sample x, the SHAP value ϕi (x) indicates the marginal impact of feature xi on the prediction result relative to the baseline value. The sign (positive or negative) of each feature’s SHAP value reflects its “pushing” or “inhibiting” effect on the prediction. Furthermore, SHAP supports global interaction analysis, such as quantifying the synergistic effect of features i and j through the SHAP interaction value ϕi, j(x), defined as:
Where N={1,2,…,n} is the set of feature indices, S is any subset of features excluding i and j, and |S| is the size of the subset. This formula calculates the change in model prediction when feature i is added to subset S, averaged across all possible feature combinations, weighted by their permutation count.
The core value of SHAP lies in its ability to decompose the prediction results of black-box models into interpretable feature contributions using an additive attribution approach. This provides crucial support for the trustworthiness and transparency of artificial intelligence.
Multi-objective carbon emission optimization
The final step of this research framework is multi-objective optimization (MOO), which employs MOO algorithms to identify solutions that can simultaneously satisfy multiple objectives. Multi-objective optimization is a class of algorithms designed to optimize several conflicting objective functions concurrently. Such problems are very common in real-world applications, including fields like engineering design, economic dispatch, and portfolio management, where decision-makers need to make trade-offs between various goals. Unlike single-objective optimization, which yields a single optimal solution, multi-objective optimization produces a set of “Pareto optimal solutions.” The fundamental mathematical components of a multi-objective optimization algorithm include the objective functions and decision variables. The objective functions are formulated as follows:
Where F(x) comprises m objective functions, fi(x), which represent multiple objectives requiring simultaneous optimization. Each fi(x) can be either minimized or maximized, depending on the specific problem. These objective functions often conflict, meaning that improving one objective may lead to a decline in the performance of another. Consequently, the optimal outcome is not a single solution but rather a set of Pareto optimal solutions. The formula for the decision variables is as follows:
Where x comprises n decision variables, xi, which represent a specific solution within the problem’s solution space. A solution x∗ is considered Pareto optimal if, after satisfying all relevant constraints and ensuring the decision variable values are within the feasible region, no other solution x exists where all objectives fi(x) are at least as good as the current solution fi(x∗), and at least one objective performs strictly better with x. The formula is defined as follows:
Any x∗ that satisfies the given formula is considered a Pareto optimal solution. The collection of all such x∗ within the domain forms the Pareto optimal set, which collectively defines the Pareto front. This front visually represents the best possible trade-offs between conflicting objectives.
NSGA-II is a widely adopted algorithm in modern multi-objective optimization49. Fundamentally, NSGA-II is a genetic algorithm that significantly improves upon its predecessor, NSGA, by incorporating mechanisms like fast non-dominated sorting, crowding distance calculation, and elitist preservation to enhance its efficiency and performance.
In our case study, we aimed to optimize carbon emissions from retrofitted buildings. To achieve this, we defined a vector of three objective functions: MCEI, OCEI, and SCEB, all of which we sought to minimize. Our decision variables included the shape control parameters of the retrofitted buildings and the building selection criteria.Within our workflow, we integrated a transferred prediction model into Grasshopper. This significantly boosted computational efficiency, with prediction times around 2.5 s and each simulation calculation taking approximately 180 s. For the multi-objective optimization, we utilized the Wallacei plugin in Grasshopper, which implements the NSGA-II algorithm. The algorithm’s parameters were configured as follows: Iterations: 50;Population Size: 50༛Crossover Rate: 0.8༛Mutation Rate: 0.9༛Elitism Rate: 0.5.
Results
Analysis of carbon emission datasets for Building renovation
Figure 7 illustrates the distributional characteristics of various Carbon Emission Intensity (CEI) metrics and presents a comparison of these distributions against their respective benchmark values.
Figure 7(a) shows that the MCEI values across all sample groups range from 2 to 15 kgeCO₂/(m²·a), with most values concentrated between 2 and 7 kgeCO₂/(m²·a) and an overall average of 5.94 kgeCO₂/(m²·a). The MCEI for the baseline renovated building is 6.98 kgeCO₂/(m²·a).
Figure 7(b) and 7(c) show CCEI and HCEI in HSCW regions. CCEI ranges from 5 to 17.5 kgeCO₂/(m²·a), with most values between 10 and 14 kgeCO₂/(m²·a). The average CCEI is 11.96 kgeCO₂/(m²·a), which is below the benchmark of 13.99 kgeCO₂/(m²·a). HCEI spans 0.5–5.5 kgeCO₂/(m²·a), with most values between 1 and 3 kgeCO₂/(m²·a). The average HCEI is 2.32 kgeCO₂/(m²·a), also below the benchmark of 3.40 kgeCO₂/(m²·a). Notably, summer emissions are much higher than winter emissions in these regions. This is due to climate: winters are near 0 °C but not extreme, while summers see significant temperature rises from subtropical high-pressure. Solar radiation also plays a role: more sun in summer boosts cooling loads, raising CCEI, while winter sun helps cut heating needs, lowering HCEI.
Figure 7(d), 7(e), 7(f) further detail the distributions of LCEI, ECEI, OCEI. LCEI primarily clusters between 5 and 6 kgeCO₂/(m²·a), with an average of 5.64 kgeCO₂/(m²·a), slightly below its benchmark of 5.82 kgeCO₂/(m²·a).ECEI ranges from 8 to 16 kgeCO₂/(m²·a), with most values falling between 12 and 16 kgeCO₂/(m²·a). The average ECEI is 14.90 kgeCO₂/(m²·a), which is lower than its benchmark of 16.23 kgeCO₂/(m²·a). Finally, OCEI values are concentrated between 30 and 40 kgeCO₂/(m²·a), averaging 34.82 kgeCO₂/(m²·a), a figure notably below its benchmark of 39.44 kgeCO₂/(m²·a).
Overall, the distributions of all CEI metrics largely approximate a normal distribution. This indicates that our parameter sampling was robustly random, providing a solid data foundation for subsequent machine learning model training. Concurrently, all benchmark values consistently and significantly exceed the sample means. This suggests that the benchmark buildings likely haven’t implemented low-carbon retrofitting measures, resulting in their generally higher carbon emission levels.
Carbon emission simulation results. (a), (b), (c), (d), (e) and (f) illustrates the carbon emission simulation results of MCEI, CCEI, HCEI, LCEI, ECEI and OCEI.
Performance evaluation of six machine learning models
Figue 8(a) presents a comparative analysis of six distinct machine learning algorithms, evaluated across four critical performance metrics: R², MAE, MAPE, and MSE. The results unequivocally demonstrate that CatBoost consistently achieves superior performance across all evaluated metrics. This dominant performance is closely followed by LightGBM and XGBoost. The collective efficacy of these algorithms underscores the pronounced advantage of gradient boosting methods when applied to the specific dataset utilized in this investigation.
Figure 8(b) provides a detailed analysis of the learning curves for each model, using R² as the scoring metric to compare the fit on both the training and cross-validation sets.For most models, the R² values on the validation set progressively increased with training iterations, indicating a favorable learning trajectory. While the MLP showed noticeable improvement, its overall accuracy remained lower than other models, suggesting that MLP might require a larger dataset to achieve higher predictive precision. XGBoost performed slightly better than Random Forest and Extra Trees, but it still lagged behind LightGBM and CatBoost.It’s noteworthy that LightGBM exhibited a declining trend in performance on the training set, which may suggest overfitting. In contrast, CatBoost maintained a consistently high R² on both the training and validation sets, demonstrating superior stability.
Considering both predictive accuracy and generalization ability, CatBoost emerges as the most prominent performer. Its superior efficacy can likely be attributed to its robust handling of categorical variables. For instance, certain input variables in this dataset, such as OTW, while numerically represented, are inherently categorical in nature. CatBoost’s native support for such data types significantly enhances model fitting effectiveness. Consequently, CatBoost has been selected as the core regressor for the subsequent multi-objective optimization model.
(a) In the comparative bar chart analysis of the six algorithms, the top three performers across all evaluation scores consistently remain CatBoost, LightGBM, and XGBoost; (b) The model training learning curves for the six algorithms reveal that CatBoost, LightGBM, and XGBoost consistently demonstrate the most optimal performance.
SHAP analysis of input variables
To further understand the optimal CatBoost model, we conducted a SHAP (SHapley Additive exPlanations) analysis. SHAP quantifies the contribution of each feature to the model’s output, providing insights into feature importance. Its visualizations primarily consist of two key forms: the SHAP Feature Importance Bar Plot and the SHAP Summary Plot. The SHAP Feature Importance Bar Plot displays features on the y-axis and the mean absolute SHAP value on the x-axis. The mean represents the average absolute magnitude of a feature’s impact on the model’s output: a larger value indicates a more critical role for that feature in the model’s decision-making process. The SHAP Summary Plot uses features as its y-axis and the SHAP value as its x-axis. The SHAP value signifies a feature’s marginal contribution to the model’s output: positive values indicate an increase in the predicted value, while negative values suggest a suppressive effect. A color gradient is mapped to the feature value itself. The results of this SHAP analysis for the three objectives are presented below.
(a) SCEB is primarily influenced by building form parameters. These include elements like WWR1, FLH, WWR2, NOF, NOA, ORI; (b) OCEIis also predominantly impacted by FLH, WWR1, NOF, WWR2, AR, ASR; (c) is primarily driven by envelope structure parameters.
For the SCEB target (Fig. 9(a)), the SHAP Feature Importance Bar Plot indicates that WWR1 has the highest mean, approaching 0.8, followed by FLH and WWR2. This marks a significant deviation from the feature ranking observed for MCEI, signifying that WWR1 becomes the core driving feature for the SCEB model. This finding underscores the critical role of envelope thermal performance in controlling operational carbon emissions. The SHAP Summary Plot for WWR1 shows a broad distribution of SHAP values, ranging from − 2 to 1.5. This suggests that a larger WWR1 tends to increase SCEB. Conversely, a moderate reduction in WWR1 can effectively decrease the energy consumption difference between winter and summer, thereby contributing to lower SCEB.
Turning to the OCEI target (Fig. 9(b)), the SHAP Feature Importance Bar Plot shows that FLH continues to hold the highest mean, approaching 1.6, with WWR1 and NOF being the next most influential—consistent with the MCEI analysis. This indicates that FLH retains its core driving position within the OCEI model. The SHAP Summary Plot for FLH exhibits an extremely broad distribution of SHAP values, ranging from − 4 to 4. This demonstrates an even stronger response amplitude and a clearer positive-negative relationship, thereby validating its significant regulatory effect on overall building carbon performance.
For the MCEI target (Fig. 9(c)), the FLH ratio is identified as the core driving feature. The SHAP Feature Importance Bar Plot clearly shows that the mean for FLH is significantly higher than other features, approaching 1.2. INW, WIN, and AR follow in importance. The SHAP Summary Plot for FLH reveals a wide distribution of SHAP values, spanning from − 3 to 3. Higher feature values for FLH generally correspond to positive SHAP values, indicating that a greater floor-to-height ratio significantly increases the predicted MCEI. This is because MCEI is averaged per unit area, and a higher floor height necessitates more materials for walls, windows, and other components per unit area. Conversely, lower FLH values tend to suppress carbon emission intensity, demonstrating a clear positive and negative regulatory mechanism.
The comprehensive analysis unequivocally reveals that different carbon emission targets are driven by distinct key influencing features. For the SCEB and OCEI models, spatial morphological characteristics predominantly dictate the carbon emissions, whereas the MCEI model is more heavily reliant on envelope structure parameters.It is noteworthy that some dominant variables, such as FLH and WWR1, maintain a consistent influence across multiple tasks, highlighting their strong generalizability. The SHAP visualization results elucidate the direction and intensity of each feature’s contribution to the model’s output. This understanding is crucial for defining effective low-carbon optimization pathways and provides robust scientific support for multi-objective building retrofit designs.
Optimal performance and parametric analysis of Building retrofits
Since the SHAP analysis only decomposes the prediction model itself to identify the relatively more important parameters and their positive or negative effects, the effects of all parameter combinations, as well as the design approach needed to simultaneously achieve the three objectives in practice, remain unclear. The carbon emission prediction model trained with CatBoost is implemented in Grasshopper and optimized using the multi-objective genetic algorithm plugin Wallacei. This setup enables rapid optimization and previews the optimized model, allowing for identifying shared features among the optimization results and effectively aiding in design decision-making.
Figure 10 illustrates the trends across different optimization objectives. In Fig. 10(a), (b), and (c), the progression of fitness values for 2500 solutions is shown, with a clear downward convergence trend over generations, indicating substantial improvement as the optimization objectives decrease. Figure 10(d), (e), and (f) present the standard deviation trends for each objective, where curve width reflects data dispersion; broader curves indicate more significant variability in objective values within an iteration. Over successive generations, the objective values show a tendency to concentrate. Lastly, Fig. 10(g), (h), and (i) depict the trend in the average value for each objective, with each blue dot representing the mean of 50 solutions per iteration. As generations progress, the MCEI average initially decreases and then slightly increases, while SCEB and OCEI averages continue to decline. Figure 10 demonstrates that multi-objective optimization substantially reduces all three objective values, significantly improving building carbon emissions post-renovation.
Optimization trends across objectives. (a-c) Fitness value progression for 2500 solutions, showing convergence towards lower values over generations. (d-f) Standard deviation trends, with broader curves indicating greater variability, becoming more concentrated over time. (g-i) Average fitness values per iteration, with MCEI initially decreasing and then slightly increasing, while SCEB and OCEI steadily decline. The optimization significantly reduces all three objective values, improving post-renovation carbon emissions.
As shown in Fig. 11, the average MCEI dropped from an initial 3.67 kgeCO₂/(m²·a) to a minimum of 1.93 kgeCO₂/(m²·a), a reduction of 1.74 kgeCO₂/(m²·a), and 5.05 kgeCO₂/(m²·a) below the baseline value of 6.98 kgeCO₂/(m²·a). Similarly, the average SCEB decreased by approximately 20.31%, falling from 8.27 kgeCO₂/(m²·a) to 6.59 kgeCO₂/(m²·a) and showing a reduction of 4.0 kgeCO₂/(m²·a) compared to the baseline of 10.59 kgeCO₂/(m²·a). The average OCEI dropped from 31.55 kgeCO₂/(m²·a) to 29.74 kgeCO₂/(m²·a), a reduction of 10.30 kgeCO₂/(m²·a) below the baseline value of 39.44 kgeCO₂/(m²·a).
Histogram of carbon emission optimization effect.
In analyzing all optimal solutions, we applied k-means clustering to the data, setting four categories. We need at least three categories because we have three optimization objectives. However, considering there might also be a scenario where all three objectives are relatively balanced, we have set four categories instead. Our aim is to summarize applicable strategies for the building scheme phase under each corresponding objective for every category. Our purpose in doing so was to observe the results under these four categories, analyze the characteristics of parameter changes within each category, and thereby summarize strategies that can be applied in practical scenarios. This is because all multi-objective optimization results are scattered, and the characteristics of optimal solutions under multi-objective conditions are fragmented, making it necessary to classify them to find breakthroughs. As shown in Fig. 12, the clustering results reveal distinct characteristics across the four groups: Category A, Category B, Category C, and Category D. Among them, Category A represents the balanced scenario, Category B represents the scenario with relatively poor OCEI, Category C represents the scenario with relatively weak SCEB, and Category D represents the scenario with relatively weak MCEI. The specific descriptions of the four categories are as follows.
Category A exhibits SCEB values from 6.42 to 6.68 kgCO₂e/(m²·a), capturing both the optimal and mid-range SCEB levels. The MCEI values range from 1.90 to 2.21 kgCO₂e/(m²·a), indicating mid-level performance, while OCEI values fall between 29.37 and 29.98 kgCO₂e/(m²·a), also reflecting moderate levels. Thus, Category A can be categorized as optimal or balanced for SCEB.
Category B shows MCEI values from 1.69 to 1.90 kgCO₂e/(m²·a), representing the highest cluster performance. Its SCEB values, between 6.48 and 6.68 kgCO₂e/(m²·a), are moderate, overlapping with some values from Category A and D. The OCEI ranges from 29.98 to 30.38 kgCO₂e/(m²·a), at a lower performance level, making Category B the optimal for MCEI but least favorable for OCEI.
Category C has MCEI values from 1.80 to 2.00 kgCO₂e/(m²·a), performing relatively well, with some overlap from Category A and B. However, its SCEB values range from 6.72 to 6.81 kgCO₂e/(m²·a), the poorest performance across all clusters. The OCEI ranges from 29.55 to 30.01 kgCO₂e/(m²·a), which is a moderate performance, and therefore, Category C can be considered the worst category for SCEB.
Category D demonstrates MCEI values from 2.15 to 2.93 kgCO₂e/(m²·a), the lowest performance among all categories. Its SCEB values are between 6.56 and 6.74 kgCO₂e/(m²·a), showing moderate performance, while OCEI values range from 29.16 to 29.37 kgCO₂e/(m²·a), the best among all categories. Therefore, Category D is the lowest for MCEI but the best for OCEI.
Pareto front solution and cluster diagram.
Category characterization and low carbon retrofit strategies for buildings
Table 4 shows that all four categories demonstrate notable improvements over the baseline across various metrics, with Category A representing a balanced approach to renovation. Category B suggests optimizing materials for low-carbon emissions may lead to higher operational carbon emissions. Category D, which achieves the lowest operational carbon emissions, inevitably shows increased material carbon emissions due to high-performance insulation materials. In Category C, balancing both low-carbon materials and operational efficiency results in increased seasonal carbon emission fluctuations.
Figure 13 illustrates the primary building forms for all optimal solutions, categorized into four types. The left column shows the three objective values for each category’s most representative Pareto front solution, along with their rankings among the 2500 solutions. The closer each vertex of the red triangle is to the center, the higher the ranking of that value. The right side reveals distinct characteristics among the four categories: Category A includes both additions and demolitions, Category B primarily involves adding floors, Category C generally maintains the exact floor count, while Category D focuses on reducing the number of floors.
Multi-objective optimization results.
(This image was created by Rhinoceros 3D, Annotations by the corresponding author)
The detailed variable parameter analysis for each optimization category is illustrated in the figure below (Fig. 14). A total of 11 building shape parameters and 8 structural choice parameters are considered, though height and north orientation are fixed. This results in 19 parameters displayed as box plots.
Notably, while the structural choice parameters were initially sampled as decimal values, they were converted to integers for model input. Thus, these eight structural parameters were rounded before analysis. The results are summarized as follows:
-
OTW: Categories C and D primarily selected outer wall 3, while A and B favored outer walls 2 and 3.
-
INW: Inner wall two was consistently selected across all categories.
-
FLR: Floor 2 was the dominant choice across all categories, with a few selections for floor 1 in Category D.
-
GRD: Ground 2 was universally chosen.
-
ROF: Categories A, B, and C opted for Roof 2, while Category D included Roof 1 and Roof 2.
-
DOR: Door 2 was the standard choice in Categories A, B, and D.
-
RLG: The second type of railing was selected in all categories, with Category C showing a mix of door 1 and door 2.
-
WIN: Window 2 was universally preferred.
This analysis indicates that type 2 structural choices generally result in lower material carbon emissions than type 1 except for the outer wall. Consequently, type 2 choices are significantly more common in practical applications, as reflected in subsequent case optimizations.
The Pareto optimal solution constructs the index parameter.
Although decimal values were used during the shape parameter sampling stage, specific parameters—such as the number of atriums (yard_num) and floors (NOF)—were rounded to integers when input into the model. Therefore, the analysis of these parameters reflects the rounded values(Fig. 15):
-
NOA: Uniformly set to 3 across all categories.
-
ALR: Typically between 0.7 and 0.8, with minimum values above 0.5, indicating an elongated building form. The range from 0 to 1 represents narrow to long courtyard layouts, suggesting that elongated courtyards are compatible with the original structure.
-
ATL: Position_y generally exceeds 0.5, indicating a northern orientation across categories. Position_x shows more variation: it’s around 0.2 in Categories A and B (westward), about 0.1 in Category D (also westward), and approximately 0.8 in Category C (eastward). An eastward orientation aids seasonal carbon emission balance, while a northern orientation optimizes overall emissions.
-
WWR: WWR2 Concentrated around 0.10 for all categories. WWR1 Concentrated around 0.20 for all categories.
-
RMW: The single office width is approximately 18.5 m, suggesting that wider widths reduce the need for interior walls, lowering carbon emissions.
-
ATR: Concentrated around 0.24 in Categories A and D, ranging from 0.14 to 0.22 in Category B, and about 0.18 in Category C.
-
AR: Approximately 3880 m² in Categories A, B, and C, with Category D ranging from 3500 to 3900 m².
-
ASR: Relatively stable across all categories, as additions or demolitions are limited to maintain the building’s adaptive reuse potential.
-
NOF: This parameter accounts for a limit of one additional or reduced floor. Category A shows a balanced approach, Category B primarily adds one floor, Category D tends to lower floors, and Category C typically reflects reductions or no change.
The Pareto optimal solution modified shape parameter box diagram.
Discussion
The results indicate that, compared to the baseline, our proposed CatBoost-NSGA-II framework achieves the most optimal outcomes in the balanced type Category A. Specifically, reductions of 71.06% in material carbon emissions, 37.20% in operational carbon emissions, and 24.75% in seasonal carbon emissions were achieved. This confirms that our framework successfully integrates multi-objective optimization to reduce overall building carbon emissions.
Reuse of Building materials and application of low-carbon Building materials
The foremost strategy for reducing MCEI is to prioritize reusing existing building materials. From the optimization results of OTW, DOR, WIN, and RLG, we can see that OTW2, OTW3, OTW4, DOR1, WIN2, and RLG2 are all structures that utilize existing materials or recyclable building materials. This renovation fully leveraged the original structure, excluding the carbon emissions associated with these materials from the assessment50. Furthermore, to ensure minimal lifecycle carbon emissions, promoting the use of low-carbon materials is essential and represents a key MCEI reduction measure. In the MCEI optimization process, the B-type construction scheme proved incredibly effective, as materials chosen for INW, FLR, GRD, WIN, ROF, and RLG were selected for their minimal material carbon emissions.
Importantly, this construction scheme significantly reduces material carbon emissions and strictly complies with national energy efficiency standards, ensuring that the renovated building meets thermal performance requirements. Consequently, by capitalizing on these low-carbon material choices, the final optimized results achieved the lowest carbon emissions while maintaining energy efficiency. This pathway exemplifies the practical application of green building principles in renovation, offering replicable low-carbon strategies for similar future projects.
Appropriate reduction of internal partitions
Adjusting building functions inevitably requires modifications to the internal spatial layout. For instance, when repurposing industrial buildings for office use, additional internal walls are often needed to divide the space effectively. However, increasing the number of internal partitions raises material usage and significantly escalates the building’s MCEI. Therefore, reducing interior partitions can be an effective strategy to lower material consumption and carbon emissions, aligning with the objectives of green transformation.
At the same time, excessive partition reduction may result in disproportionately large spaces, which can impose additional cooling and heating demands on HVAC systems51, thereby increasing operational carbon emissions. Therefore, balancing minimizing partitions and preserving a functional spatial layout is essential. By carefully optimizing room width, it is possible to balance material savings and energy efficiency.
In the case study’s Pareto optimal solution, an optimal room width of 18.3 m was identified, which falls short of the maximum set value of 20 m. This outcome suggests that a moderate reduction in interior partitions reduces material carbon emissions and mitigates additional HVAC energy consumption due to enormous room sizes, thereby achieving concurrent optimization of carbon emission reductions and spatial utilization efficiency.
Appropriately enhance the thermal performance of the Building envelope
Building envelope renovations include components such as the roof, external walls, doors, windows, and cantilevered floor slabs, whose thermal performance directly impacts winter heating and summer cooling carbon emissions, influencing OCEI and SCEB. OCEI comprises CCEI, HCEI, LCEI, and ECEI, while SCEB represents the difference between CCEI and HCEI. Effective optimization of both metrics necessitates a focus on reducing carbon emissions from summer air conditioning.
In general, higher thermal performance in the building envelope, indicated by more excellent thermal resistance, enhances the insulation against external temperature fluctuations, thus decreasing both summer and winter air conditioning demands52. However, the multi-objective optimization results revealed that, aside from WIN, where thermal performance across options was relatively similar, other envelope components—such as OTW, ROF, and DOR—did not employ the highest thermal resistance choices. This is because high thermal resistance effectively blocks external heat and limits the dissipation of internally generated heat (e.g., from occupants, lighting, and equipment)53. In regions with hot summers and cold winters, especially with high humidity levels, additional energy is required for dehumidification, increasing energy consumption and associated carbon emissions during cooling, which can result in elevated total carbon emissions.
Therefore, to optimize both OCEI and SCEB, it is essential to maintain the building envelope’s thermal performance within an optimal range that balances insulation with the ability to dissipate excess indoor heat. In the case study’s Pareto optimal solution (Class A), OTW2, with a thermal resistance of 0.849, was selected. Although this resistance is lower than OTW3’s value of 1.2, OTW2 demonstrated a superior balance in energy efficiency and carbon emissions. This suggests that optimizing the thermal performance of the building envelope should be guided by an appropriate thermal resistance range rather than by indiscriminate pursuit of maximum resistance.
Adjust the location and size of the atrium to control solar heat gain
Adjusting the atrium’s area and location effectively controls solar heat gain, reducing summer cooling-related carbon emissions and optimizing the SCEB and OCEI54,55. Expanding the atrium makes the building thinner overall, decreasing the roof area and, consequently, solar heat gain, lowering summer cooling demand. This strategy reduces cooling carbon emissions and enhances natural lighting, leading to reduced lighting energy consumption and further optimization of the OCEI.
However, an increased atrium area can pose challenges for winter carbon emissions. A larger atrium area expands the surface for heat exchange with the exterior, which—despite reducing roof solar heat gain—may increase winter cooling demands and, thus, impact the OCEI. Therefore, optimizing the atrium area within a suitable range to balance carbon emissions across seasons is essential. In the Pareto optimal solution of the case study, Class A features an atrium occupying 24% of the total building area, achieving a favorable balance.
The atrium’s location also significantly influences the building’s carbon emissions. It impacts the internal layout, dividing the building into eastern, western, southern, and northern sections with varying solar exposures and, consequently, different cooling and heating demands. For instance, the afternoon sun affects western rooms more, typically increasing summer energy consumption. The optimal solution, Class A, in the case study, situates the atrium on the northwest side at coordinates (0.18, 0.76), which balances solar heat distribution and minimizes overall carbon emissions.
It is important to note that these strategies are based on a specific project analysis and are suited for projects with similar geographic locations, building forms, and renovation objectives. Combined with cluster analysis, a comparable optimization framework should be employed for other regions or building types to identify parametric characteristics for each project type and develop tailored design optimization strategies.
Research limitations and future directions
This study establishes an optimization framework for low-carbon renovation of industrial buildings, yet many avenues for application and extension remain to be explored. Initially, the current research is primarily based on atrium-based building forms. Future studies should consider expanding this framework to encompass a broader range of forms, such as side courtyards and curved structures, thereby enhancing the generalizability of conclusions and improving the framework’s adaptability across various building types. Additionally, this study focuses on transforming industrial buildings into office spaces. However, industrial buildings are frequently repurposed as art galleries, exhibition spaces, and other versatile forms. Therefore, further research should explore carbon emission optimization strategies specific to these building types to address the increasingly urgent need for low-carbon solutions.
Furthermore, while carbon emissions have been the primary focus thus far, future research should incorporate multi-dimensional objectives, such as cost control and occupant comfort, leading to a more holistic approach in renovation assessments and better alignment with the diverse needs of real-world projects. Finally, this study is based on typical meteorological data for China. As global climate change intensifies, future studies should integrate climate change scenarios into the model, enhancing the applicability and robustness of the optimization results under future climate conditions56.
Conclusions
This study develops a multi-objective optimization framework integrating CatBoost with NSGA-II to address the complex challenges of low-carbon building renovation. This framework enables building designers to make data-driven, precise decisions for low-carbon renovations across various building types. It provides a scientific basis and technical support for the industry’s transition towards low-carbon practices and has been validated through real-world case studies.
In this study, a three-story industrial building of 3,600 square meters, with a 4:1 aspect ratio and a 4.2-meter floor height, was repurposed as office space. The optimization framework was tailored to the building’s form and construction parameters, with material carbon emissions, operational carbon emissions, and seasonal carbon emission differences as the primary objectives. Emphasizing the urgency of low-carbon renovation amidst climate change, parametric modeling revealed four optimized solutions, with Category A presenting the most balanced strategy. Compared to the baseline design, Category A achieved reductions of 71.06% in material carbon emissions, 37.20% in operational carbon emissions, and 24.75% in seasonal carbon emission difference, reaching specific values of 4.96, 3.74, and 29.76 kgCO2/(m²*a), respectively.
To optimize carbon emissions in building renovation, priority should be given to reusing existing materials and promoting low-carbon materials to reduce MCEI effectively. Appropriate reductions in internal partitions can also lower both material use and carbon emissions. However, considerations must be made to balance spatial layout with HVAC energy consumption, avoiding increased emissions due to vast spaces. The thermal performance of the building envelope should also be maintained within a suitable range to prevent increased winter cooling carbon emissions due to high thermal resistance. By adjusting the atrium’s area and location, solar heat gain can be optimized, reducing summer cooling carbon emissions and enhancing natural lighting, decreasing lighting energy consumption. These strategies outline an innovative low-carbon optimization pathway for building renovations, applicable to similar projects, and establish a reference framework for future renovation endeavors.
Future research should explore renovation strategies across diverse climate conditions, integrating cost control and occupant comfort objectives to comprehensively support the building sector’s green transformation and global carbon neutrality targets. With these advancements, the optimization framework will significantly reduce lifecycle carbon emissions and provide robust scientific and technical guidance for sustainable building design and renovation.
Data availability
The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.
Abbreviations
- MCEI:
-
Material carbon emission intensity
- OCEI:
-
Operational carbon emission intensity
- SCEB:
-
Seasonal carbon emission balance
- GBRS:
-
Green building rating systems
- CSWD:
-
China standard weather data
- CCEI:
-
Cooling carbon emission intensity
- HCEI:
-
Heating carbon emission intensity
- LCEI:
-
Lighting carbon emission intensity
- ECEI:
-
Equipment carbon emission intensity
- DSE:
-
Design space exploration
- LightGBM:
-
Light gradient boosting machine
- MAPE:
-
Mean absolute percentage
- ErroMAE:
-
Mean absolute error
- MOO:
-
Multi-objective optimization
- NSGA-II:
-
Non-dominated sorting genetic algorithm ii
- MCEUC:
-
Material carbon emission of unit construction
References
Ang, Y. Q., Berzolla, Z. M., Letellier-Duchesne, S. & Reinhart, C. F. Carbon reduction technology pathways for existing buildings in eight cities. Nat. Commun. 14, 1689. https://doi.org/10.1038/s41467-023-37131-6 (2023).
Johnson, E. J., Sugerman, E. R., Morwitz, V. G., Johar, G. V. & Morris, M. W. Widespread misestimates of greenhouse gas emissions suggest low carbon competence. Nat. Clim. Chang. 14, 707–714. https://doi.org/10.1038/s41558-024-02032-z (2024).
Weber, R. E., Mueller, C. & Reinhart, C. A hypergraph model shows the carbon reduction potential of effective space use in housing. Nat. Commun. 15, 8327. https://doi.org/10.1038/s41467-024-52506-z (2024).
Xing, Z., Ma, Y., Luo, L. & Wang, H. Harmonizing economies and ecologies: towards an equitable provincial carbon quota allocation for china’s peak emissions. Humanit. Soc. Sci. Commun. 11, 964. https://doi.org/10.1057/s41599-024-03478-4 (2024).
Ministry of Housing and Urban-Rural Development of the People’s Republic of China. Fourteenth Five-Year Plan for the Development of the Construction Industry. (2022). https://www.gov.cn/zhengce/zhengceku/2022-01/27/content_5670687.htm
National Development and Reform Commission. National Old Industrial Base Adjustment and & Transformation Plan : NDRC Northeast [2013] No. 543 (2013). –2022 https://www.ndrc.gov.cn/xxgk/zcfb/ghwb/201304/t20130402_962134.html
Xiao, W., Zhong, W., Wu, H. & Zhang, T. Multiobjective optimization of daylighting, energy, and thermal performance for form variables in atrium buildings in china’s hot summer and cold winter climate. Energy Build. 297, 113476. https://doi.org/10.1016/j.enbuild.2023.113476 (2023).
Zhou, B. & Wang, D. Integrated performance optimization of industrial buildings in relation to thermal comfort and energy consumption: A case study in hot summer and cold winter climate. Case Stud. Therm. Eng. 46, 102991. https://doi.org/10.1016/j.csite.2023.102991 (2023).
Liu, Y. et al. Towards the goal of zero-carbon Building retrofitting with variant application degrees of low-carbon technologies: mitigation potential and cost-benefit analysis for a kindergarten in Beijing. J. Clean. Prod. 393, 136316. https://doi.org/10.1016/j.jclepro.2023.136316 (2023).
Mulya, K. S. Decarbonizing the high-rise office building: A life cycle carbon assessment to green building rating systems in a tropical country (Building and Environment, 2024).
Lou, Y., Yang, Y., Ye, Y., Zuo, W. & Wang, J. The effect of Building retrofit measures on CO2 emission reduction – A case study with U.S. Medium office Buildings. Energy Build. 253, 111514. https://doi.org/10.1016/j.enbuild.2021.111514 (2021).
Lou, Y., Ye, Y., Yang, Y. & Zuo, W. Long-term carbon emission reduction potential of Building retrofits with dynamically changing electricity emission factors. Build. Environ. 210, 108683. https://doi.org/10.1016/j.buildenv.2021.108683 (2022).
Hao, J. L. & Ma, W. Evaluating carbon emissions of construction and demolition waste in Building energy retrofit projects. Energy 281, 128201. https://doi.org/10.1016/j.energy.2023.128201 (2023).
Craft, W. et al. Towards net-zero embodied carbon: investigating the potential for ambitious embodied carbon reductions in Australian office buildings. Sustainable Cities Soc. 113, 105702. https://doi.org/10.1016/j.scs.2024.105702 (2024).
García-López, J., Hernández-Valencia, M., Roa-Fernández, J., Mascort-Albea, E. J. & Herrera-Limones, R. Balancing construction and operational carbon emissions: evaluating neighbourhood renovation strategies. J. Building Eng. 94, 109993. https://doi.org/10.1016/j.jobe.2024.109993 (2024).
Li, K. et al. Carbon reduction in commercial Building operations: A provincial retrospection in China. Appl. Energy 306, 118098. https://doi.org/10.1016/j.apenergy.2021.118098 (2022).
Zhang, Y., Teoh, B. K. & Zhang, L. Data-driven optimization for mitigating energy consumption and GHG emissions in buildings. Environ. Impact Assess. Rev. 107, 107571. https://doi.org/10.1016/j.eiar.2024.107571 (2024).
Luo, X. J. & Oyedele, L. O. A data-driven life-cycle optimisation approach for Building retrofitting: A comprehensive assessment on economy, energy and environment. J. Building Eng. 43, 102934. https://doi.org/10.1016/j.jobe.2021.102934 (2021).
Thrampoulidis, E., Mavromatidis, G., Lucchi, A. & Orehounig, K. A machine learning-based surrogate model to approximate optimal Building retrofit solutions. Appl. Energy 281, 116024. https://doi.org/10.1016/j.apenergy.2020.116024 (2021).
Justo Alonso, M., Dols, W. S. & Mathisen, H. M. Using Co-simulation between energyplus and CONTAM to evaluate recirculation-based, demand-controlled ventilation strategies in an office Building. Build. Environ. 211, 108737. https://doi.org/10.1016/j.buildenv.2021.108737 (2022).
Luo, X. J. & Oyedele, L. O. Assessment and optimisation of life cycle environment, economy and energy for Building retrofitting. Energy. Sustain. Dev. 65, 77–100. https://doi.org/10.1016/j.esd.2021.10.002 (2021).
Uribe, D., Vera, S. & Perino, M. Development and validation of a numerical heat transfer model for PCM glazing: integration to energyplus for office Building energy performance applications. J. Energy Storage 91, 112121. https://doi.org/10.1016/j.est.2024.112121 (2024).
Gao, Y., Luo, S., Jiang, J. & Rong, Y. Environmental-thermal-economic performance trade-off for rural residence retrofitting in the Beijing–Tianjin–Hebei region, Northern china: A multi-objective optimisation framework under different scenarios. Energy Build. 286, 112910. https://doi.org/10.1016/j.enbuild.2023.112910 (2023).
Li, J. et al. Cooking-related thermal comfort and carbon emissions assessment: comparison between electric and gas cooking in air-conditioned kitchens. Build. Environ. 265, 111992. https://doi.org/10.1016/j.buildenv.2024.111992 (2024).
Li, C. et al. Optimal design of Building envelope towards life cycle performance: impact of considering dynamic grid emission factors. Energy Build. 323, 114770. https://doi.org/10.1016/j.enbuild.2024.114770 (2024).
Mostafazadeh, F., Eirdmousa, S. J. & Tavakolan, M. Energy, economic and comfort optimization of Building retrofits considering climate change: A simulation-based NSGA-III approach. Energy Build. 280, 112721. https://doi.org/10.1016/j.enbuild.2022.112721 (2023).
Bayer, D. R. & Pruckner, M. Data-driven heat pump retrofit analysis in residential buildings: carbon emission reductions and economic viability. Appl. Energy 373, 123823. https://doi.org/10.1016/j.apenergy.2024.123823 (2024).
Liu, S. et al. How does future Climatic uncertainty affect multi-objective Building energy retrofit decisions? Evidence from residential Buildings in subtropical Hong Kong. Sustainable Cities Soc. 92, 104482. https://doi.org/10.1016/j.scs.2023.104482 (2023).
Song, J. et al. Framework on low-carbon retrofit of rural residential buildings in arid areas of Northwest china: A case study of Turpan residential buildings. Build. Simul. 16, 279–297. https://doi.org/10.1007/s12273-022-0941-9 (2023).
Lao, W. L., Li, M., Wong, B. C. L., Gan, V. J. L. & Cheng, J. C. P. BIM-based constructability-aware precast Building optimization using optimality criteria and combined non-dominated sorting genetic II and great deluge algorithm (NSGA-II-GD). Autom. Constr. 155, 105065. https://doi.org/10.1016/j.autcon.2023.105065 (2023).
Galimshina, A. et al. What is the optimal robust environmental and cost-effective solution for Building renovation? Not the usual one. Energy Build. 251, 111329. https://doi.org/10.1016/j.enbuild.2021.111329 (2021).
Wu, C., Pan, H., Luo, Z., Liu, C. & Huang, H. Multi-objective optimization of residential Building energy consumption, daylighting, and thermal comfort based on BO-XGBoost-NSGA-II. Build. Environ. 254, 111386. https://doi.org/10.1016/j.buildenv.2024.111386 (2024).
Shi, Y. & Chen, P. Energy retrofitting of hospital buildings considering climate change: an approach integrating automated machine learning with NSGA-III for multi-objective optimization. Energy Build. 319, 114571. https://doi.org/10.1016/j.enbuild.2024.114571 (2024).
Zhou, C. et al. Deciphering the nonlinear and synergistic role of Building energy variables in sha** carbon emissions: A LightGBM-SHAP framework in office Buildings[J]. Build. Environ. 266, 112035 (2024).
Wang, Z., Zhao, Z. & Wang, C. Random forest analysis of factors affecting urban carbon emissions in cities within the Yangtze river economic Belt[J]. PLoS One 16(6), e0252337 (2021).
Pan, H. & Wu, C. Bayesian optimization + XGBoost based life cycle carbon emission prediction for residential buildings—An example from Chengdu, China[C]//Building Simulation Vol. 16, 1451–1466 (Tsinghua University, 2023).
Yuan, Y. et al. Effects of patterns of urban green-blue landscape on carbon sequestration using XGBoost-SHAP model[J]. J. Clean. Prod. 476, 143640 (2024).
Zhang, W. et al. Leveraging AI techniques for predicting Spatial distribution and determinants of carbon emission in china’s Yangtze river Delta[J]. Sci. Rep. 14(1), 15392 (2024).
Qu, X., & Liu, Z. Forecasting the total building energy based on its architectural features using a combination of CatBoost and meta-heuristic algorithms. Energy & Environment, 0(0). https://doi.org/10.1177/0958305X241241029 (2024).
Construction and construction services. China Prod. Carbon Footpr. Factors Database, https://lca.cityghg.com/pages/topic/1 (2023).
Papadopoulos, S. et al. Evaluation of tree-based ensemble learning algorithms for Building energy performance estimation[J]. J. Build. Perform. Simul. 11(3), 322–332 (2018).
Giannelos, S. et al. Machine learning approaches for predictions of CO2 emissions in the Building sector[J]. Electr. Power Syst. Res. 235, 110735 (2024).
Martínez-Comesaña, M. et al. Use of optimised MLP neural networks for Spatiotemporal Estimation of indoor environmental conditions of existing buildings[J]. Build. Environ. 205, 108243 (2021).
Li, B., Wang, Z., Liu, Y., Zhu, Y. & Ouyang, Q. Investigation of winter indoor thermal environment and heating demand of urban residential buildings in china’s hot summer–Cold winter climate region. Build. Environ. 101, 9–18. https://doi.org/10.1016/j.buildenv.2016.02.005 (2016).
Gutowski, T. G., Allwood, J. M., Herrmann, C. & Sahni, S. A global assessment of manufacturing: economic development, energy use, carbon emissions, and the potential for energy efficiency and materials recycling. Annu. Rev. Environ. Resour. 38(1), 81–106. https://doi.org/10.1146/annurev-environ-041112-110510 (2013).
Peng, C. Calculation of a Building’s life cycle carbon emissions based on ecotect and Building information modeling. J. Clean. Prod. 112, 453–465. https://doi.org/10.1016/j.jclepro.2015.08.078 (2016).
Liu, T. et al. Carbon emissions from accumulated stock of Building materials in China. Build. Environ. 240, 110451. https://doi.org/10.1016/j.buildenv.2023.110451 (2023).
Ke, G. et al. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30 https://doi.org/10.5555/3294996.3295074 (2017).
Hou, F., Ma, J., Kwok, H. H. & Cheng, J. C. Prediction and optimization of thermal comfort, IAQ and energy consumption of typical air-conditioned rooms based on a hybrid prediction model. Build. Environ. 225, 109576. https://doi.org/10.1016/j.buildenv.2022.109576 (2022).
Orsini, F. & Marrone, P. Approaches for a low-carbon production of Building materials: A review. J. Clean. Prod. 241, 118380 (2019).
Yılmaz, Z. Evaluation of energy efficient design strategies for different climatic zones: Comparison of thermal performance of buildings in temperate-humid and hot-dry climate. Energy and buildings, 39 (3), 306–316. (2007).
Yuan, J., Farnham, C., Emura, K. & Alam, M. A. Proposal for optimum combination of reflectivity and insulation thickness of Building exterior walls for annual thermal load in Japan. Build. Environ. 103, 228–237 (2016).
Zhang, G., Wu, H., Liu, J., Huang, H. & Liu, Y. A multi-objective design optimization for the exterior wall coatings of residential buildings in hot summer and warm winter regions. Build. Environ. 262, 111776. https://doi.org/10.1016/j.buildenv.2023.111776 (2024).
Diz-Mellado, E., Ruiz-Pardo, Á., Rivera-Gómez, C., de la Flor, F. J. S. & Galán-Marín, C. Unravelling the impact of courtyard geometry on cooling energy consumption in buildings. Build. Environ. 237, 110349. https://doi.org/10.1016/j.buildenv.2023.110349 (2023).
De La Sánchez, F. J., Ruiz-Pardo, Á., Diz-Mellado, E. & Rivera-Gómez, C. Galán-Marín, C. Assessing the impact of courtyards in cooling energy demand in buildings. J. Clean. Prod. 320, 128742 (2021).
Ashrafian, T. Enhancing school buildings energy efficiency under climate change: A comprehensive analysis of energy, cost, and comfort factors. J. Building Eng. 80, 107969. https://doi.org/10.1016/j.jobe.2023.107969 (2023).
Acknowledgements
This research was supported by the Philosophy and Social Science Planning Project of Zhejiang Province (No. 24NDQN042YB), the Scientific Research and Nurturing Foundation of Hangzhou City University(No.J-202308).
Author information
Authors and Affiliations
Contributions
H. Z.: Conceptualization, Validation, Supervision and Project administration. C. H.: Conceptualization, Resources, Visualization and Writing-original draft preparation. C. Z.: Conceptualization, Writing-original draft preparation, Writing-review and editing and Funding acquisition. Z. W.: Funding acquisition. X. W.: Methodology, Software, Writing-review, Data curation and editing. Y. W.: Formal analysis, Validation and Supervision.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhu, H., Hu, C., Zhou, C. et al. An explainable machine learning framework for multi-objective carbon reduction targeting material operational seasonal emissions in building retrofits. Sci Rep 16, 272 (2026). https://doi.org/10.1038/s41598-025-29418-z
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-29418-z

















