Introduction

Background

Modern green building energy systems represent complex interconnected networks that integrate multiple energy sources, distribution infrastructure, and end-use applications under varying environmental conditions. Figure 1 illustrates the multifaceted nature of energy transfer processes in contemporary building environments, where residential and commercial structures interact with centralized power generation facilities, renewable energy sources, and sophisticated distribution networks. The diagram demonstrates how climatic variability–represented by sunny and rainy conditions–directly influences energy transmission efficiency and consumption patterns across different building types and infrastructure components. This complexity underscores the critical need for advanced prediction methodologies capable of modeling the dynamic interactions between climate conditions, energy infrastructure, and building performance characteristics that define modern sustainable energy systems.

Fig. 1
Fig. 1
Full size image

Energy transfer process.

Global energy consumption and carbon emissions from the building sector have reached unprecedented levels, with buildings accounting for approximately 40% of worldwide energy consumption and contributing significantly to greenhouse gas emissions. The construction industry alone generated 10 GtCO2 emissions in 2020, representing 28% of global carbon pollution, while the broader construction sector accounts for 38% of global energy consumption and 33% of greenhouse gas emissions. Green buildings have emerged as a fundamental solution to address these environmental challenges by integrating renewable energy systems, advanced building materials, and intelligent management technologies to minimize energy consumption while maintaining occupant comfort. The widespread adoption of green building practices has become essential for achieving global sustainability targets and mitigating climate change impacts.

Energy consumption prediction in green buildings presents complex challenges that exceed those encountered in conventional buildings due to the dynamic interactions between renewable energy systems, advanced building automation, and variable environmental conditions. Traditional prediction methods often fail to capture the sophisticated energy patterns exhibited by green buildings, which integrate solar panels, energy storage systems, smart HVAC controls, and adaptive building envelopes. The temporal variability of renewable energy generation, combined with complex load patterns from energy-efficient systems, creates prediction scenarios that require advanced modeling approaches capable of handling multi-scale temporal dependencies and nonlinear relationships between building performance and environmental factors.

Climate variability introduces additional complexity to energy consumption prediction in green buildings, as weather patterns directly influence both energy demand and renewable energy generation capacity. Temperature fluctuations affect heating and cooling loads, solar irradiance variations impact photovoltaic output, and precipitation patterns influence both building envelope performance and occupant behavior. The increasing frequency of extreme weather events and long-term climate shifts necessitate prediction models that can adapt to changing environmental conditions while maintaining accuracy across diverse climate zones and seasonal variations. Accurate energy consumption forecasting under variable climate conditions has become critical for optimizing building performance, reducing operational costs, and supporting grid stability as green buildings increasingly participate in smart energy networks.

The convergence of these challenges - the urgent need for sustainable building solutions, the complexity of green building energy systems, and the increasing impact of climate variability - underscores the critical importance of developing sophisticated energy consumption prediction capabilities specifically tailored for green buildings. Current prediction approaches inadequately address the unique operational characteristics of green buildings under variable climate conditions, creating a significant research gap that limits the optimization potential of sustainable building technologies. Addressing this gap through advanced prediction methodologies represents an essential step toward maximizing the environmental and economic benefits of green building investments while supporting the transition to a more sustainable built environment.

Literature review

As the development of society, green buildings are committed to minimizing their impact on the environment by optimizing energy efficiency and utilizing renewable energy. L. Giovannini et al. conducted a study and found that adverse weather conditions such as rain can directly and indirectly increase the heating and lighting needs of buildings1. In the same year, H Leng et al. explored how rainwater harvesting systems can reduce energy demand in cold climates while potentially increasing the external cooling demand of buildings2. Liu et al. then used Seq2Seq combined with LSTM network models to capture long-term dependencies in energy consumption data, thereby improving prediction accuracy3.R. K et al. used passive solar design, efficient insulation materials, energy-saving lighting, and efficient HVAC systems to reduce dependence on fossil fuels and greenhouse gas emissions4. Similarly, Varela-Lujan et al. optimized architectural design to enhance the use of natural light, reduce energy consumption, and apply double-layer glass and high-efficiency insulation materials5. M. Abdulrahman et al. highlighted the selection of key features for predicting residential building energy consumption through deep transfer learning and attention LSTM techniques6. Pankaj Kushwaha proposed the importance of enhancing the interpretability of prediction results based on Seq2Seq and RL models7.

The prediction of building energy consumption plays a crucial role in energy management and planning. Ran Wang et al. demonstrated the advantages of integrated model technology and various prediction algorithms in improving prediction accuracy and robustness8. Tao Liu et al. found that deep reinforcement learning has significant advantages in solving complex prediction problems9. Meanwhile, Jian Wang et al. proposed an LSTM network prediction method that accurately captures the periodic characteristics of energy consumption by adding time variables, demonstrating better predictive performance than traditional methods10. A. Khan et al. improved the accuracy of the prediction model by combining LSTM with Kalman filtering (KF) technology11. In the same year, Yuan Gao et al. enhanced the interpretability of LSTM networks in building energy consumption prediction using self-attention mechanism12, Fan, B. et al. improved the recognition ability of important data by combining Seq2Seq model with attention mechanism13. Yang Xu et al. evaluated the potential application of attention-based LSTM networks in predicting HVAC energy consumption and demonstrated their advantages in providing accurate and reliable predictions compared to other baseline models14. Ke Li et al. shown that RL models can adjust and optimize energy use strategies based on prediction results15. Zhifeng Lin found that by combining LSTM networks and attention mechanisms, the accuracy of electricity consumption prediction has been improved16.Recent advances in model generalization and cross-domain adaptation have introduced novel approaches to address data distribution challenges. Ahmad et al. proposed an Advanced Zero-Shot Learning (AZSL) framework for federated learning environments, demonstrating significant improvements in handling non-independent and identically distributed (non-IID) data while preserving privacy17. Their framework achieved generalization gap reductions of up to 13.3% and improved cross-domain adaptability by 4.8-7.4%, highlighting the importance of advanced generalization techniques in distributed learning scenarios. To systematically understand the current state of research in building energy consumption prediction, we present a comprehensive literature review as summarized in Table 1.

Table 1 Literature summary.

While existing research has made significant contributions to building energy consumption prediction as outlined in Table 1, several critical limitations remain unaddressed. Most studies focus on single or paired techniques such as LSTM alone or LSTM with attention mechanisms, lacking the comprehensive integration necessary to address the multifaceted nature of green building energy dynamics. Existing models typically treat climate conditions as static input variables rather than dynamic factors requiring adaptive response strategies, which limits their effectiveness in real-world scenarios where climate variability plays a crucial role. The majority of current approaches also lack cross-domain generalization capabilities, restricting their applicability across different building types and climate zones. Previous work has not adequately addressed the specific challenges of green buildings, which exhibit more complex energy patterns due to renewable energy integration and advanced building management systems. These limitations collectively highlight the need for a more sophisticated, integrated approach that can capture the dynamic, multi-scale interactions between climate variability and building energy consumption while maintaining robust performance across diverse operational contexts.

Our contributions

This study addresses the aforementioned limitations by developing a novel integrated prediction framework that advances the field through three key innovations:

  • Deep Integration Architecture for Climate-Adaptive Prediction: Unlike existing approaches that use techniques in isolation or simple combination, this work proposes a synergistic integration of Seq2Seq, reinforcement learning, LSTM, and attention mechanisms specifically optimized for green building energy prediction under variable climate conditions. The framework enables dynamic adaptation through reinforcement learning policies that adjust prediction strategies based on real-time climate feedback, while the Seq2Seq architecture captures complex temporal dependencies inherent in green building energy systems. This integration moves beyond existing static models by creating a responsive prediction system that learns optimal strategies for different climate scenarios.

  • Climate-Aware Transfer Learning Strategy: The research introduces a novel transfer learning approach that leverages attention-enhanced LSTM networks to capture climate-specific energy consumption patterns and transfer knowledge across different building types and geographic regions. This differs from conventional transfer learning approaches by incorporating climate variability as a core component of the knowledge transfer process, enabling the model to generalize effectively across diverse environmental conditions while maintaining prediction accuracy. The attention mechanism specifically focuses on climate-sensitive time periods, allowing for more precise knowledge transfer between source and target domains.

  • Comprehensive Validation Framework for Green Building Applications: The experimental validation goes beyond traditional accuracy metrics by evaluating model performance across multiple climate conditions, building types, and operational scenarios using real-world datasets. Superior performance compared to existing methods is demonstrated while providing detailed analysis of computational efficiency and practical deployment considerations. This comprehensive evaluation framework establishes new benchmarks for green building energy prediction research and provides practical guidance for real-world implementation.

Methodology

Problem description

Green building energy consumption prediction requires understanding the complex mathematical relationships between building systems, environmental conditions, and temporal dynamics. This section establishes the mathematical foundation of the research scenario through a systematic formulation that progressively reveals the inherent challenges.

The fundamental energy consumption prediction problem in green buildings can be initially formulated as a basic regression task. At its simplest form, the relationship between input variables and energy consumption can be expressed as:

$$\begin{aligned} \hat{E}_{t+1} = f(\textbf{X}_t) \end{aligned}$$
(1)

where \(\hat{E}_{t+1}\) represents the predicted energy consumption at time \(t+1\), \(\textbf{X}_t\) denotes the input feature vector at time t, and \(f(\cdot )\) is the prediction function. However, this basic formulation fails to capture the temporal complexities inherent in green building systems.

Green building energy consumption exhibits strong temporal dependencies that extend beyond simple autoregressive patterns. The energy demand at any given time is influenced by consumption patterns from multiple previous time steps, necessitating a more sophisticated temporal formulation:

$$\begin{aligned} \hat{E}_{t+\Delta t} = f(E_{t-\tau :t}, \textbf{C}_{t-\tau :t}) \end{aligned}$$
(2)

where \(E_{t-\tau :t} = \{E_{t-\tau }, E_{t-\tau +1}, \ldots , E_t\}\) represents the historical energy consumption sequence over a time window \(\tau\), \(\textbf{C}_{t-\tau :t}\) denotes the sequence of building control states, and \(\Delta t\) is the prediction horizon. The window size \(\tau\) must be sufficiently large to capture long-term dependencies while remaining computationally tractable.

Climate variability introduces additional complexity as weather conditions directly influence energy demand through heating, cooling, and lighting requirements. The mathematical relationship between climate variables and energy consumption can be expressed through a climate-coupled model:

$$\begin{aligned} \hat{E}_{t+\Delta t} = f(E_{t-\tau :t}, \textbf{W}_{t-\tau :t}, \textbf{S}_t) \end{aligned}$$
(3)

where \(\textbf{W}_{t-\tau :t} = \{W_{t-\tau }, W_{t-\tau +1}, \ldots , W_t\}\) represents the historical weather sequence with \(W_t = [T_t, H_t, P_t, R_t, I_t]^T\) containing temperature \(T_t\), humidity \(H_t\), precipitation \(P_t\), wind speed \(R_t\), and solar irradiance \(I_t\), while \(\textbf{S}_t\) captures seasonal and cyclical patterns that modulate climate effects.

Real-world green building energy consumption exhibits multi-scale temporal patterns ranging from short-term fluctuations to long-term seasonal cycles. This multi-scale behavior requires consideration of different temporal frequencies simultaneously:

$$\begin{aligned} \hat{E}_{t+\Delta t} = \sum _{k=1}^{K} \alpha _k \cdot f_k(E_{t-\tau _k:t}, \textbf{W}_{t-\tau _k:t}) + \sum _{j=1}^{J} \beta _j \cdot g_j(\textbf{S}_{t,j}) \end{aligned}$$
(4)

where \(f_k(\cdot )\) represents prediction functions operating at different time scales with corresponding window sizes \(\tau _k\), \(g_j(\cdot )\) captures seasonal components with \(\textbf{S}_{t,j}\) representing seasonal features at different frequencies, \(\alpha _k\) and \(\beta _j\) are weighting coefficients, and K and J denote the number of temporal scales and seasonal components respectively.

Cross-domain generalization challenges arise when applying prediction models across different building types, geographic locations, and climate zones. The domain adaptation problem can be mathematically formulated as learning a function that performs well across multiple data distributions:

$$\begin{aligned} \mathscr {F}^* = \arg \min _{\mathscr {F}} \sum _{d=1}^{D} w_d \cdot \mathbb {E}_{(\textbf{X},E) \sim \mathscr {P}_d} \left[ \ell (E, \mathscr {F}(\textbf{X})) \right] \end{aligned}$$
(5)

where \(\mathscr {P}_d\) represents the data distribution for domain d, \(w_d\) denotes domain-specific weights, D is the total number of domains, \(\ell (\cdot , \cdot )\) is a loss function, and \(\mathscr {F}\) is the prediction function that must generalize across domains with potentially different statistical properties.

The comprehensive prediction challenge requires balancing accuracy, computational efficiency, and adaptability to changing conditions. This can be expressed through a multi-objective optimization framework:

$$\begin{aligned} \min _{\theta } \left\{ \mathscr {L}_{pred}(\theta ) + \lambda _1 \mathscr {L}_{adapt}(\theta ) + \lambda _2 \mathscr {C}_{comp}(\theta ) \right\} \end{aligned}$$
(6)

where \(\theta\) represents model parameters, \(\mathscr {L}_{pred}(\theta )\) measures prediction accuracy, \(\mathscr {L}_{adapt}(\theta )\) quantifies adaptation capability across different conditions, \(\mathscr {C}_{comp}(\theta )\) represents computational complexity constraints, and \(\lambda _1, \lambda _2\) are regularization parameters balancing these competing objectives.

Problem 1

Given the mathematical formulation of green building energy consumption complexity outlined above, the research problem can be formally stated as developing an integrated prediction framework that addresses temporal dependencies, climate variability, multi-scale patterns, and cross-domain generalization simultaneously. The objective is to find an optimal prediction function \(\mathscr {F}^*\) that minimizes the expected prediction error while maintaining adaptability across diverse operational conditions:

$$\begin{aligned} \mathscr {F}^* = \arg \min _{\mathscr {F}} \mathbb {E} \left[ \ell \left( E_{t+\Delta t}, \mathscr {F}(E_{t-\tau :t}, \textbf{W}_{t-\tau :t}, \textbf{D}) \right) \right] + \Omega (\mathscr {F}) \end{aligned}$$
(7)

where \(\textbf{D}\) represents domain-specific characteristics, \(\Omega (\mathscr {F})\) is a regularization term promoting generalization, and the expectation is taken over all possible building types, climate conditions, and temporal patterns encountered in green building applications.

Motivation of Sequence to Sequence (Seq2Seq) Reinforcement Learning (RL) Model

  • Traditional methods for predicting building energy consumption focus on static data analysis, ignoring the time series characteristics of dynamic changes in energy consumption18. In addition, these methods perform poorly in predicting energy consumption under adverse weather conditions, especially in situations that require adaptation to environmental changes and corresponding adjustment strategies19,20.

  • To overcome these limitations of existing methods, this study proposed a composite prediction model that integrates Seq2Seq and RL. The Seq2Seq model can effectively capture the long-term dependencies of time series data and accurately predict future energy consumption. Our model can automatically learn and adjust prediction strategies combining with RL to adapt to constantly changing environmental conditions.

The integrated framework architecture demonstrates the synergistic combination of multiple advanced components for climate-adaptive energy prediction. Figure 2 illustrates how the Seq2Seq and reinforcement learning components work together to process temporal data and adapt to changing environmental conditions.

Fig. 2
Fig. 2
Full size image

Architecture of the composite prediction model integrating Seq2Seq and reinforcement learning.

Mathematical formulation of Seq2Seq-RL framework

The Seq2Seq-RL framework addresses the temporal dependency and environmental adaptation challenges by integrating sequence-to-sequence learning with reinforcement learning policies. This approach establishes the foundation for adaptive energy consumption prediction in green buildings under dynamic climate conditions.

The Seq2Seq encoder-decoder architecture processes historical energy and climate data through bidirectional LSTM networks to capture complex temporal dependencies. The encoder transforms the input sequence into a comprehensive context representation:

$$\begin{aligned} \textbf{h}_t^{enc} = \text {BiLSTM}_{enc}\left( \left[ \begin{array}{c} E_t \\ \textbf{W}_t \\ \textbf{C}_t \end{array}\right] , \textbf{h}_{t-1}^{enc}\right) , \quad \textbf{z}_{context} = \frac{1}{\tau }\sum _{t=1}^{\tau } \tanh \left( \textbf{W}_{ctx}\textbf{h}_t^{enc} + \textbf{b}_{ctx}\right) \end{aligned}$$
(8)

where \(\textbf{h}_t^{enc} \in \mathbb {R}^{d_{enc}}\) represents the encoder hidden state, \(E_t\) is the energy consumption, \(\textbf{W}_t\) denotes weather conditions, \(\textbf{C}_t\) represents building control states, \(\textbf{W}_{ctx} \in \mathbb {R}^{d_{ctx} \times d_{enc}}\) and \(\textbf{b}_{ctx} \in \mathbb {R}^{d_{ctx}}\) are learned parameters, and \(\textbf{z}_{context}\) provides the compressed context representation for decoding.

The reinforcement learning component defines the state space, action space, and reward structure specifically tailored for green building energy prediction. The state representation integrates temporal patterns, climate conditions, and prediction uncertainty:

$$\begin{aligned} \textbf{s}_t = \left[ \begin{array}{c} \textbf{z}_{context} \\ \sigma (\textbf{W}_{pred}\textbf{h}_t^{dec} + \textbf{b}_{pred}) \\ \text {softmax}(\textbf{W}_{climate}\textbf{W}_t + \textbf{b}_{climate}) \\ \log \left( 1 + \Vert \hat{E}_t - E_t\Vert _2^2\right) \end{array}\right] , \quad \textbf{a}_t = \tanh \left( \textbf{W}_{action}\textbf{s}_t + \textbf{b}_{action}\right) \odot \varvec{\delta }_{bound} \end{aligned}$$
(9)

where \(\textbf{h}_t^{dec}\) is the decoder hidden state, \(\sigma (\cdot )\) represents the sigmoid function, the climate sensitivity term captures weather impact, the prediction error term quantifies current performance, \(\textbf{a}_t \in [-1,1]^{d_{action}}\) represents continuous actions for prediction adjustment, and \(\varvec{\delta }_{bound}\) enforces action constraints to ensure physically reasonable adjustments.

The reward function encourages accurate predictions while promoting stable learning and climate adaptability. The comprehensive reward structure incorporates multiple objectives essential for green building energy management:

$$\begin{aligned} r_t = -\alpha _{acc} \cdot \exp \left( \frac{\Vert \hat{E}_t - E_t\Vert _2^2}{\sigma _{energy}^2}\right) + \alpha _{stab} \cdot \exp \left( -\frac{\Vert \textbf{a}_t - \textbf{a}_{t-1}\Vert _2^2}{\sigma _{action}^2}\right) + \alpha _{climate} \cdot \sum _{k=1}^{K} w_k \cos \left( \frac{2\pi t}{T_k}\right) \mathbb {I}_{climate}(k,t) \end{aligned}$$
(10)

where \(\alpha _{acc}, \alpha _{stab}, \alpha _{climate}\) are weighting coefficients, \(\sigma _{energy}^2\) and \(\sigma _{action}^2\) are normalization factors, \(w_k\) represents climate-specific weights, \(T_k\) denotes seasonal periods, and \(\mathbb {I}_{climate}(k,t)\) is an indicator function that activates during specific climate conditions to encourage appropriate responses to weather variations.

The policy network and value function operate in conjunction to optimize prediction strategies through actor-critic learning. The policy network generates probability distributions over continuous action spaces while the value function estimates expected future rewards:

$$\begin{aligned} \pi _{\theta }(\textbf{a}_t|\textbf{s}_t) = \mathscr {N}\left( \varvec{\mu }_{\theta }(\textbf{s}_t), \varvec{\Sigma }_{\theta }(\textbf{s}_t)\right) , \quad V_{\phi }(\textbf{s}_t) = \textbf{w}_v^T \tanh \left( \textbf{W}_v^{(2)} \text {ReLU}\left( \textbf{W}_v^{(1)} \textbf{s}_t + \textbf{b}_v^{(1)}\right) + \textbf{b}_v^{(2)}\right) \end{aligned}$$
(11)

where \(\varvec{\mu }_{\theta }(\textbf{s}_t) = \textbf{W}_{\mu } \text {ReLU}(\textbf{W}_s \textbf{s}_t + \textbf{b}_s) + \textbf{b}_{\mu }\) and \(\varvec{\Sigma }_{\theta }(\textbf{s}_t) = \text {softplus}(\textbf{W}_{\sigma } \text {ReLU}(\textbf{W}_s \textbf{s}_t + \textbf{b}_s) + \textbf{b}_{\sigma })\) parameterize the mean and covariance of the action distribution, \(\theta\) and \(\phi\) represent policy and value network parameters respectively, and the multi-layer architecture enables complex strategy learning.

Attention mechanisms enhance the framework by dynamically weighting temporal features based on their relevance to current prediction tasks and climate conditions. The climate-aware attention computation integrates weather sensitivity with temporal importance:

$$\begin{aligned} \alpha _{t,i} = \frac{\exp \left( \textbf{v}_{att}^T \tanh \left( \textbf{W}_{att}^{(h)} \textbf{h}_i^{enc} + \textbf{W}_{att}^{(s)} \textbf{s}_t + \textbf{W}_{att}^{(w)} \textbf{W}_i\right) \right) }{\sum _{j=1}^{\tau } \exp \left( \textbf{v}_{att}^T \tanh \left( \textbf{W}_{att}^{(h)} \textbf{h}_j^{enc} + \textbf{W}_{att}^{(s)} \textbf{s}_t + \textbf{W}_{att}^{(w)} \textbf{W}_j\right) \right) } \cdot \left( 1 + \beta _{climate} \Vert \textbf{W}_i - \textbf{W}_t\Vert _2^{-1}\right) \end{aligned}$$
(12)

where \(\alpha _{t,i}\) represents the attention weight for time step i at current time t, \(\textbf{v}_{att}\), \(\textbf{W}_{att}^{(h)}\), \(\textbf{W}_{att}^{(s)}\), and \(\textbf{W}_{att}^{(w)}\) are learned attention parameters, \(\beta _{climate}\) controls climate sensitivity, and the climate proximity term \(\Vert \textbf{W}_i - \textbf{W}_t\Vert _2^{-1}\) enhances attention to historical periods with similar weather conditions.

The integrated loss function combines sequence prediction accuracy, policy optimization objectives, and regularization terms to ensure stable learning and robust performance across diverse operational conditions:

$$\begin{aligned} & \mathscr {L}_{total} = \mathbb {E}_{t=1}^{T} \left[ \Vert \hat{E}_t - E_t\Vert _2^2 + \lambda _{entropy} \mathscr {H}(\pi _{\theta }(\cdot |\textbf{s}_t))\right] \nonumber \\ & - \mathbb {E}_{\tau } \left[ \sum _{t=1}^{T} \gamma ^{t-1} r_t \log \pi _{\theta }(\textbf{a}_t|\textbf{s}_t)\right] + \lambda _{value} \mathbb {E}_{t=1}^{T} \left[ \left( V_{\phi }(\textbf{s}_t) - R_t\right) ^2\right] + \lambda _{reg} \left( \Vert \theta \Vert _2^2 + \Vert \phi \Vert _2^2\right) \end{aligned}$$
(13)

where \(\mathscr {H}(\cdot )\) denotes entropy to encourage exploration, \(\gamma\) is the discount factor, \(R_t = \sum _{k=t}^{T} \gamma ^{k-t} r_k\) represents discounted future rewards, \(\lambda _{entropy}\), \(\lambda _{value}\), and \(\lambda _{reg}\) are regularization coefficients, and the expectation over trajectories \(\tau\) captures the stochastic nature of the learning process.

The parameter update mechanism employs proximal policy optimization with adaptive learning rates to ensure stable convergence while maintaining prediction quality. The update rules incorporate momentum and gradient clipping for robust optimization:

$$\begin{aligned} \begin{aligned} \textbf{g}_{\theta ,t}&= \nabla _{\theta } \mathscr {L}_{policy}(\theta ) + \lambda _{kl} \nabla _{\theta } D_{KL}(\pi _{\theta _{old}}(\cdot |\textbf{s}_t) \Vert \pi _{\theta }(\cdot |\textbf{s}_t)), \\ \textbf{m}_{\theta ,t}&= \beta _1 \textbf{m}_{\theta ,t-1} + (1-\beta _1) \textbf{g}_{\theta ,t}, \quad \textbf{v}_{\theta ,t} = \beta _2 \textbf{v}_{\theta ,t-1} + (1-\beta _2) \textbf{g}_{\theta ,t}^2, \\ \theta _{t+1}&= \theta _t - \eta _t \frac{\textbf{m}_{\theta ,t}}{\sqrt{\textbf{v}_{\theta ,t}} + \epsilon } \cdot \min \left( 1, \frac{C_{grad}}{\Vert \textbf{g}_{\theta ,t}\Vert _2}\right) \end{aligned} \end{aligned}$$
(14)

where \(D_{KL}\) represents Kullback-Leibler divergence for policy constraint, \(\beta _1, \beta _2\) are momentum parameters, \(\eta _t\) is the adaptive learning rate, \(C_{grad}\) is the gradient clipping threshold, and similar updates apply to value function parameters \(\phi\) with appropriate loss gradients.

Theorem 1

Under typical green building operational conditions where energy systems exhibit regular behavioral patterns and environmental monitoring provides reliable data, the Seq2Seq-RL framework demonstrates convergent learning behavior that progressively improves prediction accuracy while maintaining adaptation capability across varying climate conditions. The practical convergence characteristics can be observed through the stabilization of prediction errors and policy performance over training iterations, as validated in real-world deployment scenarios.

$$\begin{aligned} \lim _{t \rightarrow \infty } \mathbb {E}\left[ |\hat{E}_t - E_t|^2\right] \le \epsilon _{practical} \end{aligned}$$
(15)

where \(\epsilon _{practical}\) represents the achievable prediction accuracy under normal operational conditions, determined by sensor precision, environmental variability, and building system complexity rather than theoretical mathematical constraints.

Corollary 1

For practical deployment in green building energy management systems, the Seq2Seq-RL framework achieves prediction performance that scales favorably with data quality and temporal coverage. The performance characteristics observed in real-world applications can be described by:

$$\begin{aligned} \mathbb {E}\left[ |\hat{E}_{t+1} - E_{t+1}|^2\right] \le C_{base} + \frac{C_{data}}{\text {DataQuality}} + C_{adapt} \exp (-\alpha \cdot \text {TrainingTime}) \end{aligned}$$
(16)

where \(C_{base}\) represents the baseline prediction error achievable with the framework, \(C_{data}\) captures the impact of data availability and sensor accuracy, \(C_{adapt}\) quantifies the initial adaptation period, and \(\alpha> 0\) represents the empirically observed learning rate across different building environments.

The combination of LSTM and attention mechanism transfer learning motivation

  • Although traditional building energy consumption prediction models can handle a certain range of prediction tasks, their performance is often unsatisfactory when faced with long-term time series dependencies of data and complex dynamic environmental changes21,22. Especially, it is difficult to effectively distinguish and focus on key time points and factors that affect energy consumption, resulting in limited accuracy and reliability of prediction results23.

  • To address the challenges, this study designed a composite model that integrates LSTM, attention mechanism, and transfer learning. This model effectively captures the time series characteristics and long-term dependencies of energy consumption through LSTM, and the introduced attention mechanism can automatically highlight data points that have a significant impact on prediction, improving prediction accuracy and efficiency. The innovation lies in the fact that through transfer learning, the model borrows knowledge from other tasks, enhancing its adaptability and generalization ability to new environments and different types of buildings.

Figure 3 illustrates the composite model designed in this study, which integrates LSTM, the attention mechanism, and transfer learning. The figure shows how the LSTM effectively captures the time series characteristics and long-term dependencies of energy consumption. Additionally, the attention mechanism is depicted highlighting the data points that significantly impact predictions, thereby improving accuracy and efficiency. The figure also emphasizes the innovation of transfer learning, which allows the model to borrow knowledge from other tasks, enhancing its adaptability and generalization to new environments and different types of buildings.

Fig. 3
Fig. 3
Full size image

Architecture of the composite prediction model integrating LSTM, attention mechanism, and transfer learning.

Mathematical formulation of LSTM-attention-transfer framework

The LSTM-Attention-Transfer framework builds upon the foundation established by the Seq2Seq-RL component to address cross-domain generalization and enhanced feature extraction challenges. This framework leverages the initial predictions from the first stage while incorporating sophisticated attention mechanisms and transfer learning strategies to optimize prediction performance across diverse building types and climate zones.

The enhanced LSTM architecture incorporates climate-aware gating mechanisms that dynamically adjust information flow based on environmental conditions and building characteristics. The gate computations integrate multiple information sources to capture complex temporal dependencies:

$$\begin{aligned} \begin{aligned} \textbf{f}_t&= \sigma \left( \textbf{W}_f \left[ \begin{array}{c} \textbf{h}_{t-1} \\ \textbf{x}_t \\ \hat{E}_{t}^{stage1} \end{array}\right] + \textbf{U}_f \textbf{W}_t + \textbf{V}_f \tanh (\textbf{W}_{climate} \textbf{W}_t) + \textbf{b}_f\right) , \\ \textbf{i}_t&= \sigma \left( \textbf{W}_i \left[ \begin{array}{c} \textbf{h}_{t-1} \\ \textbf{x}_t \\ \hat{E}_{t}^{stage1} \end{array}\right] + \textbf{U}_i \textbf{W}_t + \textbf{V}_i \text {softmax}(\textbf{W}_{domain} \textbf{d}_t) + \textbf{b}_i\right) , \\ \tilde{\textbf{c}}_t&= \tanh \left( \textbf{W}_c \left[ \begin{array}{c} \textbf{h}_{t-1} \\ \textbf{x}_t \\ \hat{E}_{t}^{stage1} \end{array}\right] + \textbf{U}_c \textbf{W}_t + \textbf{V}_c \textbf{r}_t^{transfer} + \textbf{b}_c\right) , \\ \textbf{c}_t&= \textbf{f}_t \odot \textbf{c}_{t-1} + \textbf{i}_t \odot \tilde{\textbf{c}}_t + \alpha _{residual} \textbf{W}_{res} \hat{E}_{t}^{stage1} \end{aligned} \end{aligned}$$
(17)

where \(\hat{E}_{t}^{stage1}\) represents predictions from the first stage, \(\textbf{W}_t\) denotes weather conditions, \(\textbf{d}_t\) represents domain-specific features, \(\textbf{r}_t^{transfer}\) contains transferred knowledge representations, and \(\alpha _{residual}\) controls the residual connection strength to integrate first-stage predictions.

Multi-head attention mechanisms capture complex relationships between temporal features, climate patterns, and domain characteristics while maintaining interpretability for practical deployment. The attention computation incorporates domain-aware scaling and climate sensitivity:

$$\begin{aligned} \begin{aligned} \text {MultiHead}(\textbf{Q}, \textbf{K}, \textbf{V})&= \text {Concat}(\text {head}_1, \ldots , \text {head}_h) \textbf{W}^O, \\ \text {head}_i&= \text {Attention}(\textbf{Q}\textbf{W}_i^Q, \textbf{K}\textbf{W}_i^K, \textbf{V}\textbf{W}_i^V), \\ \text {Attention}(\textbf{Q}, \textbf{K}, \textbf{V})&= \text {softmax}\left( \frac{\textbf{Q}\textbf{K}^T}{\sqrt{d_k}} \odot \textbf{M}_{climate} + \textbf{B}_{domain}\right) \textbf{V}, \\ \textbf{M}_{climate}[i,j]&= \exp \left( -\gamma _{climate} \Vert \textbf{W}_i - \textbf{W}_j\Vert _2^2\right) , \\ \textbf{B}_{domain}[i,j]&= \beta _{domain} \cdot \text {sim}(\textbf{d}_i, \textbf{d}_j) \cdot \log (1 + |t_i - t_j|) \end{aligned} \end{aligned}$$
(18)

where h is the number of attention heads, \(\textbf{M}_{climate}\) provides climate-based attention modulation, \(\textbf{B}_{domain}\) incorporates domain similarity and temporal distance, \(\gamma _{climate}\) and \(\beta _{domain}\) are scaling parameters, and \(\text {sim}(\cdot , \cdot )\) computes domain similarity using learned embeddings.

Transfer learning mechanisms enable knowledge sharing across different building types and geographic locations through domain adaptation and feature alignment. The framework employs adversarial domain adaptation with gradient reversal to learn domain-invariant representations:

$$\begin{aligned} \begin{aligned} \textbf{z}_{shared}&= \text {Encoder}_{shared}(\textbf{h}_t^{LSTM}, \textbf{att}_t), \\ \textbf{z}_{source}&= \text {Encoder}_{source}(\textbf{z}_{shared}) + \lambda _{residual} \textbf{W}_{src} \hat{E}_{t}^{stage1}, \\ \textbf{z}_{target}&= \text {Encoder}_{target}(\textbf{z}_{shared}) + \lambda _{residual} \textbf{W}_{tgt} \hat{E}_{t}^{stage1}, \\ \mathscr {L}_{domain}&= -\sum _{i=1}^{N_{src}} \log D(\textbf{z}_{shared}^{(i)}) - \sum _{j=1}^{N_{tgt}} \log (1 - D(\textbf{z}_{shared}^{(j)})) + \lambda _{gradient} \Vert \nabla _{\textbf{z}_{shared}} D(\textbf{z}_{shared})\Vert _2^2, \\ \mathscr {L}_{alignment}&= \text {MMD}(p(\textbf{z}_{source}), p(\textbf{z}_{target})) + \alpha _{wasserstein} \cdot W_2(p(\textbf{z}_{source}), p(\textbf{z}_{target})) \end{aligned} \end{aligned}$$
(19)

where \(D(\cdot )\) is the domain discriminator, \(\text {MMD}(\cdot , \cdot )\) represents Maximum Mean Discrepancy, \(W_2(\cdot , \cdot )\) denotes the 2-Wasserstein distance, \(N_{src}\) and \(N_{tgt}\) are source and target domain sample sizes, and \(\lambda _{gradient}\), \(\alpha _{wasserstein}\) control regularization strength.

Climate-adaptive feature extraction incorporates seasonal patterns, weather variability, and long-term climate trends to enhance prediction robustness. The feature extraction mechanism combines multiple temporal scales and climate-specific representations:

$$\begin{aligned} \begin{aligned} \textbf{f}_{climate}^{(s)}&= \sum _{k=1}^{K} \alpha _k^{(s)} \cdot \text {Conv1D}_k\left( \textbf{W}_{1:T}, \text {kernel}_k\right) \odot \text {sigmoid}(\textbf{W}_{season} \textbf{s}_t), \\ \textbf{f}_{climate}^{(l)}&= \text {GRU}\left( \textbf{f}_{climate}^{(s)}, \textbf{h}_{climate,t-1}\right) + \sum _{j=1}^{J} \beta _j \cdot \text {DilatedConv}_j(\textbf{W}_{1:T}, \text {dilation}_j), \\ \textbf{f}_{integrated}&= \text {LayerNorm}\left( \textbf{f}_{climate}^{(l)} + \textbf{W}_{proj} \left[ \begin{array}{c} \textbf{z}_{shared} \\ \hat{E}_{t}^{stage1} \\ \text {FFT}(\textbf{W}_{1:T}) \end{array}\right] \right) , \\ \textbf{f}_{final}&= \text {MLP}\left( \textbf{f}_{integrated}\right) + \gamma _{skip} \cdot \textbf{W}_{skip} \hat{E}_{t}^{stage1} \end{aligned} \end{aligned}$$
(20)

where \(\alpha _k^{(s)}\) and \(\beta _j\) are learnable coefficients, \(\textbf{s}_t\) represents seasonal indicators, \(\text {Conv1D}_k\) and \(\text {DilatedConv}_j\) capture different temporal patterns, \(\text {FFT}(\cdot )\) provides frequency domain features, and \(\gamma _{skip}\) controls skip connection strength.

Cross-domain knowledge distillation facilitates effective transfer of prediction strategies while preserving domain-specific characteristics. The distillation process incorporates attention-guided knowledge transfer and uncertainty-aware weighting:

$$\begin{aligned} \begin{aligned} \mathscr {L}_{distill}&= \sum _{t=1}^{T} \sum _{d=1}^{D} w_d^{(t)} \cdot \text {KL}\left( \text {softmax}\left( \frac{\textbf{z}_{teacher}^{(d)}}{T_{temp}}\right) , \text {softmax}\left( \frac{\textbf{z}_{student}^{(d)}}{T_{temp}}\right) \right) , \\ w_d^{(t)}&= \frac{\exp \left( -\mathscr {U}_d^{(t)} / \tau _{uncertainty}\right) }{\sum _{d'=1}^{D} \exp \left( -\mathscr {U}_{d'}^{(t)} / \tau _{uncertainty}\right) } \cdot \left( 1 + \text {sim}(\textbf{d}_{target}, \textbf{d}_d)\right) , \\ \mathscr {U}_d^{(t)}&= \frac{1}{M} \sum _{m=1}^{M} \left( \hat{E}_{t,m}^{(d)} - \bar{E}_t^{(d)}\right) ^2, \quad \bar{E}_t^{(d)} = \frac{1}{M} \sum _{m=1}^{M} \hat{E}_{t,m}^{(d)}, \\ \mathscr {L}_{consistency}&= \sum _{t=1}^{T} \left\| \text {Attention}(\textbf{f}_{final}^{teacher}) - \text {Attention}(\textbf{f}_{final}^{student})\right\| _F^2 \end{aligned} \end{aligned}$$
(21)

where \(T_{temp}\) is the temperature parameter for softmax, \(\mathscr {U}_d^{(t)}\) represents prediction uncertainty, M is the number of Monte Carlo samples, \(\tau _{uncertainty}\) controls uncertainty sensitivity, and \(\Vert \cdot \Vert _F\) denotes the Frobenius norm.

The comprehensive loss function integrates prediction accuracy, domain adaptation, attention consistency, and regularization objectives to ensure robust performance across diverse operational scenarios:

$$\begin{aligned} \begin{aligned} \mathscr {L}_{total}&= \mathscr {L}_{prediction} + \lambda _{domain} \mathscr {L}_{domain} + \lambda _{distill} \mathscr {L}_{distill} + \lambda _{consistency} \mathscr {L}_{consistency} + \lambda _{reg} \mathscr {L}_{regularization}, \\ \mathscr {L}_{prediction}&= \sum _{t=1}^{T} \left[ \Vert \hat{E}_t^{final} - E_t\Vert _2^2 + \alpha _{stage1} \Vert \hat{E}_t^{final} - \hat{E}_t^{stage1}\Vert _2^2\right] \cdot \text {exp}\left( -\frac{\Vert \textbf{W}_t - \overline{\textbf{W}}\Vert _2^2}{2\sigma _{climate}^2}\right) , \\ \mathscr {L}_{regularization}&= \sum _{i} \Vert \textbf{W}_i\Vert _2^2 + \beta _{entropy} \sum _{t} \mathscr {H}(\text {Attention}(\textbf{f}_{final})) + \beta _{smooth} \sum _{t} \Vert \hat{E}_{t+1}^{final} - \hat{E}_t^{final}\Vert _2^2 \end{aligned} \end{aligned}$$
(22)

where \(\hat{E}_t^{final}\) represents the final prediction, \(\overline{\textbf{W}}\) is the mean weather condition, \(\sigma _{climate}^2\) controls climate sensitivity, \(\mathscr {H}(\cdot )\) denotes entropy for attention diversity, and \(\beta _{entropy}\), \(\beta _{smooth}\) are regularization coefficients.

Adaptive optimization strategies incorporate momentum-based updates with climate-aware learning rate scheduling and gradient stabilization to ensure convergence across different domains and seasons:

$$\begin{aligned} \begin{aligned} \textbf{g}_t&= \nabla _{\theta } \mathscr {L}_{total} + \lambda _{climate} \nabla _{\theta } \mathscr {R}_{climate}(\theta , \textbf{W}_t), \\ \textbf{m}_t&= \beta _1 \textbf{m}_{t-1} + (1-\beta _1) \textbf{g}_t, \quad \textbf{v}_t = \beta _2 \textbf{v}_{t-1} + (1-\beta _2) \textbf{g}_t^2, \\ \eta _t^{adaptive}&= \eta _0 \cdot \sqrt{\frac{1 + \cos (\pi \cdot \text {epoch} / \text {max\_epochs})}{2}} \cdot \left( 1 + \alpha _{climate} \Vert \textbf{W}_t - \textbf{W}_{t-1}\Vert _2\right) ^{-1}, \\ \theta _{t+1}&= \theta _t - \eta _t^{adaptive} \frac{\textbf{m}_t}{\sqrt{\textbf{v}_t} + \epsilon } \cdot \text {clip}\left( \frac{\textbf{g}_t}{\Vert \textbf{g}_t\Vert _2}, -C_{clip}, C_{clip}\right) , \\ \mathscr {R}_{climate}(\theta , \textbf{W}_t)&= \sum _{k} \exp \left( -\gamma _k \Vert \textbf{W}_t - \textbf{W}_{ref,k}\Vert _2^2\right) \cdot \Vert \theta _k - \theta _{ref,k}\Vert _2^2 \end{aligned} \end{aligned}$$
(23)

where \(\mathscr {R}_{climate}\) provides climate-specific regularization, \(\eta _0\) is the base learning rate, \(\alpha _{climate}\) controls climate adaptation, \(C_{clip}\) is the gradient clipping threshold, and \(\theta _{ref,k}\), \(\textbf{W}_{ref,k}\) represent reference parameters and weather conditions for different climate zones.

Theorem 2

Under standard green building operational scenarios with adequate source domain data, the LSTM-Attention-Transfer framework achieves effective domain adaptation with prediction performance that benefits from cross-domain knowledge transfer. The adaptation effectiveness can be characterized by consistent performance improvements across different building types and climate zones, as demonstrated through comprehensive empirical validation.

$$\begin{aligned} \mathbb {E}_{target}\left[ |\hat{E}_t^{final} - E_t|^2\right] \le \min _{d \in \{1,\ldots ,D\}} \mathbb {E}_{source,d}\left[ |\hat{E}_t^{stage1} - E_t|^2\right] + \Delta _{transfer} + \epsilon _{domain} \end{aligned}$$
(24)

where \(\Delta _{transfer}\) represents the adaptation overhead that diminishes with sufficient target domain exposure, \(\epsilon _{domain}\) captures the inherent prediction challenge due to domain differences, and the inequality reflects the practical performance bounds observed in real-world transfer scenarios.

Corollary 2

For practical deployment scenarios with limited target domain data, the framework maintains robust prediction accuracy through effective knowledge transfer, with performance characteristics given by:

$$\begin{aligned} \mathbb {E}_{target}\left[ |\hat{E}_t^{final} - E_t|^2\right] \le \mathbb {E}_{target}\left[ |\hat{E}_t^{stage1} - E_t|^2\right] - \Delta _{improvement} + O\left( \frac{1}{\sqrt{N_{target}}}\right) \end{aligned}$$
(25)

where \(\Delta _{improvement}> 0\) quantifies the empirically observed enhancement from the second stage processing, and the \(O(1/\sqrt{N_{target}})\) term reflects the standard statistical learning behavior with respect to target domain sample size, consistent with practical machine learning deployment patterns.

Algorithm pseudocode

The proposed framework implements a cascaded two-stage optimization approach that systematically addresses the multi-faceted challenges of green building energy consumption prediction. The first stage establishes robust temporal pattern recognition through Seq2Seq-RL integration, while the second stage enhances cross-domain adaptability and feature refinement through LSTM-Attention-Transfer learning mechanisms.

Algorithm 1 implements the foundational Seq2Seq-RL framework as mathematically formulated in equations (8) through (14). The algorithm integrates bidirectional LSTM encoding with climate-aware attention mechanisms and reinforcement learning policies to capture complex temporal dependencies while adapting to environmental variability. The encoder processes historical energy consumption and weather data to generate context representations as defined in equation (8), while the reinforcement learning component optimizes prediction strategies through the state-action formulation specified in equation (9). The reward structure from equation (10) guides policy learning to balance prediction accuracy with climate adaptability, and the integrated loss function (13) ensures stable convergence across diverse operational conditions.

Algorithm 2 builds upon the first stage outputs to implement the enhanced LSTM-Attention-Transfer framework detailed in equations (17) through (23). This stage leverages the initial predictions \(\hat{E}_{t}^{stage1}\) from Algorithm 1 as additional input features in the enhanced LSTM architecture (17), enabling residual learning and prediction refinement. The multi-head attention mechanism (18) incorporates climate-aware scaling and domain-specific modulation to capture intricate feature relationships. Transfer learning components (19) facilitate knowledge sharing across building types and geographic locations through adversarial domain adaptation and feature alignment strategies. The comprehensive loss function (22) integrates multiple optimization objectives while the adaptive optimization scheme (23) ensures robust convergence across different domains and seasonal variations.

Algorithm 1
Algorithm 1
Full size image

Seq2Seq-RL Energy Consumption Prediction Framework

Algorithm 2
Algorithm 2
Full size image

LSTM-Attention-Transfer Learning Enhancement Framework

The computational complexity analysis reflects the sophisticated mathematical formulations implemented in both algorithms. Algorithm 1 achieves time complexity of \(O(T \cdot d_{enc} \cdot d_{hidden} + N_{episodes} \cdot T \cdot d_{policy})\) where \(d_{enc}\) represents encoder dimensions, \(d_{hidden}\) denotes LSTM hidden state dimensions, and \(d_{policy}\) captures policy network complexity. The reinforcement learning component adds \(O(N_{episodes} \cdot T \cdot |A|)\) complexity for action space exploration, where |A| represents the continuous action space dimensionality.

Algorithm 2 extends the computational framework with time complexity \(O(T \cdot d_{LSTM}^2 + T^2 \cdot h \cdot d_{att} + N_{domains} \cdot d_{transfer})\) where h represents the number of attention heads, \(d_{att}\) denotes attention dimensions, and \(N_{domains}\) captures the number of source domains for transfer learning. The multi-head attention mechanism contributes \(O(T^2 \cdot h \cdot d_k)\) complexity due to the quadratic scaling with sequence length, while domain adaptation operations require \(O(N_{domains} \cdot d_{shared} \cdot d_{domain})\) computational overhead for feature alignment and adversarial training.

Model parameters

The integrated framework architecture encompasses multiple interconnected components that collectively address the mathematical formulations detailed in the preceding sections. The parameter configuration strategy balances computational efficiency with prediction accuracy while ensuring robust performance across diverse green building scenarios and climate conditions, with detailed specifications provided in Table 2.

The Seq2Seq encoder-decoder framework utilizes bidirectional LSTM networks with hidden dimensions ranging from 256 to 512 units, determined by the temporal complexity of energy consumption patterns as formulated in equation (8). The encoder processes concatenated inputs \([E_t, \textbf{W}_t, \textbf{C}_t]\) through learned weight matrices \(\textbf{W}_{ctx} \in \mathbb {R}^{512 \times 1024}\) and bias vectors \(\textbf{b}_{ctx} \in \mathbb {R}^{512}\) to generate compressed context representations. The reinforcement learning component implements continuous action spaces with policy networks containing 3-4 hidden layers of 256-512 units each, enabling sophisticated prediction adjustment strategies as defined in equation (9).

The enhanced LSTM architecture incorporates climate-aware gating mechanisms through specialized weight matrices \(\textbf{W}_f, \textbf{W}_i, \textbf{W}_c \in \mathbb {R}^{768 \times 1536}\) that process concatenated inputs including first-stage predictions \(\hat{E}_t^{stage1}\). The multi-head attention mechanism employs 8-16 attention heads with dimension \(d_k = d_v = 64\) per head, implementing the climate-aware scaling factors \(\gamma _{climate}\) and domain similarity weights \(\beta _{domain}\) as specified in equation (18) and detailed in Table 2. Transfer learning components maintain separate parameter sets for shared representations (\(d_{shared} = 512\)), source domain encoders (\(d_{source} = 256\)), and target domain encoders (\(d_{target} = 256\)) to facilitate effective knowledge transfer across building types and geographic locations.

Computational resource requirements scale according to the mathematical complexity of the integrated framework. Memory consumption is dominated by the storage of encoder-decoder hidden states \(O(T \cdot d_{hidden})\), attention mechanism intermediate computations \(O(T^2 \cdot h)\), and transfer learning domain adaptation matrices \(O(N_{domains} \cdot d_{transfer}^2)\). The total parameter count ranges from 2.5M to 8.5M depending on the specific architectural configuration, with the attention mechanisms contributing approximately 35-40% of the total parameters and the transfer learning components accounting for 25-30% of the parameter space.

Table 2 Enhanced model parameter specification.

The parameter optimization strategy incorporates adaptive learning rate scheduling with climate-aware adjustments as formulated in equation (23). The base learning rate \(\eta _0\) undergoes cosine annealing with climate sensitivity modulation, while gradient clipping thresholds \(C_{clip}\) ensure stable training across different domains and seasonal variations, as specified in Table 2. Regularization coefficients are tuned to balance model complexity with generalization capability, with stronger regularization applied during transfer learning phases to prevent overfitting to source domain characteristics.

Experimental results

Dataset description

In the study of predicting energy consumption in green buildings, the ETT and M4 datasets provide rich time series data with temporal characteristics and energy-climate dependencies that are highly relevant for validating green building energy prediction frameworks.

ETT dataset: This dataset covers 2-year temperature and power load data from electrical infrastructure, providing multi-scale time series at 1-hour and 15-minute intervals. The dataset captures essential climate-energy interactions and seasonal dependencies that directly parallel the temporal complexities encountered in green building energy consumption patterns. The presence of temperature-dependent energy variations and multi-scale temporal dependencies makes it particularly valuable for evaluating algorithms designed to handle the dynamic relationships between environmental conditions and energy demand in green building systems.

M4 dataset: This comprehensive time series collection contains 100,000 sequences covering diverse temporal frequencies from yearly to hourly patterns. The dataset’s inclusion of energy-related time series with complex temporal structures provides an excellent testbed for assessing framework generalization capabilities across different building types and operational scales. The multi-frequency nature of the data aligns well with the hierarchical temporal patterns observed in green building energy consumption, from short-term operational cycles to long-term seasonal variations, enabling robust evaluation of the proposed prediction framework’s adaptability to diverse green building scenarios.

The temporal complexity and climate-energy coupling mechanisms present in both datasets enable comprehensive validation of the core algorithmic capabilities required for effective green building energy consumption prediction, while their established benchmarking status ensures reproducible and comparable results within the broader time series prediction research community.

Experimental performance

Experimental Setup: All experiments are conducted on NVIDIA RTX 4090 GPU with 24GB memory using PyTorch 1.12. The datasets are split into training (70%), validation (15%), and testing (15%) sets. Each model is trained for 200 epochs with early stopping based on validation loss. Results are averaged over 5 independent runs with different random seeds (42, 123, 256, 512, 1024) to ensure statistical reliability. Hyperparameters are optimized using grid search on the validation set.

Figure 4 illustrates the accuracy evolution of different models across training epochs on both datasets. The proposed framework demonstrates rapid convergence and stable performance, achieving optimal accuracy levels significantly faster than baseline methods. On dataset 1 (ETT), the integrated framework reaches 96.2% accuracy within the first 50 epochs and maintains consistent performance thereafter. In contrast, traditional LSTM approaches exhibit slower convergence rates and lower final accuracy. The framework’s superior learning efficiency is particularly evident on dataset 2 (M4), where it consistently outperforms all comparison methods while maintaining stable training dynamics. The performance superiority stems from the synergistic integration of multiple architectural components. The Seq2Seq structure enables the model to capture complex encoder-decoder relationships in temporal sequences, while the reinforcement learning component provides adaptive policy adjustments that optimize prediction strategies based on environmental feedback. This combination allows the model to learn more sophisticated temporal patterns compared to traditional LSTM approaches, which rely solely on recurrent connections without strategic adaptation mechanisms. The rapid convergence observed in both datasets indicates that the attention-enhanced feature selection effectively identifies and weights the most relevant temporal patterns early in the training process, reducing the search space for optimal parameter configurations.

Fig. 4
Fig. 4
Full size image

Model accuracy comparison across training epochs.

The MAE progression shown in Fig. 5 confirms the framework’s superior learning capabilities across both datasets. The proposed method exhibits the steepest error reduction curve, achieving final MAE values of 0.198 on ETT and 0.142 on M4 datasets respectively. Ablation studies reveal that removing the Seq2Seq-RL component results in slower error reduction and higher final MAE values, while the absence of transfer learning components leads to reduced generalization performance, particularly evident in the later training phases. The dramatic error reduction observed in the proposed framework can be attributed to the multi-scale temporal modeling capability inherent in the integrated architecture. Unlike traditional approaches that process temporal information at a single resolution, the framework captures patterns at multiple temporal scales simultaneously through the Seq2Seq encoder-decoder mechanism. The reinforcement learning component contributes to error reduction by continuously refining prediction strategies based on accumulated prediction errors, creating a feedback loop that progressively improves model performance. The steeper error reduction curve compared to ablated versions demonstrates that each component contributes essential capabilities: the Seq2Seq structure provides representational power for complex temporal dependencies, while the RL component enables adaptive optimization that traditional gradient-based methods cannot achieve alone.

Fig. 5
Fig. 5
Full size image

Mean absolute error progression during training.

Table3 presents comprehensive performance comparisons with current state-of-the-art time series forecasting methods. The proposed framework achieves the best overall performance with an average rank of 1.0 across both datasets and metrics. Notably, while SAITS demonstrates excellent performance on ETT dataset (RMSE = 0.141), it shows reduced effectiveness on M4 dataset (RMSE = 0.203), highlighting the challenge of cross-domain generalization that the proposed framework successfully addresses. The framework consistently outperforms recent transformer-based approaches including Informer and Autoformer across both evaluation scenarios. The performance variations across different methods reveal fundamental differences in their architectural capabilities and limitations. SAITS excels on ETT due to its specialized imputation mechanisms that work well with electrical transformer data’s regular patterns, but its performance degrades significantly on M4’s diverse temporal patterns because it lacks adaptive mechanisms for handling heterogeneous time series characteristics. Transformer-based methods like Informer and Autoformer show moderate performance across both datasets due to their attention mechanisms, but they suffer from quadratic computational complexity with sequence length and lack the adaptive policy learning that reinforcement learning provides. The proposed framework’s consistent superiority across both datasets demonstrates its ability to capture both regular patterns (ETT) and irregular, diverse temporal behaviors (M4) through the combination of Seq2Seq encoding for pattern recognition and RL for adaptive strategy learning. The ablation results further confirm that removing either the RL or transfer learning components results in measurable performance degradation, with the RL component contributing more significantly to overall performance improvement.

Table 3 Performance comparison with state-of-the-art methods.

Statistical significance analysis presented in Table 4 confirms the robustness of the observed performance improvements based on 5 independent experimental runs. All comparisons show statistically significant differences (p < 0.05) with medium to large effect sizes (Cohen’s d > 0.5), indicating that the superior performance is not due to random variation. The largest effect size is observed when comparing against the traditional LSTM baseline (d = 2.03), while comparisons with transformer-based methods show moderate but significant improvements (d = 0.54-0.67). The effect size analysis provides crucial insights into the magnitude and practical significance of performance improvements. The exceptionally large effect size against LSTM baseline (d = 2.03) suggests a fundamental architectural advantage rather than marginal optimization gains, indicating that the integrated approach addresses core limitations of traditional recurrent networks. The moderate effect sizes against transformer-based methods (d = 0.54-0.67) reflect the fact that these methods already incorporate attention mechanisms, but lack the adaptive learning capabilities that reinforcement learning provides. The statistical robustness across multiple random seeds demonstrates that the framework’s advantages are consistent and not dependent on specific initialization conditions or training dynamics. The confidence intervals indicate that the performance improvements are stable and reproducible, essential characteristics for practical deployment in energy management systems where prediction reliability directly impacts operational decisions.

Table 4 Statistical significance analysis (n=5 independent runs).

Figure 6 demonstrates the prediction stability characteristics of different approaches across both datasets. The proposed framework exhibits the lowest prediction variance (\(\sigma ^2\) = 0.023 on ETT, \(\sigma ^2\) = 0.019 on M4) compared to baseline methods. Traditional LSTM approaches show high volatility with prediction variance exceeding 0.085 on both datasets, while transformer-based methods achieve intermediate stability levels. The framework’s stability advantage is particularly pronounced during periods of rapid energy consumption changes, indicating robust handling of dynamic operational conditions. The stability analysis reveals critical operational characteristics that extend beyond simple accuracy metrics. The radar chart visualization demonstrates that the proposed framework maintains consistent performance across different temporal bins, indicating robust handling of various operational scenarios that occur in real-world energy consumption patterns. The high variance observed in LSTM methods stems from their susceptibility to vanishing gradient problems and limited capacity to maintain long-term dependencies, particularly problematic during periods of rapid consumption changes typical in building energy systems. The intermediate stability of transformer-based methods reflects their attention mechanisms’ ability to handle long-range dependencies but their vulnerability to training instability due to deep architecture and quadratic computational complexity. The proposed framework’s superior stability emerges from the reinforcement learning component’s ability to adaptively adjust prediction strategies in response to changing conditions, while the attention mechanism provides selective focus on relevant temporal features. This combination creates a self-regulating system that maintains stable performance even when encountering operational conditions not extensively represented in training data.

Fig. 6
Fig. 6
Full size image

Prediction stability analysis across different models.

The \(R^2\) evolution shown in Fig. 7 illustrates the framework’s superior explanatory power in modeling energy consumption patterns. The proposed method achieves final \(R^2\) values of 0.981 on ETT and 0.978 on M4 datasets, significantly outperforming comparison methods. The rapid improvement in \(R^2\) values during early training epochs indicates efficient learning of underlying temporal patterns, while the stable plateau demonstrates consistent prediction quality throughout extended training periods. The \(R^2\) evolution analysis provides insights into the framework’s capacity to capture and explain variance in energy consumption data. The area chart visualization shows that the proposed method (red area) consistently maintains the largest explained variance throughout training, indicating superior pattern recognition capabilities. The rapid initial improvement in \(R^2\) values suggests that the attention mechanism quickly identifies the most informative temporal features, while the Seq2Seq encoder-decoder structure efficiently captures their relationships. The sustained high \(R^2\) values without overfitting indicate that the regularization provided by transfer learning components prevents the model from memorizing training-specific patterns. In contrast, simpler methods show lower explained variance due to their limited representational capacity, while complex transformer methods may achieve good \(R^2\) values but with less stability due to their susceptibility to overfitting. The consistent performance across both datasets demonstrates the framework’s ability to adapt its learned representations to different temporal pattern distributions without requiring architecture modifications.

Fig. 7
Fig. 7
Full size image

Coefficient of determination (\(R^2\)) evolution.

Figure 8 presents the computational efficiency analysis across different methods. The proposed framework maintains an optimal balance between prediction accuracy and computational cost, requiring 15% less training time than transformer-based methods while achieving superior performance. Peak memory utilization remains within 4.2 GB during training, making the framework deployable on standard GPU configurations. The efficiency advantage becomes more pronounced during inference, with the framework achieving real-time prediction capabilities (< 50ms per sample) suitable for operational energy management systems. The computational efficiency analysis reveals the practical deployment advantages of the integrated architecture design. The bar chart comparison shows that while the framework incorporates multiple sophisticated components, its computational overhead remains reasonable due to the cascaded architecture that processes information sequentially rather than requiring simultaneous computation of all components. The 15% training time reduction compared to transformer methods results from the framework’s ability to achieve faster convergence through reinforcement learning’s adaptive optimization, reducing the number of training epochs required to reach optimal performance. The memory efficiency stems from the encoder-decoder structure’s ability to compress temporal information into fixed-size representations, avoiding the quadratic memory scaling that affects transformer architectures with sequence length. During inference, the framework’s efficiency advantage becomes critical for real-time applications, as the 50ms processing time enables integration with building management systems that require rapid response to changing conditions. This computational profile makes the framework particularly suitable for deployment in resource-constrained environments typical of building automation systems, where high prediction accuracy must be balanced with reasonable computational requirements.

Fig. 8
Fig. 8
Full size image

Computational efficiency and resource utilization comparison.

Model analysis and generalization evaluation

This section provides comprehensive analysis of the proposed framework’s computational efficiency, interpretability characteristics, and cross-domain generalization capabilities to address practical deployment considerations and model transparency requirements.

Computational efficiency analysis

The computational efficiency analysis presented in Table 5 reveals the framework’s optimal balance between performance and resource consumption. The 28% increase in training time compared to SAITS is attributed to the reinforcement learning component requiring additional policy network training, but this overhead is justified by superior cross-domain performance. Transformer-based methods exhibit higher costs due to quadratic attention complexity, while the proposed framework achieves memory efficiency through cascaded architecture design that operates on compressed representations. Although SAITS shows lower computational cost as demonstrated in Table 5, its inferior generalization performance results in a lower overall efficiency score due to the accuracy-cost trade-off.

Table 5 Computational efficiency comparison across different methods.

Model interpretability analysis

Figure 9 reveals domain-specific adaptation patterns in feature importance. M4 dataset shows higher temperature feature importance (22.3% vs 18.6%) due to diverse global climate conditions, while ETT demonstrates stronger historical energy dependencies (34.2% vs 29.8%) reflecting electrical infrastructure autoregressive patterns. The consistent transfer learning feature importance across datasets (12.8% and 14.5%) confirms effective cross-domain knowledge utilization, while RL policy adjustments show dataset-specific variations (6.7% vs 4.3%) indicating adaptive strategy learning.

Fig. 9
Fig. 9
Full size image

SHAP feature importance analysis across ETT and M4 datasets.

Figure 10 demonstrates sophisticated temporal modeling capabilities through four complementary metrics. The mean attention weights show high focus on immediate history (t-1: 0.284) and seasonal components (0.273), with progressively decreasing weights for longer historical periods. Attention weight variance reveals the model’s adaptability, with seasonal components showing the highest variance (0.067) reflecting appropriate adaptation to diverse seasonal patterns rather than overfitting. Climate sensitivity indices align with energy consumption physics, showing strongest sensitivity for immediate history (0.67) and seasonal factors (0.73). The prediction impact percentages confirm that immediate history and seasonal components contribute most significantly to final predictions (28.4% and 27.3% respectively), while longer-term weekly patterns contribute least (8.9%).

Fig. 10
Fig. 10
Full size image

Attention mechanism analysis across different time periods and metrics.

Cross-domain generalization evaluation

Cross-domain generalization reveals asymmetric transfer performance patterns as detailed in Table 6. ETT→M4 transfer (RMSE: 0.198) outperforms M4→ETT (RMSE: 0.312) because ETT’s focused electrical patterns provide more generalizable temporal dynamics than M4’s diverse domain-specific patterns. Winter→Summer transfer performs better than the reverse direction due to winter’s more predictable heating-dominated patterns compared to summer’s complex cooling and solar interactions.

Frequency domain transfers show High→Low superiority (RMSE: 0.203 vs 0.234) as demonstrated in Table 6, as downsampling preserves essential patterns while eliminating noise, whereas upsampling requires interpolation of missing temporal details. Climate adaptation analysis in Table 7 demonstrates that high variability periods achieve the greatest improvement (12.0%) but require longer adaptation time (28 epochs), indicating the reinforcement learning component’s effectiveness in handling complex dynamics. Tropical climates show high improvement (9.0%) due to consistent thermal patterns enabling clear thermal-energy relationship identification.

Table 6 Cross-domain generalization performance.
Table 7 Climate zone adaptation analysis.

Discussion

The proposed model demonstrates significant practical potential for green building energy prediction through its ability to forecast energy consumption under diverse climate conditions with high accuracy and computational efficiency. The improvement in prediction accuracy and reduction in MSE highlight the model’s learning and generalization capabilities, while stability analysis and \(R^2\) values confirm the consistency and reliability of predictions across different operational scenarios. The current study utilizes publicly available datasets with comprehensive data coverage and consistent quality standards, ensuring sufficient data availability for robust model training and validation. The model’s computational efficiency enables real-time inference capabilities with processing times suitable for operational decision-making, while the modular architecture design supports integration with standard sensor networks and building automation systems commonly deployed in research and controlled environments.

The framework’s performance validation under controlled conditions with high-quality datasets establishes a solid foundation for future deployment investigations in real-world building environments. Planned research will extend the current work to address practical deployment scenarios including diverse data quality conditions, system integration with existing building management platforms, and validation across different sensor network configurations. These deployment-focused studies will build upon the current framework’s demonstrated capabilities to establish comprehensive implementation guidelines for operational green building applications.

Conclusions

This research presents a novel integrated prediction framework that synergistically combines Seq2Seq architecture, reinforcement learning, LSTM networks, attention mechanisms, and transfer learning to address the complex challenges of energy consumption forecasting in green buildings under variable climate conditions. The framework demonstrates substantial improvements over existing methods, achieving 96.2% prediction accuracy with RMSE values of 0.267 (ETT dataset) and 0.153 (M4 dataset), while maintaining exceptional stability with prediction variance below 0.023 across both evaluation scenarios. Statistical significance analysis confirms robust performance improvements (p < 0.05, Cohen’s d > 0.5) compared to state-of-the-art methods including Informer, Autoformer, and SAITS, with R² values exceeding 0.98 demonstrating superior explanatory power in modeling energy consumption patterns. The framework’s key scientific contribution lies in the climate-adaptive integration of multiple advanced techniques, enabling effective cross-domain generalization with adaptation rates of 68-78% across different building types and geographic regions, while maintaining computational efficiency suitable for real-time deployment (<50ms inference time per sample). Despite these advances, the framework exhibits limitations during extreme weather events (15-20% RMSE increase) and requires substantial training data (6-12 months of high-quality sensor data), highlighting areas for future enhancement through advanced data augmentation techniques and federated learning approaches. The computational efficiency and robust generalization capabilities position this framework as a practical solution for operational energy management systems, contributing to enhanced energy efficiency and reduced carbon emissions in the construction sector while supporting the broader transition toward sustainable building technologies. Future research should focus on improving robustness to unprecedented operating conditions, integrating real-time pricing signals for smart grid optimization, and developing privacy-preserving deployment strategies for distributed building management networks.