Abstract
The regulation of indoor thermal comfort is a critical aspect of smart building design, significantly influencing energy efficiency and occupant well-being. Traditional comfort models, such as Fanger’s equation and adaptive approaches, often fall short in capturing individual occupant preferences and the dynamic nature of indoor environmental conditions. To overcome these limitations, we introduce a Digital Twin-driven framework integrated with an advanced attention-based Long Short-Term Memory (LSTM) model specifically tailored for personalised thermal comfort prediction and intelligent HVAC control. The attention mechanism effectively focuses on critical temporal features, enhancing both predictive performance and interpretability. Next, the Digital Twin enables the real-time simulation of indoor environments and occupant responses, facilitating proactive comfort management. We utilise a subset of the ASHRAE Global Thermal Comfort Database II, and extensive pre-processing, including median-based data imputation and feature normalisation, is conducted. The proposed model categorises Thermal Sensation Votes (TSVs) recorded on a 7-point ASHRAE scale into three classes: Uncomfortably Cold (UC) for TSV \(\le\)-1, Neutral (N) for TSV = 0, and Uncomfortably Warm (UW) for TSV \(\ge\)+1. The model achieves a test accuracy of 83.8%, surpassing previous state-of-the-art methods. Furthermore, Explainable AI (XAI) techniques, such as SHAP and LIME, are integrated to enhance transparency and interpretability, complemented by scenario-based energy efficiency analyses to evaluate energy-comfort trade-offs. This comprehensive approach provides a robust, interpretable, and energy-efficient solution for occupant-centric HVAC management in smart building systems.
Similar content being viewed by others
Introduction
A Digital Twin is a virtual representation of a physical system that continuously updates based on real-time data from sensors and other sources1,2,3. In the context of buildings, Digital Twins create dynamic models of indoor environments, enabling predictive analytics and adaptive control strategies4,5. By integrating IoT-enabled sensors, machine learning models, and cloud computing, Digital Twins facilitate real-time monitoring, anomaly detection, and optimization of various building parameters, including thermal comfort, energy consumption, and system performance6.
Modern smart buildings strive to optimize both energy efficiency and occupant comfort, recognizing that indoor environments have a significant impact on sustainability and human well-being. With individuals spending approximately 80-90% of their time indoors, maintaining a comfortable climate consumes a substantial portion of energy resources7,8. The building sector accounts for about 40% of global energy usage, with heating, ventilation, and air-conditioning (HVAC) systems being major contributors9,10. Integrating Digital Twin technology into smart buildings enhances HVAC efficiency by dynamically predicting and adjusting indoor conditions in real time. Unlike static models, Digital Twins continuously learns occupant preferences and environmental changes, optimizing comfort while reducing energy waste. This intelligent approach improves climate control, enhances occupant well-being, and minimizes unnecessary HVAC operations11,12. Thermal comfort is the state of satisfaction with the thermal environment, influenced by physiological, psychological, and environmental factors13. Key environmental parameters include temperature, humidity, and airflow, while personal traits like metabolism and clothing also play a role (see Fig. 1). Psychological factors, such as expectations and past experiences, further impact comfort. When indoor conditions fail to meet expectations, stress and distraction can reduce well-being and productivity14.
Conventional thermal comfort models, such as Fanger’s Predicted Mean Vote (PMV) equation and the adaptive comfort model, have long been used to predict comfort levels, but they have notable limitations. Fanger’s PMV model estimates the average thermal sensation of a large population based on six factors. Still, it struggles to predict comfort in real-world settings where occupants adapt their behaviour. Adaptive comfort models address some of these shortcomings by considering acclimatization to outdoor weather and occupant control actions. However, they are primarily applicable to naturally ventilated buildings and provide limited guidance for mixed-mode or mechanically conditioned spaces15,16. Machine learning (ML) and deep learning techniques have recently emerged as powerful tools for enhancing HVAC control and personalizing thermal comfort in smart buildings. With the proliferation of IoT sensors and building management systems, vast data on indoor climate and occupant feedback can now be leveraged by AI-driven algorithms17,18. Unlike traditional models, ML approaches can discover complex, nonlinear relationships between environmental inputs and comfort responses, continuously refining predictions as more data becomes available19. Researchers have explored support vector machines, random forest regressors, and neural networks to predict thermal sensation and learn comfort preferences beyond the assumptions of PMV20,21. However, previous machine learning (ML) models for thermal comfort prediction face challenges, such as the need for extensive, high-quality data and the tendency to treat thermal comfort as a static regression or classification problem, thereby neglecting the temporal aspect of comfort.
Motivation
To address the limitations in existing thermal comfort modelling and HVAC control strategies, we propose an attention-based Long Short-Term Memory (LSTM) model tailored for smart building environments. LSTM networks effectively capture temporal dependencies in sequential data, allowing the model to learn how an occupant’s comfort evolves in response to changing indoor conditions. By integrating an attention mechanism, our model dynamically focuses on the most influential features at each timestep, significantly enhancing both predictive accuracy and interpretability. Furthermore, we incorporate Digital Twin technology to create a real-time, virtual representation of the building that simulates environmental dynamics, occupant interactions, and HVAC system behaviour. This integration enables proactive comfort management through predictive simulations and personalized control strategies, optimizing energy use while maintaining occupant satisfaction. The model also supports Explainability through SHAP and LIME analyses, offering insights into the importance of individual features and their corresponding predictions. Our approach classifies thermal comfort into three core categories: Uncomfortably Cold (UC), Neutral (N), and Uncomfortably Warm (UW), and accurately predicts transitions between them. This comprehensive framework supports real-time decision-making, energy efficiency analysis, and personalized comfort modelling, making it a robust solution for next-generation smart building systems.
The increasing demand for sustainability and energy efficiency in smart buildings requires advancements in personalized thermal comfort prediction. HVAC systems consume 40% of building energy, making their optimization crucial. Traditional thermal comfort models often fail to adapt to dynamic occupant preferences, resulting in inefficiencies. While machine learning has improved predictions, challenges remain in modelling sequential dependencies and balancing energy use with comfort. Digital Twin technology offers a solution by creating a real-time virtual replica of building systems, enabling continuous monitoring and adaptive control. Combined with LSTM networks and attention mechanisms, this approach enhances thermal comfort prediction by capturing temporal dependencies. This research develops a data-driven HVAC optimization framework that ensures energy savings and improved occupant comfort in smart buildings.
Research contributions
-
Energy Consumption Classification using Digital Twins and AI: Designed a system connecting IoT sensor data streams through MQTT to an AI classifier, enabling immediate categorization of building energy efficiency. We provide an Attention-Based LSTM framework that accurately predicts Thermal Sensation Votes (TSVs) using environmental and personal factors, improving occupant-specific thermal comfort modelling across three key comfort classes: Uncomfortably Cold (UC), Neutral (N), and Uncomfortably Warm (UW). While the current model is trained on the ASHRAE II dataset due to its validated and diverse real-world measurements, the system architecture is designed to be fully compatible with real-time IoT sensor integration, enabling future deployment in live smart building environments.
-
Digital Twin Environment : Developed a dynamic Digital Twin environment that simulates real-time environmental conditions and personal comfort responses. This system enables continuous monitoring, adaptive HVAC control, and proactive decision-making based on predictive modelling and virtual experimentation.
-
Explainable Reasoning: Applied SHAP and LIME techniques to analyze feature importance and explain individual predictions. This provides interpretable insights into how various environmental and personal features influence thermal comfort outcomes and model behaviour.
-
Scalability and Integration : Conducted scenario-based energy efficiency evaluations by comparing room-level versus personalized thermal adjustments within the Digital Twin. Identified optimal comfort zones that minimize energy use while maintaining thermal satisfaction and built a seamless integration of IoT sensor streams with the prediction model and control strategies, enabling real-time thermal comfort estimation, visualization, and feedback-based adaptation in smart buildings.
Organizations
This paper is organized as follows: Section 2 reviews related work on thermal comfort models and machine learning applications. Section 3 outlines our methodology, detailing data pre-processing, feature selection, and the design of the Attention-Based LSTM model. Section 4 presents the results, comparing our model’s performance with that of existing methods. Section 5 discusses the implications of our findings and potential limitations. Section 6 concludes the study and suggests directions for future research.
Related work
A Digital Twin creates a virtual representation of physical building systems, enabling real-time data collection, simulation, and predictive control to enhance energy efficiency and occupant comfort. Several research works have integrated machine learning with Digital Twin frameworks to develop adaptive climate control strategies. The study in22 evaluates various deep-learning models to predict indoor temperatures, aiming to develop digital twins for HVAC systems. The proposed deep neural network model achieved the highest accuracy with an average RMSE of 0.160 C, making it a strong candidate for digital twin development.
The study23 presents a multivariate attention-based bi-directional long short-term memory (BiLSTM) encoder-decoder neural network for predicting the performance of hybrid ventilation systems in buildings. The model effectively captures temporal dependencies and multivariate correlations, enhancing prediction accuracy for indoor climate control. The comprehensive review in24 explores the application of digital twin technology in enhancing thermal comfort and energy efficiency in buildings. It discusses current advancements, challenges, and future directions, highlighting the integration of real-time data and simulation models for optimized building performance. The work in25 proposes a digital twin framework utilizing deep learning models, including LSTM and BiLSTM, for indoor temperature prediction in smart buildings. The approach demonstrates improved temperature management of HVAC systems, offering benefits for smart building operations. The study in26 introduces an attention-LSTM architecture enhanced with Bayesian hyperparameter optimization to predict indoor temperatures. The model effectively captures temporal patterns and variable importance, resulting in improved prediction accuracy for building climate control.
The research in27 integrates edge computing, digital twins, and deep learning to model indoor climates in intelligent buildings. The study demonstrates that deploying parametric digital twins and predictive models on edge devices enhances real-time climate control, ensuring compliance with privacy regulations. The study in28 presents a digital twin framework for grey-box modeling of thermal dynamics in multi-story residential buildings. The approach combines physical modeling with data-driven techniques to accurately predict indoor temperatures, aiding in the efficient management of HVAC systems. The integration of Long Short-Term Memory (LSTM) models in Digital Twin frameworks has become a transformative approach to enhancing thermal comfort within smart buildings, representing a significant advancement in smart building management. The study in29 introduces a multi-head LSTM model that incorporates both physical and environmental data variations. Controlled experiments with six participants demonstrated the model’s potential in accurately predicting individual thermal comfort, suggesting its applicability in optimizing indoor environments.
The research in30 focuses on developing a cloud platform that integrates Building Information Modeling (BIM) data into a Digital Twin framework. By analyzing data from temperature and humidity sensors, the study aims to predict thermal comfort levels, enhancing real-time monitoring and control within building environments. In31, this study, a Digital Twin application, visualizes a smart area in South Korea, utilizing a deep learning model for personal thermal comfort analysis. The integration of 3D geospatial data and real-time environmental measurements provides insights into individual comfort levels, aiding in the management and reduction of building energy consumption. The study in32 introduces a predictive model combining Convolutional Neural Networks (CNN) and Multivariate Long Short-Term Memory (M-LSTM) networks, optimized using Bayesian methods, to forecast energy usage in commercial buildings. The model effectively captures local and temporal features, leading to an 8% improvement in mean percentage absolute error (MAPE) and a 2% increase in R-squared score across multiple datasets, enhancing HVAC system efficiency. The study in33 leverages Building Information Models (BIM) and indoor localization data to predict occupants’ thermal preferences. By integrating spatial-temporal data into a graph network structure, the proposed model achieves a 14-28% accuracy improvement over baseline models, highlighting the significance of spatial context in thermal comfort prediction. The research in34 presents a method that combines Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks to predict thermal comfort in indoor airflow environments. The CNN captures spatial features of airflow, while the LSTM models temporal dependencies, resulting in improved prediction performance and enhanced responsiveness of the HVAC system. The study in35 develops a digital twin application for smart city visualization, integrating a Mixture-of-Expert (MoE) model architecture with Long Short-Term Memory (LSTM) and Liquid Neural Networks (LNN) to analyze personal thermal comfort. The application facilitates real-time analysis and management of building energy consumption, providing a robust framework for future smart city innovations.
Recent advancements in personalized thermal comfort modeling have increasingly focused on improving adaptability in data-scarce or evolving environments using advanced machine learning techniques. Study in36 introduced an active learning framework that selectively queries the most informative occupant data to efficiently train comfort prediction models, significantly reducing the burden of data labeling while improving personalization. Similarly, the work in37 proposed a transfer learning approach that leverages knowledge from previously trained models across buildings and occupant groups, allowing accurate predictions even with limited data in new settings. Building on both strategies, the study in38presented a hybrid approach combining active and transfer learning, demonstrating enhanced model performance and generalizability by efficiently adapting to individual occupant preferences with minimal input data. These techniques offer promising directions for further strengthening the adaptability of comfort models, and we recognize their relevance for future extensions of our Digital Twin-enabled framework.
Methodology
Figure 2 presents a proposed personalized thermal comfort prediction framework that integrates an attention-enhanced LSTM model with a Digital Twin simulation environment. The core model comprises a Bidirectional LSTM with 64 units, a multi-head self-attention mechanism, and dense layers, enabling it to capture complex spatiotemporal patterns from environmental inputs and personal attributes, such as age, clothing level, and activity. To enhance transparency, explainable AI techniques, specifically SHAP and LIME, are employed, offering both global feature importance and individualized prediction insights. The Digital Twin replicates real-time building conditions, systematically varies environmental parameters (e.g., temperature, humidity), and generates 2D comfort probability maps to identify optimal comfort zones. The system also conducts energy efficiency analysis by comparing the effects of room-level and personal adjustments, supporting energy-conscious decision-making. Thermal comfort is categorized into three classes: Uncomfortably Cold (UC), Neutral (N), and Uncomfortably Warm (UW), resulting in a robust, interpretable, and energy-efficient solution for personalized comfort management.
Algorithm 1 starts with data pre-processing, handling missing values through column removal (Nan) and median imputation. The dataset is normalized using a min-max scaler, with backward fill applied to the year and day columns. Label encoding is used for categorical variables, non-numerical columns are excluded, and a standard scaler ensures consistency. Feature extraction derives environmental and personal attributes for digital twin integration, improving personalized comfort predictions. The dataset is split (70-20) for training and testing, with inputs reshaped into a 3D format for deep learning. The Attention-based LSTM model captures sequential dependencies, leveraging an attention layer to enhance feature significance. The architecture includes a 64-neuron LSTM layer, a fully connected layer with 32 neurons, and a softmax output layer for three-class classification: uncomfortably warm, neutral, and cold. Training is performed using the Adam optimizer, Categorical Cross-Entropy loss, 5-fold cross-validation, 50 epochs, and a batch size of 32 for stable convergence. Model performance is evaluated using accuracy, precision, recall, F1-score, confusion matrices, and ROC curves, which confirm superior results compared to existing benchmarks in personalized comfort classification.
The Digital Twin for Thermal Comfort Prediction Algorithm 2 creates a virtual representation of a building environment to simulate and optimize occupant comfort in real time. It continuously collects sensor data and personal factors, updates the virtual environment, and uses a trained model to predict thermal comfort levels. By simulating variations in temperature and humidity, it generates 2D comfort maps and identifies optimal comfort zones where neutral comfort is most likely with minimal adjustments. It also analyses energy efficiency by comparing room-level versus personal-level adjustments and provides real-time control recommendations based on comfort predictions and energy trade-offs.
Dataset collection and pre-processing
In our research, we utilized a subset of the ASHRAE Global Thermal Comfort Database II, specifically focusing on HVAC-related data comprising 12,596 entries and 80 columns. This subset includes detailed environmental parameters such as air temperature, humidity, airspeed, and radiant temperature alongside subjective occupant feedback on thermal comfort. Employing this dataset is pivotal for developing a Digital Twin of HVAC systems, as it provides comprehensive real-world data essential for creating accurate virtual models. Digital Twins enable real-time simulation and predictive control of HVAC operations, leading to enhanced energy efficiency and personalized occupant comfort. By integrating this rich dataset into our Digital Twin framework, we aim to optimize HVAC system performance and improve the overall indoor environmental quality in smart buildings.
After data collection, proceed to the data pre-processing phase, which is crucial for ensuring the quality and usability of the dataset for thermal comfort prediction and Digital Twin simulation. The raw dataset was first cleaned by handling missing values. Numerical features were imputed using the median, while categorical features were filled with the most frequent values. Specifically, for the year and day columns, backward filling was applied to preserve temporal consistency. To address noise and inconsistency, similar imputation strategies were used to smooth the dataset, which was then stored as a cleaned CSV file with a final shape of (12,595 rows and 57 features). Following data cleaning, all numerical features were normalized using the StandardScaler to ensure they contributed equally to the model’s learning process. Feature engineering was then conducted to extract relevant inputs for both the predictive model and the Digital Twin simulation. Environmental features (e.g., temperature, humidity, air velocity), personal features (e.g., clothing level, metabolic rate, age), and temporal features (e.g., hour, day, season) were derived from modelling the contextual and physiological aspects of thermal comfort. The processed data was then split into an 80-20 train-test ratio for initial evaluation, and 5-fold cross-validation was applied to ensure robustness and reduce overfitting. Lastly, the target thermal comfort labels Uncomfortably Cold (UC), Neutral (N), and Uncomfortably Warm (UW) were encoded into numerical form, enabling the dataset to be ready for classification using the attention-based LSTM model.
Attention-based LSTM model
Proposed a deep learning architecture (Fig. 3) based on an Attention-enhanced Long Short-Term Memory (Attention-LSTM) network. The proposed model effectively captures time-dependent patterns in the environmental and personal feature space while incorporating an attention mechanism to emphasize the most informative time steps in the input sequence. The architecture comprises a two-layer bidirectional LSTM with 64 hidden units per direction and a dropout rate of 0.3, which is used to prevent overfitting. This is followed by a multi-head attention module that projects the bidirectional LSTM output and computes attention weights, enabling the model to selectively focus on relevant temporal representations. The attention output is then flattened and passed through a series of fully connected layers: a linear layer with 128 units, followed by a ReLU activation and dropout, a second dense layer with 64 units, and finally, an output layer with 3 neurons corresponding to the thermal comfort classes-Uncomfortably Cold (UC), Neutral (N), and Uncomfortably Warm (UW). The complete architecture includes ReLU activations and dropout regularisation to enhance generalization.
For model training, we employed the Adam optimizer with a learning rate of 0.001 and used categorical cross-entropy as the loss function. The model was trained for up to 50 epochs with early stopping criteria to prevent overfitting. A batch size of 32 was used, and performance was validated using 5-fold cross-validation to ensure robustness and generalizability across different data partitions. The model evaluation included accuracy, precision, recall, and F1-score, obtained through classification reports and averaged across the validation folds. In addition, confusion matrices were used to visualize the distribution of prediction outcomes across the three thermal comfort classes.
Digital twin integration and XAI
The proposed framework integrates a Digital Twin (DT) to simulate environmental conditions and predict thermal comfort in real-time. The DT mirrors indoor spaces and dynamically adjusts parameters such as temperature, humidity, and seasonal factors to evaluate their impact on occupant comfort. By leveraging the trained Attention-LSTM model, the DT predicts comfort boundaries, identifies optimal conditions, and generates personalized comfort profiles. It also supports energy optimization by simulating various thermal adjustment scenarios and identifying settings that balance comfort with reduced energy consumption. Explainable AI (XAI) methods were employed to enhance transparency and trust in model predictions. SHAP analysis was employed to determine the global importance of features, rank contributing variables, and analyze correlations with thermal comfort outcomes. For local interpretability, LIME and SHAP force plots were used to explain individual predictions, highlighting the contribution of each feature to the final output. These explanations offer actionable insights for optimizing HVAC control and refining system performance. Additionally, energy consumption analysis was conducted through the DT to assess comfort-energy trade-offs. The system explored possible comfort states under varying conditions and calculated the energy cost of maintaining them. This enabled the identification of optimal temperature adjustments that ensure occupant comfort while minimizing energy use, supporting more innovative and more sustainable building management.
Mathematical formulation of the digital twin
The proposed Digital Twin framework serves as a virtual replica of the physical building environment, enabling real-time simulation and prediction of thermal comfort. At any time step t, the environmental state is represented as a feature vector \(\textbf{E}_t \in \mathbb {R}^n\), capturing parameters such as temperature, humidity, and other relevant indoor conditions. Personal characteristics of occupant i are denoted as \(\textbf{P}_i \in \mathbb {R}^m\).
The mapping from the physical to the digital environment is defined by a function \(\mathscr {F}\), producing the Digital Twin state \(\mathscr {D}_t\) in Eq. 1:
To predict thermal comfort, we utilize an Attention-based LSTM model, which outputs a probability distribution over the thermal comfort classes \(C = {\text {UC}, \text {N}, \text {UW}}\) shown in Eq. 2:
\(f_{\text {LSTM}}\) denotes the deep neural model, and the attention mechanism enhances temporal relevance by computing attention weights \(\alpha _{j,t}\) for each feature j at time t in Eq. 3:
where \(e_{j,t}\) represents the learned importance score of feature j at time t.
To explore various thermal conditions, the Digital Twin supports scenario simulations by adjusting environmental variables such as temperature and humidity. Modified states are calculated in Eq. 4:
Where \(\textbf{u}_T\) and \(\textbf{u}_H\) are unit vectors in the temperature and humidity directions, respectively, and \(\Delta T\), \(\Delta H\) represent controlled variations.
These simulations enable the construction of thermal comfort probability maps in Eq. 5:
To determine the most favourable comfort settings, the optimal temperature and humidity adjustments \((\Delta T^, \Delta H^)\) are selected by maximizing the probability of neutral comfort while minimizing energy consumption in Eq. 6
\(\mathscr {E}(\Delta T, \Delta H)\) denotes the estimated energy consumption for a given adjustment, and \(\lambda\) is a regularisation parameter to balance comfort and energy efficiency.
Finally, personalized thermal comfort profiles are modelled through individual-specific functions \(\phi _i\), which adjust general predictions based on historical feedback data given in Eq. 7
In this expression, \(\textbf{P}_{\text {avg}}\) represents the population average of personal factors, and \(\phi _i\) is a personalization function learned from the occupant’s historical comfort data.
Experimental analysis, results and discussion
The results begin with the evaluation of the Attention-based LSTM model for thermal comfort classification. Following this, the model was integrated into the Digital Twin framework, which simulated environmental variations, including temperature, humidity, and seasonal changes, to generate comfort maps and identify optimal comfort conditions. Finally, Explainable AI techniques were applied; SHAP analysis highlighted key feature contributions, while LIME and force plots provided clear explanations for individual predictions.
Thermal comfort classification results
Table 1 summarises the classification performance of the trained model on the test set. The model achieves an overall test accuracy of 83.8%, with particularly high precision and recall for the Neutral (N) class, which reflects the model’s effectiveness in capturing dominant comfort states. However, performance drops for the Uncomfortably Cold (UC) and Uncomfortably Warm (UW) classes, particularly in recall. Despite this, the high precision for UC indicates the model makes confident predictions when it does identify discomfort.
Table 2 presents the training and validation performance of the Attention-LSTM model across selected epochs. Initially, both training and validation accuracies show gradual improvement, indicating stable learning. Significant gains are observed after epoch 15, with the model reaching its peak performance around epoch 35, where it achieves a validation accuracy of 83.53%. Early stopping is triggered at epoch 36 to prevent overfitting.
Table 3 summarizes recent works that integrate deep learning models within Digital Twin frameworks for building environment prediction and control. While most studies focus on temperature forecasting or physical system modeling, our approach emphasizes real-time classification of personalized thermal comfort states (UC, N, UW) using an Attention-Based LSTM. By incorporating environmental, personal, and temporal features, our model offers a more user-centric solution for adaptive comfort control in smart buildings.
The training and validation performance of the Attention-LSTM model is illustrated in Fig. 4a across four key evaluation plots. The first graph shows model accuracy over epochs, where both training (blue) and validation (orange) accuracies remain close to 0.8 until epoch 15, then increase to approximately 0.83 by epoch 20. After that, a slight fluctuation is observed, with training accuracy reaching 0.84 and validation at 0.83 by epoch 35, indicating strong generalization with minor overfitting. The Fig. 4b displays model loss over epochs, where both training and validation losses initially range between 0.65 and 0.7 until epoch 15. A consistent decline follows, with losses dropping to around 0.55 by epoch 20. While training loss continues to decrease to 0.45 by epoch 35, validation loss plateaus, suggesting stable validation performance. The confusion matrix (in percentage) in Fig. 4c illustrates the model’s class-wise prediction capability, where the Neutral (N) class is predicted with high accuracy (99.3%). In contrast, the UC class shows mixed predictions, with 64.1% correctly classified and 35.9% misclassified as N. The UW class has the lowest recognition, with 97.1% misclassified as N and only 2.4% correctly predicted. Lastly, the cross-validation average plot shows in Fig. 4d accuracy across five folds, with fold 1 and fold 4 reaching 0.8. The red horizontal line indicates the overall mean cross-validation accuracy of 0.8385, confirming the model’s consistency across multiple training splits.
Digital twin- environmental factor analysis
In this section, we analyze the relationship between environmental factors and thermal comfort. Fig. 5a is a box plot illustrating the distribution of environmental feature values across thermal comfort levels. The x-axis represents the three comfort classes: Uncomfortably Cold (UC), Neutral (N), and Uncomfortably Warm (UW), while the y-axis shows the feature values (such as temperature, humidity, air velocity, etc.). The plot reveals clear patterns: for instance, temperature values are lowest in the UC group, highest in the UW group, and moderately distributed in the Neutral category. Humidity and air velocity also show subtle variations across the comfort levels. This plot effectively illustrates the differences in environmental conditions between perceived thermal states and highlights which features are most responsive to comfort transitions. The Fig. 5b displays the temperature distribution by comfort level. It illustrates how temperature values are distributed for each class (UC, N, and UW), typically using a histogram or density curve. The UC class peaks in the lower temperature range (e.g., \(15{-}21^{\circ }\hbox {C}\)), the Neutral class is centred around mid-range temperatures (\(21{-}26^{\circ }\hbox {C}\)), and the UW class is associated with higher temperatures (above \(26^{\circ }\hbox {C}\)). This confirms that temperature is a strong indicator of perceived comfort, aligning with expectations of human thermal responses. The Fig. 5c presents the humidity distribution by comfort level, offering a similar view for relative humidity. The Neutral class spans a broad humidity range, typically between 30-60%, suggesting that people tolerate various humidity levels when the temperature is within a comfortable range. The UC category tends to occur more frequently at higher humidity, while the UW class exhibits a greater spread, reflecting the complex interactions between heat and moisture perception.
Digital twin prediction analysis
This section presents a series of visualizations generated through the Digital Twin simulation to analyze thermal comfort predictions. The Fig. 6a shows the count matrix that visualizes the frequency of predicted thermal comfort states-Uncomfortably Cold (UC), Neutral (N), and Uncomfortably Warm (UW). It helps identify which comfort state occurs most frequently across the dataset or simulation timeline. The matrix indicates that the Neutral class dominates, suggesting that indoor conditions are generally maintained within a comfortable range. At the same time, UC and UW are less frequent but still relevant for personalized control. Fig. 6b shows the Comfort Zone Temperature and Humidity Distribution, typically as a scatter or density map. It illustrates the range of temperature and humidity combinations where occupants are most likely to feel neutral. The dense cluster of points within a specific temperature-humidity region defines the optimal comfort zone. This plot is crucial for HVAC systems to target environmental setpoints that maintain occupants within this zone, improving energy efficiency and comfort.
The Fig. 6c presents Personalized Comfort Profiles, showing how comfort perception varies across individuals based on their unique environmental and personal factors. It may include individual comfort ranges or probability distributions for each person, highlighting the variability in thermal preferences. This plot highlights the importance of adaptive systems that adjust settings based on occupant-specific models, rather than relying on a one-size-fits-all approach. The Fig. 6d illustrates a Time-Based Thermal Comfort Transition Analysis, displaying how comfort states change over time. This temporal analysis helps identify comfort trends, transitions (e.g., from Neutral to UW), and the impact of environmental changes or control strategies. Such insights are valuable for real-time system adjustments and proactive thermal management using the Digital Twin framework.
Digital Twin-based thermal comfort prediction analysis: (a) frequency of predicted comfort states, (b) optimal comfort zones identified across temperature-humidity combinations, (c) individualized comfort profiles capturing occupant-specific thermal preferences, and (d) time-series analysis of comfort state transitions highlighting temporal variability.
Optimal comfort and energy saving analysis
The Fig. 7a shows the Optimal Comfort Conditions, mapping the temperature and humidity values where the probability of neutral thermal comfort is highest. The results indicate that the optimal comfort zone is concentrated within a temperature range of approximately \(22^{\circ }\hbox {C}\) to \(26^{\circ }\hbox {C}\) and a relative humidity range between 30% and 55%. This defined region enables HVAC systems to maintain set points that maximize comfort while minimizing unnecessary energy consumption. The Fig. 7b illustrates the Energy Saving Potential Analysis, comparing energy consumption under default HVAC settings with consumption under conditions tailored to personalized comfort profiles. The results demonstrate an apparent reduction in energy usage when personalized comfort zones are employed, indicating that occupant-centric control strategies can save energy without compromising occupant comfort. The analysis highlights Digital Twin’s capability to support energy-efficient operations by aligning HVAC output with real-time comfort predictions. The Fig. 7c compares Predicted Mean Vote (PMV) values with actual comfort feedback. The graph reveals significant discrepancies between PMV-based predictions and actual occupant-reported comfort levels, particularly in the Uncomfortably Warm (UW) and Uncomfortably Cold (UC) ranges. While PMV aligns reasonably well with neutral conditions, it fails to capture individual variations in discomfort. This supports the claim that traditional models, such as PMV, are limited in personalized settings and demonstrates the advantage of data-driven methods powered by machine learning and real occupant data.
Digital Twin-assisted evaluation of comfort and energy outcomes: a (a) identification of optimal thermal zones maximizing comfort probability, (b) assessment of energy-saving potential via occupant-specific control strategies, and (c) divergence between PMV-based estimates and real occupant comfort, emphasizing the value of data-driven modelling.
Explainable XAI and LIME analysis
Figure 8 depicts the global SHAP values to highlight which features have the most significant impact on the thermal comfort predictions. The results show that temperature is the most influential feature, followed by relative humidity, air velocity, and clothing insulation. The plots also indicate the direction of influence: for instance, higher temperatures increase the likelihood of UW (Uncomfortably Warm), while lower values push predictions toward UC (Uncomfortably Cold). This confirms that the model’s decisions are aligned with expected physiological responses to environmental conditions.
Figure 8a presents the correlation between input features used in the thermal comfort prediction model. It highlights how different features, such as temperature, humidity, metabolic rate (MET), air velocity, and clothing insulation, are interrelated. The results show that while temperature and humidity exhibit some correlation, most features maintain low to moderate correlation values. This suggests that the model benefits from a diverse and complementary feature set, reducing redundancy and improving generalization. The Fig. 8b shows the distribution of the MET feature across the dataset. The majority of values are clustered between 1.0 and 1.2 MET, corresponding to light activity levels, such as sitting or office work. This distribution reflects typical indoor occupant behaviour and supports the inclusion of MET as a key personal factor influencing thermal perception.
The Fig. 8c displays the distribution of the AC10 feature, which may represent air conditioning usage or a derived thermal exposure metric. The distribution is more varied, indicating a wide range of environmental or behavioural conditions. Understanding how this feature spans across data points helps clarify its influence on comfort prediction and strengthens model interpretability. LIME explanations in Fig. 8d highlight the top contributing features for a given prediction by approximating the model locally. The result confirms that temperature, humidity, and clothing are consistently among the strongest predictors of comfort.
Digital twin comfort level analysis
The Fig. 9a explores predicted comfort outcomes over a range of temperature values. The results identify the optimal comfort zone between \(22^{\circ }\hbox {C}\) and \(26^{\circ }\hbox {C}\), where the Neutral class probability is highest. Outside this range, predictions shift toward discomfort. This simulation offers actionable insights for HVAC control, enabling the maintenance of optimal thermal environments. The Fig. 9b tracks how the predicted comfort level changes with rising or falling temperature. The results show a clear transition from UC at lower temperatures to Neutral and then to UW at higher temperatures. This demonstrates the model’s sensitivity and smooth responsiveness to temperature fluctuations.
Digital Twin-based simulation of environmental sensitivity: (a-c) effects of temperature variation on comfort state predictions and class probability distribution, and (d-e) analysis of humidity impact on thermal comfort, highlighting nonlinear shifts in predicted probabilities under changing environmental conditions.
Figure 9c shows the full probability distribution across comfort classes as temperature changes. It reveals that Neutral comfort dominates between \(22{-}26^{\circ }\hbox {C}\), while UC spikes below \(20^{\circ }\hbox {C}\) and UW increases beyond \(27^{\circ }\hbox {C}\). This provides detailed probabilistic guidance for adaptive setpoint control within the Digital Twin. The Fig. 9d analyses how humidity changes influence predicted comfort levels. The result shows that neutral comfort is highest within 30-60% relative humidity, with discomfort probabilities increasing outside this range. This emphasises the importance of humidity regulation alongside temperature. The Fig. 9e provides a deeper look at class probability shifts with varying humidity.
This highlights the nonlinear effect of humidity on comfort and the value of dynamic humidity control. To evaluate the impact of comfort-based environmental adjustments on energy usage, we employ a surrogate energy estimation model within the Digital Twin framework. This model approximates HVAC energy consumption by relating deviations in temperature and humidity setpoints to estimated energy demand. The formulation is based on empirical relationships reported in prior studies, such as39, enabling real-time simulation of energy-comfort trade-offs without requiring high-fidelity simulation engines. While this approach does not replicate detailed system dynamics, it offers a computationally efficient means to compare control strategies under varying comfort conditions.
Discussion
The results of this study demonstrate the effectiveness of combining an Attention-based LSTM model with Digital Twin simulations for personalized thermal comfort prediction in smart buildings. The model achieved high predictive accuracy (up to 87.9%) and showed strong generalization across cross-validation folds, particularly for the dominant Neutral comfort class. Through integration with a Digital Twin, the framework was able to simulate environmental variations, generate personalized comfort profiles, and identify optimal comfort zones that balance user satisfaction and energy consumption. Explainable AI methods, such as SHAP and LIME, provide transparency by revealing key contributing features, including temperature, humidity, and clothing insulation, thereby improving model interpretability and trust. However, challenges remain in improving classification performance for minority comfort classes (UC and UW), which may benefit from advanced sampling techniques, larger datasets, or context-aware modelling. Additionally, while the current dataset reflects mechanically controlled HVAC conditions, future studies should evaluate model performance in naturally ventilated buildings and under diverse climate conditions. The framework is adaptable and can be fine-tuned using region-specific data; its attention mechanism is well-suited to learning seasonal or dynamic comfort patterns. Furthermore, in multi-occupant spaces, where thermal preferences may vary simultaneously, the system could be extended using group-based modeling approaches-such as occupant clustering, comfort consensus metrics, or fairness-aware optimization-to support shared HVAC control strategies. These directions will enhance the scalability, inclusiveness, and real-world applicability of the proposed system.
Conclusion and future scope
This research presents a comprehensive framework for personalized thermal comfort prediction by integrating an Attention-based LSTM model with Digital Twin simulations and Explainable AI techniques. The proposed model achieved a high classification accuracy of 87.9%, with consistent performance across 5-fold cross-validation and test accuracy of 83.8%, outperforming traditional comfort modelling approaches. The integration of Digital Twin technology enabled real-time simulation of environmental variations, identification of optimal comfort zones (typically within \(22{-}26^{\circ }\hbox {C}\) temperature and 30-60% humidity), and generation of personalized comfort profiles. Additionally, SHAP and LIME analyses provided transparency into model predictions by highlighting the influence of key features such as temperature, humidity, and metabolic rate. The framework also demonstrated energy-saving potential by optimizing HVAC settings based on occupant-specific comfort predictions, reducing unnecessary energy usage.
Future work will focus on incorporating adaptive feedback loops, expanding the system to multi-occupant settings, and integrating real-time sensor control for continuous thermal optimization. Additionally, expanding the framework to account for outdoor environmental changes and integrating reinforcement learning for HVAC control decisions presents promising research directions.
Data availability
Data is available at [https://datadryad.org/dataset/doi:10.6078/D1F671].
References
Tao, F., Xiao, B., Qi, Q., Cheng, J. & Ji, P. Digital twin modeling. J. Manuf. Syst.64, 372–389. https://doi.org/10.1016/j.jmsy.2022.06.015 (2022).
Singh, M. et al. Digital twin: Origin to future. Appl. Syst. Innov. https://doi.org/10.3390/asi4020036 (2021).
Tao, F., Zhang, H., Liu, A. & Nee, A. Y. C. Digital twin in industry: State-of-the-art. IEEE Trans. Ind. Inform.15, 2405–2415. https://doi.org/10.1109/TII.2018.2873186 (2019).
Eneyew, D. D., Capretz, M. A. M. & Bitsuamlak, G. T. Toward smart-building digital twins: Bim and iot data integration. IEEE Access10, 130487–130506. https://doi.org/10.1109/ACCESS.2022.3229370 (2022).
Ghansah, F. A. Major opportunities of digital twins for smart buildings: a scientometric and content analysis. https://www.emerald.com/insight/content/doi/10.1108/sasbe-09-2022-0192/full/html (2024). [Accessed 20-03-2025].
Hadjidemetriou, L. et al. A digital twin architecture for real-time and offline high granularity analysis in smart buildings. Sustain. Cities Soc.98, 104795. https://doi.org/10.1016/j.scs.2023.104795 (2023).
Meimand, M. & Jazizadeh, F. A personal touch to demand response: An occupant-centric control strategy for HVAC systems using personalized comfort models. Energy Build.303, 113769 (2024).
Khashehchi, M., Thangavel, S., Rahmanivahid, P. & Heidari, M. Energy planning and thermal comfort in buildings. In Sustainable Technologies for Energy Efficient Buildings, 258–276 (CRC Press, 2024).
Azzi, A., Tabaa, M., Chegari, B. & Hachimi, H. Balancing sustainability and comfort: A holistic study of building control strategies that meet the global standards for efficiency and thermal comfort. Sustainability16, 2154 (2024).
Salas, A. F., Igualada, L., Farré, J., Serrano, M. & Montes, T. Enhancing user comfort in smart buildings though operational optimization. In 2024 3rd International Conference on Energy Transition in the Mediterranean Area (SyNERGY MED), 1–5 (IEEE, 2024).
Rana, M. Design and optimization of energy-efficient hvac systems for smart buildings. In International Journal for Research Publication and Seminar 15, 50–59 (2024).
Arowoiya, V. A., Onososen, A. O., Moehler, R. C. & Fang, Y. Influence of thermal comfort on energy consumption for building occupants: The current state of the art. Buildings 14, 1310 (2024).
Fabbri, K. Thermal comfort perception. A Questionnaire Approach Focusing on Children, 2nd ed.; Springer Nature: Berlin/Heidelberg, Germany (2024).
Fan, G., Chen, Y. & Deng, Q. Thermal comfort. In Personal Comfort Systems for Improving Indoor Thermal Comfort and Air Quality, 1–23 (Springer, 2023).
Fabbri, K. The indoor thermal comfort indexes pmv and ppd. In Thermal Comfort Perception: A Questionnaire Approach Focusing on Children, 83–135 (Springer, 2024).
Ze Ze, C. B., Nneme Nneme, L. & Monkam, L. Pmvd/ppdd model for predicting thermal comfort in air-conditioned buildings in hot and humid regions of sub-saharan africa. International Journal of Air-Conditioning and Refrigeration 32, 19 (2024).
Boutahri, Y. & Tilioua, A. Machine learning-based predictive model for thermal comfort and energy optimization in smart buildings. Results Eng.22, 102148 (2024).
Bresa, A. Human-centered predictive control in buildings using personalized comfort data-driven models. Ph.D. thesis, University of Zagreb. Faculty of Mechanical Engineering and Naval Architecture (2024).
Haghirad, M., Heidari, S. & Hosseini, H. Advancing personal thermal comfort prediction: A data-driven framework integrating environmental and occupant dynamics using machine learning. Build. Environ.262, 111799 (2024).
Assymkhan, N. & Kartbayev, A. Advanced iot-enabled indoor thermal comfort prediction using svm and random forest models. Int. J. Adv. Comput. Sci. & Appl. 15 (2024).
Ribeiro, B., Silva, R., Mota, B., Gomes, L. & Vale, Z. Learning-based models for intelligent control over air conditioning units in a smart building. In International Conference on Soft Computing Models in Industrial and Environmental Applications, 197–207 (Springer, 2024).
Norouzi, P., Maalej, S. & Mora, R. Applicability of deep learning algorithms for predicting indoor temperatures: Towards the development of digital twin hvac systems. Buildings13, 1542. https://doi.org/10.3390/buildings13061542 (2023).
Chaudhary, G., Johra, H., Georges, L. & Austbø, B. Predicting the performance of hybrid ventilation in buildings using a multivariate attention-based bilstm encoder-decoder neural network. arXiv preprint arXiv:2302.04126 (2023).
Feng, Y., Chen, J. & Wang, W. Digital twin technology for thermal comfort and energy efficiency in buildings: A comprehensive review. J. Build. Eng. 57, https://doi.org/10.1016/j.jobe.2022.104837 (2023).
Islam, M. B., Guerrieri, A., Gravina, R. & Fortino, G. A deep learning based digital twin for indoor temperature prediction in smart buildings. 2024 IEEE Conference on Pervasive and Intelligent Computing (PICom) 83–90, https://doi.org/10.1109/PICom64201.2024.00018 (2024).
Zhang, W., Li, H. & Liu, Y. Attention-lstm architecture combined with bayesian hyperparameter optimization for indoor temperature prediction. Energy Build. 276, https://doi.org/10.1016/j.enbuild.2023.112509 (2023).
Ni, Z., Zhang, C., Karlsson, M. & Gong, S. Edge-based parametric digital twins for intelligent building indoor climate modeling. arXiv preprint arXiv:2403.04326 (2024).
Morkunaite, L., Kardoka, J., Pupeikis, D., Fokaides, P. & Angelakis, V. Digital twin for grey box modeling of multistory residential building thermal dynamics. arXiv preprint arXiv:2402.02909 (2024).
Cho, J., Shin, H., Ahn, Y. & Ho, J. The personalized thermal comfort prediction using an MH-LSTM neural network method. Adv. Civ. Eng.2024, 2106137. https://doi.org/10.1155/2024/2106137 (2024).
ElArwady, Z., Kandil, A., Afiffy, M. & Marzouk, M. M. Modeling indoor thermal comfort in buildings using digital twin and machine learning. Dev. Built Environ.19, 100480. https://doi.org/10.1016/j.dibe.2024.100480 (2024).
Zhao, Y., Carli, R., Ozcelik, G., Whitehouse, K. & Berges, M. Personal thermal comfort models using digital twins. arXiv preprint arXiv:2111.00199 (2021).
Le, C. N. et al. Bayesian optimized of cnn-m-lstm for thermal comfort prediction and load forecasting in commercial buildings. Preprints (2025).
Li, W., Zhang, G., Chen, L. & Li, K. Hybrid personalized thermal comfort model based on wrist skin temperature and indoor air temperature. Build. Environ. 246, (2024).
Li, X., Zhang, Y., Wang, W. & Li, H. Bo-sta-lstm: Building energy prediction based on a bayesian optimization and spatial-temporal attention improved lstm. Energy and AI 10, 100169 (2024).
Nguyen, T. H., Le, C. N., Dinh, T. N., Stojcevski, S. & Stojcevski, A. Digital twin smart city visualization with moe-based personal thermal comfort analysis. Sensors 25, 705 (2025).
Tekler, Z. D., Lei, Y., Peng, Y., Miller, C. & Chong, A. A hybrid active learning framework for personal thermal comfort models. Build. Environ.234, 110148 (2023).
Gao, N. et al. Transfer learning for thermal comfort prediction in multiple cities. Build. Environ.195, 107725 (2021).
Tekler, Z. D., Lei, Y. & Chong, A. Data-efficient comfort modeling: Active transfer learning for predicting personal thermal comfort using limited data. Energy Build.319, 114507 (2024).
Zhang, W., Wu, Y. & Calautit, J. K. A review on occupancy prediction through machine learning for enhancing energy efficiency, air quality and thermal comfort in the built environment. Renew. Sustain. Energy Rev.167, 112704 (2022).
Acknowledgements
The authors extend their sincere appreciation to the Deanship of Scientific Research at Northern Border University, Arar, Saudi Arabia, for funding this research through project number NBU-CRP-2025-2105, and to the Deanship of Research and Graduate Studies at King Khalid University for supporting this work through the Large Group Project under grant number RGP2/473/46.
Funding
The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through LargeGroup Project under grant number (RGP2/473/46). The authors extend their appreciation to the Deanship of Scientific Research at Northern Border University, Arar, KSA for funding this research workthrough the project number NBU-CRP-2025-2105.
Author information
Authors and Affiliations
Contributions
Conceptualization: Ahmad Almadhor, Nejib Ghazouani, Belgacem Bouallegue. Data curation: Shtwai Alsubai, Abdullah Al Hejaili, Nejib Ghazouani, Belgacem Bouallegue, Moez Krichen; Formal analysis: Natalia Kryvniska, Shtwai Alsubai, Abdullah Al Hejaili, Nejib Ghazouani, Belgacem Bouallegue, Moez Krichen; Funding acquisition: Nejib Ghazouani, Belgacem Bouallegue; Investigation: Nejib Ghazouani, Belgacem Bouallegue, Shtwai Alsubai, Abdullah Al Hejaili, Moez Krichen, Gabriel Avelino Sampedro; Methodology: Nejib Ghazouani, Belgacem Bouallegue, Ahmad Almadhor, Moez Krichen, Shtwai Alsubai, Abdullah Al Hejaili; Project administration: Ahmad Almadhor and Gabriel Avelino Sampedro; Resources: Ahmad Almadhor and Gabriel Avelino Sampedro; Software: Moez Krichen; Supervision: Ahmad Almadhor, Gabriel Avelino Sampedro, Nejib Ghazouani, Belgacem Bouallegue; Validation: Gabriel Avelino Sampedro and Ahmad Almadhor; Visualization: Nejib Ghazouani, Belgacem Bouallegue, Gabriel Avelino Sampedro and Ahmad Almadhor; Writing - review & editing: Ahmad Almadhor, Nejib Ghazouani, Belgacem Bouallegue, Natalia Kryvniska, Shtwai Alsubai, Abdullah Al Hejaili, Gabriel Avelino Sampedro, and Moez Krichen.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors share no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Almadhor, A., Ghazouani, N., Bouallegue, B. et al. Digital twin based deep learning framework for personalized thermal comfort prediction and energy efficient operation in smart buildings. Sci Rep 15, 24654 (2025). https://doi.org/10.1038/s41598-025-10086-y
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-10086-y













