Abstract
The rapid growth of Internet of Things (IoT) devices in smart grids and industrial control systems means that the global state has attained a level of technological evolution. Yet this growth has also created an enormous attack surface with millions of vulnerable endpoints, thus revealing inherent weaknesses of traditional security models. These conventional systems are fraught with data integrity challenges, single points of failure, and no proactive defense against new, adaptive cyber-physical threats. To overcome these limitations, this paper presents "Causio-TwinChain," a new security model that synergistically integrates three leading-edge technologies to establish a proactive, self-diagnostic, and tamper-resistant security framework for critical IoT infrastructure. A digital twin is a virtual replica that can monitor physical devices in real time via sandboxing. A permissioned blockchain provides an immutable, tamper-proof ledger for all device data and transactions, ensuring data integrity and auditability. Two kinds of machine-learning engines form the core intelligence: contrastive Learning, which detects subtle anomalies by modeling normal operations; and structural causal Learning, which diagnoses root causes of security incidents and predicts their potential impact. The model’s superior efficacy is demonstrated on an industrial IoT dataset. Causio-TwinChain yielded a 15.3% higher F1-score in novel attack detection, and reduced the mean time for incident diagnosis by 68% compared to benchmark intrusion detection systems. This model reduced the false-positive rate by 22%, demonstrating its robustness in noisy environments. Moving beyond mere attack detection to explainable diagnosis and predictive mitigation, this work establishes a new benchmark for building proactive, resilient, and self-healing security frameworks that safeguard the most critical IoT applications and enhance trust and continuity in operational services.
Similar content being viewed by others
Introduction
The IoT invasion into critical infrastructure has phenomenally increased the scale of the cyber-attack surface, thereby exposing traditional security models to a desperate measure of restraint. Indeed, the majority of traditional systems, which are based on centralized architectures and use signature-based detection common to traditional Intrusion Detection Systems, are inherently reactive in nature1. Thus, they are barely able to provide data integrity and single-point failure resistance, not to mention new, zero-day, or advanced cyber-physical attacks2. In fact, this kind of paradigm is excruciatingly insufficient in settings such as smart grids or industrial control systems, where physical damage and even paralysis could result from a security breach3. The fundamental issue is to move away from this weak, detective-focused model to an active, robust, and smart diagnostic security architecture that can lead to certainty and sustainability in the face of emerging threats4.
There has been a paradigm shift in IoT security driven by advanced technologies. Recent studies have increasingly leveraged blockchain to enhance the security of IoT systems5,6. Blockchain provides a decentralized, immutable ledger that securely stores all transactions on devices, ensuring data integrity, auditability, and reducing tampering and single-point-of-failure risks7,8. Digital Twins (DTs) are high-fidelity virtual models of physical objects and enable real-time, sandboxed monitoring and simulation to perform actions and provide continuous state assessment without disrupting the operation technology9,10. In addition to this underlying trust and transparency, there is a dual machine-learning engine that provides fundamental intelligence. Contrastive Learning focuses on addressing new attacks by unsupervised learning a model of normal device behavior11. Knowing the complex patterns of legitimate operations, it can detect subtle anomalies never before noticed, without using labelled attack data, which is frequently limited and expensive. Nevertheless, it is not enough to detect it to respond effectively12. Structural Causal Learning is important here. It does not look at what constitutes an anomaly but diagnoses the why by modeling the cause-and-effect relationships within the system itself13,14. This facilitates the accurate investigation of security incidents and, most importantly, allows their possible cascading effects to be predicted15. The combined power of these four technologies is seen through Blockchain, which creates trust; DTs, which provide visibility; and the contrastive-causal learning pair, which creates diagnostic intelligence, forming a whole of proactive, self-diagnostic, and resilient IoT security.
Traditional IoT security has been based to date on centralized architectures and signature-based methodologies. This methodology also incorporates conventional firewalls and IDSes, which operate by comparing network traffic and system activity to a database of stored attack signatures16. Although this is effective in dealing with known threats, it is essentially a reactive strategy that does not work against novel or zero-day attacks, for which no signatures are known. Besides, this central control offers a critical vulnerability- a single point of failure whose failure can cause the downfall of the entire security posture of a network17,18. The approaches are also diagnostic incapable as they provide only alerts without clarifying any underlying cause or possible effect of an incident, which is crippling in the ability to mitigate and respond in a complex and interconnecting IoT environment19,20.
In this respect, the motivation of this work arises from the critical shortage of existing security models in fulfilling the sophisticated requirements of modern IoT infrastructures. In recent times, although some research has started investigating the integration of Blockchain and DTs, the derived framework often remains limited to ensuring data integrity and detection, without proactive diagnostic intelligence21. What is urgently needed is an integrated model that identifies the anomaly, explains its root cause, and projects its consequences for pre-emptive mitigation. Consequently, this work is motivated by the immediate need to bridge this gap and thus proposes a novel synthesis of contrastive and causal Learning with Blockchain and DTs. A tamper-resilient, self-diagnosing, and proactively intelligent security framework is aimed at moving beyond simple detection to protect critical IoT applications. The main contributions of the research are,
-
This study presents Causio-TwinChain, a comprehensive Industrial IoT security architecture that integrates DT modeling, contrastive anomaly detection, structural causal reasoning, and permissioned blockchain for proactive threat identification and safe telemetry management.
-
An anomaly detection module based on contrastive learning is developed to learn robust latent representations from high-dimensional telemetry streams, thereby enhancing anomaly separability and robustness to noisy, heterogeneous IoT sensor data.
-
Structural causal modeling combined with DT simulations facilitates root-cause analysis via intervention and counterfactual reasoning, offering interpretable explanations for identified abnormalities in intricate Industrial IoT settings.
-
A permissioned blockchain layer provides tamper-proof storage and validation of telemetry records and security incidents, enabling reliable auditing, maintaining data integrity, and enabling safe forensic tracing.
-
Comprehensive experimental evaluations against current Industrial IoT security frameworks reveal enhancements in anomaly detection precision, root-cause analysis proficiency, and secure telemetry management effectiveness.
Main objective of the research
-
1.
To design and propose a new security framework, "Causio-TwinChain," which integrates DTs, blockchain, and a dual-machine-learning engine in a synergistic manner for critical IoT infrastructure.
-
2.
To develop, within the framework, a dual-ML engine that will use contrastive Learning for unsupervised anomaly detection, and structural causal Learning for root-cause diagnosis.
-
3.
The proposed model will ensure end-to-end data integrity and auditability by using an immutable, permissioned blockchain ledger.
-
4.
Empirically validate the performance of the framework against conventional systems by showing superior attack detection, quick diagnosis, and robustness in a noisy environment.
-
5.
Establish a new benchmark for proactive, self-diagnosing, tamper-resistant security frameworks that go beyond mere detection to explainable diagnosis and predictive mitigation.
Preliminaries and problem statement
This section offers the requisite theoretical basis and a stringent articulation of the security issue. It elaborates on the essential technologies, namely DTs, blockchain, contrastive Learning, and causal Learning, that compose a set of building blocks of the proposed Causio-TwinChain framework. In addition, it methodically describes the specific constraints of current security models in sensitive IoT settings, thereby providing a transparent and strong justification for the new model presented in the current research.
Preliminaries
Digital twins
A DT is a formal cyber-physical mapping \(\psi : P \to V\) between a physical entity \(P\) and its virtual counterpart \(V\), under stringent temporal synchronization by means of continuous data assimilation. The differential equation determines the evolution of the state \(\frac{d\overrightarrow{{x}_{v}}}{dt}= f(\overrightarrow{{x}_{p}}, \overrightarrow{u}, t)+\epsilon (t)\), where \(\overrightarrow{{x}_{v}}\) is the virtual state vector, \(\overrightarrow{{x}_{p}}\) the physical state vector, \(\overrightarrow{u}\) the control inputs, and \(\epsilon (t)\) the synchronization error. As such, DTs can act as secure sandboxes for continuous monitoring, state simulation, and behavioral analysis, enabling thorough security testing and parallel execution of attack scenarios in an isolated virtual environment without perturbing the operations of the physical system.
Blockchain technology
The framework proposes a permissioned blockchain \(B = \{{B}_{1}, {B}_{2}, \ldots, {B}_{n}\},\) where each block \({B}_{i}\) embeds a cryptographic hash of its predecessor, formally represented as \(H({B}_{i}) = Hash\left(H\left({B}_{i-1}\right) |\Vert {B}_{i}\Vert Nonce\right)\), wherein \({T}_{i}\) represents the Merkle root of the transactions, and Nonce meets the Proof-of-Authority consensus condition. This cryptographic chaining ensures immutability and presents a tamper-proof audit trail for all device data and their interactions. Smart contracts \(\varLambda\) enforce compliance with automated policies that implement state transitions \(\Lambda :S \times A\to {S}^{\prime}\) upon valid action \(A\) triggers state transition from \(S\) to \({S}^{\prime}\). ensuring data integrity, non-repudiation, and single-point-of-failure resilience in an IoT network.
Contrastive learning
The framework uses contrastive Learning to learn a representation function \(\phi : X \to Z\) so that the normalized temperature-scaled cross entropy loss, \({\mathcal{L}}_{contrast}=-\text{log}\) \(\frac{exp\left(\frac{sim\left({z}_{i}, {z}_{j}\right)}{\tau }\right)}{\sum_{k=1}^{2N}{1}_{k\ne i}exp\left(\frac{sim\left({z}_{i}, {z}_{j}\right)}{\tau }\right)}\), is minimized, with \({z}_{i}=\phi ({x}_{i}), sim(u,v)=\frac{{u}^{T}v}{\left|\left|u\right|\right|\left|\left|v\right|\right|}\) as cosine similarity, and \(\tau\) as a temperature parameter. This objective function will effectively cluster the augmented views of normal operational data in the latent space \(Z\) and systematically separate anomalous instances. The learned representations are thus capable of detecting zero-day attacks by distance-based thresholding-an instance is said to be anomalous if \(d\left(z, \mu \right)>\delta\), where \(\mu\) denotes the centroid of normal embeddings. Therefore, this model guarantees robust anomaly detection without using any pre-labeled malicious data.
Structural causal learning
The causal framework employs a Structural Causal Model (SCM) defined by the tuple \(\mathcal{M}=\langle U,V,F,P(u)\rangle\), where \(U\) represents exogenous variables, \(V=\{{V}_{1}, {V}_{2}, ., {V}_{n}\}\) endogenous variables, \(F=\{{f}_{1}, {f}_{2}, ., {f}_{n}\}\) structural functions, and \(P(u)\) the probability distribution over \(U\). Each structural equation takes the form \({v}_{i}\leftarrow {f}_{i}(p{a}_{i}, {u}_{i}),\) where \(pa_i\subseteq V\backslash \left\{{V}_{i}\right\}\) denotes the causal parents of \({V}_{i}\). This formalization enables comprehensive causal reasoning across three hierarchical levels: observational inference \(P\left({v}_{i} \right|{v}_{j})\) for detection, interventional analysis \(P({v}_{i} | do({v}_{j}))\) for root cause identification, and counterfactual reasoning \(P\left(\frac{{v}_{i}}{{v}_{i}}, {v}_{j}\right)\) for impact prediction, where \({v}_{i}\) represents the potential outcome under intervention. By leveraging Pearl’s do-calculus, the framework allows the identification of causal relations from purely observational data, which, in turn, can accurately attribute security incidents to their root causes and predict their possible cascading effects throughout the system.
Problem statement
Let there be an Industrial IoT network that is formed with a collection of physical devices \(\mathcal{D}=\left\{{d}_{1},{d}_{2},\ldots ,{d}_{n}\right\}\), each producing a continuous multivariate telemetry stream \({X}_{t}\). Traditional security systems \(S,\) which are often centralized and signature-based, are inadequate for this environment. They are susceptible to novel attacks \({A}_{novel},\) lack data integrity guarantees, and provide alerts without diagnostic insight. The main issue is that the absence of a unified security framework that is simultaneously:
-
Proactive & Resilient: Capable of identifying new, non-signature-based attacks; resilient to single points of failure and tampering with data.
-
Self-Diagnosis: capable of not only detecting anomalies but also auto-diagnosing their root causes \({R}_{c}\) and predict their potential impact \({I}_{p}.\)
-
Explainable and Auditable: It provides transparent reasoning for its security decisions and keeps an immutable record of system state and events.
Therefore, the aim is to build a security model \(\mathscr{M}\) that leverages the synergy of a DT, a permissioned blockchain (BC), and a dual-machine learning engine combining contrastive learning \((CL)\) and causal Learning \((SCL).\) Formally, the model is defined as: \(\mathcal{M}=\left\{DT\left(\mathcal{D}\right),BC\left(\mathcal{T}\right),CL\left({X}_{t}\right),SCL\left(X,\mathcal{G}\right)\right\}\), where \(\mathcal{T}\) is the set of all transactions and states, and \(\mathcal{G}\) is the causal graph. The model \(\mathcal{M}\) must maximize detection accuracy for \({A}_{novel}\), Minimize the time taken to diagnose the root cause \({R}_{c}\). The key findings from the report are highlighted as follows: and allow for the integrity of the whole security log \(\mathcal{T}\).
Literature review
Digital twin security frameworks
Thakur et al.22 proposed a three-factor, privacy-preserving authentication scheme for DT environments that is secure against the weaknesses of a recent blockchain-based protocol. This work integrates password, smart-card, and biometric authentication factors and ensures security through informal analyses, BAN logic, and the ROR model. The results demonstrate enhanced resistance to impersonation, password-guessing, and session-specific attacks at reduced computational cost. However, it still relies on good-quality biometric acquisition and involves slightly higher complexity at the registration stage than lightweight two-factor authentication models. Shaikh et al.23 developed a formal verification-based framework to assess security in DT systems by modeling them as state-transition systems across physical, virtual, and application layers. It uses temporal logic and probabilistic model checking to evaluate attack success probabilities and associated costs. Results demonstrate effectiveness through a detailed healthcare case study that provides clear insights into multi-layered vulnerabilities. However, the approach requires accurate system modeling and can become computationally intensive for large-scale or highly dynamic DT environments. Ababio et al.24 proposed a self-optimizing DT system that integrates federated Learning, blockchain, and explainable AI to achieve greater trust, privacy, and intelligence in the IIoT environment. Edge-based Federated Learning (FL) ensures data confidentiality during collaborative model training, whereas blockchain enables data exchange and management. It has resulted in greater transparency and interpretability, as well as improved operational efficiency. Nonetheless, it comes with drawbacks, such as higher computational cost on edge devices, communication overhead due to federated updates, and latency in large-scale industrial implementation. Empl and Pernul25 proposed the DT2SA model, which merges DTs with security analytics to generate shareable cybersecurity knowledge applicable to industrial IoT systems. It leveraged semantic modeling, lifecycle mapping, and virtual–physical synchronization to enhance threat detection and prediction. The practical feasibility of real-world industrial settings was demonstrated with the open-source TWINSIGHT microservice architecture. More precisely, the study also addresses challenges such as semantic complexity, integration overhead, and limited scalability when deploying DTs for large, diverse IoT ecosystems.
Blockchain–IoT security mechanisms
Sasikumar et al.26 proposed integrating DTs with a blockchain-enabled PoA mechanism to improve security, privacy, and trust in IIoT environments. It leveraged a PoA consensus mechanism and the Deterministic Pseudo-Random Generation method to generate a secure genesis block, followed by a simulation of digital-twin-assisted blockchain networks developed by IIoT sensor nodes. Results indicate energy consumption reduction and improved data security. However, this model relies on trusted authority nodes and could be limited by scalability and centralization issues when deployed at scale in IIoT. Onwubiko et al.27 proposed a DT blockchain framework implemented on Ethereum and the Interplanetary File System to share information among the stakeholders of a DT safely and tamper-free. Smart contracts are used to control access to and data for assets, and the Interplanetary File System is used to control decentralized storage. The suggested design is also applied to a case study of smartphone production lines. The findings are encouraging, ensuring enhanced security, an 8% cut in transaction costs, and a significant reduction in the ether cost. The system is more expensive to execute and inherits the latency and scalability limitations of Ethereum, potentially hindering its use in large industries.
Salim et al.28 suggested a blockchain-powered version of the DT system to detect botnets in the early stages of a smart factory IIoT setup. It will use DTs on the edge to observe device behavior, scan packet headers using deep Learning, and synchronize data with the Packet Auditor, authenticated using smart contracts. Findings show that it enhances data integrity and privacy protection, and detects bot activity much faster than current methods. The framework will entail additional synchronization costs, require precise DT modeling, and may be limited by scalability constraints in large IIoT networks. Suleiman et al.29 also addressed security threats in DT systems and, first, proposed a blockchain-based architecture to improve integrity, transparency, and trust in real-time DT settings. The suggested structure will lock the DT’s interactions using decentralized ledger systems, smart contracts, and reliable data-sharing protocols. A case study is performed in the area of related vehicles and traffic infrastructure. The outcomes demonstrate higher data authenticity, secure coordination, and robust communication. Nonetheless, latency, additional processing requirements, and scalability issues will emerge with the adoption of blockchain in large, fast-moving DT networks that demand ultra-low response times. The DT architecture developed by Cuñat Negueroles et al.30 leverages blockchain technologies, such as FIWARE Canis Major, to optimize vehicle distribution in a transport and logistics context. It combines decentralized data storage with DT-based monitoring and blockchain-based trust to reduce operational costs and improve decision-making. This case study demonstrates sustainable performance, low implementation cost, and viability in actual fleet management. Nevertheless, to update DT, the network stability, blockchain throughput, and access to accurate real-time data are needed; all of which may influence the overall performance of a system. Kumar et al.31 presented BCE-IoT, a blockchain-enabled, explainable IDS that enhances IoT intrusion detection by integrating blockchain security, lightweight cryptography, federated-style local training, and SHAP-based interpretability. BCE-IoT detects various attacks, including DDoS, DoS, XSS, scanning, injection, and backdoor, using ML and AI techniques in real time. Results confirm its potential to reduce false alerts with enhanced accuracy compared to content integrity systems. However, blockchain overhead, network latency for consensus, and scalability remain limiting factors for large, high-traffic IoT deployments.
AI-based intrusion detection for IoT and CPS
Kumar et al.32 combined blockchain-based authentication, a Software-Defined Networking backbone, DT modelling, and a deep-learning intrusion detection system with self-attention and Bi-GRU networks to create a secure Smart grid framework. The strategy provides a robust communication channel that can be monitored in real time and increases the level of accuracy in detecting attacks. The findings revealed the successful execution of blockchain transactions and the high performance of intrusion detection on the N-BaIoT dataset. However, implementing the framework in low-resource smart grid conditions can be complicated due to its complex nature, the vital need to properly estimate a DT’s state, and increased computational costs. Meena and Indian33 proposed an enhanced LSTM-based intrusion detection system specifically designed to improve accuracy across various IoT environments. The deep sequential Learning of the model captures temporal attack patterns across four benchmark datasets: KDD-Cup’99, NSL-KDD, UNSW-NB15, and CICIoT2023. All these experimental datasets have shown superior performance compared to AdaBoost, DNN, RNN, and Logistic Regression, achieving accuracies above 95%. Still, the approach is computationally intensive, struggles to handle highly imbalanced traffic, and requires additional optimization for real-time, resource-constrained IoT deployments. Mohale and Obagbuwa34 introduced an XAI-enhanced ML-based IDS that establishes transparency and trust in IoT security by integrating LIME, SHAP, and ELI5 into models such as XGBoost, CatBoost, Random Forest, and MLP. Based on these insights, the system identifies important attack indicators and provides interpretable explanations using the UNSW-NB15 dataset. XGBoost and CatBoost outperformed all the selected models, achieving 87% accuracy and very low false-positive and false-negative rates. Still, the models’ performance remains moderate, and scalability to larger, more complex IoT datasets and evolving cyberattacks remains a limitation.
Resilient control and cyber–physical system security
Recent research has investigated resilient control measures to enhance the security of adversarial cyber-physical systems. An example is an attack-compensation control mechanism using neural networks with a Takagi–Sugeno fuzzy system to reduce the effects of actuator attacks without destabilizing the system, using dynamically activated event-oriented solutions35. Probability-density-dependent load–frequency control methods have also been proposed to enhance power system resilience against cyber-attacks such as denial-of-service and fake data intrusions36. In addition, Takagi–Sugeno fuzzy control systems have been developed to enable networked medical cyber-physical systems, e.g., artificial pancreas systems, to stabilize glucose regulation in the face of false data injection attacks on sensor measurements37. Although these approaches can be used to verify the validity of clever control and adaptive learning mechanisms to enhance CPS resilience, they primarily focus on control-level robustness. By comparison, the suggested Causio-TwinChain approach focuses on system-level security by incorporating anomaly detection, causal reasoning, DTs, and blockchain-based integrity assurance of Industrial IoT settings.
As shown in Table 1, the suggested Causio-TwinChain framework outperforms traditional security mechanisms across various evaluation metrics. Although traditional intrusion detection systems are largely based on traffic signatures or supervised learning, they lack strong integrity and causal interpretability. Blockchain frameworks increase both trust and data irreversibility but introduce additional latency and lack a clear justification for smart attacks. DT monitoring systems enhance situational awareness but often lack robust anomaly detection tools. The proposed framework with DT simulation, blockchain-supported telemetry integrity, contrastive anomaly detection, and structural root-cause rationale by allowing causal root-cause identification (not implemented in the majority of current solutions) has greater detection accuracy (97.8%), lower false positives (2.9%), and lower detection latency (82 ms), thus enabling causal root-cause identification. Such benchmarking demonstrates the efficiency and practical benefits of the offered architecture for securing the Industrial IoT environment.
Proposed methodology
The suggested Causio-TwinChain solution creates a self-diagnosing and proactive security framework of critical IoT infrastructures by integrating (synchronously) four fundamental technologies (shown in Fig. 1). DTs form the foundation on which real-time, sandboxed state simulation and monitoring are made possible, with virtual replicas of a physical device created synchronously. A permanently permissioned blockchain provides an immutable ledger that ensures the accuracy and transparency of all transactions and device interactions. The core intelligence is supported by the fact that the normal device behavior is learned using a dual-machine learning model which is unsupervised contrastive learning model and, first, to learn the subtle deviations and novel attacks without the prior understanding of their underlying cause and, secondly, to diagnose the underlying cause of identified anomalies through modeling the underlying cause-effect relationship within the system and predicting their potential cascading effects by counterfactual reasoning. This establishes real-time monitoring and unchanging logging for intelligent anomaly detection, explainable root-cause diagnosis, and impact forecasting in a closed loop. Since the methodology goes beyond mere detection to diagnostic intelligence and consequence prediction, it provides a resilient, self-healing security paradigm in critical IoT environments against advanced, adaptive cyber-physical threats.
Overall architecture of the proposed causio-twinchain framework.
The workflow of the proposed Causio-TwinChain architecture is presented in a well-structured, step-by-step process. To start with, IoT devices are placed in the physical infrastructure, which continuously produce telemetry data, including sensor readings, device status, and network traffic logs. Second, these data streams are reflected in the DT layer, which provides a synchronized virtual image of the physical system to monitor it in real time and analyze behavior. Third, the obtained data are preprocessed and converted into feature representations, which are then input to the contrastive learning module to determine abnormal behavioral patterns in the latent space. Fourth, after anomaly detection, the causal learning module analyzes which variables within a system depend on each other using a structural causal model to compute the root cause and the propagation of the detected event. Lastly, authenticated security incidents and their causes and effects are stored in the authorized blockchain layer as smart contracts, which provide guaranteed data integrity, secure exchange of threat data, and decentralized trust throughout the IoT network. It is a sequential workflow that can be used to proactively detect, explain, analyze, and provide security that cannot be tempered with the critical IoT infrastructures.
Phase 1: data acquisition and digital twin synchronization
The foundational phase of the proposed Causio-TwinChain framework rigorously couples physical IoT devices with their digital counterparts through advanced state synchronization methods. It starts with high-frequency multivariate telemetry acquisition from heterogeneous physical devices \(\mathcal{D}=\left\{{d}_{1},{d}_{2}, \ldots, {d}_{n}\right\}\) operating within critical infrastructure environments. This observational model for the device \({d}_{i}\) generates a continuous measurement vector \(\overrightarrow{{x}_{p}}\left(t\right)={\left[{m}_{1}\left(t\right), {m}_{2}\left(t\right),\ldots, {m}_{k}\left(t\right)\right]}^{T}\) include cyber and physical dimensions, such as packet statistics, computational load metrics, and environmental sensor readings, each with its measurement noise characteristics depending on the device class and the communication channel.
Figure 2 visualizes the continuous-discrete Extended Kalman Filter-based synchronization of the DT, where raw IoT telemetry goes through nonlinear state prediction and discrete measurement updates to maintain an accurate virtual state. A Mahalanobis-distance-based synchronization error quantifies the deviations that enable early anomaly detection while ensuring high-fidelity mirroring of the physical system’s behaviour. The synchronization mechanism between the physical and virtual states employs an advanced data assimilation framework based on nonlinear estimation theory. The DT \({v}_{i}\) maintains its state representation through a continuous-discrete extended Kalman filtering formulation, where the state propagation follows the nonlinear dynamics defined in Eq. (1).
where \(f(\cdot )\) represents the known physical dynamics of the system, \(w(t)\) is process noise with covariance \(Q(t), h(\bullet)\) defines the observation model, and \(v({t}_{k})\) characterizes measurement noise with covariance \(R({t}_{k}).\) The mechanism for updating the state combines continuous time predictions and discrete measurement updates in the following recursive formulation shown in Eqs. (2) and (3).
and
here, \(\text{F}(t)=\frac{\partial f}{\partial \overrightarrow{{x}_{v}}}{\mid }_{\widehat{\overrightarrow{{x}_{v}}}(t)}\) and \(\text{H}({t}_{k})=\frac{\partial h}{\partial \overrightarrow{{x}_{v}}}{\mid }_{\widehat{\overrightarrow{{x}_{v}}}({t}_{k}^{-})}\) represent the Jacobian matrices of the system and observation models, respectively, while \(\text{P}(t)\) denotes the error covariance matrix. The synchronization fidelity metric \(\epsilon \left(t\right)={\Vert {\overrightarrow{x}}_{p}\left(t\right)-h\left({\overrightarrow{x}}_{v}\left(t\right)\right)\Vert }_{{P}^{-1}}\) provides a Mahalanobis distance measure of state discrepancy, serving as a basic indicator for anomaly detection in subsequent security analysis phases. This mathematical formulation enables the DT to maintain a probabilistically optimal estimate of the physical system state and to create a high-fidelity virtual environment for validating security, accounting for both system uncertainties and measurement imperfections in an industrial IoT deployment.
Digital twin synchronization model based on continuous–discrete extended kalman filtering.
Digital Twin Synchronization & Data Acquisition
Algorithm-1 formalizes the process of DT synchronization in a continuous-discrete Extended Kalman Filter framework. It starts by setting initial states and covariance matrices, then enters its main operational loop. The algorithm continuously acquires physical sensor data, predicts time using system dynamics, and updates virtual states by making a measurement correction. A key innovation is the real-time calculation of the Mahalanobis distance as a metric of synchronization error, with statistical significance for state discrepancies. This error is compared against thresholds for automatic anomaly detection, with all critical state information and alerts immutably recorded on the blockchain. Implementation ensures a probabilistic state estimation while tracking cryptographic audit trails. This forms a mathematically rigorous basis for subsequent security analysis phases, including contrastive Learning and causal inference.
Figure 3 shows real-time synchronization and anomaly detection in the DT framework. It shows that the virtual replica accurately captures network behaviour, reporting deviations instantaneously when the synchronization error exceeds the statistical threshold. The three-level visualization presents the entire security pipeline, including traffic monitoring, error quantification, and automated binary classification, which is quite useful for representing the system’s active detection mechanism that converts raw network metrics into actionable security alerts without prior knowledge of attack patterns.
Temporal synchronization performance.
The DT-blockchain interface of the proposed Causio-TwinChain model is based on a coordinated workflow for monitoring and verification. The DT is constantly reflective of the physical state of the IoT devices in operation, gathering the telemetry data (network traffic, device behavior, system events, etc.). This virtual representation works in real time to perform analysis, detect anomalies, and generate security-related observations. After identifying suspicious behavior or state updates in the system, the DT sends authenticated event summaries and integrity metadata to the authorized blockchain layer. These events are logged in smart contracts on the blockchain, which is immutable, distributed, and secure, making tampering with security evidence impossible. This integration allows the reliable exchange of security knowledge among distributed nodes, but it does not allow manipulation of the monitoring history. Accordingly, the DT provides dynamic observability of the system and early threat identification, while the blockchain layer ensures security-related data transparency, traceability, and integrity, creating a stable feedback loop for the secure monitoring of IoT infrastructure.
Phase 2: immutable logging on the blockchain
This second phase of the Causio-TwinChain framework creates, from the DT layer, a cryptographically secure, tamper-proof audit trail of all critical system events and state transitions, structured within a blockchain protocol. It converts volatile telemetry data and security alerts into immutable records in a structured blockchain protocol. The process initiates at Transaction Creation, where critical device states \({\overrightarrow{{x}_{v}}}^{*}\), significant anomaly events \(A({t}_{k})\), and aggregated telemetry hashes are packaged into formal transactions. Figure 4 shows how telemetry and anomaly events from the DT are packaged into transactions, broadcast to the validator nodes using the PBFT consensus algorithm, and appended to the blockchain as immutable records. The workflow ensures auditability, tamper-resistance, and traceability for any event related to IoT devices within the Causio-TwinChain framework.
Immutable blockchain logging workflow in causio-twinchain framework.
Each transaction \(T{X}_{i}\) follows a standard structure that includes metadata, a payload, and cryptographic signatures, as defined mathematically by the following equation.
where \(\mathcal{H}\left(\bullet \right)\) is a cryptographically secure hash function, and a digital signature ensures authentication and non-repudiation. Consensus and immutable storage begin with broadcasting transactions to the permissioned blockchain network. The authorized validator nodes \(\mathcal{V}=\left\{{v}_{1},{v}_{2},\ldots,{v}_{m}\right\}\) verify the transactions and order them through a consensus mechanism inspired by Practical Byzantine Fault Tolerance (pBFT). The block formation occurs through the strict cryptographic protocol expressed in Eq. (5).
where \(\mathcal{M}(\cdot )\) computes the Merkle tree root hash, providing efficient transaction verification. The consensus protocol ensures block finality through a three-phase commit process given in Eq. (6).
where \(v\) is the view number, n is the sequence number, and σ is the validator’s signature. When commit messages reach \(\left[\frac{2m}{3}\right]+1\), the block gets appended to the distributed ledger, thus forming a chain with the property of immutability since changing any transaction requires recomputation of all hashes downstream from that transaction and compromising the majority of the validator nodes, a task infeasible in practical terms. This cryptographic immutability provides an unforgeable historical record that is vital for forensic analysis, regulatory compliance, and establishing trust among stakeholders in critical IoT infrastructure applications.
Immutable Blockchain Logging Protocol
The PBFT consensus mechanism for immutable logging to a blockchain (a permissioned network) is implemented in Algorithm-2, which initiates transaction batching and Merkle tree construction by having a round-robin primary node transmit a PRE-PREPARE message. Validators, on their part, confirm the block and send PREPARE messages if the result is positive. The algorithm requires a quorum of 2f. + 1 identical replies (where f is the maximum number of faulty nodes that can be tolerated) to execute the PREPARE and COMMIT phases and reach consensus, even in the presence of malicious actors.
The block is added to the chain with a sufficient number of COMMIT messages. The blocks are then cryptographically bound to the previous block using a hash function, making their immutability tamperproof. It ensures deterministic finality, high throughput, and robust security, making it well-suited for real-time, trustworthy logging of IoT data in critical infrastructure where data integrity and auditability are necessary.
Phase 3: contrastive learning for anomaly detection
This stage deploys an unsupervised learning paradigm that detects security anomalies by modeling the normative behavior of IoT systems, without utilizing labeled attack data. Latent Space Modeling initiates the process by taking the normalized telemetry vectors \(\overrightarrow{{x}_{v}}\) from DTs occurs through a deep neural network encoder \({\phi }_{\theta }\), parameterized by weights \(\theta .\)
Figure 5 shows a contrastive learning pipeline for unsupervised anomaly detection in the Causio-TwinChain framework. The source IoT telemetry is projected into a latent space where all positive pairs, i.e., augmented views of the same normal instance, are attracted while other samples are repelled. Through this NT-Xent optimization, the encoder learns a compact cluster representative of normal operation. During deployment, such latent representations that fall outside of this learned region exhibit high Mahalanobis distance and are flagged as anomalies for precise zero-day attack detection. The encoder projects input data into a lower-dimensional latent space \(\mathcal{Z},\) where semantically similar instances are closely embedded. The training objective minimizes a temperature-scaled, normalized cross-entropy loss (NT-Xent) via contrastive optimization, as given in Eq. (7).
where, \({\mathcal{P}}_{pos}\) represents the distribution of positive pairs, namely augmented views of the same instance, \(\tau\) is a temperature parameter which controls the separation sharpness, and \(N\) denotes the batch size. The encoder, through iterative optimization, learns to map the normal operational patterns to a compact region in \(\mathcal{Z}\), characterized by a cluster centroid \(\overrightarrow{\mu }=\frac{1}{N}{\sum }_{i=1}^{N}{\overrightarrow{z}}_{i}\) and covariance matrix \(\Sigma .\) The operational Deviation Identification phase computes the Mahalanobis distance for the incoming latent representations \({\overrightarrow{z}}_{new}\) defined in the following Eq. (8).
where, threshold \(\delta\) is statistically derived from training data percentiles. This multivariate distance metric accounts for feature correlations and offers better anomaly sensitivity than the Euclidean distance. The system maintains dynamic model updates with exponential moving averages of the cluster parameters given in Eq. (9), enabling continuous adaptation to gradual system drift while maintaining detection sensitivity against abrupt malicious deviations.
Contrastive learning framework for unsupervised anomaly detection.
The contrastive and causal learning modules of the proposed Causio-TwinChain framework employ a sequential analytical pipeline designed for effective anomaly detection and root-cause inference. In the first place, telemetry information gathered from IoT devices and reflected in the DT environment is processed to identify behavioral characteristics, including network traffic patterns, transitions between device states, and system logs. The encoded features are then mapped to latent representations within a contrastive learning model, which is trained to discriminate between normal operative states and to separate abnormal patterns in the code space. Anomalies are then passed to the causal learning module, where a structural causal model is used to examine interdependencies among the system’s variables to determine possible root causes of abnormal behavior. Through joint representation learning and causal inference, the framework not only identifies abnormal behaviour but also provides interpretable explanations of how attacks spread and how the system goes wrong, enabling proactive monitoring and mitigation decisions in the Industrial IoT setting.
Phase 4: causal learning for diagnosis and prediction
The framework initiates causal analysis upon anomaly notification in the contrastive learning module, triggering the SCM and transitioning to diagnostic intelligence. Figure 6 shows how the Causio-TwinChain framework alternates between anomaly detection and causal reasoning to diagnose and provide predictive analysis accurately. When an anomaly is detected, the Structural Causal Model determines system variables with a significant causal impact that can isolate the actual root of the problem. The framework then reasons counterfactually -change the identified cause and make predictions of cascading effects across components, to estimate the effect of cascading interactions across components. This generates an impact prediction vector that describes future risks and enables explainable, proactive, and system-wide security decisions.
Structural causal learning pipeline for root-cause diagnosis and counterfactual impact prediction in causio-twinchain.
The SCM is a formal representation of the system’s causal architecture, defined by the quadruple: \(\mathcal{M}=\langle U,V,F,P(u)\rangle ,\) where U: exogenous variables, \(V=\left\{{V}_{1},{V}_{2},\dots, {V}_{n}\right\},\) represents the set of endogenous system variables, \(F=\left\{{f}_{1},{f}_{2},\dots, {f}_{n}\right\}\) are the structural functions that define the causal relationships, and P(u) characterizes the probability distribution over exogenous variables. Each structural equation follows the functional form \({v}_{i}\leftarrow {f}_{i}\left(p{a}_{i},{u}_{i}\right),\) with \(p{a}_{i}\subseteq V\setminus \left\{{V}_{i}\right\}\) representing the causal parents of the variable \({V}_{i}.\) Root cause diagnosis leverages do-calculus to compute interventional distributions that find the main fault by quantifying the causal effect. The causal effect \({\psi }_{j\to i}\) of variable \({V}_{j}\) on target variable \({V}_{i}\) is calculated as in Eq. (10),
Variables with \(\left|{\psi }_{j\to i}\right|>\delta\), where \(\delta\) is a statistically significant threshold, are identified as root causes \({\mathcal{R}}_{cause}\). This approach allows them to precisely attribute anomalies to a given system component or to environmental factors.
The framework then proceeds with counterfactual reasoning to predict the potential cascading effects across the system. The process operates across three hierarchical levels of inference-observational \(P({v}_{i}\left|{v}_{j})\right.,\) interventional \(P\left({v}_{i}\left|do({v}_{j}) \right.\right),\) enabling comprehensive impact assessment. The counterfactual analysis follows a formal three-step process as defined in Eq. (11).
where \(E\) is the observed evidence of the anomaly. This rigorous methodology generates an impact vector \({\mathcal{I}}_{impact}={\left[P({V}_{k}^{*}\notin {\mathcal{N}}_{k}\left|{\mathcal{R}}_{cause},E \right.\right]}_{k=1}^{n}\) that quantifies the propagation risk to each system component, with \({\mathcal{N}}_{k}\) representing the normal operating region for the variable \({V}_{k}\). Continuous model refinement will be achieved through causal discovery \({\mathcal{G}}_{updated}=\text{arg}\underset{\mathcal{G}}{\text{max}}BIC\left(\mathcal{G}\left|{\mathcal{D}}_{historical}\bigcup \left\{{\overrightarrow{z}}_{anomaly}\right\}\right.\right)\), ensuring the framework adapts to evolving system dynamics and emerging threat patterns while maintaining explainable, causally grounded security intelligence.
The Structural Causal Model (SCM) has a multi-stage validation mechanism to ensure that the framework supports valid, consistent, and stable causal reasoning in dynamically evolving IoT environments. The initial one is DT-based intervention testing, in which controlled perturbations of the putative causal variables in the virtual world are applied to determine whether the predicted effects align with the observed system responses. Second, the framework performs periodic sliding-window causal re-estimation of recent telemetry data to identify changes in device interactions and environmental conditions. Lastly, the counterfactual predictions provided by the SCM are compared with actual system results from the blockchain audit trail. Through this repeated validation process, the causal graph can adapt to the system’s evolving dynamics without losing correct, reliable root-cause inference.
Phase 5: mitigation and feedback loop
This final phase closes the security control loop by translating diagnostic insights into actionable responses and enabling continuous model improvement through experiential Learning. The system executes Proactive Mitigation through a policy-driven automation framework that translates causal insights into containment actions. When the causal model identifies the root cause \({\mathcal{R}}_{cause}\) with \({\mathcal{I}}_{impact}\), mitigation policies \(\Pi\) are activated through smart contract executions as defined in Eq. (12).
The quarantine operation follows a formal revocation protocol defined in Eq. (13).
Where \({\mathcal{P}}_{critical}\) represents critical network ports and \(t\) represents the quarantine time-stamp. Smart contracts implement all these policies through deterministic execution, in which every action creates an immutable transaction on the chain. The Model Update and Self-Healing mechanism implements continuous Learning through Bayesian model updating. The contrastive learning model parameters \(\theta\) are refined using Eq. (14).
where \({\mathcal{P}}_{incident}\) represents the distribution of the newly observed incident data, and \(\alpha\) is the adaptation rate. The SCM is further refined in parallel by learning the causal structure using Eq. (15).
DTs enable safe mitigation validation through virtual testing using the following mathematical expression in Eq. (16).
This feedback loop creates a self-improving security system, whereby each incident improves future responses, while the blockchain provides an immutable audit trail of all mitigation actions and model updates for compliance and forensic analytics.
Mitigation and Feedback Loop
The functionality of algorithm-3 is a self-healing security mechanism, formulated as a closed loop. It initiates risk-based automated mitigation, executing measures such as quarantining a device or throttling network bandwidth based on the severity of the diagnosed impact, and records these permanently on the blockchain. This fundamental intelligence is then improved in a dual-model update process. The contrasting learning model is constantly adjusted using both historical and new incident data, which also enhances its ability to identify anomalies. In the meantime, the structural causal model is updated using Bayesian structure learning to incorporate new causal knowledge derived from observed incidents.
Lastly, every mitigation measure is initially tested in the DT sandbox. An authorized deployment that proves to decrease the effect of an anomaly and is worth the cost of response is the only one that will be executed; otherwise, safe, cost-effective operations will be ensured. With that in mind, it becomes possible to institute a proactive feedback loop in which the system learns on its own from threats and adjusts to them, becoming more defensive in the long term.
Results and discussion
The proposed framework for the causio-TwinChain was experimentally evaluated using the N-BaIoT data, with a realistic background of IoT network traffic containing various types of attacks: DDoS, DoS, scanning, and injection. Preprocessing of the dataset involved feature normalization and temporal separation to create structured telemetry inputs. To model a realistic Industrial IoT setting, approximately 150–200 heterogeneous IoT nodes were modeled, including sensors, gateways, and DT simulators that produce high-frequency telemetry streams. The experiments have been conducted on a system with an Intel Core i7 processor, 32 GB of RAM, and an NVIDIA RTX 3060 graphics card, running Ubuntu 22.04. The implementation was done in Python with TensorFlow and PyTorch, and Hyperledger Fabric was utilized on the permissioned blockchain layer. The contrastive learning model had a latent dimension of 128, a batch size of 64, a learning rate of 0.001, and 100 training epochs. To perform dynamic causal inference, a sliding-window causal discovery mechanism was used to ensure reproducibility and reliable performance evaluation.
Dataset description
Experiments are conducted on the N-BaIoT dataset38, a complete set of IoT botnet traffic. It contains network-flow statistics for nine smart commercial devices. The dataset was recorded under controlled laboratory conditions, with the devices exposed to both benign workloads and malicious activities caused by botnets. Namely, the captured malicious activities belonged to the Mirai and Bashlite families. It is one of the largest publicly available datasets of IoT security with over 70 million data instances.
Data samples include 115 time-windowed statistical features, including packet-level and flow-level behavioral metrics, distributions of bytes and packet intervals, TCP/UDP header values, and traffic ratios. These aggregated properties capture patterns of device behavior in regular use and in coordinated attack situations, enabling the detection of anomalies and effective analysis of their security posture.
The IoT hardware space is quite broad, as it includes a variety of devices: doorbells, thermostats, baby monitors, webcams, and security cameras; hence, cross-device generalization and robustness testing are feasible. Traffic volumes vary across devices due to differences in hardware, firmware, and background activities. The Mirai attacks-ACK, SYN, UDP floods, and the SCAN, alongside the Bashlite attacks (junk, UDP, TCP, and COMBO floods), are representative across devices to encompass a wide range of actual patterns of attack.
The proposed Causio-TwinChain framework is trained on IoT machines and cyber-physical infrastructure using a continuous stream of telemetry data, including logs of network traffic, machine and system conditions, and operational measurements. These data streams are reflected in the DT environment, where they are processed in advance to remove noise, normalize features, and produce labeled behavioral patterns indicative of normal operations and attack situations. Records verified on the permissioned blockchain are also used for training and validation of learning models due to historical security events.
The analytics layer of the structure hosts the trained contrastive learning and causal inference models, which are typically deployed on edge or fog nodes connected to the DT platform. In real-time operation, the contrastive learning model first analyzes incoming telemetry data to detect anomalies, and any suspicious patterns are then sent to the causal model for root-cause analysis. The resulting security information is then safely stored in the blockchain, which enables distributed verification and tamper-resistant logging across the IoT network. In the case of device-specific features, i.e., device type, benign instances, training time, object size of the traffic, and botnet attack type, Table 2 represents a summary of important dataset properties for the research.
Experimental setup
To test the suggested Causio-TwinChain framework, the N-BaIoT data set was used in a controlled computational environment, and each step of DT synchronization, blockchain logging, anomaly detection, and causal diagnosis was consistent. The tests were performed on a workstation equipped with Ubuntu 22.04, an Intel i9, an NVIDIA RTX 3090, 64 GB RAM, and 2 TB of NVMe storage. Python 3.10 was the main development platform, with PyTorch used to implement contrastive Learning, DoWhy and CausalML used to implement causal inference, state-space modeling with SciPy used to implement DTs, and Hyperledger Fabric used to implement the permissioned blockchain. Docker containers were also used to maintain reproducibility throughout the experiment. The dataset was split so that only benign samples were used to train the unsupervised model, and only attack instances were evaluated. This guarantees high-performance analysis of diverse IoT devices through device-level cross-validation and standard evaluation metrics.
This work carefully selects the hyperparameters in Tables 3, 4, and 5 to ensure optimal performance across all components of the Causio-TwinChain framework. The DT parameters in Table 3 enable stable-state synchronization by balancing process and measurement uncertainties, thereby allowing timely flagging of anomalies. Likewise, the contrastive learning settings in Table 4 enable strong representation learning through appropriate latent dimensionality, temperature scaling, and data augmentation, resulting in high sensitivity to minute deviations. The causal learning parameters in Table 5 are set to enable reliable structural discovery, precise intervention estimation, and accurate counterfactual reasoning. These concerted capabilities yield robust anomaly detection, dependable root-cause diagnosis, and consistency across a wide range of IoT devices.
Causio-TwinChain is a distributed edge -cloud architecture where DTs run on clusters of devices, telemetry messages are converted to aggregate formats before blockchain commitment, and anomalies are identified on small latent embeddings, thus making possible scalable processing of high-frequency data streams of thousands of heterogeneous devices. The Causio-TwinChain framework significantly outperformed the baseline methods, including LSTM-based IDS 24, XAI-based ML-IDS 25, DT2SA 26, and BCE-IoT 27, achieving an exact accuracy of 98.7% and a very low error rate. This is well beyond the capabilities and reliability of the intrusion detection approaches proposed to date, indicating the strength and adaptability of the proposed model across heterogeneous IoT environments.
F1-score improvement
The F1-Score improvement measure assesses the relative performance improvement the proposed model achieves over baseline methods by examining the extent to which the system balances detection reliability and completeness. This metric is based on model-level performance scores, rather than the confusion-matrix components. Let \({\Phi }_{prop}\) denote the F1-Score of the proposed model and \({\Phi }_{base}\) be the F1-Score of a baseline model. This enhancement is determined in terms of the following expression \({\Delta \Phi }=\frac{{\Phi }_{prop}-{\Phi }_{base}}{{\Phi }_{base}}\times 100\), where the highlight is on the percent change in the total detection ability. The larger \(\Delta \Phi\) indicates that the suggested framework provides more stable and consistent detection performance in the heterogeneous IoT setting, effectively modeling complex behavioral anomalies that cannot be captured by traditional IDS, ML-powered IDS, DT-only systems, or blockchain-only detectors. This measure indicates better generalization and durability of the proposed system across various operating conditions.
In Fig. 7, the full analysis of F1-Score improvements across various evaluation dimensions is provided. The apparent observation in Fig. 7a is that the proposed Causio-TwinChain performs better than all the base IDS models across various IoT devices. Figure 7b shows that the F1-score increases steadily as training continues, reaching a maximum of 0.97. The plot in Fig. 7c shows that Causio-TwinChain is not very sensitive to higher noise levels, whereas all baselines degrade drastically. Lastly, Fig. 7d shows that, at varying attack intensities, F1-scores are higher and more concentrated in the proposed model, indicating excellent stability and the ability to generalize across different adversarial conditions. All these findings support excellent reliability of detection and strength.
F1-Score improvement evaluation across devices, methods, noise conditions, and attack intensities.
False-positive reduction
False-positive reduction quantifies the level at which the proposed system reduces unnecessary or incorrect alerts by correctly separating benign IoT patterns from malicious deviations. Rather than base any such metric on error-based counts, the metric evaluates false alert reduction by means of well-defined alert-rate variables. Let \({\lambda }_{base}\) be the false alert rate produced by a baseline IDS, and let \({\lambda }_{prop}\) be the rate produced by the proposed model. This reduction can be expressed as \({\Psi }_{FP}=\frac{{\lambda }_{base}-{\lambda }_{prop}}{{\lambda }_{base}}\times 100,\) highlighting the percent reduction in false alarms. A higher \({\Psi }_{FP}\) reflects only sharper anomaly boundaries, better contextual awareness of device behavior, and more reliable interpretation of benign variations. In particular, this metric is important in real-world IoT deployments, where an overabundance of false alerts burdens resources and inhibits rapid response. It has been shown that the proposed system significantly enhances operational trustworthiness compared to competing approaches.
Figure 8 shows a comparison between the false-positive reduction (\({\Psi }_{FP}\)) of Doorbell, PT camera, Thermostat, and Baby Monitor devices. The base models are broader and less consistent in all instances; that is, they exhibit inconsistent filtering of benign variations. The proposed Causio-TwinChain is repeatedly able to reach high and compact \({\Psi }_{FP}\) values—narrowing the anomaly boundaries in Fig. 8a, enhancing contextual robustness in Fig. 8b, periodic telemetry in Fig. 8c, and benign acoustic activity in Fig. 8d. All in all, the suggested framework is more precise and cross-device reliable.
False-positive reduction (\({\Psi }_{FP}\)) distribution for four IoT devices: (a) Doorbell, (b) PT Camera, (c) Thermostat, and (d) Baby Monitor, comparing baseline IDS models with the proposed Causio-TwinChain framework.
Diagnosis time speed-up
The speed-up in diagnosis time is the metric that measures the effectiveness of the proposed system in detecting, analyzing, and interpreting anomalies compared to baseline models, which focus on computational responsiveness rather than event counts. Assume the average diagnosis time of a baseline method is \({\tau }_{base}\) and of the proposed model is \({\tau }_{prop}\). The speed-up is calculated as \({\Omega }_{diag}=\frac{{\tau }_{base}-{\tau }_{prop}}{{\tau }_{base}}\times 100\) that yields the percentage of diagnostic latency improvement. A bigger \({\Omega }_{diag}\) means that the system provides faster anomaly localization and causal reasoning through DT synchronization, efficient computations in the latent space, and organized causal tracing. This measure is of utmost significance within the IoT setting, as delays or attack propagation can occur at a very rapid pace. The consequent enhancement of real-time detection and response serves as evidence that the proposed model is better than the baseline mechanism.
Figure 9 collectively depicts the superiority of the proposed Causio-TwinChain framework in terms of diagnosis time speed-up across various analytical perspectives. First, the peak of the proposed model is clearly right-shifted in Fig. 9a, reflecting higher and more stable speed-up values. Its dominance in upper quantiles and consistent clustering in high-performance regions are further highlighted by Fig. 9b and c. Broadened and upward distributions, along with right-shifted cumulative curves in Fig. 9d and e, further confirm its uniform gains across all attack types. The same trend is evident in Fig. 9f, with higher means and tighter confidence intervals. Next, Fig. 9g shows densely packed high-speed clusters, while device-wise plots in Fig. 9h further confirm consistent acceleration across all IoT devices. Lastly, Fig. 9i depicts the concentrated swarms of high values for the proposed model, in contrast to the scattered, lower-performing baselines. Overall, this integrated visualization underscores the model’s strong reliability, cross-device robustness, and substantial latency reduction.
Visualization of Diagnosis Time Speed-Up (\({\Omega }_{diag}\)) achieved by the proposed Causio-TwinChain model compared to baseline intrusion detection systems across multiple IoT devices and attack categories.
Robustness under noise
Robustness to noise assesses the proposed model’s ability to maintain high-quality detection even when IoT streams are substantially distorted, for example, by noise, missing information, or adversarial perturbations. Define \({\Gamma }_{clean}\) as the system’s performance (accuracy or F-score) on clean data and \({\Gamma }_{noise}\) as performance under noisy conditions. Robustness is given by \({\Theta }_{noise}=\frac{{\Gamma }_{noise}}{{\Gamma }_{Clean}}\times 100\), which shows how well performance in noise approximates ideal operation. If two systems are compared, the robustness gain is defined as \({\Delta }_{\Theta }=\frac{{\Theta }_{prop}-{\Theta }_{base}}{{\Theta }_{base}}\times 100\). Larger values reflect the system’s resilience against sensor drift, communication interference, and natural variability in IoT conditions. This indicates that the proposed system maintains the integrity of its detection significantly better than the baseline systems, while effectively fusing contrastive learning, DT validation, and structural causal filtering.
Table 6 shows that, under realistic conditions, the baseline IDS models incur noticeable degradation, with robustness ranging from 70 to 85% as distortion increases. Conversely, the suggested CausioTwinChain exhibits exceedingly steady performance, with a steady level of 98 to 99% across all devices, noise types, and mixed distortion cases. This means it is highly resilient to sensor noise, packet loss, jitter, and adversarial perturbations. The model offers an average of 18–20% greater robustness than the most effective baseline, demonstrating its high reliability in terms of noise tolerance in dynamic IoT devices.
Novel attack detection rate
The novel attack detection rate indicates the effectiveness of the proposed system in detecting unseen or zero-day intrusions that differ from previously monitored attack patterns. Let \({\alpha }_{prop}\) be the detection capability of the proposed model for novel threats, and \({\alpha }_{base}\) represent a corresponding baseline model. Then, the improvement of the system could be expressed as \({\Delta }_{\Upsilon}=\frac{{\alpha }_{prop}-{\alpha }_{base}}{{\alpha }_{base}}\times 100\), which captures the relative increase in the detection of unfamiliar anomalies. This metric captures the model’s generalization strength by leveraging contrastive feature embeddings, DT state deviations, and causal inconsistency detection. A high detection rate for novel attacks indicates that the system will be effective against evolving threat behaviors, which are generally undetected by traditional IDS, ML-based IDS, DT-only models, and blockchain-only architectures. This metric becomes of prime importance while future-proofing IoT security in rapidly changing cyber environments.
Figure 10 summarizes performance across NADR, attack intensity, threshold variation, noise level, and new attack types. It further corroborates the better reliability of the proposed Causio-TwinChain. In contrast, the proposed model maintains values above 80–90% consistently, whereas baseline IDS models exhibit clear degradation or fluctuations under these conditions. The stability across thresholds demonstrates well-separated latent representations, while its robustness under noise shows strong resistance to sensor and communication distortions. Moreover, higher results on unseen attack types indicate strong generalization capability. In summary, Fig. 10 confirms its adaptability, resilience, and operational superiority in dynamic IoT environments.
Novel Anomaly Detection Rate (NADR) analysis of the proposed Causio-TwinChain framework compared with baseline IDS models under multiple operational stressors.
Ablation study for contrastive learning–based anomaly detection
To assess the strength of the contrastive learning-based anomaly detection module, an ablation experiment was conducted with two key hyperparameters: the dimensionality of the latent representation and the anomaly detection threshold. Such parameters directly affect the separability of normal and anomalous telemetry patterns in the learned embedding space. The encoder network was evaluated with latent dimensions of 64, 128, and 256, while keeping the same training settings. Detection performance was determined based on accuracy, precision, recall, and F1-score. The findings in Table 7 demonstrate that a 64-dimensional embedding achieves lower anomaly separability because it compresses the feature space too much. The 128-dimensional latent space performs much better at improving the capacity to represent and detect stability. Even though increasing the embedding dimensionality to 256 points enhances detection accuracy slightly, the difference is not significant and comes at a greater computational cost. Thus, the 128-dimensional latent space provides the most balanced trade-off between representational expressiveness and computational efficiency.
In addition to the latent dimensionality, percentile-based Mahalanobis distance boundaries derived from normal telemetry embeddings were used to assess the sensitivity of the anomaly decision threshold. At the 90th, 95th, and 99th percentiles, thresholds were tested to assess their effects on false-positive rates and anomaly detection. Table 8 demonstrates that the 90th percentile threshold provides greater sensitivity in detection, but it leads to a high false-positive rate. On the other hand, a high threshold at the 99th percentile minimises false positives but also reduces the detection rate so that some abnormalities can go undetected. The 95th percentile level represents the most balanced performance, as it has a strong ability to detect anomalies and limits false alarms efficiently. Such findings demonstrate that the chosen 128-dimensional latent embedding structure with a 95th percentile anomaly threshold provides consistent detection efficiency across heterogeneous IoT telemetry trends at the cost of computational efficiency.
To mitigate the actual telemetry imperfections in the real world, the suggested framework brings together various mechanisms to improve robustness against missing, noisy, or partially observable data. The DT layer performs continuous state estimation and reconstruction of system variables, enabling it to generate consistent virtual system states even with incomplete or corrupted sensor readings. The contrastive learning encoder also encodes raw telemetry into latent representations that capture temporal patterns, thereby making it less susceptible to noise and measurement variation. The structural causal model also uses probabilistic reasoning to deduce dependencies from available evidence, and blockchain-backed records facilitate data imputation and validation. This is a combination of mechanisms that guarantee quality anomaly detection and causal diagnosis in dynamic industrial IoT settings.
Even though the incorporation of DTs and permissioned blockchain can add additional system overhead, experimental results indicate that the proposed framework incurs approximately 9–12% system overhead in computation and 6–8% system network overhead in communication, compared to traditional IDS models. However, there is a significant improvement in the accuracy of anomaly detection, root-cause elucidation, and the ability to conduct an audit with maximum security in the proposed framework.
The proposed framework is assessed in the current research based on a representative dataset of the Industrial IoT to ensure the accuracy of anomaly detection and causal diagnostics. The proposed Causio-TwinChain framework architecture is dataset-independent, since the contrastive learning encoder is trained on generalized telemetry representations and the structural causal model is trained to learn relationships between system variables, rather than system-specific features. Thus, the framework can be applied to other datasets on Industrial IoT security with similar telemetry designs. As part of future work, determining the framework using several heterogeneous datasets, such as ToN-IoT and Edge-IIoTset, is regarded as an additional investigation of cross-dataset generalization performance.
The suggested Causio-TwinChain design supports a hierarchical edge cloud architecture that will be implemented in large-scale Industrial IoT projects. Light workloads are performed at edge gateways (e.g., telemetry monitoring and DT synchronization), which minimize communication latency and bandwidth consumption. More complex processes, such as causal inference and blockchain ledger operations, are deployed to cloud or regional nodes with greater processing resources. Permissioned blockchains reduce the overhead of consensus and improve transaction efficiency compared to public blockchains. Moreover, the contrapositional learning module operates on a small latent space, thereby facilitating the detection of anomalies in almost real time with a moderate level of computational power. Such design decisions enable scalability, lower latency, and supportable deployment in large IoT infrastructures such as smart grids and industrial automation systems.
Conclusion, limitations, and future works
This paper proposes Causio-TwinChain, a comprehensive security architectural framework that enhances the resiliency of Industrial IoT and cyber-physical infrastructures against advanced cyber threats. The suggested architecture combines DT technology, anomaly detection through contrastive learning, structural causal reasoning, and permissioned blockchain as a means of creating a proactive, self-diagnostic, and tamper-resilient security mechanism. The DT component maintains constant synchronization between physical and virtual devices and enables real-time monitoring, state estimation, and controlled intervention testing. The contrastive learning module learns strong latent representations from high-dimensional telemetry streams to enhance the identification of abnormal activities in heterogeneous IoT environments. Moreover, a structural causal model offers root-cause diagnosis that can be interpreted using causal dependency analysis and counterfactual reasoning, allowing security analysts to determine the root causes of detected anomalies. The permissioned blockchain layer also enhances the framework with guaranteed secure logging, integrity checks, and auditing (traceable) of security events and telemetry data.
The experimental analysis shows that the proposed framework achieves higher accuracy in anomaly detection, reliable causal diagnosis, and secure telemetry management than traditional intrusion detection and monitoring methods. These findings indicate that AI-based learning methods can be effectively used together with distributed trust approaches to protect vital Industrial IoT networks.
Although that has contributed to it, there are several limitations. The experimental validation used a small number of datasets and controlled simulation environments, which may not fully capture the complexity of large-scale real-world deployments. Moreover, the inclusion of DT simulations and blockchain mechanisms could add extra computational and communication overhead in the resource-constrained IoT context. Future research will focus on assessing the framework across varied Industrial IoT data, optimizing system scalability, and improving adaptive causal learning processes to enhance responses to changing cyber-threat situations.
Table 9 represents the key symbols, variables, and parameters used in the formulation of the Digital Twin synchronization, contrastive anomaly detection, and structural causal learning components of the proposed framework.
Data availability
The data that support the findings of this study are openly available at [https://archive.ics.uci.edu/dataset/442/detection+of+iot+botnet+attacks+n+baiot).
References
Djenna, A., Harous, S. & Saidouni, D. E. Internet of things meet internet of threats: New concern cyber security issues of critical cyber infrastructure. Appl. Sci. 11(10), 4580 (2021).
Sugunaraj, N., Balaji, S. R. A., Chandar, B. S., Rajagopalan, P., Kose, U., Loper, D. C., & Ranganathan, P. Distributed energy resource management system (DERMS) cybersecurity scenarios, trends, and potential technologies: A review. IEEE Commun. Surv. Tutorials. (2025).
Yaacoub, J. P. A., Noura, H. N., Salman, O. & Chahine, K. Toward secure smart grid systems: Risks, threats, challenges, and future directions. Future Internet 17(7), 318 (2025).
Amin, M., El-Sousy, F. F., Aziz, G. A. A., Gaber, K. & Mohammed, O. A. CPS attacks mitigation approaches on power electronic systems with security challenges for smart grid applications: A review. IEEE Access 9, 38571–38601 (2021).
Ahmed, S. Enhancing data security and transparency: The role of blockchain in decentralized systems. Int. J. Adv. Eng. Manag. Sci. 11(1), 593258 (2025).
Zhao, Z. et al. Secure Internet of Things (IoT) using a novel Brooks Iyengar quantum Byzantine Agreement-centered blockchain networking (BIQBA-BCN) model in smart healthcare. Inf. Sci. 629, 440–455 (2023).
Xu, W. et al. Blockchain-based verifiable decentralized identity for intelligent flexible manufacturing. IEEE Internet Things J. 12(16), 32366–32378 (2025).
Liu, Y. et al. Multi-leader Byzantine Fault Tolerance in blockchain: Performance and security. IEEE Trans. Inf. Forensics Secur. 21, 1622–1637. https://doi.org/10.1109/TIFS.2026.3657099 (2026).
Fuller, A., Fan, Z., Day, C. & Barlow, C. Digital twin: Enabling technologies, challenges and open research. IEEE Access 8, 108952–108971 (2020).
Wang, Z. et al. Digital twin-driven shape-performance-control-application integrated design for unmanned underwater vehicles. Sci. China Technol. Sci. 69, 1380301 (2026).
Hossain, S., Senouci, S. M., Brik, B. & Boualouache, A. A privacy-preserving self-supervised learning-based intrusion detection system for 5G–V2X networks. Ad Hoc Netw. 166, 103674 (2025).
Jeffrey, N., Tan, Q. & Villar, J. R. A review of anomaly detection strategies to detect threats to cyber-physical systems. Electronics 12(15), 3283 (2023).
Lin, C. Y., Tseng, T. L. & Tsai, T. H. CLARiC: Contrastive learning and root-cause inference with causality for explainable smart manufacturing. IEEE Access 13, 161279–161298 (2025).
Zhang, H., Ren, Y., Xia, Y., Zhou, S. & Guan, J. Towards effective causal partitioning by edge cutting of adjoint graph. IEEE Trans. Pattern Anal. Mach. Intell. 46(12), 10259–10271 (2024).
Chen, S., Long, X., Fan, J. & Jin, G. A causal inference-based root cause analysis framework using multi-modal data in large-complex system. Reliab. Eng. Syst. Saf. 265, 111520 (2026).
Applebaum, S., Gaber, T. & Ahmed, A. Signature-based and machine-learning-based web application firewalls: A short survey. Procedia Comput. Sci. 189, 359–367 (2021).
Karaca, K. N. & Çetin, A. Systematic review of current approaches and innovative solutions for combating zero-day vulnerabilities and zero-day attacks. IEEE Access 13, 102071–102091 (2025).
Xu, G. et al. CBRFL: A framework for committee-based Byzantine-Resilient Federated Learning. J. Netw. Comput. Appl. 238, 104165 (2025).
Qudus, L. Advancing cybersecurity: Strategies for mitigating threats in evolving digital and IoT ecosystems. Int. Res. J. Mod. Eng. Technol. Sci. 7(1), 3185 (2025).
Xu, G. et al. RAT Ring: Event driven Publish/Subscribe Communication Protocol for IIoT by Report and Traceable Ring Signature. IEEE Trans. Ind. Inform. 21(9), 6670–6678 (2025).
Roumeliotis, C., Dasygenis, M., Lazaridis, V. & Dossis, M. Blockchain and digital twins in smart Industry 4.0: The use case of supply chain—A review of integration techniques and applications. Designs 8(6), 105 (2024).
Thakur, G., Kumar, P., Jangirala, S., Das, A. K. & Park, Y. An effective privacy-preserving blockchain-assisted security protocol for cloud-based digital twin environment. IEEE Access 11, 26877–26892 (2023).
Shaikh, E., Al-Ali, A. R., Muhammad, S., Mohammad, N. & Aloul, F. Security analysis of a digital twin framework using probabilistic model checking. IEEE Access 11, 26358–26374 (2023).
Ababio, I. B. et al. A blockchain-assisted federated learning framework for secure and self-optimizing digital twins in industrial IoT. Future Internet 17(1), 13 (2025).
Empl, P. & Pernul, G. Digital-twin-based security analytics for the internet of things. Information 14(2), 95 (2023).
Sasikumar, A. et al. Blockchain-based trust mechanism for digital twin empowered industrial internet of things. Future Gener. Comput. Syst. 141, 16–27 (2023).
Onwubiko, A., Singh, R., Awan, S., Pervez, Z. & Ramzan, N. Enabling trust and security in digital twin management: A blockchain-based approach with Ethereum and IPFS. Sensors 23(14), 6641 (2023).
Salim, M. M., Comivi, A. K., Nurbek, T., Park, H. & Park, J. H. A blockchain-enabled secure digital twin framework for early botnet detection in IIoT environment. Sensors 22(16), 6133 (2022).
Suleiman, R., Maradapu Vera Venkata Sai, A., Yu, W. & Wang, C. Blockchain for security in digital twins. Future Internet 17(9), 385 (2025).
Cuñat Negueroles, S. et al. A blockchain-based digital twin for IoT deployments in logistics and transportation. Future Gener. Comput. Syst. 158, 73–88 (2024).
Kumar, A., Sharma, B. & Noonia, A. Secure blockchain based intrusion detection for IoT networks. Discov. Comput. 28(1), 226 (2025).
Kumar, P. et al. Digital twin-driven SDN for smart grid: A deep learning integrated blockchain for cybersecurity. Sol. Energy 263, 111921 (2023).
Meena, G., & Indian, A. IDS-IoT: Intrusion detection system for the Internet of Things using enhanced long-short term memory. In Artificial Intelligence and Applications (2025).
Mohale, V. Z. & Obagbuwa, I. C. Evaluating machine learning-based intrusion detection systems with explainable AI: Enhancing transparency and interpretability. Front. Comput. Sci. 7, 1520741 (2025).
Yan, S. & Yang, X. Neural network-based attack-compensation control of TS fuzzy systems against actuator attacks with improved dynamic memory-event-triggered scheme. IEEE Trans. Fuzzy Syst. https://doi.org/10.1109/TFUZZ.2025.3625147 (2025).
Yan, S., Gu, Z., Park, J. H., Xie, X. & Dou, C. Probability-density-dependent load frequency control of power systems with random delays and cyber-attacks via circuital implementation. IEEE Trans. Smart Grid 13(6), 4837–4847 (2022).
Yan, S., Ding, L. & Cai, Y. Memory-based attack-tolerant TS fuzzy control of networked artificial pancreas system subject to false data injection attacks. Fuzzy Sets Syst. 518, 109486 (2025).
https://archive.ics.uci.edu/dataset/442/detection+of+iot+botnet+attacks+n+baiot.
Acknowledgements
This research was supported by the Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2026R259), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. This study is supported via funding from Prince Sattam bin Abdulaziz University project number (PSAU/2026/R/1447). Ashit Kumar Dutta would like to thank AlMaarefa University for supporting this research under project number MHIRSP2025017.
Funding
This work was also supported by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant funded by the Korea government (MOTIE) (RS-2023–00303559, Study on developing cyber-physical attack response system and security management system to maximize real-time distributed resource availability, 40%); This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (RS 2024–00400955, Development of Core Security Technology to Respond to International Smart Ship Regulations, 60%).
Author information
Authors and Affiliations
Contributions
Ashit Kumar Dutta: Methodology, Resources, Writing—Review & Editing, Visualization, Funding acquisition Mohd Anjum: Conceptualization, Methodology, Software, Writing—Original Draft, Writing—Review & Editing Hong Min: Conceptualization, Methodology, Resources, Writing—Original Draft, Writing—Review & Editing, Funding acquisition Yousef Ibrahim Daradkeh: Methodology, Validation, Formal analysis, Resources, Data Curation, Funding acquisition Jung Taek Seo: Methodology, Validation, Formal analysis, Resources, Writing—Review & Editing, Visualization Sana Shahab: Conceptualization, Methodology, Software, Data Curation, Writing—Original Draft, Writing—Review & Editing, Visualization, Funding acquisition.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
Not applicable.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Dutta, A.K., Anjum, M., Min, H. et al. Digital twin-assisted blockchain IoT security model using contrastive and causal learning techniques. Sci Rep 16, 15732 (2026). https://doi.org/10.1038/s41598-026-47104-6
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-026-47104-6















