Introduction

Roll-to-roll (R2R) manufacturing process refers to a continuous production method where flexible web materials are transported through rollers while undergoing various fabrication processes1. Compared to conventional batch-based manufacturing, the R2R process takes advantage of a high-speed continuous approach, with the potential for cost efficiency and increased productivity. Despite these advantages, ensuring precision and reliability in high-speed R2R systems remains a critical challenge, particularly in industries where microscale accuracy is required2. Owing to its advantages, it has been actively adopted across a variety of industrial fields, including semiconductor manufacturing3,4,5, display technology6,7, batteries8,9, and photovoltaics10,11,12 thereby driving extensive efforts toward technological commercialization. In general, the R2R manufacturing process consists of multiple sub-processes, e.g., web transport, alignment, coating, and patterning, where each process plays a key role in ensuring high production efficiency and maintaining product quality.

The web transporting process is one of the most critical sub-processes, which presents several challenges in achieving high-precision R2R manufacturing. A primary challenge arises from the complex interaction between the physical properties of the web material and the mechanical characteristics of the roller system, leading to non-uniform web stress distribution and corresponding deformation13. Deviations in web tension during processing can generate stresses that compromise the surface roughness and linewidth accuracy of applied functional layers14. In particular, excessive tension variations may exceed the elasticity threshold, resulting in winding defects such as telescoping and wrinkles. Moreover, heat exposure from drying and sintering units induces further disturbances during the R2R manufacturing process2. Therefore, enhanced operation and control strategies for web tension optimization are highly required to overcome these limitations and ensure high-quality, defect-free production.

Conventional approaches for web tension control can be broadly categorized into feedback-based and model-based methods15,16,17,18,19. First, closed-loop feedback control utilizes tension sensors or load cells to regulate web tension. For example, a load-cell-based approach measures web tension using load cells mounted on idle rollers, with the measured data being fed back to the tension controller17. However, feedback-based methods suffer from limitations such as response delay and measurement error, which reduce their effectiveness in high-speed manufacturing environments. Second, model-based approaches such as model predictive control (MPC) offer enhanced accuracy and stability by employing mathematical models to predict tension variables and optimize control strategies18. While MPC improves precision over conventional feedback control, it inherently requires accurate modeling of the system and involves high computational costs, making real-time implementation challenging, particularly in nonlinear manufacturing environments. Accordingly, these constraints prevent the methods from being scalable and adaptable for dynamic R2R manufacturing processes.

Recently, machine learning (ML) methods have yielded significant advancements in the field of autonomous manufacturing optimization. Unlike conventional approaches, ML-based methods leverage collections of data and probabilistic surrogate models to approximate nonlinear relationships, thereby offering a more adaptive and efficient optimization framework. For instance, Kanarik et al. (2023)20 proposed a Bayesian optimization method that can significantly reduce the costs and time required for developing chemical plasma processes in semiconductor fabrication, outperforming traditional approaches. Moreover, Deneault et al. (2021)21 introduced ML-based automated optimization method designed for additive manufacturing. Work of Alajmi & Almeshal (2021)22 explored ML-based tool wear optimization for machining parameters, highlighting its cost-effectiveness and sustainability. Unlike other ML methods for optimizing control such as genetic algorithms (GA)23 and particle swarm optimization (PSO)24, Bayesian optimization has demonstrated superior efficiency in converging to an optimal solution while requiring fewer function evaluations. This characteristic makes it particularly suitable for R2R controller optimization, where minimizing computational cost and achieving rapid adaptation are crucial.

On the other hand, the integration of ML methods with digital twin (DT) has also emerged as a pivotal research direction for autonomous manufacturing optimization25,26,27,28,29. Digital twins generate a virtual modeling of physical systems, enabling derivation of optimal control strategies in advance for manufacturing processes. To be effective, the models within digital twins must be adaptable, continuously updating to reflect changing physical environments30. This adaptability is a core advantage of Bayesian optimization, making it particularly well-suited for integration into digital twin frameworks31. As this research area is still developing, there remains significant potential to refine ML approaches tailored specifically for R2R manufacturing and further enhance optimization outcomes through digital twin integration.

While both MPC and PSO have been applied in tension control and optimization tasks, each method comes with practical trade-offs. MPC offers precise control by predicting system behavior using dynamic models but requires accurate modeling and introduces high computational load during runtime, which limits its applicability in fast or nonlinear web handling processes. PSO, on the other hand, is typically used for offline controller tuning and lacks the ability to adapt once deployed. Traditional methods such as grid search or manual tuning are simple and intuitive but suffer from poor sample efficiency, are time-consuming, and provide no mechanism for uncertainty handling or adaptation.

In contrast, Bayesian optimization enables sample-efficient tuning through probabilistic modeling and can be executed periodically via a digital twin interface, offering real-time responsiveness without requiring system models. This balance between adaptability, simplicity, and low computational demand makes BO particularly advantageous for low-tension precision control in dynamic R2R environments. A summary comparison of these methods is presented in Table 1.

Table 1 Qualitative comparison of control strategies for web tension optimization.

Based on the aforementioned challenges and motivations, this study proposes an AI-driven digital twin framework for autonomous web tension control in R2R manufacturing system. The proposed method integrates Bayesian optimization with Gaussian process modeling to efficiently identify and optimize control parameters, mitigating web tension instability. Unlike conventional feedback and model-based approaches, our method dynamically adjusts control parameters in response to real-time data, which enables an adaptive and self-optimizing manufacturing system. By incorporating the proposed optimization strategy into a digital twin framework, we validate the proposed model’s optimization performance, autonomous control capability, as well as digital twin operability. Our findings suggest the applicability of the proposed method for various autonomous manufacturing tasks in real-world applications.

Experimental setup

This section describes the real physical R2R manufacturing system for flexible and printed electronics manufacturing with precise tension and registration control, and its DT counterpart for the autonomous web tension control optimization.

Digital twin for Roll-to-Roll system

A critical component of the DT platform is its capability for real-time operation and data exchange between DT and physical twin (PT). This interaction is facilitated by the client-server communication. The high-level system diagram illustrating this process is provided in Fig. 1.

Fig. 1
figure 1

High-level system diagram of the roll-to-roll system digital twin for autonomous optimization of web tension controller.

In the proposed setup, the OPC UA server is hosted on the dedicated server computer and configured using KEPServerEX 6 software. This server acts as a bridge for cross-platform communication, allowing the configuration of OPC UA clients both in the PT operation software and the DT environment.

The process for commanding the PT from the DT is as follows:

  1. 1.

    Session Initialization: When the DT requires the PT to perform a task, it establishes a session with the OPC UA server.

  2. 2.

    Node Updates and Trigger Activation: The DT updates the relevant nodes on the server, including the activation of a trigger node that acts as an event indicator.

  3. 3.

    PT Monitoring and Execution: The PT continuously monitors the trigger node for updates. Upon detecting a trigger, it adjusts the R2R system parameters accordingly.

  4. 4.

    Real-Time Status Monitoring: While the PT performs the step response acquisition, the DT monitors the PT’s motors status and processing states in real time through OPC UA. This monitoring ensures synchronization, accurate execution, and immediate detection of any anomalies during the process.

For sensor data acquisition, the PT records step-response data and transfers it to the server via secure file transfer protocol (SFTP), saving the results as a.csv file. The DT retrieves the.csv data from the SFTP server, enabling the subsequent stages of the Bayesian optimization process. After the data is processed, the DT sends commands to restore the R2R system to its initial conditions and halts further operations.

Roll-to-roll manufacturing system

The R2R manufacturing systems consist of several units: web transporting unit, printing and coating unit, drying and sintering unit, and other units such as inspection unit. Web transporting unit consists of three zones as depicted in Fig. 1: unwinder zone, rewinder zone, and main operation zone. The web unwinds from the unwinder roller and passes through web edge guider and auxiliary ionizers and web cleaners. Web then proceeds to the main operation zone, isolated by infeeder and outfeeder modules. These modules are equipped with back-up rollers to secure no-slip web movement. Finally, the web upcoming from the main operation zone enters the rewiner zone, passes through another web edge guider and idle rollers with load cells attached and accumulates to the rewinder.

Tension controller design

The tension of these zones is designed to be controlled independently with the help of NIP rollers. For the current system implementation, the unwinder zone tension is controlled in an open-loop fix torque regime provided by adjusting powder clutch located between the unwinder motor and unwinder roller. The rewinder zone has similar to the unwinder clutch mechanism except the tension is controlled in a closed loop by tension signal from a load cell attached to a guide roller. Main operation zone tension is implemented with phase shifting of the outfeeder motor. The amount of shift is computed with PI scheme with additional moving average implementation to reduce the steady state error. The design of a tension controller for the main operation zone is shown in Fig. 2.

Fig. 2
figure 2

R2R system outfeeder module (main operation zone) tension control block diagram.

Proposed method

This chapter demonstrates the design of autonomous optimization of the R2R system controller. This includes the step response acquisition and preprocessing, quality score calculation, description of search space, modeling of quality score, and new controller gains proposal for testing.

Workflow of autonomous control optimization

The flowchart of autonomous optimization for the control of the R2R system tension controller is shown in Fig. 3. The process starts from initial sampling in the search space to acquire initial dataset. For this, various methods could be used ranging from a simple grid-search to random sampling methods such as Latin hyper cube method and others. Additionally, the search space was divided into a finite number of points to restrict the optimization process. After acquiring the initial experimental set comprising various proportional (Kp) and integral (Ki) parameters of controller, it is tested on the R2R system. The system is subject to a tension step change to produce a step response. This step response then processed to extract following control dynamics: time constant, overshoot, and settling time. Then, a quality score is calculated as a weighted sum of the extracted features. If the termination criteria, such as a maximum number of experiments or a convergence threshold, are not met, the Gaussian Process model is updated using the current data, and a new experimental condition is proposed. This process is repeated, testing the new condition on the R2R system and recalculating the quality score after each experiment. Finally, the optimization process concludes by selecting the trial with the highest quality score as the optimal set of controller parameters for the R2R system.

Fig. 3
figure 3

Flowchart of autonomous optimization of R2R system PI tension controller.

Step response acquisition and preprocessing

To evaluate the control performance of a roll-to-roll system, the Ziegler-Nichols method can be employed. This approach involves applying a step input to the system and assessing its ability to track the input signal. The method is particularly effective for analyzing the system’s response speed and accuracy in adapting to changes in input. In the present study, we apply this methodology to the tension control of a roll-to-roll machine, examining how the system’s output (tension) responds to a step increase. The resulting step response is then processed to extract key features, including the time constant, settling time, and overshoot. The time constant represents the speed at which the system initially responds to changes, settling time indicates how quickly the system stabilizes, and overshoot quantifies any excessive deviation beyond the target value.

Steady-state error, which measures the final deviation from the target value, is not considered in this study, as our system exhibits high accuracy consistently approaching 100%. Additionally, oscillatory characteristics in the steady-state region are also excluded from this analysis. These aspects were comprehensively modeled in our previous research, where we focused on steady-state characteristics and published the findings separately. Therefore, in this work, we concentrate on the transient response features that are more relevant to control performance optimization.

Each experiment, the step response of the tension control was acquired for investigated tension controller set of parameters. The data collection consists of two phases. At the first phase, the web is accelerating to the speed set for the experiment and the tension is stabilized using proposed controller parameters. Then, at the phase two, step response is performed and the tension signal data is recorded. Finally, the web is stopped using suboptimal control parameters and the web tension is returned to initial value. The step response experiment parameters are shown in Table 2.

Table 2 Step response processing parameters.

After acquiring the tension step response, the features were extracted such as time constant, overshoot, and settling time. The dynamics of a 1 st order system can be described by Eq. (1), where \(\:T\left(t\right)\) is a current tension, \(\:{T}_{initial}\) is initial tension, \(\:{T}_{final}\) is an after-step tension, \(\:t\) is time, and \(\:\tau\:\) is time constant. At time equal of one time constant (\(\:t=\tau\:\)), the tension reaches \(\:{T}_{initial}*0.632\). The tension signal was linearly approximated between two nearest data points, and the time constant was calculated to a corresponding time constant tension.

$$\:T\left(t\right)={T}_{final}+\left({T}_{initial}-{T}_{final}\right)*{e}^{-\raisebox{1ex}{$t$}\!\left/\:\!\raisebox{-1ex}{$\tau\:$}\right.}.$$
(1)

Overshoot was calculated as a difference between maximum value of a tension signal and the after-step tension setting value (2). Settling time was defined as a duration from the application of the step input to the point where the output stays within a specified tolerance of the after-step tension setting value (3,4). The example of a step response and extracted features are shown in Fig. 4.

$$\:overshoot={T}_{max}-{T}_{final}$$
(2)
$$\:{T}_{lower\:tolerance}={T}_{final}\times0.975,$$
(3)
$$\:{T}_{upper\:tolerance}={T}_{final}\times1.025.$$
(4)
Fig. 4
figure 4

R2R system tension control step response.

Quality score calculation

The primary objective of this study is to minimize the time constant, overshoot, and settling time by optimizing the PI control parameters. However, no single combination of Kp​ and Ki​ can simultaneously minimize all three metrics. To address this trade-off, a quality score is developed to balance these competing features, enabling the identification of an optimal set of gains that minimizes overall performance criteria.

The quality score was calculated as a weighted sum of three signal features: time constant, overshoot, and settling time. To calculate weights of linear combination, a repetitive experiment was done with near optimal control parameters for 10 experiments, and the signal features were obtained. The weights were defined as \(\:{\varvec{\sigma\:}}^{-1}\) for each feature and then normalized to sum equals to one (5), where \(\:{\varvec{w}}_{\varvec{i}}\) is a normalized weight. The weights were obtained as \(\:{\widehat{\varvec{w}}}_{time\:constant}=0.58\), \(\:{\widehat{\varvec{w}}}_{overshoot}=0.32\), and \(\:{\widehat{\varvec{w}}}_{settling\:time}=0.10\). The repeated experimental results for quality score weights are shown in Table 3.

$${\widehat{{w}_{i}}=\frac{{{\sigma}_{i}}^{-1}}{{\sum}_{j=1}^{3}{{\sigma}_{j}}^{-1}}}.$$
(5)
Table 3 Repeated experiment results for quality score weights.

Bayesian optimization for autonomous tension control

Gaussian process was used for quality score modeling. A Gaussian process is defined as a collection of random variables, any finite number of which have a joint Gaussian distribution. It can be thought of as a distribution over functions, fully specified by its mean function \(\:m\left(x\right)\) and covariance function, \(k\left(x,{x}^{{\prime}}\right),\), also known as kernel (6).

$$\:f\left(x\right)\:\sim\:GP(m\left(x\right),\:k\left(x,{x}^{{\prime\:}}\right)$$
(6)

In practice, measurements from real mechanical systems are subject to noise, so the observations become noisy. This can be modeled by assuming that the observed values \(\:y\left(x\right)\) are related to the underlying function \(\:f\left(x\right)\) by Eq. (7):

$$y\left(x\right)=f\left(x\right)+\varepsilon$$
(7)

where \(\varepsilon\) is observation noise, typically assumed to be independent and identically distributed (i.i.d.) Gaussian distribution with zero mean, \(\varepsilon\sim\:N(0,{\sigma}_{n}^{2})\).

One advantage of using Gaussian processes is that the predictions they provide are probabilistic, giving both a mean and a variance for the predictions. The covariance function, or kernel, is a crucial component of the GP model, as it encodes assumptions about the function to be learned. Kernels define the similarity between data points, and one important property is that the sum or product of two valid kernels is also a valid kernel, which allows for the construction of more complex kernels by combining simpler ones.

In this work the combination of a radial based function (RBF) and a white noise kernel was used, see the Eq. (8) and Eq. (9) respectively.

$${k}_{RBF}\left(x,{x}^{{\prime}}\right)=cov(f\left(x\right),\:f\left({x}^{{\prime}}\right))={a}^{2}\text{exp}\left(-\frac{{\parallel\:x-x^{\prime}\parallel}^{2}}{2{l}^{2}}\right)$$
(8)
$$\:{k}_{white\:noise}\left(x,{x}^{{\prime\:}}\right)={\sigma\:}_{n}^{2}\delta\:\left(x,{x}^{{\prime\:}}\right)$$
(9)

where \(\:\delta\:\left(x,{x}^{{\prime\:}}\right)\) is the Kronecker delta function, which is 1 if \(\:x={x}^{{\prime\:}}\)and 0 otherwise. The combined kernel is (10):

$$\:{k}_{sum}={k}_{RBF}+{k}_{white\:noise}$$
(10)

To determine the optimal hyperparameters of the Gaussian process model, such as the length scale \(\:l\) and amplitude \(\:a\) of the RBF kernel, and the noise level \(\:{\sigma\:}_{n}^{2}\) of the white noise kernel, the log-marginal likelihood (LML) is maximized. The log-marginal likelihood is given by Eq. (11):

$$\:log\:p\left(y|X,\:\theta\:\right)=\:-\frac{1}{2}{y}^{\top\:}{{K}_{y}}^{-1}y-\:\frac{1}{2}log\:\left|{K}_{y}\right|-\:\frac{n}{2}log2\pi\:$$
(11)

where \(\:{K}_{y}={K}_{f}\:+\:{\sigma\:}_{n}^{2}I\) i is the covariance matrix of the noisy observations \(\:y\), with \(\:{K}_{f}\) being the covariance matrix of the latent function values. The three terms in this expression correspond to the data fit, a complexity penalty, and a normalization constant, respectively. By optimizing the LML w.r.t the hyperparameters \(\:\theta\:\) using gradient-based methods, the model parameters can be learned.

Once the model is trained, predictions can be made for new inputs \(\:{x}_{*}\) by deriving the posterior distribution of the function values given the observed data (12).

$$\:p\left({y}_{*}|{x}_{*},\:y,x\right)=N\left({y}_{*}\right|{m}_{*},{S}_{*})$$
(12)

with the posterior mean and variance estimations given by Eq. (13) and Eq. (14) respectively:

$$\:{m}_{*}=K\left({x}_{*},x\right){{K}_{y}}^{-1}y$$
(13)
$$\:{S}_{*}=K\left({x}_{*},{x}_{*}\right){-{K\left({x}_{*},x\right)K}_{y}}^{-1}K\left(x,{x}_{*}\right)$$
(14)

where \(\:{m}_{*}\) is the predictive mean and \(\:{S}_{*}\) is the predictive variance.

Next query point can be proposed using acquisition functions, a critical component of Bayesian optimization, which is built based on the posterior distribution provided by the GP model. Acquisition functions are designed to balance the trade-off between exploration (sampling in regions of high uncertainty) and exploitation (sampling in regions expected to yield high objective values) with the goal of minimizing the number of evaluations required to find the optimum. Among the various acquisition functions, the most common ones are the Probability of Improvement (PI), Expected Improvement (EI), and Upper Confidence Bound (UCB). In this study the EI was used for all the experiments. EI chooses the next point by maximizing the expected value of the improvement over the current best observation \(\:{f}_{best}\) as \(\:EI=\mathbb{E}\left[\text{m}\text{a}\text{x}(0,f\left(x\right)-{f}_{best}\right]\), considering both the predicted mean \(\:\mu\:\left(x\right)\) and the uncertainty \(\:\sigma\:\left(x\right)\). It can be computed as Eq. (15):

$$EI\left(x\right)=\left(\mu\left(x\right)-{f}_{best}-\xi\right){\Phi}\left(\frac{\mu\left(x\right)-fbest-\xi}{\sigma\left(x\right)}\right)+\sigma\left(x\right){\varphi}\left(\frac{\mu\left(x\right)-fbest-\xi}{\sigma\left(x\right)}\right)$$
(15)

where \({\Phi\:}\left(\cdot\right)\) and \({\varphi\:}\left(\cdot\right)\) are cdf and pdf of a normal distribution respectively, and \(\:\xi\:\) is a hyperparameter to address exploration-exploitation balance. The Bayesian optimization process for the R2R manufacturing system is illustrated in Fig. 5. Figure 6a-b demonstrate the mean and standard deviation predictions of the GP model fitted on 18 data. Figure 5c shows EI acquisition function with the maximum at the 8,75 of Kp and 890 of reciprocal of Ki gains. Figure 5d illustrates the step response with the corresponding controller gains.

Fig. 5
figure 5

Gaussian process modeling for autonomous web tension control: (a) Gaussian process mean, (b) Gaussian process standard deviation, (c) Expected improvement acquisition function with the maximum at the 8,75 of Kp and 890 of reciprocal of Ki gains, and (d) R2R system web tension signal acquired for the proposed controller gains.

Safety constraints for bayesian optimization

In classical control theory, gain margin and phase margin are key indicators used to evaluate the robustness and stability of feedback systems. Gain margin refers to how much the open-loop gain can increase before the system reaches the point of instability (i.e., where the phase crosses − 180°), while phase margin indicates how much additional phase lag can be tolerated before instability occurs at the gain crossover frequency. These margins are particularly important in systems subject to model uncertainty, sensor noise, or time delays, and help ensure that small variations in system dynamics do not lead to uncontrolled behavior.

However, in this work, the system is optimized experimentally without a full analytical model, making direct frequency-domain analysis impractical. Therefore, we introduce empirical safety constraints into the Bayesian optimization process to fulfill a similar role as classical stability margins. Specifically, two strategies were implemented to mitigate the risk of instability and ensure safe exploration.

First, if integral gain is too dominant, the controller reacts too slowly initially and builds up a large control action over time. It was empirically observed that the ratio between the reciprocal of Ki and the Kp gains serves as a practical proxy for stability behavior in this system. When this ratio falls below 40, the web tension control loop becomes highly sensitive to noise, leading to persistent oscillations and unstable responses. To avoid such unsafe regions, the Bayesian optimization algorithm was configured to reject candidate gains falling below this threshold without executing the physical test.

Second, in cases where experiments failed to complete due to unstable tension control—resulting in no meaningful performance data—those trials were assigned the worst score observed in the current optimization run. This penalization discourages the algorithm from sampling similar regions in future iterations.

Through these measures, the proposed method maintains safe operation during optimization, despite the absence of a full system model, effectively replicating the stabilizing function of gain and phase margins in conventional control design.

Results and discussion

This section delivers experimental data description, proposed model’s optimization performance and autonomous control capability, followed by demonstration for the digital twin operation of the proposed approach.

Data description

The web tension responses acquired during the optimization process are shown in Fig. 6. The responses can be categorized into three types: slow response, optimal response, and fast responses with exaggerated overshoot.

The slow response (Fig. 6a) was obtained using Kp = 1.25 and a reciprocal of Ki = 780. It is characterized by a smooth tension curve where the error is gradually compensated, slowly approaching the set value. This type of response results in extended settling times (2.187 s and 5.968 s) while maintaining a small overshoot (0.115 kgf), often occurring when the reciprocal of Ki gain is excessively large compared to Kp.

The optimal response (Fig. 6b) demonstrates rapid stabilization to the target tension, achieved with Kp = 7.5 and a reciprocal of Ki = 730. This configuration results in a short time constant (0.207 s), a settling time of 0.396 s, and minimal overshoot (0.105 kgf), achieving a well-balanced trade-off between response speed and stability.

The fast response with exaggerated overshoot (Fig. 6c) was observed with Kp = 1.25 and a reciprocal of Ki = 60, where a small Ki gain led to excessive overshoot (1.545 kgf). Although the system responded quickly (0.134 s), it exhibits significant oscillations due to instability resulting in a big settling time (1.219 s), which may degrade manufacturing quality.

These observations underscore the importance of selecting appropriate controller gains for achieving optimal performance in the R2R system, and emphasize the effectiveness of optimization methods, such as Bayesian optimization, in refining these gains for improved system behavior. To ensure robust optimization, a total of 100 experiments were conducted. The initial 12 experiments were generated using a grid search method, with step responses collected from the R2R system.

Fig. 6
figure 6

R2R manufacturing system tension controller responses during autonomous controller gains optimization: (a) slow controller response, (b) optimal controller response, and (c) fast controller response with overshoot.

Optimization process of the proposed method

Figure 7 visualizes the optimization process and Bayesian optimization behavior. The Gaussian process (GP) model was used to estimate the quality score function and guide the selection of control parameters.

Figure 7a–c shows the mean, variance, and acquisition function of the GP model for the initial 12 samples. The mean prediction suggested that the highest expected quality score was around Kp = 5.5 and a reciprocal of Ki = 570, while the variance map indicated unexplored regions at higher Kp and Ki values.

The GP model was initialized using a radial basis function (RBF) kernel, where kernel length scale and diagonal noise influenced the smoothness of predictions. With the expected improvement (EI) acquisition function, the 13th sample was selected primarily based on quality score expectations rather than exploration incentives.

At the 100th iteration (Fig. 7d–f), the Bayesian optimization process had converged. The final GP model accurately approximated the quality score function, with optimal values aligned along the diagonal. The results confirm that alternative control coefficient combinations led to suboptimal step responses, either due to excessive overshoot or overly slow performance. Additionally, the high-performance region was densely sampled, while suboptimal regions remained sparsely explored, demonstrating the efficiency of Bayesian optimization.

Fig. 7
figure 7

The results of Gaussian process model mean, Gaussian process model variance, and acquisition function of Bayesian optimization for 50 mm/s speed: (a-c) for the initial 12 samples and (d-f) for the final 100 samples.

Autonomous control capability

Figure 8 presents the Autonomous control optimization for the R2R web tension controller operating at 50 mm/s. Over the course of 100 experiments, the quality score was iteratively updated, with the 81 st experiment yielding the optimal controller parameters.

At this stage, the optimal Kp and reciprocal of Ki values were 7.5 and 730, respectively, resulting in a quality score of 0.193. The corresponding step response exhibited a time constant of 0.207 s, an overshoot of 0.105 kgf, and a settling time of 0.396 s. Compared to the best-performing initial sample, the time constant and settling time were reduced from 0.247 s to 0.207 s and from 0.447 s to 0.396 s, respectively, demonstrating improved response speed.

However, a slight increase in overshoot from 0.085 kgf to 0.105 kgf was observed, likely due to signal variance caused by steady-state noise. During the early phase of optimization (before the 35th experiment), quality score variance was relatively high due to exploration of poorly performing regions. After the 35th experiment, the algorithm focused more on refining the most promising parameter regions. By the 81 st to 84th iterations, the optimization process consistently proposed the same controller gains (Kp = 7.5, reciprocal of Ki = 730), confirming strong convergence toward an optimal solution. Based on the performance of this autonomous optimization, it proves to be practical for the R2R system and adaptable to various R2R system designs.

Fig. 8
figure 8

Bayesian optimization results for 50 mm/s speed: (a) Kp gain, (b) Reciprocal of Ki gain, (c) time constant, (d) overshoot, (e) settling time, (f) quality score.

Digital twin operation

The Supplementary Video S1 and Fig. 9 provide a detailed visualization of the DT platform’s graphical user interface (GUI) and its operation during the optimization process. The GUI is distributed across two monitors, as shown in Fig. 3. Monitor 1 features the DT simulation module, which includes a 3D simulation of the R2R system, and the DT operation module. The DT operation module orchestrates the optimization loop by executing the optimization algorithm, sending commands through OPC UA to the R2R system to perform step responses, and receiving real-time web tension data from the R2R system for visualization. It also retrieves tension step response files via SFTP for use in the optimization process. Monitor 2 displays the PT operation module and the PT surveillance camera module, enabling real-time monitoring of the physical system’s operational status and visual oversight of the R2R manufacturing process. The supplementary video highlights the effectiveness of this integration, showcasing the system’s ability to autonomously manage and optimize the R2R controller in real time, illustrating the practical capabilities of the DT framework for autonomous R2R system control.

Fig. 9
figure 9

Screen capture of the Digital Twin (DT) platform interface during web tension controller optimization. (a) Monitor 1: DT simulation module with a 3D simulation of the R2R system and the DT operation module, managing optimization, real-time data visualization, and file retrieval. (b) Monitor 2: Physical Twin (PT) operation module and PT surveillance camera module, providing real-time system monitoring and visual oversight.

Latency and scalability analysis of the DT platform

In our implementation, the latency of the DT platform is primarily determined by two components: the data acquisition and transmission time between the physical system and the DT interface, and the computational time for Bayesian optimization using Gaussian process modeling. The DT platform leads the optimization process by controlling the R2R system through a defined step-profile sequence consisting of acceleration (10 s), step excitation (20 s), and return to baseline (5 s) steps. Communication latency between the DT and R2R system via OPC UA is measured at roughly 100 ms per command-response cycle, supporting timely command execution. The measurements with 47 ms sampling resolution generate approximately 70 kB of data per iteration, transferred securely via SFTP. Under typical network conditions, data transfer latency is negligible; however, potential delays can arise from connection overhead or limited bandwidth.

On the computational side, Gaussian Process modeling—central to Bayesian optimization—has a cubic time complexity O(n3) relative to sample size n, which poses scalability challenges as data accumulates. To address this, techniques such as sparse Gaussian Processes, low-rank approximation, and incremental updates are considered to reduce computation time without compromising accuracy. Overall, the platform supports real-time interaction and scales well for practical use in R2R optimization tasks.

Generalization tests across varying speeds and loads

Generalization tests were performed to evaluate whether the proposed DT framework can adaptively identify effective control parameters under varying operating conditions. Specifically, the proposed optimization process was executed at multiple web speeds (25, 50, and 75 mm/s) and tension step transitions (3 to 5 kgf, 3 to 7 kgf, and 3 to 9 kgf). For each condition, the DT coordinated multiple optimization iterations to explore and identify suitable PI gains based on real-time step response data.

As shown in Table 4, the optimization process consistently converged to stable and effective control parameters, maintaining good performance across the tested variations. These results demonstrate that the framework can generalize its optimization capabilities to different speed and load scenarios without requiring prior tuning, confirming its adaptability and robustness for a broader range of R2R manufacturing environments.

Table 4 Optimized kp​ and reciprocal of ki​ values, along with step response features acquired during independent optimization runs under varying speeds and tension setpoints.

Conclusion

This study demonstrated autonomous optimization of the roll-to-roll (R2R) web tension controller and digital twin for real-time communication. By leveraging Bayesian optimization framework, Kp and Ki gains of the PI controller were rapidly optimized through iterative Gaussian process modeling and selection via the expected improvement acquisition function. Controller performance was evaluated using a quality score derived from the time constant, overshoot, and settling time of the tension step response. A total of 100 experiments were conducted at a web speed of 50 mm/s, with 12 initial experiments defined by the grid and the remaining experiments inferred by the optimization algorithm. The 81 st experiment achieved the best quality score of 0.193, with corresponding Kp and reciprocal of Ki values of 7.5 and 730, respectively. The extracted features included a time constant of 0.207 s, an overshoot of 0.105 kgf, and a settling time of 0.396 s. This study highlights the potential of the proposed approach in autonomous manufacturing, where more adaptive, scalable, and intelligent industrial systems are required. Future works will extend this framework to other manufacturing systems as well as more advanced learning techniques for further performance improvement.