Introduction

In roll-to-roll (R2R) manufacturing, varied inspection systems have been explored to measure the coating thickness on large substrate areas. Existing thin-film inspection systems provide opportunities to perform measurements whilst the material is processed; however, they are not robust enough to provide reliable in-process quality control1. Imaging ellipsometry is an optical technique proven to measure coating thickness in large substrates of 300mm width but shows spatial resolution issues in the central 100mm of the web,2,3. Atomic Force Microscopy (AFM) is a physical technique that has been studied for its potential application in R2R systems but requires high-precision and significantly large equipment to position the tip of the AFM on top of the coating surface4. Interferometry-based techniques such as wavelength scanning interferometry (WSI)5 and coherence scanning interferometry (CSI)6 have overcome the well-known 2\(\pi\) phase ambiguity, and others are implementing multi wavelength polarization to overcome limitations of previous interferometric developments7; but scaling them to cover large areas across the full width of the substrate in R2R processing would require a significant cost and space in manufacturing in addition to the technical challenges of handling imaging size and adapting to ink changes. Others have created a promising approach combining Hyperspectral and RGB cameras with spectroscopic reflectometry (SR) and ellipsometry using a probabilistic sensor fusion approach to create virtual mappings of the coating surfaces. Still, these techniques require an offline physical mapping of the samples and thousands of measurements to map the wafer coating surface.8,9.

SR is an alternative technique that measures a single point of the coated surface and has the advantage of “seeing” through the material and performing coating thickness measurements, nevertheless, it presents local minimum limitations when using optimisation algorithms to estimate the thickness values10, its accuracy decreases when inspecting rough surfaces and is normally used as an offline quality assurance tool1. However, SR is still an attractive technique due to its accuracy and low cost as compared to the other techniques mentioned above. Commercially available SR systems can perform multi-point in-line measurements in R2R processes but are limited by the optical loss in the reflectance splitters11. The physical dimensions of the light sources and spectrometers required to perform the measurements also limit system expansion to cover larger inspection areas. Despite the disadvantages, researchers are working on newer approaches using machine learning methods to predict the coating thickness for varied coatings and substrates using commercially available reflectometers. The use of machine learning methods is a viable alternative to the single-point measurement disadvantage of SR systems, however, it still requires a considerable amount of training data set to enable immediate thickness measurements12.

In 2021, Doo-Hyun Cho confirmed the single-point SR disadvantage was still present in the scientific community and that its use for a potential in-process inspection system for large areas would require an excessive amount of points, which suggests that it is not feasible with the existing SR technologies due to size, cost constraints and unknowns in terms of data analytics9. In 2023, Sánchez-Arriaga13, presented a miniaturised lab-based reflectometer that could potentially be stacked to create a multi-sensor array with integrated light sources which can challenge the existing single-point SR limitations, and expand the inspection of large area substrates.

Although R2R processing offers high throughput and low production cost, it is also prone to process failures. These include roll starring and displacements that manifest as misalignment or flutter of flexible substrates during operation. To simulate this scenario and understand if the sensor array can detect substrate angle variations dynamically, a robotic arm sequence is used to validate the sensor array measurement accuracy and tilt detection capabilities for potential failures in R2R processing. This mimics the use of manipulators in thin-film wafer manufacturing for improving the automation of wafer inspection from a fabrication chamber. Table 1 shows the state-of-the-art of recent advances and challenges in wafer inspection with in-process potential for R2R manufacturing.

Table 1 State-of-the-art - Recent improvements and challenges in wafer metrology with in-process potential for R2R manufacturing.

As observed, techniques combined with computer vision or imaging technologies are improving the surface mapping; however, they are slow to process thousands of measurements to train models and remain offline metrology techniques. Others, based on interferometry, offer quick measurements, but the methods require validation for thin-film thicknesses < 400 nm or are challenging to implement in R2R conditions. Additionally, the solution could be difficult to incorporate for in-process conditions due to the use of multiple fibre optics, light sources, and spectrometers in case of expansion with robotic arms or R2R processing.

As mentioned, robotic arms are commonly used to transfer these wafers from a fabrication chamber to inspection systems17. However, robotic manipulation remains a difficult task for applications that contain constraints in the motion of the end-effector due to the high dimensionality and complexity of the end-effector and joint spaces18. Incorporating robotics for dynamic adjustments during wafer inspections introduces numerous practical challenges. Principally, robot-induced motion can corrupt optical metrology signals while manipulating objects, driving the need for carefully engineered trajectories that satisfy motion constraints to suppress measurement blur and pose drift19,20. In this work, a novel representation of these constraints is learned for maintenance purposes to ensure that the wafer remains level during manipulation, as rotations in the pitch and roll directions lead to the wafer being dropped during transportation. A Variational Autoencoder (VAE) is used to learn a lower-dimensional representation of the robot joint space so that constraints can be examined efficiently. This ensures the correct positioning of the wafer into its calibration point for wafer inspection during manufacturing and to start performing dynamic sequences to simulate R2R process scenarios.

This paper presents a novel spectrometer multi-sensor array capable of measuring thin-film thickness across the width of Si:SiO\(_2\) semiconductor wafers. This has a high potential for scalability into larger areas which is a desired feature to contribute to the global manufacturing efficiency improvements required to achieve carbon reduction emissions by 205013. The sensor array covers a linear width of 74mm using seven sensors positioned strategically to reduce inspection gaps and detect angle variations. Root mean squared error (RMSE) values lower than 0.02, \(R^2\) greater than 0.9 and thickness error measurements below 2% were observed per sensor, which is comparable to commercially available SR systems21. Thin-film measurements are performed via the Curve Fitting Method (CFM), calculating the root mean squared error (RMSE) between the measured reflectance curve of the coated samples (SiO\(_2\)) and a mathematically modelled curve. The thickness estimation was performed with the Dogbox optimisation algorithm using the Python library SciPy, and a single thickness output of the sensor array was created by performing a basic sensor fusion averaging of each sensor output22. This novel methodology for thin-film thickness measurements is shown in Fig. 1, where the offline training of the VAE is used to evaluate the constraints of the robot trajectory during wafer inspection. An overview of the process is provided in Algorithm 1.

Fig. 1
figure 1

System overview. Robot manipulator-assisted wafer transportation enables in-line thin-film thickness estimation using a multi-sensor reflectance array. All sensors are calibrated simultaneously with three intensity exposures; uncoated \(I_{\textrm{u}}\), noise \(I_{\textrm{n}}\), and coated \(I_{\textrm{c}}\) to calculate the reflectance \(R_{\textrm{c}}\). The \(R_c\) spectrum is used to estimate the thickness measurement per sensor, then the thickness measurements are fused to give a sensor array fused thickness output. See section “Methods” for more details of the measurement procedure.

Algorithm 1
figure a

Process flow of the constrained wafer inspection

Methods

Multi-sensor array

Array hardware architecture

The sensor architecture shown in Fig. 2 is based on an original design proposed by Sánchez-Arriaga13.

Fig. 2
figure 2

Schematic of the sensor array with seven sensors: (a) Sensor sub-assembly cross section showing the STM Nucleo L432KC board, the DC converter, the pin socket, the sensor C12666MA and an LED; (b) 3D models of the array sensor holder and the lateral and front locks of the sensor devices. The top picture shows sensor zones A1-A2, B and C, and sensor positions 1-7; (c) Front and back isometric views of the array full assembly; (d) Front view of the multi-sensor array in static measurement configuration.

Figure 2a shows the improved sensor sub-assembly designed to occupy the least space possible to a configuration of 95.59 x 27 x 13.5 mm. Figure 2b shows the sensor array backbone which holds the sensor assemblies into numbered positions (1\(\rightarrow\)7) and strategic zones (A1, A2, B and C). Zone A1 and A2 were designed to detect left and right tilt and zone B and C to detect rear and front tilt, respectively. Sensor locks were designed to lock the sensor assemblies into a fixed position. Figure 2c shows the front and back view of the sensor array with the sensor assemblies locked into the testing position. The locks allow an M2.5 bolt to complete the array assembly. Figure 2d shows the novel sensor array assembly held by a Dinolite microscope stand RK-10A and a sample wafer on a compact five-axis stage Thorlabs PY005/M. All sensors were connected to a DELL PC through a StarTech 7-Port Self-Powered USB-C Hub.

The sensor C12666MA is a CMOS spectrometer with 256 pixels. Each pixel corresponds to a predefined wavelength defined by the vendor as follows:

$$\begin{aligned} \lambda = A_0 + \sum _{i=1}^{5} B_i x^i, \end{aligned}$$
(1)

where \(A_0\) and \(B_0 \rightarrow B_5\) are coefficients provided by the vendor and x is the pixel under study. Each pixel reads a relative intensity per wavelength in “counts,” which are defined by the microprocessor Analog-to-Digital-Converter (ADC) (max counts: \(1023 = 2^{n}-1\), where n = 10 is the ADC resolution). The sensor can be enabled with an STM Nucleo-L432KC board via the Arduino IDE. The integration time was set to 0.11 seconds, so that all 256 sensor pixels can capture the light intensity. The light gathered across the pixels forms a reflectance spectrum, which is then sent through the COM port for processing.

The sensor video output should be connected to an operational amplifier (OPAMP) buffer, as per the manufacturer’s recommendations, before sending data to the microcontroller ADC. Therefore, the Nucleo board L432KC was selected as it includes a configurable OPAMP in the input of its ADC. Additional power supplies were used to provide a reference voltage of 2.8V for the Nucleo board \(V_{\text {REF}}\) input and another of 3.3V was used to supply the LEDs. The LED intensity was regulated externally with 2.6k\(\Omega\) potentiometers to achieve 90% of the available counts.

Reflectance curve modelling

The reflectometer principle of operation is based on the interferometry phenomena described by Heavens for a single thin-film coating deposited on a semi-transparent substrate23. When light gets reflected from an isotropic coated surface, a reflectance \(R'\) data point is calculated per wavelength with:

$$\begin{aligned} R' = \frac{r^2_{01}+r^2_{12}+2r_{01}r_{12}\cos 2\varphi _1}{1+r^2_{01}r^2_{12}+2r_{01}r_{12}\cos 2\varphi _1}, \end{aligned}$$
(2)

where \(r_{ij}\) is the total reflection coefficient per layer and \(\varphi _1\) is the phase change of light in the coating. It is observed that \(\varphi _1 = k'dN_1\cos \theta _1\) where \(N_1\) is the coating refractive index, d is the coating thickness, \(\theta _1\) is the angle of incidence and \(k'\) is the wave number in vacuum. In this work, \(k'\) is determined as \(k'=\frac{2\pi }{\lambda }\) with \(\lambda\) being the wavelength under study. Multiple reflectance points are calculated per wavelength when a broadband light source is under study, resulting in a modelled reflectance curve. This is shown in Fig. 3. The modelled reflectance curve is then compared to a measured reflectance curve per sensor. A detailed process is described in13.

Fig. 3
figure 3

Reflectance curve formed by the model formula in Eq. 2 compared against the measured reflectance from Eq. 3.

Measuring reflectance

Reflectance measurements require a coated sample (i.e. Si:SiO\(_2\)) and an uncoated sample (i.e. Si), and involve a process where three measurements must be performed. First, the uncoated sample reflected intensity (\(I_u\)) (calibrated to 90% of the available ADC counts as shown in Fig. 1), then the dark noise intensity (\(I_n\)) and finally the coated sample reflected intensity (\(I_c\)). Once the intensities have been measured, then the reflectance of the coated surface (\(R_c\)) can be calculated per pixel as follows:

$$\begin{aligned} R_c = \frac{I_c - I_n}{I_u - I_n}R_u, \end{aligned}$$
(3)

where \(R_u\) is the absolute reflectance of the uncoated sample; in practice, this value is close to one without affecting the \(R_c\) value13. Once that \(R_c\) is measured per pixel, a reflectance curve can be generated and then compared to a modelled reflectance curve as described in the previous point of this report (see Fig. 3). The comparison between the modelled and the measured reflectance curve is performed in this work with two curve-fitting approaches. The first approach quantifies the Root Mean Squared Error (RMSE) between the modelled and the measured reflectance values. The second approach allows the thickness estimation by fitting a curve to the measured values using an optimisation algorithm. The fitted line was then evaluated using the coefficient of determination \(R^2\), which allows the calculation of the RMSE as follows24:

$$\begin{aligned} \text {RMSE} = \frac{1}{n}\sqrt{\sum _{i=1}^{n} \left( Y - \hat{Y} \right) ^2}, \end{aligned}$$
(4)

where n is the number of pixels under comparison, Y is the measured value per wavelength and \(\hat{Y}\) is the modelled value per wavelength. The RMSE values must be close to zero to ensure reliable data. Since the quality metric is defined by the user21,25, and based on previous work, in this study, an RMSE value < 0.04 per sensor is sufficient to ensure a reliable thickness estimation.

Coating thickness estimation

The thickness estimation was performed via the Python SciPy curve fitting function using the Dogbox optimisation algorithm26. This function fits a line within the measured reflectance curve per sensor and returns optimised values that best describe the measured data. Additionally, it showed the best capability performance after running a Minitab six-pack analysis compared to the other optimisation algorithms available in Python SciPy library (the well-known Levenberg-Marquardt and Trust Region Reflective (TRF)). To achieve this, a target function must be provided to optimise, Eq. 2, and two groups of data. The first group is the x and y values which are the pixel wavelengths and the measured reflectance values per pixel. The second group is the estimated thickness, refractive index, and the angle of incidence. The second group of data is the one that the curve fitting function optimises to achieve the least error between the measured reflectance values and the fitted reflectance line.

Once the curve fitting function process is completed, the coefficient of determination \(R^2\) is used to understand the goodness-of-fit between the measured and the fitted values24:

$$\begin{aligned} R^2 = 1 - \frac{\sum (Y - Y_f)^2}{\sum (Y - \bar{Y})^2} = 1 - \frac{SSR}{SST}, \end{aligned}$$
(5)

where Y is the measured value per wavelength, \(Y_f\) is the fitted value per wavelength and \(\bar{Y}\) is the mean of the measured values. The numerator is also known as the sum of the squared residuals (SSR) and the denominator is also known as the total sum of squares (SST). In this work, for exploratory purposes, an \(R^2\) value > 0.7 is considered a reliable value for the thickness estimation per sensor. The estimated thickness value from the curve fitting function is used as the measured thickness per sensor.

Sensor fusion & noise handling

The definition of “sensor fusion” has been questioned for the last three decades and recently has regained controversy in academia due to the increased complexity and evolution of technology, applications, and fusion algorithms. Nevertheless, according to the definitions by Elmenreich22 and Klein27 this paper considers “sensor fusion” as the combination of n sensors to have a better representation of the wafer area thickness under inspection. According to the central limit theorem, the thickness measurements of the individual sensors should converge close to a normal distribution, which is a proven assumption for this sensor array28. Therefore, the sensor fusion technique was a simple averaging (SA) performed to obtain the average thickness \(\bar{Y}\)22,29:

$$\begin{aligned} \bar{Y} = \frac{1}{n}\sum _{i=1}^{n} Y_i, \end{aligned}$$
(6)

where \(Y_i\) is the individual thickness measurement from each sensor in the array \(n \in \{1, 7\}\).

To mitigate the effect of sensor and USB noise on the readings, a convolution filter was applied to smooth the intensity values and eliminate noise.

Learning constraint manifolds in robotics

Constraint manifolds

Manifolds represent a subset of geometry dealing with curvature that exists in higher dimensions30. Many aspects of robotic manipulation can be considered to operate on manifolds, such as the symmetric and positive definite (SPD) matrices for joint stiffness and unit quaternion (UQ) for orientation31. These manifolds, denoted as \(\mathcal {M}\), can be considered to be Riemannian and dictate the capabilities of specific robotic platforms. Many robotic applications require the use of custom end-effectors, such as the wafer transportation tool used in this work. Some platforms may however be restricted in what positions the end-effector can take, with tasks such as opening doors or drawers18 imposing end-effector constraints on rotation and position. Traditionally, these constraint manifolds are identified iteratively during the path-planning process through the sampling of joint configurations that satisfy the desired constraint:

$$\begin{aligned} \mathcal {M} := \{\theta \in \Phi \ |\ \Phi ^- \preceq \theta \preceq \Phi ^+ \text { and } \textbf{f}(\theta ) = 0\}, \end{aligned}$$
(7)

where \(\Phi\) is the joint limits of the robot, \(\theta\) is a set of joint positions from \(\Phi\) and \(\textbf{f}(\theta )\) is the constraint function. For each new joint configuration sampled, the manifold grows and is identified during the planning stage. However, for complex constraint functions that limit movement on multiple axes, this process can increase planning time and reduce the efficiency of many algorithms. Additionally, many repetitive tasks that maintain constraints which don’t change during product life cycles, meaning that manifolds need only be identified once to ensure compliance with robot movement.

Manifold learning represents a method to identify the properties of a high-dimensional manifold through prior offline data collection. Formally, it is the process of defining a function f that maps some Euclidean space into a lower dimensional manifold32:

$$\begin{aligned} \mathcal {M} = \textbf{f}(\textbf{x}) \text { with } \textbf{f}:\mathcal {X} \rightarrow \mathcal {Z}, \end{aligned}$$
(8)

where \(\mathcal {X}\) represents the Euclidean space and \(\mathcal {Z}\) represents a closed subset of \(\mathcal {X}\) lying on \(\mathcal {M}\). Determining the mapping function \(\textbf{f}\) becomes complex in robotics tasks as there is a coupling between the Euclidean motion group SE(3) and the joint configuration space Q. Approximating the function \(\textbf{f}\) is seen as a way to avoid the “curse of dimensionality” within high-dimensional models, where the function approximator can interpolate between the data points to construct the manifold. Another benefit of using manifold learning for robotics is the ability to evaluate the relationship between the joint space \(\Phi\) and the pose space. Once the joint space relationship is known, it can be used directly to evaluate when the robot experiences drift from being on-manifold and maintenance to the robot is required.

Constraint manifold identification

For the constraint function, the original problem statement of this paper is the transport a wafer sample from a fabrication chamber to the sensing array for inspection. As the wafer is deposited on the end-effector of the robot and isn’t locked into place, a horizontal constraint is imposed on the end effector18. Consider the transformation matrix \(\textbf{T}^0_e\) that relates the pose in the end effector frame \(\mathcal {F}^e\) relative to the base frame \(\mathcal {F}^0\), for a robot with n joints:

$$\begin{aligned} \textbf{T}^0_e = \textbf{T}^0_1\ \textbf{T}^1_2\ ...\ \textbf{T}^{n}_e = \left( \prod _{i=0}^{n-1}\ \textbf{T}^{i}_{i+1} \right) \ \textbf{T}^{n}_e = \begin{pmatrix} \textbf{R}^{0}_e & \textbf{P}^0_e\\ 0 & 1 \end{pmatrix}. \end{aligned}$$
(9)

Equation 9’s rotations and translations are determined using the forward kinematics of the robot manipulator from the joint angles \(\theta\). The rotations of the robot are normally expressed in three main ways, the first being the rotation matrix \(^{0}\textbf{R}_e\) shown in Eq. 9. In this work, the rotation of the end effector is represented in the RPY representation of Euler’s angles \([\beta \ \alpha \ \gamma ]^\intercal\) for constraint manifold identification. This brings a binary constraint vector in the form:

$$\begin{aligned} \mathbb {C}_{RPY} = \left[ c_x\ c_y\ c_z\ c_\beta \ c_\alpha \ c_\gamma \right] ^\intercal \end{aligned}$$
(10)

to constrain the 6 degrees of freedom pose of the end-effector of a manipulator operating in Euclidean space \(\mathbb {R}^3 \times SE(3)\), henceforth denoted as \(\mathcal {X}\) as shown in Eq. 8.

Using the defined transformation matrix \(\textbf{T}^0_e\) and constraint array \(\mathbb {C}_{RPY}\), the constraint function \(\textbf{f}(\theta )\) can then be formulated. Using the joint positions \(\theta\), the forward kinematics of the manipulator can be computed to find the pose of the end effector \(\textbf{x}^0_e\) relative to the base frame \(\mathcal {F}^0\). Using this, the constraint function can be constructed as the \(\ell _2\)-norm of the element-wise product of the constraint array and pose vector:

$$\begin{aligned} \textbf{f}(\theta ) = \Vert \mathbb {C}\odot \textbf{x}^0_e \Vert _2 \text { where } \textbf{x}^0_e \equiv \textbf{T}^0_e (\theta ). \end{aligned}$$
(11)

Sampling the joint positions is done through a Monte Carlo sampling method, whereby increasing the number of samples can improve the overall estimation of the manifold.

Learning manifolds from data

Learning manifolds from data relies on reducing a higher dimensional manifold into a lower dimensional space through a projection function \(\phi (z)\). This projection function can be modelled as the latent space of a function that learns representations between the joint positions, whereby it learns a Riemannian manifold on this latent space. The VAE33 learns a latent space representation of data input \(\textbf{x}\). Deep VAE models seek to maximise the evidence lower bound (ELBO) of the model, which can be modified to use the joint space of the robot manipulator to produce a lower dimensional manifold of the joint operating positions32:

$$\begin{aligned} \mathcal {L}_{ELBO} = \mathbb {E}_{q_{\zeta }(z | \theta )}\left[ \log (p_{\Theta }(\theta | z)) \right] - \text {KL}\left[ q_{\zeta }(z | \theta ) \parallel p_\phi (z, \theta ) \right] , \end{aligned}$$
(12)

where KL denotes the Kullback-Leibler divergence between the encoder distribution \(q_\zeta (z | \theta , \textbf{f}(\theta )\) and the Gaussian latent variables and \(p_\Theta\) is the joint space conditional density. Once the model has been trained, the Riemannian metric first derived in32 can be used in combination with the predicted constraint value on the manifold \(\hat{\textbf{f}}(\theta )\) for the manipulator joint positions to generate a Riemannian metric corresponding to the joint constraint value:

$$\begin{aligned} \textbf{M}^\theta _{\textbf{f}}(z) = \zeta \left| -1 + \exp [\hat{\textbf{f}}(\mu (z)] \right| . \end{aligned}$$
(13)

This metric takes large values in areas where the model has a high uncertainty regarding joint positions and when estimated constraint function \(\hat{\textbf{f}}(\hat{\theta })\) takes a large value. The model architecture is shown in Fig. 4, which was deployed into the MoveIt planning interface so plans can be evaluated and determine whether maintenance is required for calibrating the joint positions.

Fig. 4
figure 4

Model of the VAE system for generating a latent space Riemannian manifold. The latent space \(q_\zeta (z | \theta )\) is used to generate the Riemannian metric \(\textbf{M}\), which is used to determine whether the manipulator is experiencing joint drift. The estimate of the constraint function \(\hat{\textbf{f}}(\theta )\) is computed from decoding the latent space and computing the manifold constraint function.

Results

Wafer inspection results

Static experiments - inspection box definition

The sensor array was first validated statically to understand its capabilities. A full factorial design of experiments was performed with three samples made of Si substrate and a layer of SiO2 coating (Si:SiO2). Each sample had the following coating thicknesses: SAMPLE1: 300nm, SAMPLE2: 286nm and SAMPLE3: 164nm. The set of experiments consisted of calibration at 2mm above the sample surface, then modifying the array height: -1mm/+2mm, and the wafer angle up to 0.498\(^{\circ }\) (rounded to 0.5\(^{\circ }\)) in increments of 0.166\(^{\circ }\) as shown in Fig. 5.

Fig. 5
figure 5

Array height and angle experiments. (a) Sensor array calibration point at 2mm above wafer surface (red dotted line) and height variations from the calibration point -1mm/+2mm (green arrows). Thorlabs base PY005/M, reproduced with permission. The numbers are the sensor numbers i.e. SENSOR1 = 1. (b) Angle variations from 0\(^{\circ }\) to 0.498\(^{\circ }\) (rounded to 0.5\(^{\circ }\)) in steps of 0.166\(^{\circ }\). (c) The hardware setup.

Table 2 Full factorial DOE of SAMPLE1 (300nm) showing the averaged RMSE, \(R^2\) and Thickness (nm) per combination of factors (Angle vs Height). Notes: (i) The Thorlabs PY005/M base was re-positioned when measuring RIGHT-LEFT and FRONT-BACK positions. (ii) Each data value is an average of thirty readings performed by all the sensors. (iii) *Calibration point @ height = 2mm from the sample surface.

Table 2 shows the SAMPLE1 sensor-array measurements with increments of 0.166\(^{\circ }\) rounded to the nearest 2 decimal places: RMSE, \(R^2\) and Thickness. It was observed that when the RMSE was \(\le\) 0.022, the \(R^2 > 0.9\). This is considered a good result for the sensor array as it shows that the measured reflectance curve for each sensor fits correctly to the modelled reflectance curve, as explained in the methods section. By contrast, when the RMSE is > 0.022, the \(R^2\) likely drops below or trends towards 0.9. When this occurs some of the individual sensors present a loss of performance.

The loss of performance was first observed when the sensor array was positioned at the calibration point and when there was a 0.5\(^{\circ }\) tilt on the right side of the sensor array. After reviewing the individual sensor performance it is clear that when the array tilts to the right side, the SENSOR6 showed an RMSE > 0.04 and \(R^2\) < 0.7. Then, when varying the height 1mm below the calibration point and tilting the sensor array to the right side by 0.33\(^{\circ }\), SENSOR5 and SENSOR6 showed the same behaviour. Similarly, when there is a tilt on the left side below the calibration point, SENSOR1 and SENSOR7 showed an increase in RMSE and a decrease in \(R^2\) when varying the angle 0.33\(^{\circ }\). Comparably, SENSOR6 and SENSOR7 detected variation when the wafer was tilted on the back side of the sensor array. Finally, all sensors except SENSOR2 detected a variation on the front side of the sensor array. This behaviour was repeatable for all the samples, data for which can be found in Supplementary 1.

Reduced performance of the RMSE and \(R^2\) per sensor is caused by a change in the reflected intensity received by the sensor slit due to the angle and height variations distorting the measured reflectance curve. When this occurs, the RMSE increases above 0.4, similarly, the \(R^2\) calculation gets affected because the optimisation algorithm fails to effectively fit a line within the measured reflectance curve making the \(R^2\) go below <0.7. See the Methods section for more details on \(R^2\) calculation.

Finally, despite the loss of the fit quality metrics, the final observation was that all combinations show an estimated thickness of less than 2% variation (<6nm) vs the expected thickness value of 300nm. However, when the RMSE and \(R^2\) fail beyond the expected levels, the thickness values are not reliable, and sensor alignment to the calibration point must be performed. Based on the static experiment results, an inspection box for the presented sensor array was defined in Fig. 6.

Fig. 6
figure 6

Sensor array inspection box.

The virtual inspection box is the sensor array limit to measure thin film thickness, whereas the inspection box is the baseline for an automated inspection procedure using a robot manipulator.

Sample comparison with filmetrics F20

An additional test was performed to evaluate the capability of the sensor array to meet manufacturing tolerances. The performance of the array was compared to a Filmetrics F20 reflectometer using the three aforementioned samples, see Fig. 7.

Fig. 7
figure 7

Comparison of the proposed sensor array against the Filmetrics F20 reflectometer.

As observed, the sensor array measurements match the Filmetrics F20 measurements, being all within the vendor’s tolerances. Table 3 shows that two hundred and forty measurements were performed with the F20 in the same positions as the sensor array per sample.

Table 3 Data of Fig. 7 including uncertainty calculation. All the tests were performed under controlled room temperature environment and using a Filmetrics reference standard for calibration purposes. *Estimated uncertainty.

The F20 showed similar thickness measurements, demonstrating that the sensor array is capable of measuring thickness values with an error \(\le\) 2.1% compared to the F20 reflectometer, demonstrating the capability of the developed sensing array. Table 3 additionally shows the uncertainty calculations based on the National Physical Laboratory (NPL)34. The Type A uncertainty was calculated based on the standard deviation of the measurements, in this case \(u = \sigma / \sqrt{(}n)\), where \(\sigma\) is the standard deviation and n is the sample size. The combined uncertainty (u\(_c\)) is the combination of the Type A, with other Type B factors such as the spectrometer reproducibility per pixel and temperature (0.8nm and 0.08nm, respectively, for the Hamamatsu sensor), or others. In this case, \(u_c = \sqrt{(}Type A^2 + Type B\, factor1^2+...+Type B\, factor\,n^2)\). In Table 3, an additional 1 nm factor was added when calculating \(u_c\) for the sensor array to compensate for any time delays and disturbances in the precision of the measurements caused by mechanical misalignment and USB connections. Additional Type B factors for the F20 were unknown, so they were estimated assuming similar behaviour to the Hamamatsu sensors; therefore, the F20 uncertainty calculation is an estimate.

Dynamic measurement accuracy

Four automated trajectories were implemented with a robot manipulator, shown in Fig. 8.

Fig. 8
figure 8

Wafer inspection experiments with robot manipulator: (a) Robot arm movement from an idle/load position to the calibration point. The green rectangle shows the wafer and a 3D-printed end effector. (b) Zoom-in of the end effector and the wafer. (c) Circular trajectory. (d) Back & forth trajectory.

Figure 8a shows the robot arm trajectory to the calibration position. First, the Si uncoated wafer is placed in the end effector and the arm moves it to the calibration point to start the calibration process. After measuring the uncoated intensity, the arm goes back to its loading position so that the Si wafer can be manually removed and the Si:SiO2 coated wafer loaded, to complete the calibration process. Figure 8b shows the wafer loaded on the end effector. Figure 8c shows a circular motion to perform an area scan of the surface, and Fig. 8d shows a back-and-forth trajectory to simulate an R2R motion. The final positions are shown in Fig. 5b, used to duplicate the static experiment results and evaluate the inspection box calculations.

Figure 9 shows the sensor array results for each one of the programmed trajectories per sample.

Fig. 9
figure 9

Sensor array results (From left to right: RMSE, \(R^2\) and Thickness). (a) SAMPLE1: 300nm, (b) SAMPLE2: 286nm, (c) SAMPLE3: 164nm. Notes: (i) The green zone in the RMSE and \(R^2\) is the discovered safe zone for the array output. (ii) Each data point represents the averaged value of 50 measurements of all the sensors. (iii) The x-axis shows the test sequences: CALIB POINT, CIRCULAR, FRONT-BACK-RIGHT-LEFT and BACK&FORTH.

Figure 9a shows the SAMPLE1 (300nm) results. As observed in the Fig., from left to right, Sequence 1 (CALIB POINT) shows an RMSE < 0.02, \(R^2 > 0.9\) and thickness measurement = 302.57nm. This represents a thickness error of 0.86% vs the expected thickness value of 300nm, which is close to the 0.4% accuracy offered by the F20 and within the ± 2% compared to Yersak et al. for a potential R2R application35. This behaviour was similar when performing Sequence 4 (BACK AND FORTH), as it showed an error of 0.79%. This result is similar in Samples 2 and 3, as seen in Figs. 9b and 9c respectively. Nevertheless, Sequence 2 (CIRCULAR) showed an increase in the RMSE \(> 0.02\), meaning that the sensor array detected height variations below the calibration point whilst performing the sequence. In this case, it was observed that SENSOR1, SENSOR2, and SENSOR6 started failing showing an RMSE \(> 0.04\) and an \(R^2 < 0.07\) after half of the circular motion sequence which suggests a misalignment of the end effector. When the RMSE and \(R^2\) go beyond these limits, it was observed that the individual thickness measurements can go above 10% of the expected thickness value. For instance, when SENSOR6 showed an RMSE=0.07 and \(R^2\)=0.63, the individual thickness reading was 368.26nm affecting the general average metric score. Sequence 3 was designed to duplicate the static experiments, for a potential in-motion tilt detection. The array could detect front and back tilting, with SENSOR5 and SENSOR7 detecting RMSE variations of \(>0.02\) and \(R^2\) close to 0.9, however, it was not possible to detect significant variations beyond the set limits when the robot arm tilted the wafer on the right and left sides. Similar behaviour was observed in SAMPLE2 (286nm) and SAMPLE3 (164nm) when performing all the robot arm programmed sequences, data for which can be found in Supplementary 2.

Verification of robot movement

Constraint manifold identification

To identify the constraint manifold, the first manifold that was obtained was the global manipulation manifold. The results of the training process are shown in Fig. 10a, where the variance measure in Fig. 10b shows the confidence of the variational autoencoder (VAE) in its predictions of the joint space positions from the latent space into the joint space. In Fig. 10c, the magnification factor of the metric is defined as:

$$\begin{aligned} J = \log \sqrt{\det [ \textbf{M}_{J_\mu } + \textbf{M}_{J_\sigma }]}, \end{aligned}$$
(14)

where \(\mathbf {M_{J_\mu }}\) and \(\mathbf {M_{J_\sigma }}\) are the Jacobian matrices of the VAE’s decoder mean and variance architecture outputs respectively, governing the model’s confidence in its estimation of the manifold. This metric shows the boundary of the manifold based on the data, indicating that the lower dimensions constitute a manifold \(\mathcal {M}_\theta\) encompassing the joint positions. It is observed that the latent space representation in Fig. 10b correlates to the data spread presented in Fig. 10c’s distribution of embedded points on the manifold, indicating that the lower-dimensional representation is sufficient for the application of studying a manipulator’s kinematics.

Fig. 10
figure 10

Training results from the Riemannian manifold VAE: (a) ELBO loss averaged across 10 runs, indicating convergence on a stable manifold; (b) Variance measure of the latent space. This variance takes low values in areas that the manifold has a high confidence of performance and high values in areas of high uncertainty; (c) The magnification factor J in Eq. 14 applied to the variance measure metric. The white dots indicate the training data, with a boundary around those points of high variance indicating the edge of the manifold.

The latent space in Fig. 10b has a circular distribution similar to that of the joint space of the robot which exists on a set of tori \(\mathbb {T}^n = \mathcal {S}^1 \times \mathcal {S}^2 \times \cdots \times \mathcal {S}^n\) with n being the number of joints and \(\mathcal {S}\) being a 1-sphere. This distribution is mimicked in Fig. 10c, where the circular nature is now repeated in the boundary of the learned latent space manifold. This shows that the sampling of the joint space when training the VAE can capture the original kinematics of the manipulator, hence the uniform random sampling plan used is sufficient to model the manipulator kinematics in the latent space. For higher dimensional or redundant manipulators where \(n > 6\), alternative sampling plans should be used that encode the geometry of the kinematics into the sampling36.

The learned manifold domain can be modified by applying the constraint function metric \(\textbf{M}^\theta _{\textbf{f}}\) to the variance measure plot. This creates a sub-manifold \(\tilde{\mathcal {M}}\) that satisfies the constraint function \(\textbf{f}(\hat{\theta }) = 0\), where \(\hat{\theta }\) indicates a predicted value from the VAE. As this application considers the manipulator to maintain horizontal motion during operation, a constraint vector of \(\mathbb {C} = \begin{bmatrix} 0&0&0&1&1&0 \end{bmatrix} ^\intercal\) is applied, then the value of \(\textbf{f}(\hat{\theta })\) computed for each point in the latent space. This is then used to determine the constraint metric \(\textbf{M}^\theta _{\textbf{f}}\), which is then projected onto a three-dimensional variance measure plot.

As shown in Fig. 11, there exists a sub-manifold - shown in dark blue - that corresponds to areas that satisfy the constraint vector and maintain horizontal motion. As these areas lie on the low variance regions of the manifold, this learned model can then be used to examine the manipulator movement and determine with a high degree of confidence whether the constraints are being met during the motion of the manipulator.

Fig. 11
figure 11

Projected value of the constraint metric \(\textbf{M}^\theta _{\textbf{f}}\), normalised to between 0 and 1, onto the variance measure in three-dimensions. Areas corresponding to zero (dark blue) are latent space points that satisfy the constraint. The colour bar on the left represents the value of the constraint metric, with the Z-axis value being the variance metric.

Movement inspection

To examine the ability of the latent space manifold to detect variations, motion plans generated on hardware can be embedded into the latent space of the VAE. This is done using the MoveIt motion planner in ROS37, where ground-truths are obtained for plans that are no longer adhering to constraints. These plans are then encoded into the latent space of the VAE to be evaluated with the constraint metric \(\textbf{M}^\theta _{\textbf{f}}\). For each path generated in MoveIt, points were sampled from the path and labelled with the ground truth as to whether they satisfy the constraints. The paths were generated from the starting position to the calibration box shown in Figs. 5a and 6, with three off-manifold paths being induced with increasing deviation from the desired ground-truth trajectory.

Fig. 12
figure 12

Projection of trajectories from the manipulator into the latent space of the VAE. LEFT: Projection of the trajectories on to the manifold; RIGHT: Zoomed in of the space where the trajectories lie on the manifold. The trajectory in white is the ground truth trajectory that maintains horizontal motion during the trajectory. Trajectories in red are ones that deviate from the constraint sub-manifold.

As shown in Fig. 12, the trajectories can be shown in the latent space as being continuous, indicating that continuity with joint positions across single trajectories is maintained when embedding the high-dimensional joint space onto the latent space Riemannian manifold. This continuity is caused by the fact that the VAE maintains the geometric relationships between the joint positions in Euclidean space when embedding them in the latent space. Furthermore, this preservation of the robot geometry is present when examining the trajectories that violate the constraint, shown in red in Fig. 12, as there is significant deviation from the desired trajectory in white, which indicates that trajectories that violate the constraint imposed on the manipulator can be detected in the latent space. This deviation is determined by examining the value of the metric \(\textbf{M}^\theta _{\textbf{f}}\), which allows us to build the curvature of the manifold based on the value of the constraint in the latent space.

Table 4 Evaluation of the trajectory reconstruction and direct constraint estimation difference between the ground truth and the values from the latent space manifold. The mean value \(\mu\) is the average difference between the actual ground truth and the reconstructed output from the VAE with the standard deviation \(\sigma\) over the trajectory.

Using this projection, the estimated values of the constraint function can be determined directly from this manifold without needing to explicitly calculate its value. The accuracy of the reconstructed points from the VAE decoder can also be evaluated to determine the ability of the VAE architecture to learn the lower-dimensional manifold. This is done by passing random joint configurations through the encoder and decoder architectures, then comparing the resulting estimation of the joint configurations against the ground truth. The estimated \(\textbf{M}_\textbf{f}^\theta\) based on the reconstructed joint configuration can also be compared against the actual values of \(\textbf{M}_\textbf{f}^\theta\) computed using the ground-truth joint configurations. In Table 4, the mean and standard deviation of the two reconstructions are presented, and it is clear that the VAE can reproduce the encoded positions with a high degree of accuracy, indicating that the latent space in the VAE is an accurate representation of the joint configurations for the manipulator. Furthermore, it can be seen that the manifold constraint estimator shown in Fig. 11 can determine directly the value of the constraint function from the manifold, allowing the complete evaluation of whether the repeated trajectory is starting to experience deviation.

Discussion

SR has been overlooked in the past because of its known limitations. However, newer advances in component miniaturisation are allowing the exploration of novel approaches to overcome its technical disadvantages. This sensor array solution proposes a novel approach that challenges the existing single-point and physical expansion restrictions by utilising a spectrometer and a light source integrated into a single reflectometer package. Additionally, this proposal provides a linear coverage of 74mm of wafer inspection, which is promising for a potential area coverage expansion for the semiconductor and/or R2R manufacturing.

However, the lab-based SR sensor array requires high precision of the angle of incidence, sensor alignment, precise control of the light intensity, sensor integration times, and premium-quality USB devices. All the mentioned requirements represent a limitation of the presented solution as they are a source of variation and potential noise contributors that could affect the sensor’s readings. Each sensor had to be mechanically adjusted using a Dinolite microscope and a spirit level to ensure proper alignment during the calibration process. This step required at least 30 minutes before attempting the calibration and thus, before starting measuring. This could be solved by modifying the sensor assembly design, incorporating higher-quality moulded parts and adding precise positioning devices in the future.

By contrast, commercially available reflectometers use Halogen and Deuterium stable light sources, which cover the full spectrum from UV to Infrared. In this case, for demonstration purposes, commercial LEDs were used, which are limited to the Visible (VIS) spectrum (450nm to 700nm). The light intensity of the LEDs was controlled via hardware (HW) and software (SW). HW-wise, potentiometers adjusted to 1.34k\(\Omega\) were used to deliver a light intensity of 164Lux (±20%). Software-wise, the integration time was adjusted to 0.11 s when measuring the SiO2 coating to reduce an observed offset vs the uncoated Si reference. Although this is not a widespread practice according to the publicly available manuals21,25 and following the NPL practices38, this was the best combination to ensure a good fit of the measured reflectance curve for the stated conditions.

All the aforementioned factors could also be impacted by the quality of the 3D-printed parts. The 3D printer nozzle had a tolerance of 0.2mm, and each part of the reflectometer was printed in Stereolithography (SLA) material designed to hold the components in place by making a “snap” assembly; however, the parts presented deformation that did not allow a tight assembly. The combination of the deformation with the tight tolerances of the current design left an unstable structure, making it sensitive to misalignment by the USB cables coming out of each reflectometer. This situation made it impossible to ensure a normal positioning of the sensor array with respect to the sample (zero degrees alignment) when performing the measurements, which sometimes caused catastrophic errors. This is critical because the angle of incidence affects the measurements as seen in Eq. 2. To overcome the misalignment issues, a cable handler support was added during testing, which successfully removed the number of catastrophic errors. The sources of variation of the printed parts are out of the scope of this paper’s research and will be investigated in future work. The USB hub was also a source of random noise added to the system. The hub presented noise when more than one sensor was connected to it. After Fourier and Lomb-Scargle analysis, it was not possible to locate a frequency to apply a noise reduction filter. Although in case a frequency could be isolated, adding a filter per pixel would be a high-cost processing, therefore, the best noise reduction strategy was to apply a convolution procedure using the Python NumPy library. All the aforementioned sources of variation are accounted for in the uncertainty calculation and explain the uncertainty difference between the F20 and the sensor array in Table 3. Knowing the sensor array limitations is key to understanding the development requirements to improve its performance and test its capabilities with flexible substrates. Although there is room for improvement, the results are promising as the sensor array can measure dry thin-film thickness with less than 2.1% thickness variation compared to a well-established SR system when it is positioned in the calibration point, and when there are angle variations below 0.5\(^{\circ }\). Moreover, the sensor array can detect RMSE and \(R^2\) variations when the sample goes 1mm below the calibration point, which is a desirable feature for detecting web fluttering failures in R2R manufacturing. Future work with the sensing array includes the reduction of the sources of variation and finding its capability in R2R variable-speed environments, testing with flexible solar cell materials and potential feedback control for the robot arm testing sequence. For the manipulator, the VAE can be used to produce new trajectories that satisfy the desired constraint, but additionally, new constraints can be directly imposed within the Riemannian metric to allow different constraints, such as limits on rotations or positional translation. Furthermore, whilst this work has focused on wafer transportation for inspection through a custom end-effector, in future approaches, the sensing array could be attached to the end-effector of the manipulator to allow for R2R inspection with linear motion constraints.

In this research, an automated inspection system using a novel sensor array and a novel robot constraint manipulator has been presented. The sensor array is capable of measuring coating thickness with \(\le\) 2.1% error in static and dynamic environments when the wafer remains in the calibration point, and can detect angle variations of 0.5\(^\circ\) in the positions described in Fig. 5b. This can be combined with the robot manipulator to position a wafer to its calibration point whilst adhering to a novel learned constraint manifold, whereby the array can potentially perform surface mapping of the wafer.

This work sets the footprint for more size-reduced solutions using SR for inspection expansion in R2R systems, and the implementation of SR with more advanced sensor fusion techniques. Within the robotic manipulation field, this method allows for the design of constraint manifolds that are flexible to the constraint that is being imposed, whilst maintaining the underlying kinematics of the manipulator. Furthermore, this method allows the direct evaluation of whether the manipulator is deviating from the desired constraint.