Research on projection light field encoding and reconstruction based on parallel camera arrays

Zhang, Feng; Wang, Xi; Shi, Zhou Shi; Ma, Chang Pei; Jiang, Xiao Yu

doi:10.1038/s41598-025-14906-z

Download PDF

Article
Open access
Published: 13 August 2025

Research on projection light field encoding and reconstruction based on parallel camera arrays

Feng Zhang¹,
Xi Wang²,
Zhou Shi Shi¹,
Chang Pei Ma¹ &
…
Xiao Yu Jiang^1,2

Scientific Reports volume 15, Article number: 29709 (2025) Cite this article

1629 Accesses
Metrics details

Subjects

Abstract

This research introduces an advanced projection-based 3D light field reconstruction algorithm, specifically engineered to tackle the limitations of the conventional SPOC algorithm—namely its dependency on a symmetric (circular) framework and dense array image acquisition. The algorithm’s core innovation resides in the utilization of disparity images captured by parallel cameras: by selecting the two cameras with positions closest to the ideal configuration, it identifies pixels most similar to the object point through their captured imagery, thereby assigning pixel values to the projected image array. Furthermore, by integrating ray tracing principles, the algorithm enables sampling with a sparse camera array, facilitating high-fidelity 3D reconstruction. Experimental results, which compare computational efficiency and display performance against the SPOC algorithm, corroborate the efficacy and advantages of this approach.

Computational version of the correlation light-field camera

Article Open access 10 December 2022

Sparse pixel image sensor

Article Open access 05 April 2022

An integrated imaging sensor for aberration-corrected 3D photography

Article Open access 19 October 2022

Introduction

With the rapid development of computer vision technology, 3D light field reconstruction has become increasingly advanced, and its unique 3D display effects have attracted widespread attention¹. This technology holds significant scientific value and practical application in both everyday life and military fields.

Common techniques in 3D display include holographic display^2,3,7 lenticular lens display^4,5 volumetric 3D display^6,7,8 and projection-based 3D light field display^9,13. Among these, projection-based 3D light field display is considered to have broad prospects due to its features such as large display size, wide viewing angles, and glasses-free viewing¹⁰. In 2006, the Balogh team from France first developed a 3D display system using a planar scattering screen and projectors¹¹.

In projection-array-based light field display systems, which share a similar display principle to integral imaging, the typical algorithm employed is the SPOC algorithm. This approach reconstructs three-dimensional scenes from a large number of disparity images^12,18 enabling the preservation of the scene’s true details without the need for complex 3D modeling. A notable example is the study conducted in 2020 by Yu Haiyang, Jiang Xiaoyu, and colleagues from the PLA Army Armored Force Engineering College, China¹³. Their research focused on achieving light field reconstruction^17,18,19 using the SPOC^{14,15,16,20,21,22} algorithm within a fixed ring sampling system. The algorithm proposed in this paper aims to overcome the limitations of the ring (symmetrical) framework and the need for dense camera sampling.

This novel algorithm diverges from conventional projection-array-based 3D light field reconstruction approaches by eschewing circular structural simplifications. Rather, it incorporates parallel camera arrays from real-world scenarios, introducing an innovative light field encoding paradigm. The algorithm encodes imagery based on camera intrinsic parameters and their world coordinate system positions, thereby overcoming the limitations of traditional frameworks. Its core innovation resides in the elimination of constraints imposed by circular (symmetric) architectures. By leveraging camera intrinsic parameters and positional data, it accurately determines the optimal pixel locations in disparity images captured by parallel cameras that correspond to object points. This methodology enables 3D light field reconstruction using parallel camera arrays under sparse sampling scenarios. Additionally, the flexible mapping between display and sampling pixels substantially improves the display accuracy and photorealism of the reconstructed light field.

This paper’s Sect. "Structure of the projection array and traditional light field encoding algorithms" introduces the fundamental principles of projection-array-based 3D light field display systems and conventional light field encoding algorithms. Section "Projection-based 3d light field reconstruction using parallel camera arrays" delves into the detailed explanation of the 3D reconstruction algorithm based on parallel camera arrays and its optimization strategies. Section "Experimental results and analysis" validates the algorithm’s effectiveness through experimental verification and presents the corresponding results. Section "Problems and prospects" summarizes the experimental findings, discusses the current challenges, and outlines prospective directions for future research.

Structure of the projection array and traditional light field encoding algorithms

Projection array display system

The projection display system is composed of a projection array and a holographic scattering screen. The optical axes of the projectors converge on the geometric center of the screen, where light emitted by the laser projectors is focused. The screen’s anisotropic light modulation characteristics induce distinct scattering angles for transmitted light in the horizontal and vertical directions, enabling 3D light field reconstruction. Given that the human eye is more sensitive to horizontal disparity changes than vertical ones—and considering bandwidth limitations—both the traditional SPOC algorithm and the proposed algorithm disregard vertical disparity, retaining and reconstructing only horizontal disparity to enhance stereoscopic visual effects.

SPOC algorithm

The SPOC (Smart Pseudoscopic-to-Orthoscopic Conversion) algorithm operates on the principle of utilizing disparity images acquired through sampling as the elemental image array for simulated display. It employs this image array to encode the light field and synthesize an elemental image array that aligns with the parameters of the real display lens array. When the reference plane coincides with the elemental image plane, this process can be characterized as a direct mapping from the sampled pixel space to the display pixel space.

As illustrated in Fig. 1, the geometric center of the holographic scattering screen is designated as the origin of the coordinate system. To establish the unit length of this coordinate system, it is defined as equivalent to the size of the unit pixel projected from the projector onto the scattering screen. The formula can be expressed as:

$$e={\raise0.5ex\hbox{$\scriptstyle {{R_p}}$}\kern-0.1em/\kern-0.15em\lower0.25ex\hbox{$\scriptstyle {N*K}$}}$$

(1)

Where $\:{R}_{p}$ denotes the radius of the projector array,$\:\:N$ represents the horizontal resolution of the projector, and$\:\:K$ is the projection ratio of the projector. $\:\varDelta\:PAB$ encompasses all light information emitted by projector P. For specific pixels P1 and P2,suppose the pixel projected by projector P is $\:{P}_{n}$, with the coordinates of point P being $({x_p},{{\text{z}}_p})$ and the coordinates of point $\:{P}_{n}$ being$\:(0,{z}_{n})$.

$$\left\{ {\begin{array}{*{20}{c}} {{x_{\text{p}}}={R_p} \times \cos (\phi - p \cdot {\theta _p})} \\ {{z_{\text{p}}}={R_p} \times \sin (\phi - p \cdot {\theta _p})} \end{array}} \right.$$

(2)

Where $\:p$ denotes the projector index, with the angle$\phi$between the negative half-axis of the x-axis and the projector corresponding to projector index 0. ${\theta _p}$represents the angular separation between two projectors that are evenly arranged on the arc.

$${z_n}=(n - \frac{N}{2}) \cdot e,n \in \left[ {0,N - 1} \right]$$

(3)

The equation for the display light ray can be expressed as:

$$z=\frac{{{z_p} - {z_n}}}{{{x_p}}} \times x+{z_n}$$

(4)

By integrating the equation of the display light ray with the equation of the camera trajectory, their intersection point can be denoted as$({x_{ideal}},{z_{ideal}})$, If a camera is positioned at this intersection --for instance, at the intersection${C_1}$between line$L1$and the camera trajectory -- the information of object point ${O_1}$ conveyed by the display light ray$P{P_2}$can be completely sampled by the camera. However, due to the non - zero spacing between sampling cameras, in cases such as when there is no camera sampling at the intersection${C_2}$between line$L2$and the camera trajectory, the information of object point ${O_2}$cannot be collected. The algorithm is designed to synthesize sampling rays that precisely coincide with the display light rays. In other words, it seeks to generate sampling pixels that perfectly align with the display pixels, thereby enabling the assignment of pixel values.

Projection-based 3D light field reconstruction using parallel camera arrays

The SPOC algorithm relies on a ring-shaped system framework, leading to insufficient collection of light information and poor light field reconstruction performance. To overcome the limitations of the ring (symmetric) framework and the challenges of dense camera array sampling, this paper proposes a projection-based 3D light field reconstruction algorithm using parallel camera arrays. This algorithm breaks away from the traditional framework, aiming to capture more comprehensive light field information and significantly enhance the reconstruction results.

In practical applications, installation errors of parallel camera arrays may lead to deviations in camera positions, optical axes, and other parameters, thereby introducing systematic errors. To address this, this paper proposes a method integrating camera calibration and coordinate transformation to effectively mitigate the impact of such errors. In this algorithm, the asymmetry of the system framework—resulting from departures from traditional symmetric structures—exacerbates the incompleteness of captured light field information. To tackle this, an approximate point matching algorithm based on the Lambertian properties of objects is further proposed, aiming to enhance the accuracy and stability of light field reconstruction.

Camera parameters and coordinate transformation method

To eliminate errors, this algorithm first necessitates the acquisition of both intrinsic and extrinsic parameters of the free - moving camera. The intrinsic parameters encompass the camera’s focal length and the coordinates of the optical center, while the extrinsic parameters denote the rotation and translation offsets of the camera with respect to the coordinate system. The camera calibration and coordinate transformation process involves conversions among multiple coordinate systems, such as the world coordinate system, the camera coordinate system, and the image coordinate system as illustrated in Fig. 2. Through these transformations, the spatial position and orientation of the camera can be accurately calibrated, effectively mitigating the impact of errors on light field reconstruction.

Any point $({x_w},{y_w},{z_w})$ in the world coordinate system, based on Eq. (5), can be transformed to determine its coordinates in the camera coordinate system.

$$\left[ {\begin{array}{*{20}{c}} {{x_c}} \\ {{y_c}} \\ {{z_c}} \end{array}} \right]=R * \left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \end{array}} \right]+T$$

(5)

Similarly, for a point $({x_c},{y_c},{z_c})$in the camera coordinate system, the corresponding point in the world coordinate system $({{\text{x}}_w},{y_w},{z_w})$ can be calculated based on Eq. (6).

$$\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \end{array}} \right]={R^{ - 1}} * \left[ {\begin{array}{*{20}{c}} {{x_c}} \\ {{y_c}} \\ {{z_c}} \end{array}} \right] - {R^{ - 1}}T$$

(6)

Let ${R^{ - 1}}$ be ${R_{{\text{cw}}}}$ and ${R^{ - 1}}T$ be ${T_{cw}}$.

Any point $({x_c},{y_c},{z_c})$ in the camera coordinate system can be mapped to the pixel coordinates based on Eqs. (4 − 3).

$$\left\{ {\begin{array}{*{20}{c}} {u=\frac{{{x_c}}}{{dx}}+{u_0}} \\ {v=\frac{{{y_c}}}{{dy}}+{v_0}} \end{array}} \right.$$

(7)

By combining Eqs. (5) and (7), the formula for directly transforming from the world coordinate system to the pixel coordinate system can be derived as:

$${Z_c}\left[ {\begin{array}{*{20}{c}} u \\ v \\ 1 \end{array}} \right]=\left[ {\begin{array}{*{20}{c}} {\frac{1}{{dx}}}&0&{{u_0}} \\ 0&{\frac{1}{{dy}}}&{{v_0}} \\ 0&0&1 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} \vline & f&0&0&0 \\ \vline & 0&f&0&0 \\ \vline & 0&0&1&0 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} R&T \\ 0&1 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \\ 1 \end{array}} \right]$$

(8)

Similarly, if the pixel coordinates are given as (u, v), the coordinates in the world coordinate system can be expressed as:

$$\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \end{array}} \right]=\left[ {\begin{array}{*{20}{c}} {(u - {u_0}) * dx} \\ {(v - {v_0}) * dy} \\ f \end{array}} \right] * {R_{cw}}+{T_{cw}}$$

(9)

Here f denotes the focal length of the camera.

Approximate point matching algorithm based on parallel camera arrays

In the ideal scenario, we assume that the parallel camera array is positioned at a height h from the geometric central plane of the object. The camera trajectory is then expressed as:

$${\text{z}}=h$$

(10)

As shown in Fig. 3, the intersection of the camera trajectory and the display light ray $P{P_n}$ is denoted as $({x_i},{z_i})$. The equation for the display light ray $P{P_n}$ is given by:

$$z=\frac{{{z_p}}}{{{x_p} - {x_n}}}(x - {x_n})$$

(11)

Where $({x_p},{z_p})$ are the coordinates of point $\:P$,and$({x_n},0)$are the coordinates of point ${P_n}$.

$$\left\{ {\begin{array}{*{20}{c}} {{x_i}=\frac{{h \times ({x_p} - {x_n})}}{{{z_p}}}+{x_n}} \\ {{z_i}=h} \end{array}} \right.$$

(12)

The intersection of the camera trajectory and the display light ray denotes the position of the ideal camera. The parallel camera array is uniformly distributed, with the horizontal coordinates of two adjacent cameras forming an interval $({x_j},{x_{j+1}})$. When ${x_i} \in ({x_j},{x_{j+1}})$ is satisfied, the two cameras closest to the ideal camera can be identified. Let these two closest cameras be denoted as${C_A}$and ${C_B}$.Given that the distance from the object to the sampling cameras is significantly larger than the interval between adjacent cameras, according to the Lambertian properties of the object, for the same object point, the pixel pair formed by the light rays passing through cameras ${C_A}$and${C_B}$ is the most similar pixel pair.

Since the position of object point $\:O$ is unknown, its location must be determined to compute the corresponding pixel information. To enhance computational efficiency, the algorithm first defines the range of the virtual active space. As illustrated in Fig. 4, the coordinates of the near and far points on the plane $\:y=0$ are derived by solving the system of equations formed by the display light ray and the virtual boundary equation:

$$z=l,z= - l$$

(13)

Here l denotes the distance from the virtual boundary to the plane.

Let the near and far points be denoted as $({x_{nA}},0,{z_{nA}})$ and $({x_{fA}},0,{z_{fA}})$,respectively. In the camera coordinate system, the coordinates of the camera’s optical center are $(0,0,0)$, and the coordinates of the principal point are $(0,0,f)$. Based on Eqs. (4 − 2), the coordinates of the camera’s optical center $C({x_{cw}},{y_{cw}},{z_{cw}})$and the principal point $F({x_{iw}},{y_{iw}},{z_{iw}})$in the world coordinate system can be determined.

Since the optical axis of the camera is perpendicular to the imaging plane, the normal vector of the imaging plane can be expressed as:

$$\overrightarrow n =({x_{nw}},{y_{nw}},{z_{nw}})=({x_{iw}} - {x_{cw}},{y_{iw}} - {y_{cw}},{z_{iw}} - {z_{cw}})$$

(14)

The mathematical expression for the imaging plane is:

$${x_{nw}} \cdot (x - {x_{cw}})+{y_{nw}} \cdot (y - {y_{cw}})+{z_{nw}} \cdot (z - {z_{cw}})=0$$

(15)

When camera ${C_B}$ samples the near-field point near$({x_{na}},0,{z_{na}})$, the expression for the sampling ray in the world coordinate system is:

$$\left\{ {\begin{array}{*{20}{c}} {x=t * ({x_{cwB}} - {x_{{\text{na}}}})+{{\text{x}}_{na}}} \\ {y=t * {y_{cwB}}} \\ {z=t * ({z_{cwB}} - {z_{na}})+{z_{na}}} \end{array}} \right.$$

(16)

Where $({x_{cwB}},{y_{cwB}},{z_{cwB}})$ are the coordinates of camera ${C_B}$in the world coordinate system.

The intersection point of the sampling ray with the imaging plane represents the camera’s sampling information. The coordinates of the intersection point can be obtained by calculating t, where:

$$t=\frac{{{x_{nw}}({x_{cw}} - {x_{na}}) - {z_{nw}}({z_n} - {z_{nw}})+{y_{nw}}{y_{cw}}}}{{{x_{nw}}({x_{cwB}} - {x_{na}})+{y_{nw}}{y_{cwB}}+{z_{nw}}({z_{cwB}} - {z_n})}}$$

(17)

By combining the coordinate transformation, the pixel coordinates ${I_{BN}}({u_{NB}},{v_{NB}})$ corresponding to the near-field point can be determined. Similarly, the pixel coordinates ${I_{BF}}({u_{FB}},{v_{FB}})$ for the far-field point can be obtained. The pixel coordinates $(u,v)$corresponding to the object point $O$ should satisfy that $u$lies within the interval defined by ${u_{NB}}$ and ${u_{FB}}$, and $v$ lies within the interval defined by ${v_{NB}}$ and${v_{FB}}$.

Due to the lack of symmetry in the system framework, it is not possible to easily determine the relative size of ${u_{NB}}$ and ${u_{FB}}$, as well as ${v_{NB}}$ and ${v_{FB}}$. Therefore, we need to discuss the cases separately. Assume that ${u_{NB}}<{u_{FB}},{v_{NB}}<{v_{FB}}$ then:

$$u \in [{u_{NB}},{u_{FB}}],v \in \left[ {{v_{NB}},{v_{FB}}} \right]$$

(18)

Let the pixel index of the object point $O$ after being sampled by camera ${C_B}$be denoted as${I_{OB}}$, and the pixel indices corresponding to the near and far points be denoted as ${I_{nB}}$and${I_{fB}}$ respectively. Then:

$$0 \leqslant {I_{{\text{n}}B}}<{I_{OB}}<{I_{fB}} \leqslant K - 1$$

(19)

K is the horizontal resolution of the camera.

The imaging interval of the object point $O$ has been determined above. Next, we need to determine the actual imaging pixel of the object point $O$. Let the actual imaging pixel be denoted as ${I_{SB}}$, then ${I_{SB}} \in [{I_{nB}},{I_{fB}}]$. The pixel coordinates of ${I_{SB}}$ can be converted to world coordinates through Eq. (9). Since the camera’s optical center $C({x_{cw}},{y_{cw}},{z_{cw}})$is known, the equation of the sampling ray ${P_n}{I_{SB}}$is known. Let the equation be:

$$y={k_c} \cdot x+{b_c}$$

(20)

By combining Eqs. (20) and (11), the position of ${P_n}$can be determined. Then, by using the position of ${P_n}$and the location of ${C_A}$, the mathematical expression of the line ${P_n}{C_A}$can be established. The intersection of the line ${P_n}{C_A}$with the imaging plane of ${C_A}$gives the pixel ${I_{SA}}$, thus determining a set of pixel pairs. By traversing each imaging pixel in the range $[{I_{nB}},{I_{fB}}]$, multiple sets of pixel pairs can be determined.

Just as Fig. 5(a) shows the schematic diagram of the overall algorithm, Fig. 5(b) is a partial enlarged view. When ${I_{OB}} \ne {I_{SB}}$, the actual object points corresponding to ${I_{SB}}$ and${I_{SA}}$are ${O_B}$and${O_A}$, respectively. Since ${O_B}$and ${O_A}$ are actually unrelated, the corresponding imaging pixels have significant differences. When ${I_{SB}}={I_{OB}}$, it implies that ${I_{SA}}={I_{OA}}$. Since the object points sampled by ${I_{OA}}$and ${I_{OB}}$ both correspond to the same object point $O$, the sampled pixels are very similar. Based on the above analysis, this algorithm uses variance to determine the pixel corresponding to the real object point $O$. The imaging pixels of the real object point after being sampled by two adjacent cameras can be obtained by traversing and calculating multiple sets of pixel pairs. The pixel pair with the smallest variance is considered the pixel pair corresponding to the real object point $O$.

Similarly, the light field reconstruction algorithm for different vertical heights remains unchanged. After sampling the real object point $O$, this represents a series of continuous light field information. After camera sampling, it is discretized into $J$ (the vertical resolution of the camera) imaging pixels. Therefore, the variance ${s^2}$ is:

$${s^2}=\sum\nolimits_{{j=0}}^{{J - 1}} {\tfrac{{{{(I_{{SB}}^{j} - {\raise0.5ex\hbox{$\scriptstyle {(I_{{SB}}^{j}+I_{{SA}}^{j})}$}\kern-0.1em/\kern-0.15em\lower0.25ex\hbox{$\scriptstyle 2$}})}^2}+{{(I_{{SA}}^{j} - {\raise0.5ex\hbox{$\scriptstyle {(I_{{SB}}^{j}+I_{{SA}}^{j})}$}\kern-0.1em/\kern-0.15em\lower0.25ex\hbox{$\scriptstyle 2$}})}^2}}}{2}}$$

(21)

Here, $I_{{SB}}^{j}$ represents the jth pixel in the ${I_{SB}}$column of the disparity image captured by the ${C_B}$ camera, and $I_{{SA}}^{j}$ represents the jth pixel in the ${I_{SA}}$column of the disparity image captured by the ${C_A}$camera.

By combining the above formula with the pixel coordinate system, we get:

$${{\text{s}}^2}=\sum\nolimits_{{j=0}}^{{J - 1}} {\tfrac{{({{(P_{A}^{m} - (P_{A}^{m}+P_{B}^{m})/2)}^2}+({{(P_{B}^{m} - (P_{A}^{m}+P_{B}^{m})/2)}^2}}}{2}}$$

(22)

Where $P_{A}^{m}$represents the pixel value at pixel coordinates $({u_{NAm}},{v_{NAm}})$on the imaging plane of the camera numbered ${C_A}$, and $P_{B}^{m}$ represents the pixel value at pixel coordinates $({u_{NBm}},{v_{NBm}})$ on the imaging plane of the camera numbered ${C_B}$.

Let the pixel coordinates corresponding to the object points in each row of contiguous object points be denoted as $({u_{nam}},{v_{nam}})$and $({u_{nbm}},{v_{nbm}})$, where the variance is minimized. A weighted fusion method is then applied to assign values to the composite pixel. The resulting value of the pixel in the mm-th row is denoted as:

$$P_{n}^{m}=0.5 * P_{{OA}}^{m}+0.5*P_{{OB}}^{m}$$

(23)

Where $P_{{OA}}^{m}$ represents the pixel value at pixel coordinates $({u_{nam}},{v_{nam}})$ on the imaging plane of the camera numbered ${C_A}$, and $P_{{OB}}^{m}$represents the pixel value at pixel coordinates $({u_{nbm}},{v_{nbm}})$ on the imaging plane of the camera numbered ${C_B}$.

In the process of computing the composite pixel, the calculation and assignment of values for each row of pixels in the display column, corresponding to the sampled pixels in the composite pixel, are equivalent to completing the vertical light field mapping. Therefore, the mapping relationship between the display column pixels and the composite column pixels can be expressed as follows:

$$dP_{n}^{m}=P_{n}^{m}$$

(24)

Where $dP_{n}^{m}$ represents the pixel value of the $m$ row pixel in the display column ${P_n}$.

The above is based on the assumption of ${u_{NB}}<{u_{FB}},{v_{NB}}<{v_{FB}}$, then $u \in [{u_{NB}},{u_{FB}}],v \in \left[ {{v_{NB}},{v_{FB}}} \right]$. The algorithm principle for other $u \in [{u_{NB}},{u_{FB}}],v \in \left[ {{v_{FB}},{v_{NB}}} \right]$,$u \in [{u_{FB}},{u_{NB}}],v \in \left[ {{v_{NB}},{v_{FB}}} \right]$,$u \in [{u_{FB}},{u_{NB}}],v \in \left[ {{v_{FB}},{v_{NB}}} \right]$is similar.

Improvement of the approximate point matching algorithm

In the approximate point matching algorithm, it is assumed that ${u_{NB}}<{u_{FB}},{v_{NB}}<{v_{FB}}$,$u \in [{u_{NB}},{u_{FB}}],v \in \left[ {{v_{NB}},{v_{FB}}} \right]$ but in practice, there may be cases where the horizontal distance between two pixels is less than the size of one pixel, i.e., $\left| {{u_{NB}} - {u_{FB}}} \right|<1$. This means that the position information in the horizontal direction of the real object point corresponding to the same plane is determined.

As illustrated in Fig. 6, the display light ray $P{P_n}$can be fully sampled by camera ${C_B}$. Thus, it is unnecessary to determine the real positional information of the object point $O$. Instead, we only need to compute the intersection of $P{P_n}$with the imaging plane using Eqs. (4) and (15), and determinen the pixel information corresponding to the object point $O$ on the horizontal plane $y=0$ through coordinate transformation. or the pixels in a column projected by the projector, it is only necessary to calculate the projection height of the projector onto the holographic scattering screen, namely:

$${z_h}= - e * (i - M/2),i \in \left[ {0,M - 1} \right]$$

(25)

Where $M$ denotes the vertical resolution of the projector. Next, based on the intersection of the display light ray passing through the point $({x_n},0,{z_h})$with the imaging plane, the pixel information corresponding to this intersection can be computed. This simplification of the computational process allows for faster calculation of the projection image array.

Experimental results and analysis

The projection array employed in this system is depicted in Fig. 7.Comprising 108 projectors with a resolution of 1280 × 720 and a holographic scattering screen, the projectors are evenly distributed along an arc with a 1.7-meter radius. Featuring an angular interval of 0.5 degrees, they cover an azimuthal angle range of [− 27°, 27°]. The center of the holographic scattering screen aligns with the arc center, and its plane is perpendicular to the optical axis of the central projector at an azimuthal angle of 0°.

Using the disparity images obtained from sampling as the input, the projection image array is computed by the approximate point matching algorithm and the traditional SPOC algorithm (with the placement and rotation of the projection array taken into account). The imaging results are shown in Fig. 8.

Owing to the discrepancies in the acquisition processes of the two methods, the input image arrays exhibit substantial variations, and the shooting angles also vary significantly, introducing numerous variables. Hence, this paper refrains from discussing the occurrence of image deformation and instead focuses on performance metrics such as the overall clarity after image reconstruction and the computational speed of the algorithm.

Table 1 Parameters of the two Algorithms.

Full size table

As tabulated in Table 1, the parallel camera-based approximate point matching algorithm exhibits a twofold speedup compared to the SPOC algorithm. Overall, both algorithms yield favorable results for the computed projection image arrays, with both clearly resolving object contours. Notable discrepancies emerge in image details, however. Compared to Fig. 8(b), the image in Fig. 8(a) is free from mosaic artifacts and stripe distortion. Relying on a single reference plane, the SPOC algorithm struggles to capture continuous object information under complex depth variations. By contrast, the proposed algorithm dispenses with the need for a reference plane, instead determining optimal pixel matches through object point similarity comparisons to assign values to the projection image array. This approach enables more accurate reconstruction of object information across diverse depth layers under sparse sampling conditions, substantially enhancing image detail representation.

To further quantify the image display quality, the widely adopted Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity (SSIM) in image quality assessment are hereinafter introduced. The projection image array derived from the disparity image sampled under dense conditions serves as the reference image, whereas the projection image arrays generated by the two algorithms using disparity images sampled under sparse conditions serve as the experimental images. PSNR and SSIM are calculated for both scenarios, respectively:

1)
Compare the first projected image in the reference images with the first projected image in the experimental images.
2)
Compare all projected images in the reference images with all projected images in the experimental images

Table 2 Comparison of PSNR and SSNR numerical results in the first Case.

Full size table

Table 3 Comparison of PSNR and SSNR numerical results in the second Case.

Full size table

Both Tables 2 and 3 demonstrate that the parallel camera-based approximate point matching algorithm outperforms the SPOC algorithm in terms of performance.

Finally, the projection image array is loaded into the projector array to observe the light field reconstruction effect. The display result is presented in Fig. 9.

As depicted in Fig. 9, the image details produced by the SPOC algorithm are significantly more blurred than those of the parallel camera algorithm, with severe aliasing evident in Fig. 9(a). This arises because the SPOC algorithm directly uses pixel values from the closest cameras to assign pixel values to object points in the projection array. When camera positions deviate from ideal positions, this pixel assignment method introduces substantial errors. Multiple misassigned pixels corresponding to light rays converge at the holographic scattering screen, readily leading to pronounced ghosting artifacts.

As the viewing angle changes, the parallel camera-based approximate point matching algorithm demonstrates less pronounced variations in perspective compared to the SPOC algorithm. This stems from the relatively narrow angular capture range of the parallel camera array. Theoretically, the maximum viewing angle achievable by the parallel array is restricted, whereas a ring-shaped camera array can cover a far broader angular span. Specifically, the parallel camera array is limited to a 180-degree maximum viewing angle, while the ring-shaped configuration can theoretically achieve a full 360-degree coverage.

Problems and prospects

The approximate point matching algorithm based on parallel camera sampling proposed in this study successfully breaks through the technical bottleneck of the traditional SPOC algorithm, which relies on a symmetric architecture for designing the display and acquisition systems. By leveraging the Lambertian properties of objects, the algorithm achieves high-quality light field reconstruction under sparse sampling conditions. Experimental data shows that the computational efficiency of the approximate point matching algorithm is approximately twice that of the SPOC algorithm with the same number of disparity images. However, the current algorithm still has a long training time, and the computational efficiency can be further optimized by using the CUDA architecture in the future.

Although the algorithm breaks the constraints of the ring (symmetric) architecture and outperforms the SPOC algorithm in terms of display effect and computational efficiency, its viewing angle variation range is narrower than that of the SPOC algorithm. It is worth noting that when the sparsity of the camera array increases, the algorithm may have problems such as image aliasing and color boundary blurring. Excessive sparse sampling will also lead to a large amount of information loss. Future research can explore deep learning-based automatic multi-view image generation technology to solve the problem of information completion in sparse sampling scenarios.

Data availability

The data and code involved in this paper can be obtained from the corresponding author upon reasonable request.

References

Xingpeng, Y. et al. 120 years of light field display: making Dreams a reality [J]. Chin. J. Lasers. 51 (22), 9–50 (2024).
Google Scholar
Wang, D. et al. Liquid lens based holographic camera for real 3D scene hologram acquisition using end-to-end physical model-driven network. Light-Science \& Application DOI. https://doi.org/10.1038/s41377-024-01410-8 (2024).
Article Google Scholar
Xiao, X. et al. Three-dimensional holographic display using dense ray sampling and integral imaging capture. J. Disp. Technol. 10, 688–694 (2014).
Article ADS Google Scholar
Li, L. et al. 3D light-field display with an increased viewing angle and optimized viewpoint distribution based on a ladder compound lenticular lens unit. Opt. Experss. 29, 34035–34050 (2021).
Article ADS Google Scholar
Kim.c. Electrowetting lenticular lens for a multi-view autostereoscopic 3D display. IEEE Photonics Technol. Lett. 28, 2479–2482 (2016).
Article ADS Google Scholar
Martinez, C., Javidi, B. & Manuel. & Fundamentals of 3D imaging and displays: a tutorial on integral imaging,light-field, and plenoptic systems. Adv. Opt. Photonics. 10, 512–566 (2018).
Article ADS Google Scholar
Hahn, J. et al. Volumetric 3D display: Features and classification. Curr. Opt. Photonics 7(6), 597–607 (2023).
Google Scholar
Favalora GE.Volumetric 3D displays and application infrastructure.Computer.https://doi.org/10.1109/MC.2005.276
Doronin, O. & Bregovic, R. & Gotchev,.A. Optimized 3D scene rendering on projection-based 3D displays. in 2020 28th European Signal Processing Conference (EUSIPCO). (IEEE, 2021).
Geng, J. Three-dimensional display technologies. Adv. Opt. Photonics. 5, 456–535 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Agocs, T. et al. Balogh,T.,Forgacs,TA large scale interactive holographic display. IEEE Virtual Reality Conference.IEEE, (2006).
Wang, Y. F., Jiang, X. Y., Gao, H. & Wang, J. F. Research progress of computational integral imaging. Three-Dimensional Image Acquisition Disp. Technol. Appl. https://doi.org/10.1117/12.2503576 (2018).
Article Google Scholar
Hai, Y. Y. et al. Research summary on light field display technology based on projection. Journal of Physics: Conference Series. 1682, 012–033 (2020).
Navarro, H., Martinez, R., Saaverdra, G., Martinez, M. & &Javidi ,B.3D integral imaging display by smart pseudoscopic-to-orthoscopic conversion(SPOC). Opt. Express. 18, 25573–25583 (2010).
Article ADS CAS PubMed Google Scholar
Yan, M. et al. Improved smart pseudoscopic-to-orthoscopic conversion algorithm for integral imaging with pixel information averaging. Frontiers Phys. DOI. https://doi.org/10.3389/fphy.2021.696623 (2021).
Article Google Scholar
Manual, M., Adrian, D., Hector, N., Genaro, S. & &Bahram J.Three-dimensional display by smart pseudoscopic-to-orthoscopic conversion with tunable focus. Appl. Opt. 53, 19–25 (2014).
Article Google Scholar
Ren Ng, M. et al. Light Field Photography with a Hand-held Plenoptic Camera. [Research Report] CSTR 2005-02, Stanford university (Stanford University Computer Science Tech Report, 2005).
Levoy, M. & Hanrahan, P. Light field rendering. in Seminal graphics papers: Pushing the boundaries, Vol. 2. 441–452 (2023).
Wanner, S. & Goldluecke, B. Variational light field analysis for disparity Estimation and super-resolution. IEEE Trans. Pattern Anal. Mach. Intell. 36 (3), 606–619 (2013).
Article Google Scholar
Huang, Y. et al. Performance enhanced elemental array generation for integral image display using pixel fusion. Front. Phys. 9, 97. https://doi.org/10.3389/fphy.2021.639117 (2021).
Article Google Scholar
Yang, S. et al. High quality integral imaging display based on Off-Axis pickup and high efficient Pseudoscopic-to-Orthoscopic conversion method. Opt. Commun. 428, 182–190. https://doi.org/10.1016/j.optcom.2018.07.021 (2018).
Article ADS CAS Google Scholar
Meng, Q. et al. A sparse capture light-field coding algorithm based on target pixel matching for a multi-projector-type light-field display system. Photonics. Vol. 10. No. 2. MDPI, (2023).

Download references

Author information

Authors and Affiliations

National Key Laboratory of Information Photonics and Optical Communications, Beijing University of Posts and Telecommunications, Beijing, 100876, China
Feng Zhang, Zhou Shi Shi, Chang Pei Ma & Xiao Yu Jiang
Army Academy of Armored Forces, Beijing, 100072, China
Xi Wang & Xiao Yu Jiang

Authors

Feng Zhang
View author publications
Search author on:PubMed Google Scholar
Xi Wang
View author publications
Search author on:PubMed Google Scholar
Zhou Shi Shi
View author publications
Search author on:PubMed Google Scholar
Chang Pei Ma
View author publications
Search author on:PubMed Google Scholar
Xiao Yu Jiang
View author publications
Search author on:PubMed Google Scholar

Contributions

F.Z, X.Y.J, X.W wrote the main content of the report, S.Z.S and C.P.M prepared various figures, and finally F.Z integrated all the content. All the authors reviewed the manuscript.

Corresponding author

Correspondence to Xiao Yu Jiang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhang, F., Wang, X., Shi, Z.S. et al. Research on projection light field encoding and reconstruction based on parallel camera arrays. Sci Rep 15, 29709 (2025). https://doi.org/10.1038/s41598-025-14906-z

Download citation

Received: 22 January 2025
Accepted: 04 August 2025
Published: 13 August 2025
Version of record: 13 August 2025
DOI: https://doi.org/10.1038/s41598-025-14906-z