Introduction

Due to the inherently low absorption contrast of low-atomic-number materials, phase-contrast techniques have long been of interest in the field of hard X-ray imaging1.

The first and arguably most popular method is to defocus the sample, which highlights the sample edges2,3,4. While such edge detection helps in understanding the internal structures of the sample, directly retrieving the sample phase shift is challenging without prior knowledge of the sample5. To alleviate this limitation, multiple acquisitions at different sample-to-detector distances have been used6,7. However, translating either the sample or the detector along the beam direction often introduces additional technical issues such as image registration8.

Another popular approach is to introduce a modulator between the sample and the detector. A representative example is the Hartmann sensor, which uses a pinhole array as a modulator9. The displacement in each pinhole projection pattern provides a local phase gradient map, and the quantitative phase can be obtained by numerical integration10. Although it remains popular in beam characterization11, it has rarely been used for phase imaging due to the trade-off between image resolution and crosstalk between subregions.

Grating shearing interferometry is often preferred for phase imaging applications12,13. The shear between the grating diffraction orders allows the local phase gradient to be obtained from the interference without sacrificing spatial resolution. In optical microscopy with sufficient magnification, it is possible to isolate the interference term from a single image by using off-axis interferometry14. In contrast, projection X-ray imaging typically uses an analyzer grating to demodulate the interferometric fringe, requiring multiple acquisitions to isolate the interference term13. The use of physical interference ensures the credibility of the measured phase gradient, which is a distinct advantage8. Despite its usefulness, the requirement for high-aspect-ratio X-ray gratings significantly increases the technical difficulty of reproduction, especially for the two-dimensional gratings needed for unambiguous phase determination15,16.

Recently, speckle-tracking methods have been proposed that introduce a diffuser as a modulator17,18,19,20. The use of ordinary paper or sandpaper instead of special pinhole arrays or gratings significantly reduces the technical difficulty. Similar to the Hartmann sensor, the local phase gradient is derived from the displacements of subregions, which in this case are random speckle patterns. Due to the sharply peaked autocorrelation function of the speckle, it is possible to uniquely determine the local displacements without crosstalk21. Since accurate local displacement estimation is crucial for robust phase retrieval, several phase retrieval algorithms have been developed. One popular approach is direct cross-correlation calculation based on subregion partitioning, similar to that of Hartmann sensors22,23,24. Another popular approach is to use the transport intensity equation (TIE) concept, originally introduced in defocusing methods25,26,27,28. Although both approaches work reasonably well in practical situations29,30,31, they rely on assumptions or approximations that simplify the physical imaging model. We summarized the assumptions, required measurements of speckle-tracking methods in Table S1. As a result, it is difficult to determine whether they fully capture the sample field information encoded in the speckle patterns.

Fortunately, speckle-based imaging has already been studied in various disciplines for both coherent32,33,34,35,36,37 and incoherent imaging systems38,39,40,41. In particular, the phase retrieval from a single speckle pattern has been extensively explored in the fields of computational imaging33,34,36,37 and applied mathematics42,43,44,45. By exploiting the mathematical properties of random variables, robust phase retrieval from a single speckle pattern has been consistently reported without additional approximations or assumptions33,43. In contrast to speckle-tracking methods, significant sample-induced speckle decorrelation is usually assumed, and the phase information is retrieved from the random self-interference of the incident field37. Interestingly, these multidisciplinary works share common requirements regarding the number of measured speckle grains relative to the number of unknown variables. This is referred to as the resolution ratio36, the oversampling rate42, or the oversampling ratio33,45.

Despite significant discrepancies in experimental conditions, the mathematical principles and algorithms of speckle-based imaging should prevail for conventional speckle-based X-ray microtomography (Fig. 1a), since both methods decode the sample field information from measured speckle patterns. Then, is it possible to find a phase retrieval algorithm that incorporates all the lessons learned from the aforementioned speckle-related methods? In this paper, we present the preconditioned Wirtinger flow (PWF) as the solution (Fig. 1b). Exploiting the Wirtinger flow as a backbone43, we introduce three core ideas: a reliable physical model that includes the decoherence effect of the partially coherent X-ray source (Fig. 1a); an preconditioner inspired from speckle-tracking methods (\({{\bf{P}}}^{-1}\) in Fig. 1b); and a Tikhonov regularizer to satisfy the oversampling ratio criterion (\({{\mathbf{\Gamma }}}^{\dagger }{\mathbf{\Gamma }}\) in Fig. 1b). We successfully demonstrate the effectiveness of PWF in experiments (Fig. 2). From the corresponding tomogram results (Fig. 3), we find that PWF, with a single speckle measurement, outperforms existing speckle-tracking methods, which uses 12 different speckle measurements, in both refractive index accuracy and resolution. We also demonstrate the versatility of PWF by reconstructing phase tomograms in various samples (Fig. 4).

Fig. 1: Speckle–based X-ray microtomography with a partially coherent source.
figure 1

a Schematic of the experimental setup. Partially coherent source provides an additional blurring effect on the intensity image. The symbol indicates image convolution. \({L}_{1}=3\,\text{mm}\) and \({L}_{2}=20\,\text{mm}\) are used throughout the paper. An experimentally measured speckle image of the toothpick with glass beads sample is presented on the detector plane. b Schematic of preconditioned Wirtinger Flow (PWF). The vectors are defined as \({\boldsymbol{\psi }}:={\left[{\psi }_{r}\right]}_{1\le r\le m}\), where m is the number of pixels in the image. Although the variables and operators are represented in vector-matrix form here for visualization purposes, the actual algorithm is performed based on element-wise calculations and Fourier transforms (see Algorithm S1). \(z=z^{\prime} +{z}^{\prime\prime}\) is a complex number. c Reconstructed phase and attenuation from the measured speckle pattern (see Fig. 2 for details)

Fig. 2: Field reconstruction results of a toothpick with glass beads.
figure 2

a–d Reconstructed phase (a), horizontal phase gradient (b), and attenuation (c), and the remaining root-mean-square error (RMSE, d) from PWF with a single measurement (K = 1). eh The reconstructed counterparts from LCS with 12 measurements at different diffuser positions (K = 12)27

Fig. 3: Tomographic reconstruction results of a toothpick with glass beads.
figure 3

a, b The real (a) and imaginary (b) parts of the reconstructed 3D refractive index distribution, \(n\left({\bf{r}}\right)=1-\delta \left({\bf{r}}\right)+i\beta \left({\bf{r}}\right)\), from PWF with a single measurement (K = 1). Please refer to Video S1 for the results from different cross sections. The yellow arrows indicate the interface between the toothpick and the adhesive. The white arrow symbols indicate the rotation axis orientations. The intersection of two orthogonal slices is shown as a solid line in the background. c, d The reconstructed counterparts from a LCS with 12 measurements in different diffuser positions (K = 12)27. Note that the color scale is different from the PWF results (a, b). The red arrows indicate the reconstruction artifacts. e Line profiles along the intersection lines shown in the backgrounds. The expected δ and β values of the beads are shown as dotted lines. f The Fourier shell correlation (FSC) of the reconstructed tomograms. The intersections with the 1/4 criterion (dotted line) are pointed by the arrows with the corresponding spatial resolutions

Fig. 4: Reconstructed quantitative phase tomogram \(\delta ({\bf{r}})\) of various samples.
figure 4

a A cumin seed. b Dried shrimp. c Dried anchovy. d Piece of cork. The red and yellow arrows highlight the structural contrast of δ, and fine structures, respectively. All colorbar units are 10−6. Please refer to Videos S2S5 for the results from different cross sections. The reconstructed \(\beta ({\bf{r}})\) are shown in Fig. S11

Results

Preconditioned Wirtinger flow

We propose PWF to retrieve the complex sample field from the conventional speckle-based X-ray microtomography (Fig. 1a). The reconstruction requires a single sample speckle pattern (\({y}_{r}\)) with a reference speckle pattern that is acquired without the sample. The X-ray wavelength (0.124 nm, 10 keV), and the distances before and after the diffuser (L1 = 3 mm and L2 = 20 mm) are known constants. The detailed PWF algorithm is described in Algorithm S1. Throughout the paper, we use vector indices r and k for real and reciprocal (or Fourier) space, respectively.

Instead of the sample field (\({x}_{r}\)), we directly reconstruct the complex phase (\({\psi }_{r}=\log {x}_{r}\)) to avoid potential instability caused by phase wrapping46. Note that the real and imaginary parts of \({\psi }_{r}\) represent the attenuation (with a negative sign) and the unwrapped phase of the sample, respectively. Now, the physics-based forward model (\(f:{\psi }_{r}\mapsto {y}_{r}\)) can be defined as follows:

$$f({\psi }_{r})={\left|{{\mathcal{P}}}_{2}\left\{{t}_{r}{{\mathcal{P}}}_{1}\left\{{e}^{{\psi }_{r}}\right\}\right\}\right|}^{2}* {\mathrm{IPSF}}_{r}$$
(1)

where \({{\mathcal{P}}}_{1,2}\left\{\cdot \right\}\) are the free-space propagation operators for the distances L1,2 based on angular-spectrum method, \({t}_{r}\) is the diffuser transmission function, IPSFr is the intensity point spread function, and is the image convolution operator (Fig. 1b).

We derive the diffuser transmission function (\({t}_{r}\)) from the reference speckle. A robust sample field reconstruction is found even with the phase ambiguity of the reference speckle (see Methods). The effect of partially coherent source is applied by convoluting an intensity point spread function (IPSFr) as described in Fig. 1a; or equivalently, by applying the intensity optical transfer function (IOTFk) as a window function. We experimentally obtain IOTFk from the power spectral density of the reference speckle, based on the Siegert relation (see Methods). Retrieved spatial horizontal and vertical coherence lengths (xcoh and ycoh) are 3.47 µm and 4.31 µm, respectively (Fig. S2). It is noteworthy that IOTFk here is identical to the coherence envelope” in ref. 4.

At this point, the phase retrieval problem can be understood as finding the ψr that best reproduces the measured speckle pattern based on the defined physical model:

$$\mathop{{\rm{minimize}}}\limits_{{\psi }_{r}}\mathop{\sum }\limits_{r}{\left({y}_{r}-f({\psi }_{r})\right)}^{2}$$
(2)

PWF is a gradient-based solver for Eq. (2), that seeks the solution by error backpropagation \(\nabla f({e}_{r})\), where \({e}_{r}={y}_{r}-f({\psi }_{r})\) (Fig. 1b). We use the Wirtinger derivative to compute the gradient of non-holomorphic complex functions to perform this minimization. As the measured intensity images are already a nonholomorphic function, most iterative algorithms have employed the Wirtinger derivative to backpropagate the error, either explicitly or implicitly. For instance, ptychographic iterations can be regarded as a variant that utilizes the Wirtinger derivative47,48, albeit with a modified loss function44.

Preconditioning is a popular technique for improving the convergence properties of iterative solvers, especially for ill-conditioned problems49. For a linear problem Ax = b, preconditioning can be understood as the compensation of the inhomogeneous singular value distribution of A, effectively making its gradient direction A more parallel to A−1. For complex nonlinear problems, however, the error propagation properties are often obscure, making it difficult to define an appropriate preconditioner.

Here, we determine the preconditioner with the help of speckle-tracking methods. Based on the working principle of speckle-tracking methods and their robustness, we understand that our imaging system is well-conditioned for the sample phase gradient rather than the phase itself. Since the phase gradient is equivalent to the filtered phase with linear spatial frequency ramps, the phase retrieval iteration is expected to be highly ill-conditioned, leading to large convergence rate discrepancy between high and low frequency (Fig. S3). Accordingly, we introduce an inverse quadratic preconditioning filter on the imaginary part of \({\psi }_{r}\) to invert the linear spatial frequency ramps and improve the convergence properties (see Methods).

The regularizer imposes the oversampling criterion in speckle-based imaging, which requires a sufficient oversampling ratio to ensure unique determination of the phase and amplitude in field reconstruction33,43. The oversampling ratio is defined as \(\gamma =M/N\), where M and N are the number of measured and reconstructed spatial modes37,50. We compute the space-bandwidth product (SBP) to estimate M and N, which then determine the regularization window for a given γ (see Methods).

The oversampling ratio (γ) and the regularization parameter (α) are determined heuristically. Since they usually depend strongly on the experimental signal-to-noise ratio (SNR), these parameters would depend on the imaging setup. Comparative results with different γ and α are shown in Figs. S4 and S5, respectively. Although different parameters would give different results, we did not find a strong dependence on the parameters between samples. Therefore, a fixed oversampling ratio (\(\gamma =1\)) and regularization parameter (\(\alpha =0.1+0.01i\)) are used throughout the paper. The real and imaginary parts of α are applied separately to the attenuation and phase of the sample field. The detailed calculations for the physical model, preconditioner, and regularizer are presented in Algorithm S2. The MATLAB code has been uploaded online to facilitate reproduction (see Data availability). To further accelerate the convergence, Nesterov’s accelerated gradient method was employed51 (Algorithm S1). A fixed step size (η = 1) is used throughout the paper.

Field reconstruction results

The experiments are performed at the X-ray microtomography beamline (6C, PLS-II) using a 10 keV partially coherent X-ray from a wiggler source (see Methods). A toothpick with glass beads is chosen as the first sample (see Methods). The measured raw speckle image is shown in Fig. 1a and the corresponding reconstruction results are shown in Fig. 2. The raw speckle and reconstructed images have the same number of pixels (2560 × 2160). The PWF algorithm takes 5 s to 6 s to converge on a personal computer (see Methods).

Along with PWF, we employ the low coherence system (LCS) implicit speckle-tracking method for a comparison27. It is the best-performing and assumption-free speckle-tracking method among the algorithms we tested24,25,27 (Fig. S6). Unlike PWF, which requires a single measurement (K = 1), speckle-tracking methods typically require multiple measurements for reliable phase reconstruction24,27. Although a few speckle-tracking methods use a single speckle, they usually rely on strong assumptions such as no attenuation25 and single material52. Accordingly, the acquisition process is repeated at 12 different diffuser positions (K = 12).

Despite fewer measurements, we find that PWF consistently outperforms LCS (Fig. 2). Significantly greater phase shifts are obtained (Fig. 2a, e), where phase shift here refers to the magnitude of phase values. Note that phase shifts greater than 30π are obtained, indicating that the sample is optically very thick. This result is meaningful in itself, as robust field reconstruction of such a thick sample is usually difficult due to the complicated loss function landscape induced by severe phase wrapping53. Interestingly, perhaps due to the existence of the preconditioner, we find that the PWF also exhibits lower sensitivity to slow phase variations as in phase-gradient sensing techniques54 (see Fig. S7). Fortunately, it is known that such low-frequency inaccuracy is largely mitigated by averaging measurements taken at different projection angles during the tomographic reconstruction55. Since our goal is to obtain a quantitative 3D phase image, the fidelity of the measured phase values is discussed in the next section alongside the tomographic reconstruction results (Fig. 3e). Less noisy phase gradient values are also observed, especially for the toothpick interior containing fine structures (Fig. 2b, f). Although the PWF algorithm directly provides an integrated phase image, the phase gradient is calculated for direct comparison with LCS to eliminate potential effects of the integration method used.

Notably reduced edge-enhancement effect is observed in the attenuation result of PWF (Fig. 2c, g). Since the enhancement at the edges originates from the abrupt local sample phase modulation4, the edge-enhancement artifacts in the attenuation image indicate an inefficient decoupling of phase shift and attenuation effects. We find that PWF successfully decouples the sample attenuation from the phase contribution despite the non-negligible propagation distance between the sample and the detector. This result strengthens the credibility of PWF in both phase and attenuation images.

Perhaps the most straightforward metric is the residual error, which directly indicates the discrepancy between the reconstructed results and the measurements. To quantify the residual error, we compute estimated speckle patterns based on the model and then obtain the root-mean-square error (RMSE) with respect to the measured speckle patterns (see Methods). Unlike PWF, which shows a uniform and noise level RMSE (Fig. 2d), LCS shows an RMSE image that is similar to a “dark-field” image (Fig. 2h)22,26,31. This similarity makes sense as speckle-tracking methods often obtain the dark-field images by introducing an additional term to explain the part that cannot be interpreted by the original speckle displacement model26,31. An important question arises here: if the RMSE corresponds to the dark-field signal, why is it absent from PWF?

The answer lies in the physical model of PWF. Note that the dark-field signal in projection X-ray imaging represents the fraction of sample diffraction that exceeds the coherence volume56. In other words, the dark-field is an alternative representation of the high-angle sample diffraction that cannot be acquired due to sample-induced decoherence. We find that PWF can incorporate some of the dark-field signal into the reconstructed sample field as high spatial frequency components. Despite the speckle blur, PWF is able to pick up the displacement as the forward model includes the decoherence effect (Eq. 1). This is analogous to incoherent speckle-based imaging methods that can reconstruct speckle displacements even from completely blurred speckle patterns38,39,40,41. Such high-angle diffraction signal reconstruction capability of PWF suggests an enhancement in resolution, which is discussed in the next section (Fig. 3f).

Tomogram results

From the field reconstruction results at various projection angles, we reconstruct the three-dimensional (3D) refractive index distribution of the sample, \(n({\bf{r}})=1-\delta ({\bf{r}})+i\beta ({\bf{r}})\), where r is 3D spatial position (see Methods). The \(\delta ({\bf{r}})\) and \(\beta ({\bf{r}})\) are separately reconstructed from the retrieved phase and attenuation images, respectively (Fig. 3a–d). Please refer to Video S1 for the results from different cross sections.

For both PWF and LCS, the \(\delta ({\bf{r}})\) values are consistently higher than \(\beta ({\bf{r}})\), resulting in better visualization of the detailed internal structure of the toothpick (Fig. 3a, c). In PWF, the sample boundaries are clearly reconstructed, including the interface between the toothpick and the adhesive (Fig. 3a, yellow arrows). In LCS, although the sample structure is reconstructed to some extent, significant artifacts are present, suggesting inaccurate phase reconstruction (Fig. 3c, red arrows). A clearer contrast between PWF and LCS can be seen in the \(\beta ({\bf{r}})\) results. Unlike the clear distinction between the glass beads and the toothpick in PWF (Fig. 3b), the LCS result shows large fluctuation along the edges (Fig. 3d). Such fluctuations are mainly found at sample edges that produce large phase gradient (Fig. 2g) and can be understood as phase-coupling artifacts.

We also compare the reconstructed δ and β values of the soda–lime glass beads with those reported in the literature (Fig. 3e). We calculate the expected δ and β values from the density provided by the manufacturer57. In PWF, we find that both δ and β agree with the expected values, confirming the credibility of the reconstructed phase and attenuation values (Fig. 2a, c). In LCS, however, significantly lower δ values are observed due to phase underestimation (Fig. 2e). Highly fluctuating β also provides physically incorrect negative β values. These results also agree well with numerical simulation (Fig. S8).

The 3D resolutions of the reconstructed tomograms are estimated based on Fourier shell correlation (FSC) analysis (Fig. 3f, see Methods). We find that conventional resolution criterion (i.e., 1/7) is not directly applicable here due to nonlinear noise propagation in PWF algorithm. Instead, we use the 1/4 criterion based on our derivation (see Methods). Note that K = 6 is used here for the LCS results in order to achieve two independent reconstructions from a total of 12 measurements (see Methods). Although FSC is useful for analyzing image resolution and comparing methods, it is a very sample-specific analysis. It does not present the general properties of an imaging system or algorithm. Therefore, direct extrapolation of the FSC results in Fig. 3f to other samples or setups may be invalid.

For \(\delta ({\bf{r}})\), we obtain a spatial resolution of 1.50 µm and 2.23 µm for PWF and LCS, respectively. The significant improvement in spatial resolution strongly supports the direct reconstruction capability of the high-angle diffraction signal in PWF, which is considered as dark-field signals in LCS (Fig. 2f). Although PWF reconstructs part of the dark field signal, we would like to clarify that not all scattering signals can be retrieved.

The theoretical spatial resolution of PWF is determined by the oversampling ratio criterion and the regularization window used. Since the regularization window is calculated based on the measured sample transfer function (STFk, see Methods), the achievable resolution is closely related to the speckle grain size. We experimentally demonstrate that the resolution of the reconstructed image degrades as the speckle grain size increases with different L2 values of 90 mm and 150 mm (Fig. S9). Despite the clear trend, establishing an universal theoretical limit on spatial resolution is difficult. Due to the smooth boundary of STFk (Eq. 14), reconstructed spatial bandwidth would highly depend on the practical noise level and the fineness of the sample structure. This is a common feature of projection-based X-ray imaging systems, which do not have band-limiting optical element between the sample and detector.

For \(\beta ({\bf{r}})\), we have spatial resolutions of 1.67 µm and 1.22 µm from PWF and LCS, respectively. In PWF, we achieve better spatial resolution in \(\delta ({\bf{r}})\) than in \(\beta ({\bf{r}})\), which encourages phase-contrast imaging not only for better contrast but also for better resolution. This also aligns well with the numerical simulation results (Fig. S8). Interestingly, LCS shows better FCS results than PWF in \(\beta ({\bf{r}})\). We observe the same behavior in the numerical simulation and determine that it is primarily caused by the strong, high-frequency phase-coupling artifacts of the LCS, which do not occur for the PWF (Fig. S10). Further, the simulation results suggest that PWF has superior resolution in both \(\delta ({\bf{r}})\) and \(\beta ({\bf{r}})\) as shown in Figs. S8 and S10.

To demonstrate the versatility of PWF, we also observe other samples—a cumin seed, a dried shrimp, a dried anchovy, and a piece of cork are chosen—with the same setup (Fig. 4). Due to the limited beamtime available, the additional samples are simply purchased from a local store and observed without any additional treatment (see Methods). All the experimental and reconstruction parameters are kept identical.

We get surprisingly good reconstruction results for all the samples (Fig. 4). Although we obtain results for both \(\delta ({\bf{r}})\) and \(\beta ({\bf{r}})\) for each sample, we focus more on \(\delta ({\bf{r}})\) here, since \(\beta ({\bf{r}})\) is achievable using traditional methods. It is noteworthy, however, significant differences between the \(\delta ({\bf{r}})\) and \(\beta ({\bf{r}})\) are found, which highlight the potential specificity enhancement of complex-field imaging (Fig. S11).

Fine sample structures are well visualized, as highlighted in the insets. Especially for the cumin seed, not only the outer fine structures but also the inner fine structures are clearly depicted (Fig. 4a, yellow arrows). The fine and complex structures of the shrimp are also well reconstructed (Fig. 4b, yellow arrows). Lipids and the exoskeleton are naturally highlighted by the higher δ values (Fig. 4b, red arrows).

We can also see the clear boundaries between the internal organs of the anchovy (Fig. 4c, red arrows). The fine cell structures of the cork are successfully reconstructed (Fig. 4d, yellow arrows). However, a certain amount of background artifacts are found in the cork result. We believe this is due to the mostly fine structure of cork, which effectively requires more projection angles for complete reconstruction58. Based on the Crowther criterion, at least 1047 projection angles in the range of 0° to 180° are required to fully reconstruct a 1 mm sample with a spatial resolution of 1.5 µm, while we used 801 projection angles throughout the experiments.

Discussion

We successfully demonstrate speckle-based X-ray microtomography via PWF. To extract the sample field information from random speckle images, we combine three lessons learned from different disciplines: the Wirtinger flow is adapted from the random-measurement-based phase retrieval studies in signal processing and information theory43,44; the preconditioner is inspired by the phase gradient retrieval of speckle-tracking methods studied in the field of X-ray phase contrast imaging17,18; and the regularizer is inspired by the oversampling condition studied in the field of coherent imaging33,36.

We present the detailed PWF algorithm, along with additional reconstruction results obtained without preconditioner or regularizer. In the experiments, we show that PWF provides good reconstruction results for both the phase and attenuation of the sample. The reconstruction fidelity and spatial resolution are confirmed by comparing the measured refractive index of a glass bead with the literature and FSC analysis. We also compare PWF with an existing speckle-tracking method (LCS) and find that PWF clearly outperforms LCS in terms of reconstruction fidelity and spatial resolution, despite the fact that all the comparison results are based on a single measurement for PWF, while 12 different measurements are used for LCS. Finally, we demonstrate the versatility of PWF by successfully visualizing 3D refractive index distributions of various samples without changing the setup and reconstruction parameters.

We believe that this work will facilitate speckle-based X-ray imaging in various applications. In synchrotron facilities, conventional microtomography beamlines can easily reproduce the same results by simply introducing a diffuser after the sample. We expect PWF to be applicable to a wide range of experimental conditions, regardless of source size and source-sample distance. We also expect that the spectral dimension can also be explored by scanning the X-ray energies without any changes to the imaging setup. The chemical or material composition can be inferred from the absorption and dispersion relations59.

We are also optimistic about the application of PWF to benchtop X-ray sources, as the decoherence effects of partially coherent sources are already accounted for in the model. This extends the applicability of speckle-based phase-contrast X-ray imaging to a wider range of on-field applications. The single-shot nature of PWF also provides a significant practical advantage in terms of measuring time, stability, and radiation dose, especially in clinical applications60. We believe that the applicability to polychromatic sources should be explored first, since most benchtop sources provide polychromatic emission and the current forward model only considers monochromatic light.

Materials and methods

Sample preparation

Glass beads (≤106 µm, 2.5 g cm−3, G8893, Merck KGaA) are attached to a toothpick with an adhesive for the sample shown in Figs. 2 and 3. The samples shown in Fig. 4 (a cumin seed, a dried anchovy, a dried shrimp, and a piece of cork) are procured from a local store. The piece of cork is obtained from a wine cork.

Experimental setup

The experiments are performed at the 6 C beamline of PLS-II in Korea. A wiggler source is used (500 µm × 30 µm full width at half maximum), followed by a double multilayer monochromator (DMM) tuned to 10 keV (∆E/E = 2.1% measured at 16 keV). The sample is placed 36 m after the source. The diffuser is four layers of P3000 grit sandpaper (991 A, Starcke GmbH & Co. KG) placed L1 = 3 mm after the sample. The scintillator is 50-µm thick LuAG:Ce placed L2 = 20 mm downstream of the diffuser. Note that the shortest possible L1 and L2 are chosen to minimize sample-induced decoherence (Supplementary Text). Specifically, L1 is chosen to ensure that it does not physically interfere with the sample rotation stage, and L2 is determined to ensure that the spatial bandwidth of the speckle pattern is well sampled without aliasing. Note that the speckle grains become coarser as L2 increases due to the partial coherence (Fig. S2). The image on the scintillator is observed with an optical microscope equipped with an objective lens (NA = 0.25, LMPLFLN 10X, Olympus Corp.) and a sCMOS camera (6.5 µm, 2560 × 2160, pco.edge 5.5, Excelitas PCO GmbH). The corresponding spatial sampling period and area are 650 nm and 1.664 mm × 1.404 mm, respectively.

Data acquisition

For each sample, data are acquired in the following order: (i) dark frame, (ii) flat field for the beam image, (iii) reference speckle from the diffuser, (iv) sample projection images. In step (iv), the samples are rotated continuously from 0° to 180°, while the projection images are taken on the fly at each 0.225° rotation angle, resulting in a total of 801 images. The camera exposure time is kept at 100 ms throughout the measurements. For speckle-tracking methods, which typically require 10 to 20 independent measurements27,29, the acquisition process is repeated 12 times at different lateral positions of the diffuser. In all subsequent reconstruction steps, all speckle images are considered to have been pre-processed with dark frame subtraction and flat-field correction. The entire acquisition process, including all stage movements and pauses, is completed in approximately 4 min (see Video S6).

Intensity optical transfer function (IOTF)

The coherence length at the detection plane is calibrated from the autocorrelation function of the reference speckle pattern based on the Siegert relation

$${g}^{(2)}(\Delta x)=1+{\left|{g}^{(1)}(\Delta x)\right|}^{2}$$
(3)

while \({g}^{(1)}(\Delta x)\) and \({g}^{\left(2\right)}\left(\Delta x\right)\) are first- and second-order normalized autocorrelation functions of a speckle pattern61. Note the \({\left|{g}^{(1)}(\Delta x)\right|}^{2}\) term is closely related to the spatial coherence of light62. Due to the inherent asymmetry of the synchrotron source, horizontal and vertical coherence lengths (xcoh and ycoh) are estimated separately. For example, for xcoh, we compute the one-dimensional normalized autocorrelation function along the x-axis, average it along the y-axis, and estimate the coherence length by fitting it to the Gaussian model \({g}^{(2)}(\Delta x)=1+\exp \left(-\pi \Delta {x}^{2}/{x}_{{\rm{coh}}}^{2}\right)\)62. Corresponding intensity point spread function (IPSF) becomes

$${{\rm{IPSF}}}_{r}=\frac{2}{{x}_{{\rm{coh}}}{y}_{{\rm{coh}}}}\exp \left[-2\pi \left(\frac{{x}^{2}}{{x}_{{\rm{coh}}}^{2}}+\frac{{y}^{2}}{{y}_{{\rm{coh}}}^{2}}\right)\right]$$
(4)

where r is the index of vectorized image in (x, y) space. By the Fourier transform of Eq. (4), we can calculate the intensity optical transfer function (IOTF),

$${{\rm{IOTF}}}_{k}=\exp \left[-\frac{\pi }{2}\left({u}^{2}{x}_{{\rm{coh}}}^{2}+{v}^{2}{y}_{{\rm{coh}}}^{2}\right)\right]$$
(5)

where u and v are the spatial frequencies of x and y, respectively, and k is the index of vectorized image in (u, v).

Diffuser transmission function

We derived the transmission function of the diffuser (tr) from the reference speckle. First, we removed the spatial incoherence effect using Wiener deconvolution with the calibrated IOTF (Eq. 5). This step estimates the ‘coherent’ reference speckle intensity at the camera plane. Then, the tr can be obtained by the numerical backpropagation (L2) of the deconvoluted reference speckle. One problem here is that its phase is unknown.

We originally attempted to retrieve the sample and diffuser fields simultaneously, as done in ptychographic iterations47,48. However, we found that the obtained diffuser phase is usually unreliable (e.g., slowly varying, spatially inhomogeneous) and changes drastically depending on the initial guess (Fig. S12). Despite this instability, the sample field was retrieved in a stable manner with indiscernible tomogram results. It is noteworthy that this result agrees with near-field ptychography results32,63,64. This may suggest that there is only one sample field that satisfies both the sample and reference speckle simultaneously, regardless of the phase of the reference speckle. In other words, the sample field can be retrieved with any of the possible diffuser phase. Based on this, we simply fix the phase of the reference speckle to zero.

This is a significant advantage over conventional speckle-based imaging methods, which usually require a well-defined complex diffuser transmission function34,50. We expect this advantage to hold only for weakly scattering samples with a near-field setup that does not induce significant overlap between speckle grains. Further investigation is needed to determine these conditions.

Preconditioning filter

The phase retrieval problem in speckle-based X-ray microtomography is well conditioned on the phase gradient rather than the phase. As such, the gradient with respect to the phase gradient would have a better convergence property. Thus, at each iteration, we want to perform the gradient descent step to the phase gradient such as

$$\begin{array}{l}\begin{array}{l}\frac{\partial {\phi }_{r}}{\partial x}\leftarrow \frac{\partial {\phi }_{r}}{\partial x}-\eta \frac{\partial {\mathcal{L}}}{\partial (\partial {\phi }_{r}/\partial x)}\\ \frac{\partial {\phi }_{r}}{\partial y}\leftarrow \frac{\partial {\phi }_{r}}{\partial y}-\eta \frac{\partial {\mathcal{L}}}{\partial (\partial {\phi }_{r}/\partial y)}\end{array}\\ \end{array}$$
(6)

where L is the loss function defined in Eq. (2), \({\phi }_{r}={\rm{Im}}({\psi }_{r})\) is the sample phase, and η is a step size. Applying additional partial derivative to both sides of Eq. (6) and adding them together, we get

$${\nabla }_{\perp }^{2}{\phi }_{r}\leftarrow {\nabla }_{\perp }^{2}{\phi }_{r}-\eta \left(\frac{\partial }{\partial x}\frac{\partial {\mathcal{L}}}{\partial (\partial {\phi }_{r}/\partial x)}+\frac{\partial }{\partial y}\frac{\partial {\mathcal{L}}}{\partial (\partial {\phi }_{r}/\partial y)}\right)$$
(7)

where \({\nabla }_{\perp }^{2}={\partial }^{2}/\partial {x}^{2}+{\partial }^{2}/\partial {y}^{2}\) is a two-dimensional Laplacian. The second term on the right-hand side can be simplified by the chain rule

$$\frac{\partial }{\partial x}\frac{\partial {\mathcal{L}}}{\partial (\partial {\phi }_{r}/\partial x)}=\frac{\partial }{\partial x}\frac{\partial {\phi }_{r}}{\partial (\partial {\phi }_{r}/\partial x)}\frac{\partial {\mathcal{L}}}{\partial {\phi }_{r}}=\frac{\partial (\partial {\phi }_{r}/\partial x)}{\partial (\partial {\phi }_{r}/\partial x)}\frac{\partial {\mathcal{L}}}{\partial {\phi }_{r}}=\frac{\partial {\mathcal{L}}}{\partial {\phi }_{r}}$$
(8)

which leads

$${\nabla }_{\perp }^{2}{\phi }_{r}\leftarrow {\nabla }_{\perp }^{2}{\phi }_{r}-\eta \frac{\partial {\mathcal{L}}}{\partial {\phi }_{r}}$$
(9)

where the factor of 2 is omitted here as it is coupled with the step size. Applying inverse Laplacian to both sides of Eq. (9), we obtain

$${\phi }_{r}\leftarrow {\phi }_{r}-\eta {\nabla }_{\perp }^{-2}\frac{\partial {\mathcal{L}}}{\partial {\phi }_{r}}$$
(10)

Compared to the conventional gradient-based phase iteration, \({\phi }_{r}\leftarrow {\phi }_{r}-\eta \partial {\mathcal{L}}/\partial {\phi }_{r}\) there is an additional inverse Laplacian term before the gradient term, which is the preconditioner. The preconditioner is introduced by applying the inverse quadratic preconditioning filer \({P}_{k}^{-1}:=1/({u}^{2}+{v}^{2})\) to the update term,

$${\phi }_{r}\leftarrow {\phi }_{r}-{{\mathcal{F}}}^{-1}\left\{{P}_{k}^{-1}{\mathcal{F}}\left\{\frac{\partial {\mathcal{L}}}{\partial {\phi }_{r}}\right\}\right\}$$
(11)

where u and v are the spatial frequencies of x and y, and \(\mathcal{F}\left\{ \cdot \right\}\) is the Fourier transform. For the zero frequency (u = 0 and v = 0) \({P}_{k}^{-1}\) is set to zero, and is determined separately later by zeroing the background phase value. Since \({\phi }_{r}={\rm{Im}}({\psi }_{r})\), the preconditioning filter is applied only to the imaginary part of \({\psi }_{r}\). The detailed calculation can be found in Algorithm S2.

Regularization window

In this work, the number of measured spatial modes is defined by the IOTFk, which effectively limits the measurable spatial bandwidth. Since Eq. (5) gives the effective bandwidths of \(\sqrt{2}{x}_{{\rm{coh}}}^{-1}\) and \(\sqrt{2}{y}_{{\rm{coh}}}^{-1}\) 62, we can estimate the number of measured spatial modes from the space-bandwidth product (SBP) of the measured images50,

$$M=\frac{1}{2}\left(\frac{\sqrt{2}{{\rm{FOV}}}_{x}}{{x}_{{\rm{coh}}}}\right)\left(\frac{\sqrt{2}{{\rm{FOV}}}_{y}}{{y}_{{\rm{coh}}}}\right)=\left(\frac{{{\rm{FOV}}}_{x}}{{x}_{{\rm{coh}}}}\right)\left(\frac{{{\rm{FOV}}}_{y}}{{y}_{{\rm{coh}}}}\right)$$
(12)

where FOVx and FOVy are the horizontal and vertical extents of acquired images (i.e., field of view). Note that the factor 1/2 is introduced here to correctly estimate the SBP of speckle field from the SBP of intensity speckle image. Assuming the Gaussian angular spectrum, the intensity speckle would have greater bandwidth by the factor of \(\sqrt{2}\), which results in the SBP increase by the factor of 2. The number of reconstructed spatial modes (N) can be defined from given oversampling ratio γ and Eq. (12),

$$N=\left(\frac{{{\rm{FOV}}}_{x}}{\sqrt{\gamma }{x}_{{\rm{coh}}}}\right)\left(\frac{{{\rm{FOV}}}_{y}}{\sqrt{\gamma }{y}_{{\rm{coh}}}}\right)$$
(13)

Since the reconstructed fields have the same spatial extent as the measured images, Eq. (13) directly leads to additional constraints on their Fourier space by the bandwidth of \({x}_{{\rm{coh}}}^{-1}/\sqrt{\gamma }\) and \({y}_{{\rm{coh}}}^{-1}/\sqrt{\gamma }\). Assuming a Gaussian profile, we can define the sample transfer function (STFk),

$${{\rm{STF}}}_{k}=\exp \left[-\pi \gamma \left({u}^{2}{x}_{{\rm{coh}}}^{2}+{v}^{2}{y}_{{\rm{coh}}}^{2}\right)\right]$$
(14)

and subsequently the regularization window,

$${\Gamma }_{k}^{2}=1-{{\rm{STF}}}_{k}$$
(15)

The used oversampling ratio (γ = 1) may seem inconsistent with previous works that empirically report γ ≥ 4 for robust field reconstruction33,43. The difference is that here we use a Gaussian STF, which provides a soft boundary, while previous works used a well-defined domain, which is equivalent to binary STFs. We find that the soft boundary relaxes the dependence of the reconstruction fidelity on γ (Fig. S4), and we prefer to use a smaller γ to obtain the best possible resolution.

Due to the Gaussian shape of the STFk, it does not have a distinct limit on the resolution; rather, it depends heavily on the fineness of the sample structure and the noise level. For instance, if the sample bandwidth is retrievable up to STFk = 0.1 boundary, the expected spatial resolution can be calculated \(\sqrt{\pi }{x}_{{\rm{coh}}}/\left(2\log 10\right)\approx 0.385{x}_{{\rm{coh}}}\) from Eq. (14). Using the measured coherence lengths in Fig. S2, the expected horizontal and vertical resolution becomes 1.34 µm and 1.66 µm, respectively. The root-mean-square resolution is 1.51 µm, which agrees with the acquired resolution in FSC analysis (Fig. 3f).

Reconstruction procedure

The PWF iteration is stopped when the normalized correlation between the retrieved field of the current iteration and the previous iteration is greater than 10−0.00001. Typically, 700 to 1300 iterations are required to meet the criterion, which takes 5 s to 6 s on a personal computer equipped with a graphics processing unit (GPU; GeForce RTX 4090, NVIDIA Corp.) using MATLAB software (The MathWorks, Inc.). Thus, for each sample, it took 70 min to 80 min to process all 801 projection angles. Note that all the computations in Algorithm S1 are element-wise operations and Fourier transforms, which were effectively accelerated by the GPU.

In both PWF and LCS, the root-mean-square error (RMSE) image is calculated from the reconstructed sample fields. By applying reconstructed sample fields to the forward model, we calculate the estimated speckle patterns, \(f({x}_{r})\) and subtract it from the measured speckle patterns, \({y}_{r}\). In PWF, there is only one speckle pattern to compare (K = 1), so the RMSE image simply is \(\left|{y}_{r}-f({x}_{r})\right|\). In LCS, we have twelve speckle patterns to compare (K = 12), so the RMSE image becomes \(\sqrt{\sum_{m=1}^{K}{\left({y}_{r}^{m}-{f}_{{\rm{LCS}}}\left({x}_{r}^{m}\right)\right)}^{2}}\), where fLCS is the forward model of LCS. The RMSE images are normalized with the mean intensity of the reference speckle and present in % (Fig. 2d, h).

The tomogram results are reconstructed using the standard filtered back projection (FBP) algorithm with Ramachandran-Lakshminarayanan (“Ram–Lak”) filter. The iradon function of MATLAB is independently applied at each vertical position of the sample. A constant \(\lambda /p/(2\pi )\) is then multiplied to convert the tomogram results to \(\delta ({\bf{r}})\) and \(\beta ({\bf{r}})\), where p is a pixel size of the reconstructed sample tomogram. We often experience low-frequency background curvature due to the intrinsic lower sensitivity on slow phase variation of phase-gradient sensing techniques54. We mitigate such artifacts by fitting the background area with a two-dimensional quadratic function after tomographic reconstruction.

Fourier shell correlation (FSC)

For FSC, we utilize two tomograms of the toothpick sample (Fig. 3) that are reconstructed from the measurements at different lateral positions of the diffuser. Defining either the real or imaginary parts of two tomograms as \({g}_{1}({\bf{r}})\) and \({g}_{2}({\bf{r}})\), the FSC can be calculated from the normalized cross-correlation of their Fourier transforms over the shell,

$${\rm{FSC}}(\kappa )=\frac{{\left\langle {\tilde{g}}_{1}^{* }({\bf{k}}){\tilde{g}}_{2}({\bf{k}})\right\rangle }_{{{||}{\bf{k}}{||}}_{2}=\kappa }}{\sqrt{{\left\langle {\tilde{g}}_{1}^{* }({\bf{k}}){\tilde{g}}_{1}({\bf{k}})\right\rangle }_{{{||}{\bf{k}}{||}}_{2}=\kappa }{\left\langle {\tilde{g}}_{2}^{* }({\bf{k}}){\tilde{g}}_{2}({\bf{k}})\right\rangle }_{{{||}{\bf{k}}{||}}_{2}=\kappa }}}$$
(16)

where \({\tilde{g}}_{1,2}({\bf{k}})\) is the Fourier transform of \({g}_{1,2}({\bf{r}})\), and κ is the radius of the shell in the 3D Fourier space65. For PWF, a total of 12 tomograms are independently reconstructed from the K = 12 measurements, and two are randomly selected for FSC calculation. Five FSC are calculated from different pairs of tomograms and averaged. For LCS, the K = 12 measurements are randomly divided into two sets of K = 6, and the tomograms reconstructed from the two sets are used for the FSC. Similarly, five FSC are calculated from different random splits and averaged.

Regarding the resolution criterion, we followed the derivation in Ref. 65, which suggests the 0.143 (or 1/7) criterion for the “full-set averaged” result. For example, if there are K repetitive measurements, the FSC is calculated between the two half-set averaged results of K/2 non-overlapping measurements. To extrapolate this FSC result to the resolution of the full-set averaged result, an improvement in signal-to-noise ratio (SNR) by \(\sqrt{2}\) was assumed. Unfortunately, this result cannot be applied directly to our case for two reasons: first, PWF requires only a single measurement (K = 1), which cannot be divided into two sets; and more importantly, the SNR may not be simply proportional to K because noise propagation in PWF is more complex due to the nonlinear reconstruction process. Therefore, we redo the derivation without the full set extrapolation.

Let us divide the measured tomogram into the sample signal and noise terms, \({g}_{1,2}({\bf{r}})={s}_{r}({\bf{r}})+{n}_{1,2}({\bf{r}})\). Note that \({s}_{r}({\bf{r}})\) is invariant across measurements, while \({n}_{1,2}({\bf{r}})\) fluctuates randomly. Here we can consider the result to be reliable if the normalized correlation between the data and the ideal sample signal is greater than 1/2,

$$\frac{1}{2}\leqq \frac{\left\langle {\tilde{g}}^{* }\tilde{s}\right\rangle }{\sqrt{\left\langle {\tilde{g}}^{* }\tilde{g}\right\rangle \left\langle {\tilde{s}}^{* }\tilde{s}\right\rangle }}=\sqrt{\frac{\left\langle {\tilde{s}}^{* }\tilde{s}\right\rangle }{\left\langle {\tilde{s}}^{* }\tilde{s}\right\rangle +\left\langle {\tilde{n}}^{* }\tilde{n}\right\rangle }}$$
(17)

where the \({{||}{\bf{k}}{||}}_{2}=\kappa\) condition of the average operator, and the argument of \(\tilde{g}({\bf{k}})\), \(\tilde{s}({\bf{k}})\), and \(\tilde{n}({\bf{k}})\) are dropped for brevity. Note that the \(\left\langle {\tilde{n}}^{* }\tilde{s}\right\rangle\) and \(\left\langle {\tilde{s}}^{* }\tilde{n}\right\rangle\) terms have disappeared since the noise should be random and uniform over space, which leads \(\left\langle {\tilde{s}}^{* }\tilde{n}\right\rangle =\left\langle {\tilde{s}}^{* }\right\rangle \left\langle \tilde{n}\right\rangle\) and \(\left\langle \tilde{n}\right\rangle =0\) for the non-zero spatial frequencies (\(\kappa > 0\)), respectively. Similarly, we can rewrite Eq. (16) as

$${\rm{FSC}}(\kappa )=\frac{\left\langle {\tilde{s}}^{* }\tilde{s}\right\rangle }{\left\langle {\tilde{s}}^{* }\tilde{s}\right\rangle +\left\langle {\tilde{n}}^{* }\tilde{n}\right\rangle }$$
(18)

Substituting Eq. (18) into Eq. (17), we get \({\rm{FSC}}(\kappa)\geqq 1/4\), which suggests 1/4 as the resolution criterion. Again, the only difference from the derivation in Ref. 65 is the full set extrapolation that introduces an additional 1/2 factor for \(\left\langle {\tilde{n}}^{* }\tilde{n}\right\rangle\) in Eq. (17).