Abstract
Long-term and high-spatiotemporal-resolution 3D imaging of living cells remains an unmet challenge for super-resolution microscopy, owing to the noticeable phototoxicity and limited scanning speed. While emerging light-field microscopy can mitigate this issue through three-dimensionally capturing biological dynamics with merely single snapshot, it suffers from suboptimal resolution insufficient for resolving subcellular structures. Here we propose an Adaptive Learning PHysics-Assisted Light-Field Microscopy (Alpha-LFM) with a physics-assisted deep learning framework and adaptive-tuning strategies capable of light-field reconstruction of diverse subcellular dynamics. Alpha-LFM delivers sub-diffraction-limit spatial resolution (up to ~120 nm) while maintaining high temporal resolution and low phototoxicity. It enables rapid and mild 3D super-resolution imaging of diverse intracellular dynamics at hundreds of volumes per second with exceptional details. Using Alpha-LFM approach, we finely resolve the lysosome-mitochondrial interactions, capture rapid motion of peroxisome and the endoplasmic reticulum at 100 volumes per second, and reveal the variations in mitochondrial fission activity throughout two complete cell cycles of 60 h.
Similar content being viewed by others
Introduction
In living cells, different organelles work together to execute diverse and intricate physiological functions. Elucidating these fast dynamics/interactions of organelles across the cell cycle needs the microscopes to have sufficiently high spatiotemporal resolution in four dimensions (time +3D space) and low phototoxicity rate for long-term observation1,2,3,4. This is highly challenging because current 3D microscopy techniques are constrained to the maximal photon budget a sample permits, which results in an inevitable trade-off among imaging speed, spatial resolution, and photon efficiency5,6,7,8,9,10,11,12,13,14. A series of 3D microscopy implementations, including confocal microscopy11,15, 3D structured illumination microscopy (3D SIM)5,7,12 and light-sheet microscopy (LSM)9,16,17, have been intensively developed for live-cell imaging beyond the diffraction limit, e.g., ~150 nm lateral and ~280-nm axial resolutions by SIM mode of lattice light-sheet microscope (LLSM-SIM)9. However, these scanning-based super-resolution (SR) approaches often require recording hundreds of frames at different planes to reconstruct a super-resolved volume, thereby showing compromised temporal resolution limited to a few seconds and relatively high phototoxicity limited to hundreds of volumes acquisition9. Therefore, it’s difficult for these approaches to observe either long-term evolution during the cell cycle or instantaneous subcellular process occurred at milliseconds timescale.
Unlike scanning-based microscopy, light-field microscopy (LFM) provides photon-efficient direct volumetric imaging by encoding both position and angular information of 3D signals on single 2D camera snapshots without time-consuming axial scanning18,19,20,21,22,23,24. Benefiting from the high-speed scanning-free 3D imaging, LFM has facilitated various biological studies on neural activities, cardiac hemodynamics, and live cells21,22,25. In contrast to the superior temporal resolution and low phototoxicity well-suited for live imaging, the spatial resolution of LFM is often unsatisfactory, owing to the insufficient pixel sampling and limited sub-numerical aperture existed in each simultaneously captured light-field view. Many efforts have been made to improve this limitation of LFM22,26,27, for example, DAOSLIMIT can yield a spatial resolution of ~220 nm after a 9-times aperture scanning that slightly lowers the imaging speed22. Meanwhile, the use of iterative light-field deconvolution for 2D-3D image reconstruction is vulnerable to artifacts and incapable of surpassing the diffraction limit. There always exists an “impossible performance triangle” which greatly limits the design space of current 3D fluorescence microscopy techniques towards high speed, high spatial resolution, and high photon efficiency imaging.
The recent advent of deep neural networks for image reconstruction enlarges the design space of microscopy by introducing prior knowledge of high-resolution data for learning and inference28,29. We previously reported view-channel-depth light-field microscopy (VCD-LFM), in which a VCD network model is trained to learn the nonlinear relationships between the 3D confocal ground truths (GTs) and their 2D light-field projections, and afterwards, can directly reconstruct high-resolution 3D volume from a single 2D light-field image by using the well-trained channels in the VCD model to transform the implicit features of the light-field views into depth information of a 3D stack. With a deep-learning model combining the high-resolution advances of scanning microscopy into high-speed imaging of LFM, VCD-LFM holds the promise for high-speed and high-resolution 4D imaging of live samples30,31,32,33. However, current end-to-end supervised networks encounter constraints in terms of both enhanced capabilities and the requirement for a large dataset. When dealing with complex inverse problems, ensuring accuracy is challenging. For instance, reconstructing a 3D SR image from a 2D undersampled light-field image with spatial bandwidth compressed by around 600 times requires the recovery of various degradations brought by noise, resolution, and dimensionality reduction. This presents a highly intricate, ill-posed problem with a huge solution space. The fitting ability of the network depends on the model complexity, the amount of prior data available, and loss constraints34. However, traditional one-stage end-to-end networks, with a limited amount of label data and model complexity, are difficult to find precise SR solutions in such a huge space, resulting in either limited resolution or reduced fidelity. Another typical challenge that supervised learning has to face is the need for massive, high-quality data and extensive training time. This arises from the network’s predominant focus on learning high-dimensional features of specific training samples from a large amount of high-quality data, thereby limiting their applicability to new samples.
Here we propose an Adaptive Learning PHysics-Assisted Light-Field Microscopy (Alpha-LFM), capable of accurately super-resolved reconstruction of diverse subcellular structures. To enhance the network’s fitting capability, we established an adaptive-learning physics-assisted network framework (Alpha-Net) to increase model complexity and the data constraint through decoupling the complex light field inverse problem into multiple subtasks with multi-stage data guidance. Instead of simply incorporating a 3D SR Net with VCD-Net, our decomposition strategy is designed to progressively denoise, de-alias, and reconstruct, facilitating a more precise 3D reconstruction by effectively leveraging the angular information in light-field (LF) views. To permit the implementation of this decomposition strategy, we developed a physics-assisted hierarchical data synthesis pipeline to introduce multi-stage data prior and a decomposed-progressive optimization (DPO) strategy to enable the convergence of multi-stage networks. We demonstrate that through carefully designing the model and training strategies, our Alpha-Net can notably narrow down the inversion space for seeking the correct solution more efficiently, enabling 3D SR reconstruction of diverse samples with improved fidelity. To tackle unseen structures, we further developed an adaptive tuning strategy. It allows for fast optimization of new live samples with the physics assistance of in situ 2D wide-field images. Alpha-LFM demonstrates imaging of the dynamics of intracellular structures in live cells at an isotropic spatial resolution up to ~120 nm and hundreds of hertz volume rate, facilitating the analysis of the lysosome-mitochondrial interactions as well as the rapid motion of peroxisome and the endoplasmic reticulum (ER). With minimized phototoxicity of Alpha-LFM, we achieve 3D live-cell imaging over 60 h for the tracking of mitochondrial evolution across two complete cell cycles.
Results
Principle and implementation of Alpha-LFM
LFM encodes the spatial-angular patterns from a 3D sample into a single 2D image34,35. This procedure contains multiple imaging degradations, including dimension compression from the diffraction-unlimited 3D objective to a diffraction-limited 2D LF projection, the frequency aliasing induced by the undersampling of the microlens array (MLA) during the encoding of spatial-angular information, and the noises mainly from the camera exposure (Fig. 1a). In our study, light-field reconstruction beyond the diffraction limit needs to invert this compression with a space-bandwidth product (SBP) expansion over 600 times (Methods). This intricate and ill-posed inversion problem has resulted in an extensive solution space that maps the undersampled LF measurement to the possible 3D SR solutions, thereby posing a big challenge to either deconvolution-based approaches without data priors or standard end-to-end DL models constrained by limited label data and model complexity30 (Supplementary Note 1). This leads to an unsatisfactory reconstruction performance, especially when imaging the fine subcellular structures, where both high spatial resolution beyond the diffraction limit and improved fidelity close to the GT are required for downstream tasks.
The principle and architecture of Alpha-LFM and its training strategy. a The stepwise network is designed based on the physical model of the light-field imaging process. The light-field imaging process encodes multidimensional degradation mainly including: (i) the frequency aliasing caused by the limited NA of objectives and sub-apertures of microlens, (ii) dimension compression during light field encoding, and (iii) the noise addition by the camera recording. b Three light-field-aware sub-networks with view-attention denoising, spatial-angular de-aliasing, and VCD 3D reconstruction restoration tasks are designed to disentangle and sequentially solve the complex light-field inversion problem. Multi-stage training data, including De-aliased LF, Clean LF, and Noisy LF, are synthesized from the same 3D SR data using a “physics-embedded hierarchical data synthesis” to guide the training of the sub-networks. The decomposed-progressive optimization strategy ensures the collaboration of the sub-networks when training the multi-stage data. c Direct Alpha-Net reconstruction when the training and testing datasets are of the same type of structure. d The schematic illustrates the adaptive inference strategy of Alpha-LFM. When inferring previously unseen types of structures, experimentally-obtained 2D WF and corresponding LF images of the new structure are used to instantly tune the pre-trained model, making it adaptive to the unseen structures.
To improve the solving capability of the network and reduce the solution space, we established a physics-assisted deep-learning framework to enhance the model complexity and the data constraint by disentangling the complex light field inverse problem into multiple subtasks with multi-stage data guidance and joint optimization (Fig. 1b, Supplementary Note 1, Supplementary Movie 1, “Methods”). Instead of simply using a 3D SR network to further improve the diffraction-limited resolution of VCD results, we devised a multi-stage network encompassing LF denoising, LF de-aliasing, and 3D reconstruction to progressively solve the light-field inverse problem. In this case, we are able to introduce more angular constraints in the de-aliasing network, therefore demonstrating notably enhanced reconstruction fidelity. Meanwhile, our decomposing strategy also achieves four-order-of-magnitude higher inference speed by avoiding the use of complex 3D blocks (Supplementary Fig. 1). To permit the implementation of this decomposed strategy, we firstly need to develop a sub-aperture shifted light-field projection (SAS LFP) strategy to generate light-field images without frequency aliasing (De-aliased LF) to guide the LF de-aliasing task, in which we projected the 3D SR images containing sub-aperture shifts into a series of clean light-field images (Clean LF) and rearrange them into a single one (Supplementary Fig. 2, Supplementary Note 1). This strategy serves as the cornerstone of our physics-assisted hierarchical data synthesis pipeline that allows semi-synthetic multi-stage data priors to be conveniently generated from the same 3D SR data based on the light-field model and then progressively guides the sub-networks (Fig. 1b, “Methods”).
To achieve the SR light-field reconstruction that involves a 30-fold increase in sampling and transformation from 2D to 3D, we devised light-field-aware networks for each inversion task and a DPO strategy to efficiently facilitate the collaboration of multiple sub-networks, thus together ensuring the whole network can find an optimal solution with improved resolution and fidelity (Fig. 1b, Supplementary Note 1). To fully exploit the angular information from multiple views of LF image, we incorporated view-attention denoising modules, spatial-angular convolutional feature extraction operators, and disparity constraints for denoising and de-aliasing of 4D (x, y, u, v) LF images, showing performance superior to modules that rely solely on spatial information (Supplementary Figs. 3 and 4). We also optimized the VCD 3D reconstruction sub-network by incorporating multi-res blocks to extract features from more dimensions, ensuring the high quality of the reconstruction results. Considering the different difficulty in each task, the three sub-networks are jointly optimized using our DPO strategy, in which, each sub-network was optimized independently to ensure the high-quality solution for each sub-task at the beginning and then the sub-networks were grouped into “denoising,” “denoising and de-aliasing” and “denoising, de-aliasing and reconstruction” for being optimized progressively (Supplementary Fig. 5). DPO facilitates the collaboration of sub-networks while maintaining their independent training, thus achieving reconstructions with improved resolution and fidelity (Supplementary Fig. 6). It’s noteworthy that DPO strategy is also valuable to other image restoration networks. We also demonstrate the superior performance on other image restoration networks, such as RCAN networks36, for delivering SR results with higher fidelity (Supplementary Fig. 7). With this strategy, Alpha-Net can successfully resolve 137 nm line pairs of the synthetic resolution board (Supplementary Fig. 8) and directly reconstruct SR images of diverse samples (Fig. 1c, Supplementary Fig. 9).
Another inherent limitation of supervised learning is the requirement for large datasets and extensive training time. Common solutions that either include training a large amount of data from diverse structures26 or use a few 3D data points for transfer learning30 need the acquisition of 3D high-resolution data on static samples in advance, suffering from low flexibility. We developed an adaptive-tuning strategy in Alpha-LFM to reconstruct the types of structure that were not included in the training datasets. When facing unseen types of structures, Alpha-LFM adopted a few wide-field (WF) 2D images and corresponding LF measurements of new samples to rapidly tune the pre-trained model, adaptive to the new structure (Fig. 1d, Supplementary Note 1 and “Methods”). In the adaptive-tuning phase, the WF images of the new samples were readily obtained using the regular port of the inverted microscope and performed deconvolution, serving as the lateral constraint of 3D reconstructions, which were used for calculating the mean-square errors (MSE) with maximum projection and down-sampling of the network’s 3D inferences. To maintain the mapping function from LF to 3D reconstructions and prevent over-fitting by the lateral constraint, we employed an alternate training strategy to incorporate a small amount of raw data used for the base model into the training process, acting as the volumetric constraint. The lateral and volumetric constraints were alternatively optimized during the fine-tuning phase to together contribute to the optimized results in 3D. Through this adaptive-tuning strategy, we tuned the network trained on lysosomes for reconstructing light-field images of fluorescent beads, yielding a resolution of ~120 nm (Supplementary Fig. 10). Additionally, we successfully transferred the model trained on the outer membrane of mitochondria to the outer membrane of lysosomes and mitochondrial matrix, respectively (Supplementary Fig. 10). The finely-tuned Alpha-Net showed significantly reduced artifacts and enhanced fidelity, as compared with VCD-Net and Alpha-Net without fine-tuning, yielding sharp but distorted reconstruction. The finely-tuned Alpha-Net reconstructions allowed us to capture the subcellular dynamics of mitochondrial fission and fusion, as well as lysosomal movements.
Characterization of Alpha-LFM
We demonstrated the performance advances of Alpha-Net through a comparison with end-to-end VCD on both Argolight resolution board and organelle data. To quantitatively illustrate the performance of the network in the process of solving inverse problems, we developed a network comprehensive performance pyramid (NCPP) method based on the comprehensive evaluation of the fidelity and resolution changes during the network training (Fig. 2a, Supplementary Movie 1, “Methods”). We quantified the fidelity and resolution of the reconstruction results during the network’s convergence process by calculating the differences in structural similarity (ΔSSIM, with 0 indicating the best similarity) and cut-off frequency (ΔKc, with 0 indicating the highest resolution) between the inference results of 196 regions of interest (ROIs) and the GTs. The position of the data point on the coordinate axes reflects the network fitting ability, while the concentration of the distribution represents the robustness of the network across different datasets. Through the NCPP metric, we verified that it’s indeed difficult for the original VCD to find high fidelity, SR solutions through a one-stage supervision, as evidenced by the premature convergence and halting at inferior resolution and fidelity (Fig. 2a, NCPP map, left). In contrast, Alpha-Net under multi-stage data supervision achieved significant improvements in fidelity and resolution right after the initial task optimization (Initial stage, Fig. 2b). The progressive optimization strategy further reduced the unnatural high-frequency artifacts caused by such local optimization in the sub-networks. With the decomposed and progressive optimization of Alpha-Net, the network rapidly approached the global optimum (Fig. 2a, NCPP map, right), yielding SR reconstructions with improved fidelity in the last stage (Last stage, Fig. 2b).
a The comparative NCPP map evaluating the fitting capability of previous end-to-end VCD-Net and Alpha-Net during the optimization process (n = 196 ROIs). b The 2 charts show the network performance at the initial optimization stage (10th epoch) and final stage (well-convergence at 250th epoch). c Resolution characterization of Alpha-LFM using Argolight resolution board featuring adjacent lines with known distances ranging from 120 to 360 nm. d Regions of interest (ROIs) showing lines spaced 120 and 240 nm apart, imaged using VCD-Net and Alpha-Net. e Intensity profiles along the lines indicated in (d), quantifying the resolution of Alpha-Net and VCD-Net. f The experimental Noisy LF image and extracted views of lysosome outer membranes in a fixed U2OS cell. g The Noisy, Denoised, De-aliased LF views (indicated by dotted boxes in f) and h the 3D SR results from three sub-networks of Alpha-Net. i The comparison of reconstructions by Alpha-Net, LFD, VS-LFD, and VCD. The white arrows indicate the noticeable errors in LFD, VS-LFD, and VCD results whereas being accurately resolved by Alpha-Net. j Structure similarity (SSIM) metric quantitatively comparing the fidelity of Alpha-Net and other approaches using enhanced Airyscan data (GT) as reference (n = 15 volumes). k Decorrelation analysis quantifying the lateral and axial resolution of reconstructions by Alpha-Net, LFD, VS-LFD, VCD, and GT (n > 30). l MIPs of microtubules in a fixed COS-7 cell obtained by 3D SIM under high (1.6 J/cm2) and low (0.02 J/cm2) light dose, as well as by VCD-LFM and Alpha-LFM under a low light dose (0.05 J/cm2) used for volumetric acquisition. The magnified view of ROI indicated by the white box and the Fourier spectra are shown at the bottom. m SSIM and PSNR metrics quantifying the fidelity of 3D SIM and Alpha-LFM across varying light doses, using 3D SIM with AI denoising under high light dose as reference. Boxes are median ± i.q.r.; whiskers represent the min and max values (j, k). Scale bar, 5 μm (f–h), 2 μm (i)10 μm (spatial domain), and 1/100 nm−1 (Fourier domain) (l).
Through a simple retrofit of a commercial inverted microscope (Olympus IX73) using a designed compact light-field add-on (~220 × 140 mm in size, full design in Supplementary Fig. 11 and “Methods”), we conducted LF imaging of the Argolight resolution board, which features adjacent lines with known distances, to quantify our resolution. Alpha-Net successfully resolved adjacent lines with distances ranging from 120 nm and 360 nm (Fig. 2c–e). In contrast, VCD-Net could only resolve lines spaced 240 nm apart and produced ambiguous reconstructions for lines spaced 120 nm apart (Fig. 2d, e). These results validate the capability of Alpha-LFM to achieve a resolution of 120 nm.
Furthermore, we imaged the lysosomes in a fixed U2OS cell and reconstructed their 3D distributions using Alpha-LFM (Fig. 2f–h). The raw LF image of the lysosome was effectively denoised, de-aliased, and finally transformed into fine 3D structures (Fig. 2g, Supplementary Figs. 12 and 13). To validate the resolution enhancement and fidelity by Alpha-LFM, we also obtained the in situ Airyscan15,37 images of lysosomes in the same fixed U2OS cell and enhanced them through an axial-to-lateral isotropic learning (Methods). As verified by calculating the structure similarity (SSIM) using enhanced Airyscan microscope’s results as references, the fine structures of lysosomes were accurately reconstructed throughout the 3D volume (Fig. 2i). The fidelity of Alpha-Net reconstruction was quantified to be significantly higher when compared to current leading light-field reconstruction techniques, including LFD, virtually-scanning LFD (VS-LFD)26, and VCD (Fig. 2j, “Methods”). Additionally, while LFD, VS-LFD, VCD yielded average lateral resolutions of 780 nm, 315 nm, 261 nm, and average axial resolution of 1021 nm, 440 nm, 277 nm, respectively, Alpha-Net achieved a near isotropic resolution of 120 nm, which is far superior to all the alternative approaches and close to the resolution of GT (Fig. 2k, “Methods”).
We further evaluated the performance of Alpha-Net across different LFM configurations, including various spatial LFM and Fourier LFM systems21,25. To validate its adaptability, we first constructed an LFM system using an MLA with a pitch size of 45.5 µm and a focal length of 1.6 mm (Methods). When reconstructing LF images of lysosomes acquired with this setup, Alpha-Net still demonstrated superior resolution and fidelity compared to state-of-the-art light-field reconstruction algorithms, including LFD, VS-Net, and VCD-Net (Supplementary Fig. 14). In addition to spatial LFM, we built a Fourier LFM system by incorporating a Fourier lens to perform an optical Fourier transform of the image at the native image plane and placing the MLA (pitch = 3.25 mm, f = 120 mm) at the back focal plane of the Fourier lens (Methods). By adapting the data synthesis strategy and view extraction method in the network to align with the wave optics model of Fourier LFM, Alpha-Net accurately reconstructed fine microtubule structures, using 3D SIM results acquired under high light dose as GT (Fig. 2l, “Methods”). In contrast, VCD-Net introduced high-frequency artifacts and produced discontinuous signals, resulting in low fidelity.
The high fidelity achieved by Alpha-LFM under low light dose required for volumetric acquisition highlights its advantages in imaging speed and reduced photobleaching. To evaluate these benefits, we assessed the imaging performance of Alpha-LFM across varying total light doses used for volumetric acquisition and compared it to the state-of-the-art live-cell imaging technique, 3D SIM7,38, under identical light dose conditions (Fig. 2l, “Methods”). While 3D SIM achieves high-quality reconstructions with superior resolution of (~123 ± 5 nm) under high light dose conditions, its reconstruction fidelity deteriorates as the light dose decreases. In contrast, Alpha-LFM fully utilizes photons emitted from the entire volume, achieving a higher signal-to-noise ratio (SNR) under the same low light dose. The SSIM and peak signal-to-noise ratio (PSNR) metrics presented in Fig. 2m confirm more stable reconstruction fidelity and resolution of Alpha-LFM compared to SIM and SIM-denoise34 under low light dose conditions (Fig. 2m, Supplementary Fig. 15). The resolution of Alpha-LFM was quantified to be ~126 ± 6 nm under high light dose and showed a slight variation in resolution (~135 ± 15 nm) across various light doses. The combination of high photon efficiency and robust denoising capabilities enables Alpha-LFM to deliver superior resolution and reconstruction fidelity, making it particularly well-suited for high-speed or long-term imaging applications.
Comparative performances of Alpha-LFM and other approaches
Alpha-LFM enabled 4D imaging of large-scale mitochondrial dynamics in dozens of cells with a volumetric imaging rate up to 333 Hz and a field of view (FOV) of ~ 220 × 220 × 10 µm3 (Fig. 3a). We compared the reconstruction of Alpha-LFM with those from the alternative LFM reconstruction approaches, including VS-LFD26 and VCD30 based on an end-to-end network model (Fig. 3b, Supplementary Movie 2). While Alpha-Net reconstruction clearly showed the outer membranes of mitochondria, these structures remained unresolvable in LFD, VS-LFD, and VCD reconstructions, owing to their suboptimal resolutions and noticeable artifacts. Alpha-LFM yielded a near isotropic lateral and axial resolutions of ~120 nm, which were compared with 230 nm and 370 nm by VS-LFD, 180 nm and 350 nm by VCD, respectively (Fig. 3c). The ultrafast light-field imaging rate together with subcellular-resolution reconstruction thus allowed the visualization of the fast morphology changes of the mitochondria, such as fission and fusion, occurred in three dimensions and at milliseconds timescale (Fig. 3d). Since LFM only requires light exposure once per volume, its photobleaching is notably lower than the scanning-based 3D microscopy (Fig. 3e). Meanwhile, the inclusion of denoising module in Alpha-LFM ensures stable light-field reconstruction from low-exposure, noisy measurements (Supplementary Fig. 16). As a result, low-photobleaching imaging combined with low-exposure reconstruction capability together permitted high spatiotemporal resolution live-cell imaging in long term, yielding over 40000 SR volumes with less than 50% photobleaching (Supplementary Movie 3). In contrast, plane-scanning-based light-sheet fluorescence microscopy (LSFM)39, 3D SIM7,38 (15 exposures for each plane) and point-scanning-based Airyscan microscopes15 suffered from noticeable photobleaching after imaging merely 500, 6, and 50 volumes, respectively (Fig. 3e, f, “Methods”). We compared the resolution and volumetric speed of our Alpha-LFM with current leading 3D fluorescence microscopy techniques for live-cell imaging, including scanning-based SR microscopes such as Airyscan microscope, Instant SIM (iSIM)8 and LLSM-SIM9 and LFM modalities such as sLFM22, Fourier light-field microscopes (FLFM)25, and VCD-LFM30 (Fig. 3g). While scanning-based SR microscopes exhibit trades-off among spatial resolution and imaging speed, Alpha-LFM has apparently enlarges this limitation, showing spatial resolution as well as volumetric speed far superior to not only SR microscopes, but also other LFM modalities (Fig. 3g).
a The volume rendering of the large-scale mitochondrial dynamics in dozens of live U2OS cells reconstructed by Alpha-Net within a large FOV of 220 × 220 × 10 μm3 using a 60×/1.3 NA objective. Scale bar, 20 μm. b The volume rendering of the reconstructed results of mitochondrial outer membranes in live U2OS cells by Alpha-Net, VCD-Net, and VS-LFD. Insets show the magnified volume renderings of the ROI indicated by the blue dotted box. Scale bar, 5 μm. c Decorrelation analysis quantifying spatial and axial resolution of VS-LFD, VCD-Net, and Alpha-Net. n = 20 volumes were analyzed at both xy and xz planes. The center line represents the median, the box limits represent the lower and upper quartiles, and the whiskers represent the min and max values. d Time-lapse 3D visualization captures the rapid morphological transformations of the mitochondrial outer membrane occurred at milliseconds timescale, illustrating both the processes of mitochondrial fission and fusion. Scale bar, 2 μm. e Comparisons of the photobleaching rates between Alpha-LFM, LSFM, Airyscan, and 3D SIM. f Max intensity projection of images of lysosomes in live U2OS cells imaged via 3D SIM (30 s every volume for a whole cell), Airyscan (4 min every volume for a whole cell), LSFM (4 s every volume for a whole cell), and Alpha-LFM (3 ms every volume for at least a whole cell). Scale bar, 2 μm. g The comparisons of the performance in resolution and volumetric imaging speed between Airyscan, iSIM, LLSM-SIM, FLFM, sLFM, VCD-LFM, and Alpha-LFM.
High-speed 3D imaging of peroxisomes and ER enabled by Alpha-LFM
The ultrafast volumetric imaging rate combined with subcellular-resolution reconstruction provided by Alpha-LFM enabled the visualization of rapid dynamics of peroxisomes and the ER in three dimensions.
Without requiring scanning, Alpha-LFM successfully demonstrated its capability to capture peroxisomes (tagged with SKL-mApple) in live U2OS cells at 100 volumes per second (vps) over a 1-min duration, yielding 6000 volumes (Fig. 4a, Supplementary Movie 4). This high imaging rate allowed the extraction of spatiotemporal patterns of peroxisome motion in 3D on a millisecond timescale (Fig. 4b). In contrast, current scanning-based 3D microscopy techniques9,40 achieve imaging speeds up to 10 vps. To evaluate the impact of imaging speed on peroxisome analysis, the 100-vps data was downsampled to 10 vps, and peroxisome tracking was performed on both datasets. While analysis of the 10-vps data revealed a velocity limited to 1 μm/s, the 100-vps data captured velocities as high as 10 μm/s (Fig. 4c, d). This discrepancy arose from missed movements and reduced trajectory accuracy caused by the lower imaging speed (Fig. 4e, f).
a Volume rendering of peroxisomes (tagged with SKL-mApple) in a live U2OS cell, acquired with Alpha-LFM. b Time-lapse MIPs of the ROI indicated by orange boxes in (a). Arrows indicate the rapid motion of peroxisomes. c 3D tracking of peroxisomes imaged at 100 vps and those downsampled to 10 vps, with velocity encoded by color. d Velocity plots of the peroxisome highlighted by a box in (c). e Trajectory of the peroxisome highlighted by box in (c). f Comparison of trajectories tracked using the 100-vps and 10-vps results. g x-y MIP and x-z, y-z slices of the ER (tagged with Sec61β-EGFP) in a live COS-7 cell captured by Alpha-LFM. h Projection of skeletonized images over 1 s, with time encoded by color, visualizing ER dynamics. Scale bar, 5 μm. i Comparison of ER dynamics of the ROI (marked by white boxes in g, h) acquired at 100 vps versus 1 vps (downsampled). White arrows and circles highlight dynamics resolved at 100 vps but blurred at 1 vps. Scale bar, 2 μm. j, k Magnified time-lapse images of the ROI marked by the orange box in (g). White arrows in (j) indicate the stretching of an ER tubule within 40 ms, while orange arrows in (k) show the formation of a new tubule within 40 ms. Scale bar, 2 μm.
Additionally, we recorded the dynamics of the ER (tagged with Sec61β-EGFP) in a live COS-7 cell at 100 vps using Alpha-LFM (Fig. 4g, Supplementary Movie 5). The rapid growth and remodeling of the ER, with velocities reaching ~3.5 μm/s, have previously been observed in 2D using 2D SIM microscopes8,29,41. Alpha-LFM’s volumetric imaging capability at high resolution provided a clear visualization of ER tubules in 3D. The 100-vps speed allowed us to resolve the remodeling of individual ER tubules within milliseconds, which appeared blurred at 1 Hz, the maximum volumetric imaging rate achievable by current 3D SR microscopes12,29 (Fig. 4h, i). Leveraging this high speed, we observed the stretching of an ER tubule and the formation of a new ER tubule, both occurring in less than 50 ms (Fig. 4j, k). These experiments underscore the essential role of Alpha-LFM in capturing highly dynamic biological processes.
Dual-color Alpha-LFM for 5D in-toto imaging and quantification of lysosome-mitochondria interactions
The high spatiotemporal resolution of Alpha-LFM allowed the visualization of rapid morphological changes in mitochondria, such as fission and fusion, in three dimensions. We validated the fidelity of dynamic Alpha-LFM imaging for identifying mitochondrial activities using in situ WF images (Supplementary Fig. 17) and co-expression of Drp1 (Supplementary Fig. 18). Notably, we observed 48 mitochondrial fission events, with ~96% of these events (n = 46) marked by Drp1, consistent with previous studies42. These results confirm that the fission events identified by Alpha-LFM are genuine.
Using Alpha-LFM, we conducted simultaneous 5D (3D space + time + spectrum) SR imaging of the outer membranes of lysosomes (tagged with Rab7-mCherry2) and mitochondria (tagged with Tomm20-EGFP) in live U2OS cells (Fig. 5a, Supplementary Movie 6). The isotropic subcellular resolution presented in five dimensions then permitted in-toto visualization of the Lysosome-Mitochondria (Lyso-Mito) interactions (Fig. 5b, c). While 2D microscopes merely provided projection images of Lyso-Mito contacts, our ability to reconstruct 3D processes eliminated false judgments and yielded more accurate analytic results (Fig. 5d). By measuring the distance between lysosome and mitochondria in 2D and 3D (Methods), respectively, we identified a 24% margin of inaccuracy in 2D results (Fig. 5e, 11 false cases in 41 events from 17 cells), proving the significance of Alpha-LFM for investigating organelle interactions in 4D. Recent research has revealed that lysosome-mitochondria contact sites may serve as a marker for mitochondrial fission43. We scrutinized 35 fission cases from 17 cells reconstructed by Alpha-Net to measure the proportion of Lyso-Mito contact in the mitochondrial fission events. We found that lysosomes contacted mitochondria at 48% of mitochondrial fission sites, which was significantly lower than the rate obtained via 2D analysis of our own images (71%) (Fig. 5f). This was presumably due to the tendency of 2D imaging methods to misidentify the Lyso-Mito contact. We also analyzed the Lyso-Mito contacts at mitochondrial fusion sites, and the proportion was 41% in our statistics. With the high spatiotemporal resolution of Alpha-LFM, we also perform 3D tracking of the endpoints of mitochondrial fission and fusion and analyze whether Lyso-Mito contact would affect the speed of mitochondrial fission and fusion (Fig. 5g, Supplementary Movie 7). While there is no significant difference in fission speed was found when 2D analysis was performed (P > 0.05, one-way ANOVA), the results based on our 3D images showed a positive correlation between lysosome contacts and fission (P < 0.05)/fusion velocity (P < 0.001) (Fig. 5h, i). These findings suggest that interactions with lysosomes might accelerate mitochondrial fission and fusion. However, this hypothesis requires further validation through systematic physiological and biochemical experiments.
a A dual-color volume rendering of the outer membranes of mitochondria (Tomm20-EGFP) and lysosomes (Rab7-mCherry2) in a live U2OS cell. Scale bar, 10 μm. b, c Time-lapse images of typical lysosomes-mediated mitochondrial fusion and fission events, respectively. Volume rendering shows the first volume of the events in 3D. The xy and xz slices of the ROI indicated by white boxes in the volume rendering show the interactions between the lysosome and the fission/fusion sites of mitochondria. d A false positive case where the lysosome appears to be in contact with the fission sites in a 2D projection, but in fact there is a significant distance between them in a 3D observation. e The 3D distances between the lysosomes and mitochondrial contact sites reveal the inaccuracies in the identification of Lyso-Mito contacts in 2D. The dashed line indicates the close proximity (<240 nm) between lysosomes and mitochondrial constriction sites that can be identified as Lyso-Mito contacts. f The proportion of mitochondrial fission and fusion events occurred with and without Lyso-Mito contacts identified in 3D and 2D projections, respectively. The results of 3D indicate a lower contact rate during mitochondrial fission and fusion than the rate obtained via 2D measurements (n = 35 fissions and 31 fusions from 17 cells). g 3D tracking of mitochondrial fission and fusion with and without Lyso-Mito contact. The fission and fusion velocities are represented by color. The hotter color, when in contact with the lysosome, indicates a higher fission and fusion velocity compared to without the lysosome’s contact. Scale bar, 500 nm. h, i The velocity of mitochondrial fission and fusion with or without Lyso-Mito contact via 2D and 3D analysis (n = 35 fission and 31 fusion events). The velocity is significantly higher when in contact with lysosomes via 3D analysis, while there is no significant difference in fission velocity based on 2D results (P = 0.062410; 0.039430 in h, P = 0.03074; 0.00017 in i). Boxes are median ± i.q.r.; whiskers represent the min and max values. ns P > 0.05, *P < 0.05, ***P < 0.001 (one-way ANOVA).
Long-term imaging and quantitative analysis of mitochondrial fates across cell cycles
The low phototoxicity and robust denoising capability of Alpha-LFM enable 3D imaging of mitochondria evolution (tagged with Cox4-EGFP) in live U2OS cells across a long timescale up to 60 h. Considering both rapid mitochondrial dynamics and their long-term fates need to be studied, we specifically designed an automatic imaging strategy to track the high-speed fission and fusion process throughout the entire cell cycle with minimal phototoxicity (Fig. 6a, “Methods”). During each cycle of 4 h, we continuously imaged the mitochondria for 30 min across 8 FOVs, with exposure time of 500 ms and a temporal resolution of 10 s (Fig. 6b). During this 30-min mid-term observation window, the imaging pipeline included the following steps: (i) Implementing an Alpha-Net-based focusing strategy to correct sample drift along z-axis. Mitochondria were imaged and quickly reconstructed by Alpha-Net to calculate the z-drift distance, which would be further corrected by moving the objective. This process typically takes merely 1 s. (ii) Imaging the mitochondria dynamics at fixed depth of interest for 30 min, with an exposure time of 500 ms and an interval time of 10 s. Meanwhile, we imaged in situ chromosomes of the same FOV for identifying the related cell stage with an exposure time of 500 ms and an interval time of 100 s (Supplementary Movie 8). After 36-h observation containing 9 cycles of 30-min continuous recording, we discovered diverse viabilities of the cells within one batch, identifying both inertia without cell division and active 2-generation divisions across one entire cell cycle (Fig. 6c).
a Automatic Alpha-LFM imaging pipeline designed for the long-term imaging (up to 60 h) of mitochondria (tagged with Cox4-EGFP) and chromosomes (tagged with H2B-mCherry). In each cycle of 4 h, the cells were continuously imaged for 30 min. The 30-min observation majorly includes the following steps: (i) An instant VCD-based auto-focusing strategy to correct sample drift along the z-axis. Quick light-field imaging and reconstruction of mitochondria were performed to calibrate the z-axis drift of the system. The drift was then corrected by moving the objective using a piezo scanner. (ii) Alpha-LFM imaging of the mitochondria dynamics and chromosomes in cell nuclei for 30 min. Eight FOVs were sequentially imaged in each cycle with an interval time of 10 s for mitochondria and 100 s for chromosomes and an exposure time of 500 ms for both. b The MIPs of 8 FOV imaged in the first cycle. c The volume renderings of time-lapse images visualizing diverse cell viabilities with one undergoing 2-generation divisions (the first FOV in b) and another showing inertia without cell division (the second FOV in b) during the same 60-h imaging. The inset numbers indicate three cells and their later generations. d The 3D visualization of a representative mitochondrion during 30 min. The mitochondria are highlighted in the 3D rendering of the whole cell. e Detailed lineage tracing of the selected single mitochondrion with seven generations traced. The mitochondria daughters rendered by magenta were from the same source mitochondrion (leftmost one), while the yellow rendering was a foreign mitochondrion. f Schema depicting the different fates of the daughter mitochondria in different stages of interphase G1 and G2 in the cell cycle. Mitochondria were tracked for 30 min. Both the quantities and portions of the specific mitochondrial events were calculated. g The morphology changes (Major axis length and minor axis length) of mitochondria during two entire cell cycles throughout 60 h (n = 255). The solid lines represent the median, and the dashed lines represent quartiles. The spots indicate all calculated values. Scale bar, 10 μm.
Smart Alpha-LFM imaging of live cells, as such included capture of instantaneous mitochondrial fission/fusion events (500 ms), lineage tracing of single mitochondrion’s change in the middle term (30 min), and evolution mapping of all mitochondria in the long term (60 h), thereby enabling in-toto investigation of the mitochondrial fates during the entire cell cycle. For example, up to seven generations of the fissions/fusions of an individual mitochondrion were recorded during the 2nd 30-min observation. Each fission/fusion activity of the selected mitochondrion and its resulting later generations were successfully visualized in three dimensions (Fig. 6d). We casted lineage tracing of these mitochondrial activities (Fig. 6e, Supplementary Movie 9). In this case, we observed that majority of subsequent generations of the source mitochondrion (the Magenta ones in Fig. 6e) fused with their relatives with only one exception being fused with a foreign mitochondrion (the yellow one in Fig. 6e). Furthermore, we followed the fates of the daughter mitochondria from peripheral and midzone fissions at whole-cell scale. After analyzing 108/113 mitochondrial at G1/G2 stages, we validated that most of the small peripheral daughter mitochondria were excluded from further fusions or divisions, consistent with a previous study44. Also, we observed a significant difference in the rates of no event occurring in these small peripheral daughters between the G1 (92%) and G2 (80.7%) phases (Fig. 6f, Supplementary Fig. 19). This is probably because the G2 phase is the phase before the cell division, in which the mitochondria are more active to prepare for the cell division45. Furthermore, Alpha-LFM results also allowed us to study the morphology changes of mitochondria, including variations in major and minor axis length, throughout two entire cell cycles (Fig. 6g, Supplementary Fig. 20). The significant variations in mitochondria morphology and event rates highlight the importance of long-term imaging facilitated by Alpha-LFM in advancing biological research and applications.
Discussion
Sustained observation of the subcellular biological dynamics at their physiological status is essential to investigating diverse organelle functions and their interactions in cell biology. This is difficult for either scanning-based SR microscopes with suboptimal speed and phototoxicity, or high-speed light-field microscopes with limited single-cell resolution. Alpha-LFM circumvents this compromise between speed, resolution, and photon efficiency based on the new development of a deep-learning VCD pipeline, now with efficient solution-seeking abilities. We’d like to summarize the following technical advances of Alpha-LFM enabled by our new developments of physics-assisted network framework and adaptive tuning strategy. First, Alpha-LFM includes new physics-assisted model design, hierarchical data synthesis procedure, and DPO strategy to maximize the network’s solving ability to the light-field inversion problem with a huge solution space, thereby efficiently pushing the resolution of LFM beyond the diffraction limit. Meanwhile, the strong denoising by the network in conjunction with photon-efficient LFM modality together enables ultra-long-term cell observation under physiological status, wiping off the phototoxicity issue existed commonly in scanning-based SRM with allowing over 10-fold more measurements. We also demonstrated the broad adaptability of the Alpha procedure to various LFM configurations, including spatial LFM excelling in imaging large fields of view and Fourier LFM being well-suited for resolving denser signals. Furthermore, the requirement of a large dataset and extensive training time by a supervised network is also mitigated in Alpha-LFM by using readily-accessible WF and LF measurements of live samples to instantly tune the model adaptive to new types of signals. It is worth noting that Alpha-LFM has been wrapped into a fully open-source program with a user-friendly GUI provided. With Alpha-LFM, we four-dimensionally imaged the rapid motion of peroxisomes and the ER at 100 vps and highly dynamic interactions between lysosomes and mitochondrial membranes at isotropic 120-nm spatial resolution. We also demonstrated SR fluorescent imaging of live cells with probably longest observation time of 60 h. The image results support lineage tracing of both short-term changes and long-term fates of mitochondria at single-mitochondrion resolution and across an entire cell cycle.
While Alpha-LFM demonstrates improved performance compared to existing LFM approaches, it still faces inherent limitations stemming from supervised deep-learning approaches. Although the adaptive-tuning strategy mitigated the requirement of a large dataset and training cost of the supervised network, the fidelity on unseen data is still constrained by limited information. It is envisaged that the incorporation of more data, especially those with higher resolution and volumetric prior, will further improve the performance of fine-tuning. Meanwhile, it’s noted that Alpha-LFM still requires a considerable amount of high-resolution label data to initiate the training of the base model and encounters the inherent generalization issue of supervised networks. Like most deep-learning-based computational super-resolution approaches, whose reconstruction quality depends on signal properties, Alpha-LFM also experiences reduced resolution and fidelity when reconstructing highly dense or low-SNR signals (Supplementary Fig. 15). It should also be noted that Alpha-LFM primarily enhances the imaging of subcellular structures within live cells. The in vivo imaging of animals involves light scattering and requires additional optimization of the approach, which is beyond the scope of this study.
We anticipate that the generalization issue and dependency on external data of Alpha-LFM will be circumvented by full-cycle self-supervision reconstruction strategies, in which the in situ WF measurements of the samples can be used as internal training labels to further improve the resolution of LF views and lead to reconstruction with higher quality. A completely unsupervised deep-learning light-field reconstruction could also be made possible by using implicit network representation (INR)46,47. It’s promising because the intrinsic view synthesis capabilities in INR are indeed helpful to the enhancement of axial resolution and reduction of reconstruction artifacts. But the learning-and-representing mode for each single LF image is very time-consuming and computationally demanding. INR-based reconstruction needs to solve this efficiency problem before it becomes as practical as DL and deconvolution approaches. Given that a lot of in vivo biological dynamics occur in the deep tissues, we also expect the combination of Alpha-LFM with more advanced LFM techniques, e.g., LFM with AO or two photon excitations, or even a system working at the NIR-II window, to observe the subcellular dynamics inside living animals. Taken together, we believe Alpha-LFM has strongly pushed the spatiotemporal limit of current SRM/3D fluorescence microscopy and could be a powerful and accessible tool that helps a broad range of biology research to demystify the worlds inside the cell. The paradigm shift it shows could also be beneficial to propelling other microscopy techniques towards deeper, faster, and clearer imaging.
Methods
LFM optical setup
An add-on device was designed to provide easy light-field imaging of live cells on a commercial inverted fluorescent microscope (Olympus IX73), thus enabling wider applications for biology researchers without an optics background. The device only requires the assembling of off-the-shelf lenses into a customized mounting base without any complex alignment of the optical path. To precisely position MLA at the native image plane of the microscope, a customized lens tube 1 was designed to fix the MLA (RPC Photonics, MLA-S100-f28) at the Flange focal distance away from the camera port. Then, a pair of relay lenses (Thorlabs, TTL100-A) was integrated into another customized lens tube 2 to 1:1 relay the back focal plane of MLA (light-field image plane) onto the camera sensor. A high-precision zoom housing (SM1ZM, Thorlabs) was used to interconnect the 2 lens tubes and allow fine focusing of the light-field image plane (Supplementary Fig. 11). The complete add-on device containing the two lens tubes and the zoom housing was then mounted between the microscope’s camera port and the sCMOS camera (Photometrics Prime BSI Express), converting the ordinary inverted microscope into an advanced light-field microscope. This setup contains 15 × 15 views (Nnum = 15). An alternative LFM configuration was implemented by replacing MLA with one featuring a pitch size of 45.5 µm and a focal length of 1.6 mm, containing 7 × 7 views (Nnum = 7) (Supplementary Fig. 14).
Additionally, a Fourier LFM system was constructed by adding a Fourier lens (Thorlabs, AC508-250-A) to perform a Fourier transform of the native image plane, placing MLA (pitch = 3.25 mm, f = 120 mm) at the back focal plane of the Fourier lens, and using a 100×/NA1.5 objective lens (Olympus UPL, APO100XOHR). This setup was used for imaging microtubules and ER in Figs. 2 and 4, and Supplementary Fig. 15.
Training data processing pipeline
A data processing pipeline was developed to generate all training data (including Noisy LF, Clean LF, De-aliased LF, and 3D SR) from the 3D SR data. It’s noteworthy that the 3D SR data could be acquired from any SR microscopy with a resolution of around 120 nm. In our study, we provide an easily accessible solution that is acquired from commercial Airyscan confocal microscopy15,37 (Zeiss, LSM900) by imaging fixed cells and enhanced through applying a well-established self-learning network12,28,48, thereby achieving an isotropic resolution of 120 nm. We also demonstrate that our strategy can perform well on raw Airyscan data (Supplementary Fig. 21), 3D SIM data acquired from a commercial SIM microscope (HIS-SIM, Guangzhou Computational Super-resolution Biotech) (Fig. 2l, Supplementary Fig. 15) or SR data acquired from 4-beam SIM12 (Supplementary Fig. 22).
A SAS LFP pipeline was designed to create De-aliased LF images to guide the LF de-aliasing task. The 3D SR volumes were shifted by a distance less than the size of the microlens along the horizontal (both x and y, respectively) and then were projected by convolving with a 5D PSF of LFM to yield multiple light field projections that included more spatial information through multiple sampling. The shifted times depended on the undersampling rate of the microlens, which is 5 × 5 times with a step of 3 pixels for the setup of Nnum = 15 and 3 × 3 times with a step of 2 for the setup of Nnum = 7 in this paper. The SAS LF projections were then realigned to generate the De-aliased LF images according to the arrangement of the light.
Then, Clean LF images (the center ones with shift = 0 of the De-aliased LF images) were used to guide the denoising task. To generate Noisy LF with accurate SNR matched with the experimental LF, the Clean LF images were normalized to the same range of experimental LFs, and various noises in the range of the highest and lowest noise in the experimental LFs during long-term observation were added to the Clean LFs to generate Noisy LFs. The SNR was calculated by:
where S is the average signal intensity value in the image, \({I}_{{{\rm{background}}}}\) and \({\sigma }_{{{\rm{background}}}}\) are the mean and the standard deviation of the background, respectively. Finally, we obtained De-aliased LF images, Clean LF images, and Noisy LF images, which were all conveniently generated from the same 3D SR images, for the network training of Alpha-LFM. The rationality of the synthetic pipeline was validated by the high similarity between synthetic LFs and experimental LFs (Supplementary Fig. 23).
For Fourier LFM, the LF projections were generated by using the PSF of FLFM. The high-resolution (HR) LF images used to guide the LF de-aliasing network were generated by squaring the PSF of the Fourier LFM and convolving it with 3D SR data.
For adaptive tuning shown in Supplementary Fig. 10, the training dataset for the base model included outer membranes of lysosomes (panels b–d) and outer membranes of mitochondria (panels e–h). For all other figures, the training and testing datasets consisted of the same sample type.
The network design of Alpha-LFM
Previous deep-learning-based light-field reconstruction networks enabled resolution enhancement by training the light-field projections and high-resolution GTs in an end-to-end manner. This one-step network produces hallucinations or artifacts when solving a hard inverse problem. Simply increasing the number of model parameters leads to over-fitting without incorporating physical priors (Supplementary Fig. 24). Considering the physical process of light-field imaging, our Alpha-Net disentangles this inverse problem into four sub-problems: denoising, de-aliasing, and 3D reconstruction. The three sub-problems are divided and conquered by jointly optimizing task-specific networks with a weighted objective function.
The architecture of the denoising sub-network was based on an attention mechanism49,50, which consisted of dual branches (channel-attention and view-attention) to extract the view-wise features from input views with different SNR. The denoising module can be formulated as
where \({{\rm{Con}}}{{{\rm{v}}}}_{{{\rm{dialated}}}}\) means a dilated convolution operation that enlarges the receptive field. \({{\rm{Fusion}}}\) is feature element-wise adding. CAB and VAB are the channel-attention branch and view-attention branch. In order to suppress the noise fluctuation, L1 and L2 loss is used as loss function during denoising stage:
For LF de-aliasing, how to effectively utilize the angular and spatial information encoded in LFs poses a challenge when using conventional 2D super-resolution networks, such as ResNet51 and RCAN52, directly on the extracted views. These approaches often fail to fully leverage the rich angular features inherent in LFs. In FLFM, we used the dual-attention network structure as a denoising sub-network to extract angular features from FLFM views, taking advantage of the distinct 4D representation in the Fourier domain21. To increase the sampling rate of LFs, an additional upsampling layer was integrated at the end of the dual-attention network, forming the FLFM de-aliasing network. In spatial LFs de-aliasing, considering the interleaved arrangement of angular and spatial information in spatial light-field images, we employed two types of dilated convolution operations to disentangle 2D spatial information and 2D angular information53. These two convolutions and a series of activation functions and feature fusion layers form the Spatial-Angular block (SAB). According to the sequential connection of four SAB and upsampling convolution, the aliased spatial information caused by the sparse sampling rate of MLA can be de-aliased:
where \({{\rm{Fusion}}}\) is a channel concatenate operation. Considering the potential parallax information across different views, epipolar constraint (\({{\rm{EP}}}{{{\rm{I}}}}_{{{\rm{consistency}}}}\)) on de-aliasing prediction is proposed to keep the geometry relationship among various views. The objective function of de-aliasing can be formulated as:
where \({{\rm{MSE}}}\) means mean squared error between network output and de-aliased LF.
For the 3D reconstruction model, we extended the original VCD to handle super-resolution reconstruction. According to adding MultiRes blocks54 and modifying activation function, the capability of VCD network is enhanced, achieving lower fitting errors when solving SR reconstruction:
where \({{\rm{SubpixelConv}}}\) refers to the subpixel convolution operation of the input.
Correspondingly, for the 3D reconstruction task, we adopt MSE loss and lateral gradients loss to preserve the structural information of the GT.
These three networks are optimized synchronously based on the designed weighted objective function:
The architecture of the three modules exhibited task-specific capabilities (Supplementary Fig. 25). Benefiting from the rationality of the model design, repeated training yielded highly consistent predictions, indicating low model uncertainty (Supplementary Fig. 26).
For Alpha-Net training, all models were trained on NVIDIA GeForce RTX 3090 or 4090 with Python version 3.8 and Tensorflow version 1.15.0. The training process of an Alpha-Net model was composed of two stages. First, pretrain 3 sub-networks, including a denoising network, a de-aliasing network, and a 3D reconstruction network. Specifically, the denoising network was trained with a learning rate of 1 × 10−4 and a training epoch of 51. For de-aliasing network and 3D reconstruction network, they used the same learning rate (5 × 10−4) but different training epochs (51 and 151, respectively). The pretraining process lasted for 16 h. Secondly, train Alpha-Net with these pre-trained models under the designed weighted loss. During this DPO, we adopted step-based learning rate schedules with a decay factor of 0.5 and a decay step of 25 to enhance model performance. The initial learning rate was set to 1 × 10−4 and the total training epoch was 101. The training time of the second phase was roughly 2 h.
Once the model was trained, the captured LFs could be reconstructed into 3D volumes through network inference. The inference time was determined by the computational capacity of the device and the voxel count of the reconstructed volume, for example, reconstructing a 3D volume with a size of 2040 × 2040 × 161 (height × width × depth) from captured LF (1020 × 1020, height × width) took ~0.540 s (~0.062 s for LF denoising, ~0.080 s for LF de-aliasing and ~0.398 s for 3D Reconstruction). For more details about Alpha-Net implementations, see Supplementary Note 1 and our open-source code.
Adaptive-tuning strategy for reconstruction of unseen samples
Unlike previous supervised learning strategy30 only solving the inverse problem for specific samples similar to the training data, Alpha-LFM adopted an adaptive-tuning strategy to overcome the hallucination and abnormal structure features when facing unseen data for the trained model. We developed an adaptive-tuning strategy to reconstruct 3D signals from LF captures of unseen types of structure that were not included in the training data via the fine-tuning model, in which only an in situ 2D WF image captured together with the LF imaging was required, instead of the supervision of 3D label data from another scanning microscope. During the adaptive-tuning phase, we adopted an alternative optimization approach to re-update the parameters of the trained model. The paired synthetic training data (e.g., the outer membrane of mitochondria) containing 3D stack were used to provide the volumetric priors to ensure the fidelity of 3D prediction of original model while the experimental LF captures and corresponding WF images act as prior knowledge of new domain (e.g., the outer membrane of lysosome) to prevent data bias brought by trained model. For synthetic data training, the designed weighted loss of the original model was adopted:
For the new domain transfer, reprojection loss was to provide constraints on the different structure of unseen samples:
where \({{\rm{reprojection}}}\) was computed from the maximum projection and down-sampling of network 3D prediction of \({{{\rm{LF}}}}_{{{\rm{unseen}}}}\). The WF images were processed by a deconvolution algorithm to suppress out-of-focus signals. To ensure the accuracy of this loss computation, the two detection modalities (LF and WF) were pre-aligned with the aid of fluorescent beads to yield paired LF images and registered WF images. The paired LF-WF dataset in the fine-tuning phase contained ~3 cells. In this work, \({{\rm{Los}}}{{{\rm{s}}}}_{{{\rm{synthetic}}}}\) and \({{\rm{Los}}}{{{\rm{s}}}}_{{{\rm{projection}}}}\) were calculated every two consecutive iterations, N and N + 1, respectively. The effectiveness of the alternative optimization strategy was validated through an ablation study on the reconstruction of unseen samples (Supplementary Fig. 27). The training time of the fine-tuning phase was typically 70–100 folds less than the consumption of building a new model, with a total mini-batch iteration of ~30,000 finished in 10–15 min.
Network comprehensive performance pyramid (NCPP)
The NCPP visualized the model’s ability by quantifying the resolution and fidelity of network inference results between Alpha-LFM and VCD-LFM during model optimization. To clarify the fitting ability of these two models on label data, such resolution and fidelity metrics were derived from the cut-off frequency difference (ΔKc) and the structural similarity discrepancy (ΔSSIM = 1−SSIM) between the 3D reconstruction of the network and corresponding GT, where Kc is computed by decorrelation analysis55. Both ΔKc and ΔSSIM range between 0 and 1, where values close to 0 indicate a favorable combination of resolution and fidelity, while values close to 1 signify a low-quality image reconstruction. In our study, 196 LF patches (Size: 360 × 360) of mitochondrial outer membrane data were reconstructed by VCD and Alpha-Net during the whole model training process. Specifically, we chose five different epochs (10, 100, 200, 210, and 300) to track this comprehensive performance of two models: For Alpha-Net, 10–100 epochs denoted the decomposed optimization on each sub-network while 200–300 epochs represented the progressive optimization process with multi-stage data (Noisy LFs, Clean LFs, De-aliased LFs, and 3D SR); For VCD, 10–300 epochs denoted the optimization process under SR stacks and corresponding Noisy LFs. By calculating ΔKc and ΔSSIM of network inferences under the epochs, the performance scattering plots in the network optimization process were produced (Fig. 2a). Besides, the convex hull computation of these scattering plots was used to generate the boundary line of the statistics distribution and evaluate the performance deviation.
Assessment of the resolution, fidelity, and SBP of Alpha-LFM
In Fig. 2, we used the structural similarity (SSIM) and PSNR function in Matlab to assess the fidelity of our Alpha-Net reconstructions and other LFM techniques, including LFD, VS-LFD, and VCD, utilizing the Airyscan data as reference in both the xy and yz planes. In Supplementary Fig. 10, the resolution-scaled Pearson coefficient was quantified by SQUIRREL analysis56 using WF as reference. We applied decorrelation analysis to quantify the resolution of Alpha-Net results and other 3D microscopy implementations. The analysis was conducted using MATLAB. The axial resolution was measured by the sectorial resolution mode of the decorrelation analysis55, in which only the resolution in a narrow sectorial region along the z direction was calculated. The SBP was calculated by SBP = FOV/(0.5δ)3 with δ as the system’s resolution, the factor 0.5 stemming from the Nyquist-Shannon sampling theorem, and the factor 3 representing three dimensions.
In Fig. 2l and Supplementary Fig. 15, the fidelity of Alpha-LFM was validated under varying light doses by adjusting the optical power and exposure time. Comparison between Alpha-LFM and 3D SIM was conducted using the same objective (Olympus UPL, APO100XOHR) and identical light doses from 1.6 to 0.02 J/cm2 for obtaining the same volume with a depth of 4 μm.
Cell culture and fluorescence labeling
U2OS cells were grown in culture medium containing McCoy’s 5A medium (Thermo Fisher Scientific) supplemented with 1% antibiotic-antimycotic (Thermo Fisher Scientific) and 10% fetal bovine serum (Thermo Fisher Scientific) at 37 °C with 5% CO2 in a humidified incubator.
For labeling lysosomes in fixed U2OS cells, cells were first transfected with EGFP-Rab7A using Lipofectamine 2000 according to the standard protocol and cultured at 37 °C with 5% CO2 for an additional 24 h. Before imaging, the cells were fixed with 2% glutaraldehyde for 20 min.
For labeling tubulin in fixed U2OS cells, cells were seeded onto coverslips at 37 °C with 5% CO2 for 12 h. Before fixation, the cells were washed with phosphate buffered saline (PBS, Thermo Fisher Scientific) at 37 °C and then treated with fixing buffer (containing 3% paraformaldehyde (Electron Microscopy Sciences), 0.1% glutaraldehyde (Electron Microscopy Sciences), 0.2% Triton X-100 (Sigma-Aldrich)) for 15 min, then incubated with 0.2% Triton X-100 for 15 min and blocked with blocking buffer (3% bovine serum albumin (Sigma-Aldrich) and 0.05% Triton X-100 (Sigma-Aldrich)) for 20 min at room temperature. After that, cells were incubated with an anti-alpha tubulin antibody (Abcam, 1:500 dilution) overnight at 4 °C. Subsequently, the primary antibody was removed, and the cells were washed twice with PBS. Next, the cells were incubated with a secondary antibody (Abcam, labeled with Alexa Fluor 488, 1:400 dilution) for another 2 h at room temperature. The antibody was then removed, and the cells were washed three times with PBS.
For labeling outer membranes of mitochondria and lysosomes in live U2OS cells, cells were first transfected with Tomm20-EGFP and Rab7A-mCherry2 using Lipofectamine 2000 according to the standard protocol and cultured at 37 °C with 5% CO2 for an additional 24 h. Before imaging, remove old media and add fresh media.
For labeling mitochondrial matrix and chromosomes in live U2OS cells, cells were first transfected with Cox4-EGFP and H2B-mCherry2 using Lipofectamine 2000 according to the standard protocol and cultured at 37 °C with 5% CO2 for an additional 8 h. After 6–8 h of transfection, the cells were digested with 0.25% trypsin, seeded on cell culture dishes (20 mm diameter), and incubated for 36 h at 37 °C in 5% CO2.
For labeling peroxisomes in live U2OS cells, cells were first transfected with SKL-mApple using Lipofectamine 2000 according to the standard protocol and cultured at 37 °C with 5% CO2 for an additional 24 h. Before imaging, remove old media and add fresh media. For GT images acquisition, cells were fixed with 4% paraformaldehyde (PFA) before imaging.
For labeling ER in live COS-7 cells, cells were first transfected with Sec61β-EGFP using Lipofectamine 2000 according to the standard protocol and cultured at 37 °C with 5% CO2 for an additional 24 h. Before imaging, remove old media and add fresh media. For GT images acquisition, cells were fixed with 2% glutaraldehyde before imaging.
During the imaging process, cells were cultivated in phenol red-free McCoy’s 5A medium (customized, Boster Biological Technology) within the confocal dishes. To ensure a stable environment, the cells in confocal dishes were cultured in the live-cell microscope incubation system (TOKAIHIT) to maintain a consistent temperature of 37 °C and a 5% CO2 atmosphere.
Live-cell imaging
Light-field imaging of the lysosome outer membrane in live U2OS cells was implemented using our add-on light-field device with a ×60/1.3 NA objective (Olympus UPlanSApo60XS2) at a volumetric imaging rate up to 333 Hz. This high-speed light-field imaging of live cells continued for 2 min with less than 50% photobleaching, yielding 40000 SR reconstructions using Alpha-LFM. As a result, the fine deformation of lysosome outer membranes occurred in milliseconds and could be observed by our Alpha-LFM in three dimensions. To compare the photon bleaching rate with other 3D microscopy implementations, we also imaged the lysosome outer membrane by Airyscan and an LSFM microscope. The laser intensity and exposure time were carefully adjusted to ensure that the image SNRs by these microscopes were similar (SNR = 5.98, 5.87, 5.74, 5.99 for LSFM, Airyscan, 3D SIM, and Alpha-LFM). To maintain sufficient SNR, LSFM, Airyscan, and 3D SIM required 4 s, 4 min, and 30 s for imaging a whole cell, respectively. The bleaching rates were then calculated using MATLAB. Multiple areas were cropped, and a threshold was used to discriminate the signal areas and background areas automatically. The photon bleaching rate over time was finally calculated using the following equation:
where \({I}_{{{\rm{signal}}}}\) and \({I}_{{{\rm{background}}}}\) are the mean intensity value of the multiple regions of signal and background images, respectively.
Quantitative analysis of the interaction between mitochondria and lysosomes
The interaction of mitochondria and lysosomes was identified when the distance between a certain lysosome and the constriction site of a certain mitochondrion was measured as smaller than 240 nm (close proximity) in at least three consecutive frames. We four-dimensionally captured such interaction between mitochondria and lysosomes in 17 live cells, where 41 ROIs that met the Lyso-Mito contact criteria in 2D MIP mode were identified, with 24 ROIs containing lysosome-mediated mitochondrial fission and the other 17 ROIs containing mitochondrial fusion. Then, the real distance between lysosomes and mitochondria in these ROI was measured in true 3D mode using the commercial Imaris software. As a result, 10 false contacts were found in the 2D results. We further calculated the fission and fusion velocity using self-written MATLAB code that included the following steps:
-
(i)
Obtaining the coordination of the mitochondrial endpoints. For each frame of the time-series data: first, use the “imbinarize()” function in Matlab to binarize the image. Then, employ the “bwskel()” function to skeletonize the binary image. Finally, use the “endpoints” method in the “bwmorph3()” function to mark the endpoints of the skeletonized image and record the endpoint coordinates of all mitochondria.
-
(ii)
Tracking the endpoints and identifying the trajectories related to mitochondrial fission and fusion events. The code uses Matlab’s “assignDetectionsToTracks” function to track the sequence of the recorded endpoint coordinates along the time axis for the cropped ROIs, providing the motion trajectory of all endpoints along the time axis. Then, the two longest motion trajectories were filtered as the trajectories related to the fission or fusion events, while the discrete trajectories were removed. As long as the discontinuities occurred between frames in tracking, the “interp1()” function is applied to complete the trajectories.
-
(iii)
Velocity Acquisition. For each motion trajectory, the code calculates the distance between endpoints in all adjacent frames. Since frame interpolation has been performed, the instantaneous velocity was calculated by dividing the distance by the time interval. The calculated sequence of instantaneous velocity along the time corresponding to each motion trajectory was then averaged to obtain the overall fission or fusion velocity of this ROI.
Experimental setup for long-term live-cell imaging
To observe the mitochondrial dynamics and use chromosomes to identify the cell stages, we used a 488/561 nm dual-band optical filter block (Chroma, 59904) in the microscope. Given that the cell division is highly sensitive to phototoxicity, the power of the LED (CoolLED, PE-800) was adjusted to 1% in 470 nm (30 μw) and 1% in 550 nm (30 μw). Since the tracking of high-speed fission and fusion events requires at least 10-ss imaging rates and the entire cell cycle needs at least 48 h of observation, we designed a non-uniform imaging strategy for this long-term experiment. During this 30-min mid-term observation window, the imaging pipeline included the following steps: (i) Implementing an Alpha-Net-based focusing strategy to correct sample drift along the z-axis. Then the 3D distribution of the mitochondria was quickly reconstructed by Alpha-Net and compared with the original distribution to calculate the z-drift distance, which would be further corrected by moving the objective using a piezo scanner (COREMORROW, P73.Z). This process typically takes merely 1 s. (ii) Imaging the mitochondria dynamics at fixed depth of interest for 30 min, with an exposure time of 500 ms and an interval time of 10 s. Meanwhile, identifying the related cell stage through the imaging of chromosomes in cell nuclei with an exposure time of 500 ms and an interval time of 100 s.
Tracking of the fates and morphology changes of mitochondria
We applied Mitometer57 to track the mitochondria dynamics reconstructed by Alpha-Net. Then the confident tracks and the corresponding morphology parameters can be exported as a mat file. To obtain the morphology changes of mitochondria throughout the entire cycle, we first extracted the morphology parameters, including major axis length, minor axis length, Volume, and Solidity from the mat file in each time point. Then we mapped these parameters along time and observed the changes throughout the entire cycles. To further track the fates of each mitochondrion, we used the self-written MATLAB code to extract the fission and fusion events in the mat file of the confident tracks. Then the mitochondria that exhibited the first fission were extracted, with the fission sites being identified. We compared the volumes of the two divided mitochondria in the next frame after fission, and the smaller one was defined as the smaller daughter. If the volume of the smaller one is less than 25% of the total length of the two divided mitochondria, the fission will be defined as a peripheral fission; otherwise, it is a midzone fission. Then the number of fission and fusion events that occurred in these daughters were extracted, and the daughters that underwent no fission or fusion were defined as no event. As a result, the event rates of these peripheral or midzone daughters were calculated and mapped.
Statistics and reproducibility
Each experiment was repeated independently at least three times using distinct biological replicates. Representative quantitative results shown in Figs. 2j, k, 3c, 5e–i and 6g were consistent across these independent experiments.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The source data that supported the findings of this study are provided with the paper (including figures in the manuscript and Supplementary information). Representative training dataset for network training and raw data from figures (e.g., Figs. 4a and 5a) are publicly available on the figshare database (https://doi.org/10.6084/m9.figshare.29492231). Additional datasets (time-lapse live-cell data and training data of diverse samples for deep learning) are available from the corresponding author upon request due to their large file size. Source data are provided as a Source Data file. Source data are provided with this paper.
Code availability
Customized Alpha-Net program and codes for quantitative analyses implemented in the current study are available (https://github.com/feilab-hust/Alpha-LFM, https://doi.org/10.5281/zenodo.15779137).
References
Valm, A. M. et al. Applying systems-level spectral imaging and analysis to reveal the organelle interactome. Nature 546, 162–167 (2017).
Sahl, S. J., Hell, S. W. & Jakobs, S. Fluorescence nanoscopy in cell biology. Nat. Rev. Mol. Cell Biol. 18, 685–701 (2017).
Choquet, D., Sainlos, M. & Sibarita, J. B. Advanced imaging and labelling methods to decipher brain cell organization and function. Nat. Rev. Neurosci. 22, 237–255 (2021).
Schermelleh, L. et al. Super-resolution microscopy demystified. Nat. Cell Biol. 21, 72–84 (2019).
Schermelleh, L. et al. Subdiffraction multicolor imaging of the nuclear periphery with 3D structured illumination microscopy. Science 320, 1332–1336 (2008).
Huang, B., Wang, W., Bates, M. & Zhuang, X. Three-dimensional super-resolution imaging by stochastic optical reconstruction microscopy. Science 319, 810–813 (2008).
Shao, L., Kner, P., Rego, E. H. & Gustafsson, M. G. Super-resolution 3D microscopy of live whole cells using structured illumination. Nat. Methods 8, 1044–1046 (2011).
York, A. G. et al. Instant super-resolution imaging in live cells and embryos via analog image processing. Nat. Methods 10, 1122–1126 (2013).
Chen, B. C. et al. Lattice light-sheet microscopy: imaging molecules to embryos at high spatiotemporal resolution. Science 346, 1257998 (2014).
Li, D. et al. ADVANCED IMAGING. Extended-resolution structured illumination imaging of endocytic and cytoskeletal dynamics. Science 349, aab3500 (2015).
Wu, Y. et al. Multiview confocal super-resolution microscopy. Nature 600, 279–284 (2021).
Li, X. et al. Three-dimensional structured illumination microscopy with enhanced axial resolution. Nat. Biotechnol. 41, 1307–1319 (2023).
Zhou, Y., Mao, S. & Fei, P. Light sheet fluorescence microscopy: advancing biological discovery with more dimensions, higher speed, and lower phototoxicity. Innovation 5, 100692 (2024).
Wang, Z. et al. 3D live imaging and phenotyping of CAR-T cell mediated-cytotoxicity using high-throughput Bessel oblique plane microscopy. Nat. Commun. 15, 6677 (2024).
Huff, J. The Airyscan detector from ZEISS: confocal imaging with improved signal-to-noise ratio and super-resolution. Nat. Methods 12, i–ii (2015).
Gao, L. et al. Noninvasive imaging beyond the diffraction limit of 3D dynamics in thickly fluorescent specimens. Cell 151, 1370–1385 (2012).
Zhao, Y. et al. Isotropic super-resolution light-sheet microscopy of dynamic intracellular structures at subsecond timescales. Nat. Methods 19, 359–369 (2022).
Broxton, M. et al. Wave optics theory and 3-D deconvolution for the light field microscope. Opt. Express 21, 25418–25439 (2013).
Cohen, N. et al. Enhancing the performance of the light field microscope using wavefront coding. Opt. Express 22, 24817–24839 (2014).
Prevedel, R. et al. Simultaneous whole-animal 3D imaging of neuronal activity using light-field microscopy. Nat. Methods 11, 727–730 (2014).
Guo, C., Liu, W., Hua, X., Li, H. & Jia, S. Fourier light-field microscopy. Opt. Express 27, 25573–25594 (2019).
Wu, J. et al. Iterative tomography with digital adaptive optics permits hour-long intravital observation of 3D subcellular dynamics at millisecond scale. Cell 184, 3318–3332 e3317 (2021).
Zhang, Z. et al. Imaging volumetric dynamics at high speed in mouse and zebrafish brain with confocal light field microscopy. Nat. Biotechnol. 39, 74–83 (2021).
Yi, C., Zhu, L., Li, D. & Fei, P. Light field microscopy in biological imaging. J. Innov. Opt. Health Sci. 16, 2230017 (2023).
Hua, X., Liu, W. & Jia, S. High-resolution Fourier light-field microscopy for volumetric multi-color live-cell imaging. Optica 8, 614–620 (2021).
Lu, Z. et al. Virtual-scanning light-field microscopy for robust snapshot high-resolution volumetric imaging. Nat. Methods 20, 735–746 (2023).
Han, K. et al. 3D super-resolution live-cell imaging with radial symmetry and Fourier light-field microscopy. Biomed. Opt. Express 13, 5574–5584 (2022).
Weigert, M. et al. Content-aware image restoration: pushing the limits of fluorescence microscopy. Nat. Methods 15, 1090–1097 (2018).
Qiao, C. et al. Rationalized deep learning super-resolution microscopy for sustained live imaging of rapid subcellular processes. Nat. Biotechnol. 41, 367–377 (2022).
Wang, Z. et al. Real-time volumetric reconstruction of biological dynamics with light-field microscopy and deep learning. Nat. Methods 18, 551–556 (2021).
Zhu, L., Yi, C. & Fei, P. A practical guide to deep-learning light-field microscopy for 3D imaging of biological dynamics. STAR Protoc. 4, 102078 (2023).
Zhu, T. et al. High-speed large-scale 4D activities mapping of moving C. elegans by deep-learning-enabled light-field microscopy on a chip. Sens. Actuators B Chem. 348, 130638 (2021).
Yi, C. et al. Video-rate 3D imaging of living cells using Fourier view-channel-depth light field microscopy. Commun. Biol. 6, 1259 (2023).
Guo, Y. et al. Closed-loop matters: dual regression networks for single image super-resolution. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 5407–5416 (IEEE, 2020).
Lempitsky, V., Vedaldi, A. & Ulyanov, D. Deep image prior. In Proc. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 9446–9454 (IEEE, 2018).
Chen, J. et al. Three-dimensional residual channel attention networks denoise and sharpen fluorescence microscopy image volumes. Nat. Methods 18, 678–687 (2021).
Huff, J. et al. The new 2D Superresolution mode for ZEISS Airyscan. Nat. Methods 14, 1223–1223 (2017).
Gustafsson, M. G. et al. Three-dimensional resolution doubling in wide-field fluorescence microscopy by structured illumination. Biophys. J. 94, 4957–4970 (2008).
Sapoznik, E. et al. A versatile oblique plane microscope for large-scale and high-resolution imaging of subcellular dynamics. eLife 9, e57681 (2020).
Voleti, V. et al. Real-time volumetric microscopy of in vivo dynamics and large-scale samples with SCAPE 2.0. Nat. Methods 16, 1054–1062 (2019).
Guo, Y. et al. Visualizing intracellular organelle and cytoskeletal interactions at nanoscale resolution on millisecond timescales. Cell 175, 1430–1442.e1417 (2018).
Boutry, M. & Kim, P. K. ORP1L mediated PI(4)P signaling at ER-lysosome-mitochondrion three-way contact contributes to mitochondrial division. Nat. Commun. 12, 5354 (2021).
Wong, Y. C., Ysselstein, D. & Krainc, D. Mitochondria-lysosome contacts regulate mitochondrial fission via RAB7 GTP hydrolysis. Nature 554, 382–386 (2018).
Kleele, T. et al. Distinct fission signatures predict mitochondrial degradation or biogenesis. Nature 593, 435–439 (2021).
Mishra, P. & Chan, D. C. Mitochondrial dynamics and inheritance during cell division, development and disease. Nat. Rev. Mol. Cell Biol. 15, 634–646 (2014).
Chen, A. et al. MVSNeRF: fast generalizable radiance field reconstruction from multi-view stereo. In Proc. IEEE/CVF International Conference on Computer Vision 14124–14133 (2021).
Mildenhall, B. et al. NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 99–106 (2021).
Zhao, F. et al. Deep-learning super-resolution light-sheet add-on microscopy (Deep-SLAM) for easy isotropic volumetric imaging of large biological specimens. Biomed. Opt. Express 11, 7273–7285 (2020).
Zhang, Y. et al. Image super-resolution using very deep residual channel attention networks. In Proc. European Conference on Computer Vision (ECCV) 286–301 (Springer, 2018).
Mo, Y., Wang, Y., Xiao, C., Yang, J. & An, W. Dense dual-attention network for light field image super-resolution. IEEE Trans. Circuits Syst. Video Technol. 32, 4431–4443 (2021).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).
Qiao, C. et al. Evaluation and development of deep neural networks for image super-resolution in optical microscopy. Nat. Methods 18, 194–202 (2021).
Wang, Y. et al. Spatial-angular interaction for light field image super-resolution. In Proc. Computer Vision—ECCV 2020 290–308 (Springer, 2020).
Ibtehaz, N. & Rahman, M. S. MultiResUNet: rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw. 121, 74–87 (2020).
Descloux, A., Grußmayer, K. S. & Radenovic, A. Parameter-free image resolution estimation based on decorrelation analysis. Nat. Methods 16, 918–924 (2019).
Culley, S. et al. Quantitative mapping and minimization of super-resolution optical imaging artifacts. Nat. Methods 15, 263–266 (2018).
Lefebvre, A. E. Y. T., Ma, D., Kessenbrock, K., Lawson, D. A. & Digman, M. A. Automated segmentation and tracking of mitochondria in live-cell time-lapse images. Nat. Methods 18, 1091–1102 (2021).
Acknowledgements
We are grateful to X. Duan for providing us with the fluorescent cell samples and to S. Mao for discussing the biological applications with us. This work was supported by the funding from National Natural Science Foundation of China (T2225014, 62375095, 82470239, 32201132), National Key Research and Development Program of China (2023ZD0519900, 2022YFC3401100), Key Research and Development Project of Hubei Province (2024BCB011), The Interdisciplinary Research Program of HUST (5003540153, 2024JCYJ064), National Research Center for Translational Medicine at Shanghai (NRCTM (SH)-2023-04).
Author information
Authors and Affiliations
Contributions
P.F., L.Z., and C.Y. conceived the idea. P.F. and D.L. oversaw the project. L.Z. and J.S. developed the optical setups and acquired the experimental images. L.Z., J.S., and C.Y. developed the programs. L.Z. and J.S. processed the images. M.Z. prepared all the biological samples. L.C., Y.Z., C.Z., J.T., Y.Z., L.Z., C.Y., J.S., M.H., Y.H., S.W., H.C., and D.L. analyzed the data. L.Z., D.L., and P.F. discussed and wrote the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhu, L., Sun, J., Yi, C. et al. Adaptive-learning physics-assisted light-field microscopy enables day-long and millisecond-scale super-resolution imaging of 3D subcellular dynamics. Nat Commun 16, 7132 (2025). https://doi.org/10.1038/s41467-025-62471-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-025-62471-w