Introduction

In recent years, interest in realistic three-dimensional (3D) modeling has surged, accompanied by significant advancements in algorithms that utilize multiple photographs, such as photogrammetry1, neural radiance field (NeRF)2, and 3D Gaussian splatting (3DGS)3,4. These technologies have driven innovations in virtual production and game development, particularly for background assets5. However, absolute coordinates cannot be determined from photographs alone, making ground control points or additional modalities essential for accurate georeferencing6,7. The integration of light detection and ranging (LiDAR) has garnered significant attention for enhancing geometric fidelity and improving 3D models8,9,10. LiDAR is particularly valuable in scenarios where generating a point cloud via structure-from-motion (SfM) is impractical, such as for objects beneath dense canopies or those with limited distinctive features11. However, insufficient point cloud density often poses a challenge for dense reconstruction12.

Recently, algorithms integrating physical models of nature with 3DGS have been explored13. Interest in 4D Gaussian splatting, driven by augmented reality (AR) and virtual reality (VR) applications, is also growing14. Therefore, the integration of LiDAR is expected to play a crucial role in handling dynamic objects. This could lead to a need for higher point cloud density corresponding to higher frame rates. Drones have become essential tools for scalable 3D modeling, particularly in inaccessible environments for photogrammetry. However, integrating photogrammetry with LiDAR is often limited to large aircraft owing to payload constraints. Therefore, miniaturizing LiDAR technology is also key to enhancing the versatility and applicability of these systems.

While algorithms for photogrammetry–LiDAR fusion continue to evolve, LiDAR systems used for 3D modeling were primarily intended to advance driver-assistance systems (ADAS)15 or unmanned aerial vehicles (UAVs)16. In these applications, characteristics such as long-range detection and a wide field of view (FoV) are prioritized; however, the point cloud density and geometric fidelity required for high-quality 3D modeling are frequently lacking. Typical single-photon LiDAR (SPL) systems are used in aerospace17 and aviation applications18,19, where they are capable of measuring distance with a single photon per shot. The fundamental principle of SPL—achieving the lowest photon cost per measurement point and excelling in capturing fast motion—is critical for 3D modeling, as it supports high point cloud density and motion tracking while maintaining compact size and low power consumption. However, improving distance precision from a single shot requires the temporal precision of both the light source and the timer to match the levels found in SPL systems used in space and aviation applications.

The purpose of this study is to resolve the trade-offs among dense point clouds, distance precision, ultra-miniaturization, and low cost in LiDAR systems. Therefore, we introduce an ultralight solid-state SPL (SS-SPL) system comprising the following devices, each addressing specific conventional issues: a pulsed semiconductor laser enhanced by bandgap renormalization (BGR) provides low beam quality factors (M²), which was difficult to achieve with conventional high-intensity pulsed semiconductor lasers20,21,22,23, and contributes to improved angular resolution of point clouds. A large-aperture two-dimensional (2D) micro-electro-mechanical system (MEMS) mirror achieves more than twice the figure of merit (FoM) compared to conventional devices24,25,26. As a result, it provides the FoV required for 3D modeling, while being more compact than traditional mechanical optical systems. A proposed feedforward frequency control using an up-down direct digital synthesis (UD-DDS) eliminates the beam trajectory instability caused by changes in the resonant frequency in conventional feedback methods27,28,29, thereby improving the positional accuracy and spatial filling rate of point clouds. A high-precision time-to-digital converter (TDC) reflects a state-of-the-art architecture with picosecond-level accuracy30,31, unlike the conventional TDCs with around 100 ps resolution typically used in LiDAR systems32. This contributes to improved distance precision and also shows that such advanced architecture exhibits sufficient electrical noise tolerance.

In this work, the SS-SPL, reflecting the features of the introduced devices and maintaining a high optical efficiency carried by the principle of SPL, solves the critical issues discussed above, such as insufficient point cloud density for 3D modeling and payload limitations in drone applications. To demonstrate the capabilities of SS-SPL, we compare the ground truth target with the corresponding point clouds and present colored 3D models obtained by fusing the super-dense point clouds generated by SS-SPL with camera data. The demonstration suggests new possibilities for introducing LiDAR into a wide range of fields, such as 3D modeling and compact drones that represent rapidly growing and large-scale markets, with applications spanning industries including entertainment, sports, construction, logistics, and other sectors.

Results

Concept

The ranging principle of SS-SPL is illustrated in Fig. 1a. A single short-pulse laser beam is directed at an object, and weak light scattered by the object is detected by a silicon photomultiplier (SiPM). The detected light corresponds to photons ranging from a single to several tens, which can be converted into quantized weak electrical pulses. This occurs because the SiPM consists of thousands of single-photon avalanche diode (SPAD) elements connected to a common cathode and a common anode, resulting in a linear output for weak signals33,34. If the temporal resolution of both the pulsed laser and the SiPM is sufficiently high, a single electrical pulse can exhibit quantized levels, as illustrated in the figure. These weak electrical pulses are usually buried in common mode noise. However, these electrical pulses can be extracted by employing a replica of the SiPM to cancel out only the common mode noise. The rising edge timing detection is performed using a threshold voltage corresponding to the quantized number of photoelectrons. A time of flight (ToF) is measured from the difference between the reference rising edge timing and the rising edge timing of the round trip from the SS-SPL to the object. Since the single short-pulse laser beam is continuously scanned in two dimensions (Fig. 1b), measurements at each beam position are performed in a single shot.

Fig. 1: Principal and schematic of the ultralight solid-state single-photon light detection and ranging (SS-SPL).
figure 1

a Simplified ranging principle. b Schematic of Lissajous trajectory generated by beam scanning of a 2D micro-electro-mechanical system (MEMS) mirror. c The large receiving aperture of 13.17 mm2 comprises the large diameter 2D MEMS mirror and the freeform mirror. The transmitter (TX) unit comprises a single Q-switching semiconductor laser and small optics. The receiver (RX) unit includes the silicon photomultipliers (SiPMs), isolation transfers (~55 V on the input side, less than 1.0 V on the output side), and differential amplifiers (a part of the complementary metal-oxide-semiconductor (CMOS) chip). The four channels of the two-step time-to-digital converter in the CMOS chip are connected to the individual differential amplifiers. The field-programmable gate array (FPGA) is not included in the module. d Laser chip and CMOS driver are die-bonded on the small half cavity package with a volume of 0.033 cc (5.5 mm × 4.0 mm × 1.5 mm). e Ultralight module comprises TX on the left side, along with RX and time-to-digital converter (TDC) on the right side. The module comprises resin, resulting in a package volume of 11.7 cc (25 mm × 26 mm × 18 mm) and weight 9.97 g.

A schematic illustration of the SS-SPL system is shown in Fig. 1c. The Q-switching semiconductor laser chip and a dedicated complementary metal-oxide semiconductor (CMOS) driver were die-bonded onto a compact 0.033 cc half-cavity package (Fig. 1d). A tail-free laser pulse with an energy of 0.2 nJ and a width of ~50 ps was output from a window. The emitted beam was collimated, passed through a perforated freeform mirror with a 2.3 mm diameter hole (equivalent to a 4.15 mm2 TX aperture), and was incident on a 2D MEMS mirror with a large reflection area of 20 mm2 at an angle of 30°. Scattered light from an object irradiated by the beam passed through an RX aperture of 13.17 mm2, formed by the perforated freeform mirror in a coaxial optical path.

A timer to measure the ToF was designed based on 40 nm CMOS technology and consisted of four two-stage TDCs, each incorporating a differential amplifier, a comparator, and a low-voltage differential signaling (LVDS) output. The rising edge detection threshold voltages for the stop terminals of channels Ch-1, Ch-3, and Ch-4 were Th-1, Th-3, and Th-4, respectively, while that for the common start terminals as the reference was Th-ref. The TDC provided a temporal resolution of ~2 ps and a single-shot timing jitter of 3 ps. To mitigate erroneous signals such as ambient sunlight or dark count of the SiPM, the TDC provided multiple acquisition modes: single-photon mode (SP-mode), double-photon mode (DP-mode), and triple-photon mode (TP-mode). Furthermore, it captured both the first and second events by switching to Ch-2 from Ch-1. Real-time time-walk error correction and vector operations on the point cloud were performed by the external field-programmable gate array (FPGA).

Finally, the ultralight module, which integrated all of the aforementioned components—including voltage regulators, folded circuit boards, and a resin body—achieved a total weight of 9.97 g (inset image in Fig. 1e).

Q-switching semiconductor laser

Performance requirements for semiconductor lasers in SS-SPL are unique, as they necessitate the coexistence of pulse widths on the order of several tens of picoseconds, nJ-class pulse energies, tail-free pulse shaping, and low two-axis M2 values. It is challenging for conventional semiconductor lasers to possess all of them. Building upon an edge-emitting platform that inherently provides low M2 values, we adopted a Q-switching structure to achieve the remaining three stringent requirements. Fifty-two pairs of alternately arranged gain and Q-switch regions are laterally electrically isolated within a 4-mm-long cavity, and both regions are actively driven, as illustrated in Fig. 2a, b.

Fig. 2: Schematic and characteristics of edge-emitting, actively Q-switched semiconductor laser.
figure 2

a The alternatively placed gain and Q-switch regions (52 pairs) in the cavity of 4 mm are electrically separated in the lateral direction. b A voltage (VLD) ranging from 4.45 to 5.04 V was applied to the gain region. Then, a pulsed current is input through a low-side n-type metal–oxide–semiconductor (NMOS) driver. Three states exist for the Q-switch anode: high impedance with parasitic capacitance, VDD (5 V) via a p-type metal–oxide–semiconductor (PMOS), or ground via an NMOS. Each timing is precisely controlled by a pulse signal generator embedded in the complementary metal-oxide-semiconductor (CMOS) driver with a least significant bit (LSB) equivalent to ~100 ps. c Streak camera (Hamamatsu Photonics C5680) images of the laser pulses at \(\Delta {t}_{{{\mathrm{lsd}}}}=3.17\) ns and\(\Delta {t}_{{{\mathrm{lsd}}}}=5.22\) ns. The rainbow scale represents the linear intensity. Notably, the time offsets observed in the images are induced by a jitter of the streak camera and do not reflect the actual timing of the pulsed laser oscillation. d Temporal waveform of the laser pulse at \(\Delta {t}_{{{\mathrm{lsd}}}}=5.22\) ns. e, f Peak power, average power, pulse width, wavelengths of pulse and tail, and ratio of tail depending on \(\Delta {t}_{{{\mathrm{lsd}}}}\). Pulse energy is calculated by dividing the average power by the repetition rate of 3.16 MHz. The peak power is determined from the laser pulse waveform. g Temporal waveform of seven laser pulses, demonstrating stable pulse generation. h Beam quality factor (M2) of a similar semiconductor laser not integrated into the module. Focal position shift is attributed to the large aspect ratio of the beam divergence angles. The distance from the collimator lens to an f = 40 mm lens for the M2 measurement is ~250 mm. Therefore, the lens is positioned within the Rayleigh length for the vertical component and beyond the Rayleigh length for the horizontal component. The focal point along the horizontal axis is theoretically determined to be 47.6 mm, in agreement with the experimental results. For the module, the aspect ratio is adjusted using anamorphic optics.

The characteristics of the laser beam emitted outside the module, as a function of the pulse width controlled by the low-side driver NMOS (denoted as \(\Delta {t}_{{{\mathrm{lsd}}}}\)), are shown in Figs. 2cf. The value of \(\Delta {t}_{{{\mathrm{lsd}}}}\) was estimated through simulation. In Fig. 2c, the color scale image represents the linear intensity measured using a streak camera (Hamamatsu Photonics C5680). For each condition, the timing of the PMOS and NMOS driving the anode of the Q-switch was individually optimized to minimize the laser pulse tail. The laser waveform at \(\Delta {t}_{{{\mathrm{lsd}}}}=5.22\) ns is shown in Fig. 2d. A peak power of 3.04 W, a pulse width of 48.7 ps (defined as full width at half maximum, FWHM), and a 289-ps delay between the main laser pulse and its tail were observed. The peak power, average power, and pulse width are illustrated in Fig. 2e. A clear lasing threshold was observed at \(\Delta {t}_{{{\mathrm{lsd}}}}=2.70\) ns; the peak power increased while the laser pulse width decreased with increasing \(\Delta {t}_{{{\mathrm{lsd}}}}\). The center wavelength of the laser pulse red-shifted rapidly as the peak power increased, then shifted more gradually when \(\Delta {t}_{{{\mathrm{lsd}}}}\) exceeded 4 ns, while the tail transitioned to a shorter wavelength, as shown in Fig. 2f. At \(\Delta {t}_{{{\mathrm{lsd}}}}=5.22\) ns, the wavelength difference between the laser pulse and the tail reached 5.02 nm (equivalent to 9.2 meV). The ratio of the tail to total pulse energy decreased with increasing \(\Delta {t}_{{{\mathrm{lsd}}}}\), as shown in Fig. 2f. Further details are shown in Supplementary Notes 1.

The red-shift and the 289 ps delay between the laser pulse and its tail were attributed to high carrier density in BGR35,36,37 and its associated recovery time36. The recovery time provided a temporal margin to turn off the active Q-switch, allowing the low-side driver to be deactivated and the Q-switch anode to return to the ground state. Therefore, the laser pulse peak power was maximized while simultaneously suppressing the pulse tail. The temporal waveform measured in single-shot mode using an oscilloscope with a 256 GSa s−1 sampling rate is shown in Fig. 2g. The pulse jitter, including the trigger timing jitter, was calculated as 43 ps from the temporal waveform. An example of the beam quality (M2) of a similar semiconductor laser not integrated into the module is presented in Fig. 2h. The M2 value was calculated using the D4σ method, based on the second moment of the intensity distribution. The M2 values perpendicular (fast) and parallel (slow) to the semiconductor layers were 1.3 and 1.15, respectively. While BGR has been widely studied, its transient response has not been exploited in semiconductor laser operation; here, we demonstrate a device that utilizes this regime.

A comparison among different types of semiconductor lasers20,21,22,23 is provided in Supplementary Notes 3. Although some other semiconductor laser emits a higher-power laser pulse, none of the conventional semiconductor lasers satisfy all four characteristics required for SS-SPL, such as the pulse width on the order of several tens of picoseconds, the nJ-class pulse energy, the tail-free pulse, and the low two-axis M2 values.

2D laser beam scanner

In beam steering for the SS-SPL, priority is given to capturing more photons, while also ensuring a wider FoV, uniform scanning across the entire FoV, and a compact size. Beam steering methods can be categorized into mechanically driven motor systems, MEMS mirrors, and fully solid-state nanophotonics with no moving parts38,39. In this study, we designed a low-Q resonant 2D MEMS mirror that operates in air due to three fundamental advantages. It consists of a single gold-coated mirror, resulting in minimal optical loss. It scans the entire FoV within a short period by employing a relatively higher slow-axis frequency. It possesses superior robustness against electrical noise due to its mechanical inertia, in comparison with other linear scanners. From these advantages, the risk of missing the target in dynamic 3D modeling applications can be reduced, and uniform non-repetitive scanning of the entire FoV can be achieved with only stable frequency control due to the inertia when the exposure time is increased. The trajectory generated by the two-axis resonant MEMS mirror forms a Lissajous pattern (Fig. 1b), as described in the Methods section.

We prioritized feedforward controllability at the desired frequency over energy efficiency in the design of the low-Q 2D MEMS mirror with the mirror area of 20 mm2 and Pb-free piezoelectric actuators40. The frequency dependences of the optical scan full angle when driven independently in the vertical (fast) and horizontal (slow) directions are shown in Fig. 3a, b, respectively. The resonance amplitude decay plot is presented in Fig. 3c. The black lines represent fitted results, with Q values of 172 (fast) and 112 (slow).

Fig. 3: Characteristics of low Q-value micro-electro-mechanical system (MEMS) mirror and driver.
figure 3

a Frequency response of optical scanning full angle for two samples. A 20 V peak-to-peak voltage is applied to lead-free piezoelectric actuators of the MEMS mirror to perform vertical scanning. b Optical scanning angle in the horizontal direction under the same conditions as the vertical direction. c Resonance amplitude decay scanned in vertical direction (blue), horizontal direction (red), and fitted curves (black). d Schematic for up-down direct digital synthesis (UD-DDS). e Sin wave generation by the UD-DDS. f Experimental frequency analysis of the UD-DDS. g Non-repetitive Lissajous trajectories. h Repetitive Lissajous trajectories. Both trajectories are captured at an exposure time of 198 ms. The field-of-view is 42° × 26° when a 28 V peak-to-peak voltage is applied.

We propose an UD-DDS system (Fig. 3d) that uses a high-precision reference clock \({f}_{{{\mathrm{sys}}}}\) from a crystal oscillator to generate the driving waveform in the MEMS mirror. When changes in digital to analog converter (DAC) input value were gradual and did not change multi-bit transitions, as shown in Fig. 3e. The lookup table (LUT) capacity was significantly reduced—comparable to a DDS with a computational unit41—by storing the DAC input threshold values (timing of one bit up or down) in the LUT instead of the waveform shape. Despite this minimal LUT storage, single-shot noise below 0.2 Hz in bandwidth was achieved at a reference clock \({f}_{{{\mathrm{sys}}}}\) of 20 MHz, as shown in Fig. 3f. Non-repetitive scanning at 2075.0 Hz and repetitive scanning at 2075.5 Hz along the vertical axis are shown in Fig. 3g, h, respectively. The optical scanning full angles reached 42° horizontally and 26° vertically. Since the piezoelectric actuators in the proposed MEMS mirror were shared between vertical and horizontal drives, the 2D scanning angles were smaller than those in the 1D cases, as shown in Fig. 3a, b. Although the trajectory changed significantly with only a 0.5 Hz frequency shift, the proposed UD-DDS system enabled stable feedforward control.

The shock robustness42,43 of the MEMS fixed to the resin frame exceeded 2771 G during operation, as discussed in Supplementary Notes 2 and demonstrated in Supplementary Video 1. Since no hysteresis was observed in the resonant spectrum as shown in Fig. 3a, b, the Lissajous trajectories remained after the impact.

Supplementary Notes 3 shows a comparison of 2D large MEMS mirrors. An FoM44,45 of the MEMS mirror is more than twice as high as that of conventional ones24,25,26 operating at atmospheric pressure, and is more than half that of those operating in a vacuum46.

High functional time-to-digital converter

The TDC integrated into the SS-SPL module required precise measurement under a common-mode noise environment of up to several hundred millivolts and supported advanced functionalities such as time–walk error correction based on the number of incident photons. The two-stage TDC was implemented, comprising an inverter ring operating at ~1.1 GHz for coarse timing and the pulse-shrinking gated buffering ring (PSGBR) architecture30,47, which provided ~2 ps of pulse shrink per buffer stage, for fine timing. A relatively strong signal waveform measured at the node between the isolation transfer and the differential amplifier is shown in Fig. 4a. The signal pulse width was estimated to be 920 ps. The base-level voltage increased gradually over the first ~20 ns and stabilized thereafter, which suppressed erroneous measurements caused by stray laser light. The distributions of time differences between Ch-1 and Ch-3, Ch-1 and Ch-4, and Ch-3 and Ch-4, with standard deviations of \({\sigma }_{{{\mathrm{TDC}}}\left({{\mathrm{ch}}}3-{{\mathrm{ch}}}1\right)}=4.2\) ps, \({\sigma }_{{{\mathrm{TDC}}}\left({{\mathrm{ch}}}4-{{\mathrm{ch}}}1\right)}=4.7\) ps, and \({\sigma }_{{{\mathrm{TDC}}}\left({{\mathrm{ch}}}4-{{\mathrm{ch}}}3\right)}=4.8\) ps, respectively, are shown in Fig. 4b. Since each channel operated independently, the standard deviation for each channel was obtained from these time-difference distributions, which represented the convolution of the noise distribution from each channel. Accordingly, the estimated standard deviations were \({\sigma }_{{{\mathrm{TDC}}}\left({{\mathrm{ch}}}1\right)}=2.9\) ps, \({\sigma }_{{{\mathrm{TDC}}}\left({{\mathrm{ch}}}3\right)}=3.1\) ps, and \({\sigma }_{{{\mathrm{TDC}}}\left({{\mathrm{ch}}}4\right)}=3.6\) ps. The resulting single-shot noise levels were significantly low to allow estimation of the number of incident photons based on inter-channel correlations.

Fig. 4: Characteristics of receiver (RX) and time-to-digital converter (TDC).
figure 4

a Signal waveform measured at the node between the isolation transfer and the differential amplifier when the white paper is used as the target. b Distributions of time differences between Ch-1 and Ch-3, Ch-1 and Ch-4, and Ch-3 and Ch-4. c Probability \(P\left(x\ge {k}\right)\) that more than \(k\) photons are detected, depending on the distance to the target for single-photon (SP-mode) (black), double-photon (DP-mode) (red), and triple-photon (TP-mode) (blue). d Detection timing of Ch-4 against the difference of detection timing between Ch-4 and Ch-3 with the distance of 6.486 m, before (red dots) and after (blue dots) time–walk error correction. e Distance precision of single-shot after the time–walk error correction. Open circle represents the case corresponding to the minimum photoelectron for each mode. f Distance accuracy of the single-shot after the time–walk error correction. The values are calculated from 50,000 measurements at each distance.

Three detection modes, single-photon (SP-mode) (\(m=1\)), double-photon (DP-mode) (\(m=2\)), and triple-photon (TP-mode) (\(m=3\)), were incorporated, and their thresholds (Th-1, Th-3, and Th-4) were set to \(m\), \(m+1\), and \(m+2\) photoelectrons, respectively. When \(m\) or \(m+1\) photoelectrons were obtained as output from the SiPM, the number of photoelectrons could be determined because only Ch-1 or Ch-1 and Ch-3 operated. When Ch-4 was measured, more than \(m+2\) photoelectrons were obtained as output from the SiPM. The probability \(P\left(x\ge k,\overline{{N}_{{RX}}}\,\right)\) of detecting more than \(k\) photons, depending on the distance to the target for SP-mode (circular), DP-mode (triangle), and TP-mode (rhombus), is illustrated in Fig. 4c. The dashed lines were calculated by the Poisson distribution as:

$$P\left(x\ge k\,\right)=1-{\sum }_{i=0}^{k-1}\frac{{\overline{{N}_{{RX}}}}^{{{{\rm{i}}}}}{e}^{-\overline{{{{{\rm{N}}}}}_{{RX}}}}}{i!}$$
(1)
$$\overline{{N}_{{RX}}}=\frac{\alpha }{{D}^{2}}$$
(2)

where \(\overline{{N}_{{RX}}}\) denotes the expectation of the detected number of photons, \(D\) is the distance to the target, and \(\alpha\) is a constant throughout the measurement. As the base level voltage was low at short distances, more photoelectrons were required for the signal to exceed the threshold. However, the probability of detection did not significantly decrease owing to the increased number of returned photons.

The time–walk error caused by the weak SiPM output introduced considerable time delays, thereby requiring corrections even in real-time usage. As reflected in the waveform in Fig. 4a, the time–walk error increased as the difference of detection timing between Ch-4 and Ch-3 increased. For example, red dots in Fig. 4d show the raw data of the detection timing of Ch-4 versus the difference of detection timing between Ch-4 and Ch-3 under the condition that the distance to the target was 6.486 m. By applying the estimated correction value induced by the quantized photon number, the raw data were corrected, as shown by blue dots.

The distance precision of a single-shot after the time–walk error correction is shown in Fig. 4e. The standard deviations were ~7 mm (47 ps) at a distance of 3 m or less, where the signal of more than \(m+2\) photoelectrons was dominant, as shown in Fig. 4c. At distances longer than 3 m, the standard deviations increased with the distance owing to the number of photoelectron reduction to \(m\) or \(m+1\). The value approached the lines with open circles. Each line represented the standard deviation extracted for m, where only Ch-1 operated. The distance precision at 25 m is discussed in the next subheading of LiDAR. The distance accuracy of the single-shot after the time-walk error correction was within 5 mm, as shown in Fig. 4f.

The performance of the built-in TDC is compared with that of conventional ones30,31 in Supplementary Notes 3. The obtained results based on 40 nm technology fell between those of the introduced 180 nm and 28 nm technologies in terms of both precision and FoM. The commonly used FoM did not include precision, but in LiDAR applications, designing both with a good balance is crucial.

Light detection and ranging (LiDAR)

Geometric accuracy was verified using a stepped target as the ground truth (Fig. 5). The divergence angles of the laser beam were 0.44 mrad and 1.2 mrad in fast and slow directions, respectively. As shown in Fig. 5a, the fast-slow axis of the laser beam was 60° counterclockwise from to the xy axis. A point cloud that clearly reproduced the ground truth was obtained at a distance of 3 m. At the distance of 25 m, the point cloud shape became smoother due to the beam divergence; however, the outline of the ground truth, derived from the distribution of the counted number of points per an anisotropic voxel, showed good agreement. The standard deviations of the distances at 3 m and 10 m were nearly the same as the results shown in Fig. 4e. At 25 m, they were 47.7 mm (x) and 45.8 mm (y), which lie on the extended trend line of the SP-mode results for \(m=1\), from 5 to 10 m shown in Fig. 4e.

Fig. 5: Validation of geometric accuracy in 3D point clouds.
figure 5

Ground truth image (a) and 3D point clouds at distances of 3 m (b), 10 m (c), and 25 m (d). White lines on the point clouds indicate the outline of the ground truth. For each anisotropic voxel, assign a representative point at the center of the square cross-section (x, y) with z equal to the mean distance of the enclosed points. The number of points in an anisotropic voxel at the edge of the ground truth is 50% of that in the interior, and point clouds with values less than 30%, indicating regions outside the ground truth, are eliminated. The cross-section sizes are 2 mm, 4 mm, and 8 mm for distances of 3 m, 10 m, and 25 m, respectively. The target occupies 10.2, 0.92, and 0.15 % of the field of view for the three cases. e Dimensions of our original ground truth, (fk). Means (black dots), standard deviations (gray lines), and the ground truth (red lines) along the cross-sections in x and y axis, as shown in the red dashed boxes in (a).

The characteristics of SS-SPL are summarized in Table 1. The laser pulse output from the module \({{PE}}_{{{\mathrm{TX}}}}\) was 0.2 nJ, which corresponds to the transmit laser energy per point for single-shot measurements. The TX aperture \({A}_{{{\mathrm{TX}}}}\) was 1.25 mm2, which was smaller than the optical component limit of 4.15 mm2. This was due to the insufficient expansion of the slow beam width by the anamorphic optical components, resulting in the increased divergence angle. The RX aperture \({A}_{{RX}}\) was 13.17 mm2, realized by the relatively large MEMS mirror and the coaxial optical system. The expected number of photons to be detected is expressed as:

$$\overline{{N}_{{RX}}}=\frac{{{PE}}_{{TX}}\cdot \rho \cdot {\eta }_{{RX}}}{{E}_{0}}\cdot \frac{{A}_{{RX}}}{\pi \cdot {D}^{2}}$$
(3)

where \({E}_{0}\) is photon energy, \(\rho\) is the reflectance of the target, and \({\eta }_{{RX}}\) is the total optical efficiency of RX, including the PDE of SiPM. A probability of detection of wrong signal induced by the dark count of the SiPM is defined by

$${P}_{{WS}}=1-{e}^{-{{{R}}}_{{DC}}\cdot \frac{2{{D}}}{{{c}}}}$$
(4)

where \({R}_{{DC}}\) is the dark count rate of the SiPM. For example, the expected number of photons \(\overline{{{{{\rm{N}}}}}_{{RX}}}\) is 0.273 with the parameters of \({E}_{0}=2.41\times {10}^{-19}\) J, \({{PE}}_{{{\mathrm{TX}}}}=2\times {10}^{-10}\) J, \(\rho=0.7\), \({\eta }_{{RX}}=0.07\). \(D=25\) m, \({R}_{{DC}}=7\times {10}^{5}\) cps. The probability of detection more than single photon

$$\left(1-{P}_{{WS}}\right)\cdot P\left(X\ge 1,\,0.273\right)={e}^{-{{{R}}}_{{{DC}}}\cdot \frac{2{{D}}}{{{c}}}}\cdot \left(1-{e}^{-\overline{{{{{\rm{N}}}}}_{{RX}}}}\right)=0.213$$
(5)
Table 1 Comparison of solid-state single-photon LiDAR (SS-SPL) and frequency modulated continuous wave (FMCW) LiDAR

The result shows good agreement with those illustrated in Fig. 4c. The SS-SPL operates at nearly the lower limit of transmit laser energy required for distance measurement with single photons.

Table 1 also includes the pioneering frequency modulated continuous wave (FMCW) LiDAR based on the silicon photonics platform48. The transmit laser energy per point of 200 nJ and the distance precision of 3.1 mm were demonstrated. Notably, they discussed in the article that scalability of point cloud density in FMCW LiDAR can be achieved by increasing the output power and pixel count, and further suggested that ~200 μm precision is feasible with the use of 50-GHz silicon photonic modulators.

The advantage of SS-SPL over the FMCW LiDAR is that the transmit laser energy per point is smaller by a factor of 1/1000 to 1/100, which allows photon efficiency to enable flexible allocation between point cloud density and precision. Additionally, since TDCs and pulse lasers have achieved high temporal resolution, improving the RX’s temporal resolution leads to a direct enhancement of distance precision without the need to allocate photon efficiency to it.

From an alternative perspective, traditional direct ToF systems require averaging processes, such as histogramming, to improve distance precision, which consumes optical output. In contrast, FMCW LiDAR benefits from enhancements in laser linewidth and chirp bandwidth characteristics, while SS-SPL benefits from improvements in device temporal precision, both of which directly contribute to enhanced distance precision.

Miniaturization of LiDAR systems mounted on UAVs has been progressing16. As shown in Table 2, SS-SPL enables significant weight reduction compared to these LiDAR systems, even when the FPGA board is included.

Table 2 Comparison of solid-state single-photon LiDAR (SS-SPL) and conventional LiDARs

3D modeling

We also expect that the number of photographs required for fusion will decrease by introducing super-dense point clouds substitute for the initial processing in 3D modeling, making it easier to acquire the data necessary for 3D modeling in the future. We propose a fusion system based on super-dense point clouds with high geometric fidelity and a single-view full-color photograph instead of performing advanced fusion using multiple photographs in post-processing, and visualize the expressive power of the resulting colored point clouds. Table 3 summarizes conversion parameters used for the fusion system in this study. The point clouds acquired by the non-repetitive scanning with a 20-s exposure were post-processed by filtering and averaging within 0.2 mrad grids as shown in Fig. 6. The maximum number of points per grid before the filtering was ~3 (at the center of FoV) and ~7 (average of the entire FoV). The camera’s FoV encompassed that of the SS-SPL, enabling full-color data acquisition at approximately 0.1 mrad grid intervals. Consequently, roughly one-fourth of the full-color pixel data within the SS-SPL’s FoV after distortion correction using Zhang’s algorithm49 were assigned to the unique point clouds. All processes were performed in real time.

Fig. 6
figure 6

Overview for reconstruction and fusion.

Table 3 Parameter conversion for fusion

A meticulously stitched 3D model of a restaurant floor (Fig. 7a–e) obtained by the fusion system demonstrated geometric accuracy in global coordinates, including scenes with less texture information and sufficient density to project fine textures (Fig. 7e). Cherry blossom trees captured by the fusion system from a single viewpoint for several conditions are shown in Fig. 7fk, Supplementary Notes 5, and Supplementary Video 2. The front and side views were aligned with the shooting direction and perpendicular to the shooting direction, respectively. The petals and branches behind were captured through the gaps between the petals, representing a typical scene demonstrating the advantages of the coaxial optical system in the SS-SPL. A dynamic view of basketball shot acquired by the SS-SPL without the fusion is shown in Fig. 7l and Supplementary Video 3 to illustrate another interest: how the dense point clouds model dynamic scenes. The point clouds continuously acquired along the Lissajous trajectory were packed into a frame every 60 Hz to reproduce the motion of the shooter and the trajectory of the ball.

Fig. 7: 3D modeling by the fusion system and the solid-state single-photon light detection and ranging (SS-SPL).
figure 7

ac Stitched and colored point clouds of the building interior. Point clouds were captured indoors by the fusion system from different viewpoints and locations, with 114 shots stitched together to create the 3D model. Interior view of the restaurant (d) and fine texts on the white wall (e), including the ground truth on the left side and the point clouds on the right side. fk Point clouds acquired outdoors by the fusion system under environmental illumination of ~3000 lx or 30,000 lx in single-photon (SP-mode), double-photon (DP-mode), or triple-photon (TP-mode). From left to right: the ground truth, the front view of the point cloud (same as the shooting direction), and the side view of the point cloud (perpendicular to the shooting direction). l Basketball shot photo sequence extracted from the side view of a point cloud video captured by the SS-SPL from behind a shooter.

Discussion

For SPL to be effective, the detection probability must approach unity at an acceptable distance, with sufficient precision and accuracy for the intended application. In this work, we introduced the 10 g SS-SPL module capable of single-shot operation via single-photon detection. The prospects for further enhancing the performance of each device and SS-SPL, and the advantages and challenges of the resulting super-dense point cloud in the context of 3D modeling are discussed in this section.

To increase laser pulse energy of the Q-switching semiconductor laser while addressing nonlinear gain saturation, an approach involves expanding the beam cross-sectional area by adding a tapered amplifier section23. However, exceeding the pulse energy of broad-stripe EELs or PCSELs is difficult, making the approach unsuitable for conventional LiDAR applications. The method of current-induced BGR is not limited to narrow-stripe EELs and can be applied to any semiconductor laser to increase pulse energy and reduce pulse tail.

Despite the MEMS mirror operating at atmospheric pressure, a high FoM and strong shock robustness suggest that applications are not limited to SS-SPL and may extend to a wide range of fields. Although the large 2D MEMS mirror was designed to achieve both TX and RX apertures using the perforated freeform mirror, it enables a large TX aperture in biaxial optical systems and wide FoV, while maintaining small laser divergence angles for long-range applications.

The experimental demonstration of the PSGBR architecture of the TDC operating reliably under noisy conditions and achieving the single-shot timing jitter of 3 ps alleviates concerns regarding the limitations associated with the use of fine-TDCs. Even if drifts arise from longer measurement times, they can be reduced either directly or through calibration by using a temperature-compensated crystal oscillator (TCXO)50.

To increase the distance precision, improving the signal-to-noise ratio at the differential amplifier input and enhancing its response are crucial. The pulse current corresponding to a single photoelectron is approximately several μA, which is not substantially low compared with previous reports51,52, indicating that timing jitter can be minimized without increasing power consumption. Therefore, if an ideal short pulse laser and low jitter rise timing detection are achieved, the standard deviation of the distance can be reduced to a few millimeters or less.

We discuss further long-range measurement from two perspectives: indoor and outdoor. In the proposed coaxial optical system, further improvement of the RX aperture is challenging; therefore, increasing the laser pulse energy serves as a dominant factor for indoor use. Approximately five times greater measurement range can be achieved owing to the existence of high-power light sources such as PCSELs. In Supplementary Notes 4, we theoretically analyzed the influence of sunlight and demonstrated good agreement with the experimental results53,54,55. The 10 nm bandwidth optical filter was used in the experiment, taking into account the spectral width of 2.56 nm for the present laser. However, by narrowing the filter bandwidth to, for example, 0.2 nm, outdoor ranging beyond 100 m becomes feasible. Notably, a report that DFB-type semiconductor lasers23 can achieve a sufficiently narrow spectral width is shown in Supplementary Notes 3. The maximum pulse energy achievable with state-of-the-art chip-scale high-peak-power semiconductor/solid-state vertically integrated lasers56 may determine the ultimate measurement range limit of SS-SPL. Conversely, applying SS-SPL to ultra-long-range measurements is not feasible57, both due to limitations in achievable laser power and the influence of sunlight.

The results of this study in 3D modeling elucidated that SS-SPL enabled the generation of highly accurate point clouds from a single viewpoint. Furthermore, the projection of color information onto these point clouds substantially improved their visual fidelity. By mounting SS-SPL on the camera, precise three-dimensional positional information was assigned to the images acquired by the camera. In the context of data fusion, this approach is expected to reduce the number of images required for preprocessing methods such as SfM, decrease the amount of training data required for machine learning, and shorten the processing time for camera pose estimation. Furthermore, the method improves the accuracy of estimated camera positions even for small and densely clustered objects, such as cherry blossom petals. Conversely, excessively high point cloud density degraded the convergence accuracy of colored iterative closest point (CICP). In the future, SS-SPL is expected to enable experimental investigation of the balance between point cloud density and geometric fidelity. While SS-SPL enables the generation of highly accurate point clouds even for structures with minimal texture, which are challenging for photo-based 3D modeling, point cloud generation becomes significantly more difficult for objects with low reflectance or specular surfaces. Therefore, the fusion of SS-SPL with photogrammetry, NeRF, or 3DGS serves the diverse requirements.

Finally, demonstrating super-dense point clouds using the ultralight module opens new avenues for weight and size-constrained platforms, such as small to medium-sized drones. This will enable a range of applications, including virtual production, game design, sports, construction, logistics, and exploration or inspection in challenging environments58,59,60.

Methods

Key criteria for pulsed semiconductor laser design

As in optical discs and optical communication technologies, semiconductor lasers are considered optimal compact light sources. Since class-1 laser safety at near-infrared wavelengths is evaluated at the same pulse energy for the irradiation time of 10 ps to 5 μs, introducing shorter pulse widths within this range reduced timing jitter. For two-axis laser beam scanning, achieving a low M2 (beam quality) factor is essential to improve angular resolution.

Q-switching technique is widely employed to generate short-pulse laser beams for semiconductor lasers. Notable examples include narrow-stripe edge-emitting lasers (EEL) with the 18.6 ps pulse width61, bowtie lasers with 15 ps pulse width and 100 pJ pulse energy62, bowtie lasers with 45 ps pulse width and 39 pJ pulse energy63, broad-stripe EELs comprising asymmetric thick cladding layers with 80 ps pulse widths and 35 W peak power21,64, and photonic-crystal surface-emitting laser (PCSEL) with pulse width less than 30 ps and 200 W class peak power65. In conventional EELs, the output power increases with the propagation mode area, whereas M2 increases owing to the generation of higher-order transverse modes. In contrast, PCSELs offer increased beam cross-section while maintaining low M2, which is a significant breakthrough.

For accurate ranging using single-photon detection in a single shot, both the absence of laser pulse tail and a short pulse width are necessary. However, eliminating laser pulse tails is challenging owing to the longer current injection time compared with the laser pulse width. We propose a laser pulse tail suppression technique based on the following mechanism: First, suppressing amplified spontaneous emission (ASE) achieves high carrier density, following which BGR is induced before Q-switching laser pulse generation. Second, BGR disappears after Q-switching laser pulse generation, reducing gain in conjunction with the transient response of the disappearance. Third, a dedicated circuit rapidly changes the Q-switch state to a low level during the transient response. This method was applied to the narrow-stripe EEL with reliable process reproducibility.

Structure of the Q-switching semiconductor laser and operation

The schematic cross-section of the semiconductor laser along the 4 mm cavity is shown in Fig. 2a. The front section comprises 51 pairs of 21 μm Q-switch and 42 μm gain regions, and the rear end includes 200 μm Q-switch and 100 μm gain regions. These p-doped gain and Q-switch regions above the active layer, comprising an n-type AlGaAs single quantum well (SQW), were electrically isolated by 4 μm n-doped grooves. A composition ratio of the active layer was adjusted to achieve a wavelength of approximately 820–830 nm, where the photon detection efficiency of the SiPM is relatively high. Transverse modes were weakly confined by an 8 μm ridge stripe created via dry etching. All gain and Q-switch regions were connected to the one gain region anode and the one Q-switch region anode. The Q-switch region anode was driven by independently controlled PMOS and NMOS, and the cathode was low-side-driven by NMOS. They were driven precisely by a pulse signal generator embedded in the CMOS driver-controlled timing with an LSB of approximately 100 ps. The gain region anode was adjusted between 4.45 and 5.04 V to keep the cathode voltage within the CMOS drive voltage of 5 V. This applied voltage is significantly lower than that of conventional Q-switching semiconductor lasers, and minimizing common-mode noise and reducing power consumption in the compact module is crucial. During the current injection into the gain region before Q-switching, the NMOS for Q-switch remains on. The drain voltage caused by the on-resistance of the low-side driver negatively biases the Q-switch region. In this state, the high absorption coefficient of the Q-switch region disrupts the propagation mode. Spontaneous emission from the gain region is rapidly absorbed either into the Q-switch region or the substrate via leakage. Therefore, ASE is more effectively suppressed in shorter gain regions due to the rapid absorption and limited amplification of spontaneous emission.

Bandgap renormalization in a quantum well

The red shift induced by BGR was observed in GaAs/AlGaAs quantum wells via photoluminescence35,36 and current injection37. Specifically, Bobrysheva et al. investigated GaAs/AlGaAs multiple quantum wells (MQW) employing photoluminescence spectra at high-level picosecond excitation. They observed that the e–h plasma-stimulated luminescence line red-shifted due to BGR, with a recovery time of 300 ps as the plasma density decreased36. Comparing the previous results with those of the present study indicated that BGR plays a significant role in laser pulse formation in semiconductor lasers.

Silicon photomultiplier (SiPM)

MPPC® S15369-1325PS, manufactured by Hamamatsu Photonics, was used. It consisted of 2120 SPAD elements. We designed a dedicated CMOS driver that was controlled by the FPGA.

Lissajous trajectory of resonant 2D micro-electro-mechanical system mirror

The optical scanning angle of the 2D MEMS mirror, featuring a fast beam on the inner circumference and a slow beam on the outer circumference as shown in Fig. 1b, can be expressed as:

$${{{{\bf{R}}}}}_{{{{\bf{x}}}}}\left({{{\rm{\theta }}}}\right)=\left(\begin{array}{ccc}1 & 0 & 0\\ 0 & \cos {{{\rm{\theta }}}} & -\sin {{{\rm{\theta }}}}\\ 0 & \sin {{{\rm{\theta }}}} & \cos {{{\rm{\theta }}}}\end{array}\right)$$
(6)
$${{{{\bf{R}}}}}_{{{{\bf{y}}}}}\left({{{\rm{\varphi }}}}\right)=\left(\begin{array}{ccc}\cos {{{\rm{\varphi }}}} & 0 & -\sin {{{\rm{\varphi }}}}\\ 0 & 1 & 0\\ \sin {{{\rm{\varphi }}}} & 0 & \cos {{{\rm{\phi }}}}\end{array}\right)$$
(7)
$${{{\bf{n}}}}={{{{{\bf{R}}}}}_{{{{\bf{y}}}}}\left({{{\rm{\varphi }}}}\right){{{\bf{R}}}}}_{{{{\bf{x}}}}}\left({{{\rm{\theta }}}}\right){{{{\bf{n}}}}}_{{{{\bf{0}}}}}$$
(8)
$${{{\bf{r}}}}={{{\bf{s}}}}-2\left({{{\bf{s}}}}\cdot {{{\bf{n}}}}\right){{{\bf{n}}}}$$
(9)

where\({{{{\bf{R}}}}}_{{{{\bf{x}}}}}\left({{{\rm{\theta }}}}\right)\) and \({{{{\bf{R}}}}}_{{{{\bf{y}}}}}\left({{{\rm{\varphi }}}}\right)\) are the rotation matrices, \({{{{\bf{n}}}}}_{{{{\bf{0}}}}}\) represents the normal of the fixed MEMS mirror, \({{{\bf{n}}}}\) represents the normal during rotation, \({{{\bf{s}}}}\) denotes the direction of laser incidence, and \({{{\bf{r}}}}\) denotes the direction of laser reflection. The trajectory is referred to as a Lissajous curve.

Up-down direct digital synthesis in micro-electro-mechanical system mirror driver

Because Lissajous trajectories varied significantly with frequency shifts even by 1 Hz, frequency control must be maintained in steps of 0.2 Hz or less. Conventional MEMS mirrors with high FoM typically exhibit Q values exceeding several thousands27,28,29. The resonant frequency of MEMS is temperature-dependent; hence, a phase-locked loop (PLL) is typically used to align the driving frequency with resonance. However, for uniform spatial scanning employing Lissajous trajectories, maintaining a fixed frequency ratio between horizontal and vertical scans is preferable. Considering the recent advances in piezoelectric devices47, we designed a low-Q MEMS mirror, prioritizing feed-forward controllability at the target frequency over energy efficiency.

When stabilizing and fixing the frequency of the low-Q MEMS mirror, direct digital synthesis (DDS)66 is preferred over PLLs used in the high-Q MEMS mirror. However, typical DDS implementations require larger LUTs to store waveform information for achieving precise frequency control. To address this issue, incorporating a computational unit into the DDS and storing coefficients in the LUT instead of waveform information significantly reduced the size of the read-only memory (ROM)41. We propose an up-down-DDS (UD-DDS) system equipped with a high-precision reference clock \({f}_{{{\mathrm{sys}}}}\) of 20 MHz from a crystal oscillator as shown in Fig. 3d. To achieve the desired step, the number of counter bits, denoted \({N}_{{{{\rm{c}}}}}\), is set to 27. An output frequency is defined as:

$${f}_{{out}}={f}_{{sys}}{\cdot {Count}}_{{inc}}/{2}^{{{{N}}}_{{{c}}}}$$
(10)

where \({{{\mathrm{Count}}}}_{{{\mathrm{inc}}}}\) determines the maximum frequency. Since the MEMS mirror resonates only near specific frequencies, harmonic components of the MEMS driving wave exert minimal influence. Therefore, the number of bits required for DAC (denoted as \({N}_{{{{\rm{d}}}}}\) in Fig. 3d) can be relatively low, and is estimated to be 9 bits. In standard DDS, storing \({N}_{d}\)-bit DAC values for each counter value (\({2}^{{{{{\rm{N}}}}}_{{{{\rm{c}}}}}}\) addresses) requires 1.2 billion bits (\({N}_{{{{\rm{d}}}}}\times {2}^{{{{{\rm{N}}}}}_{{{{\rm{c}}}}}}=9\times {2}^{27}\)). However, when DAC inputs change gradually and not by multiple bits at once as shown in Fig. 3e, the LUT capacity can be significantly reduced to levels comparable to the DDS equipped with a computational unit by storing the DAC threshold values (timing of bit transitions) in the LUT instead of the waveform shape. The architecture where the up or down DAC input values are generated when the counter value crosses the threshold from the LUT is denoted as If block, in Fig. 3d. The number of bits stored in the LUT of the UD-DDS is 1.4 thousand (\({N}_{{{{\rm{c}}}}}\times {2}^{{{{{\rm{N}}}}}_{{{{\rm{d}}}}}}=27\times {2}^{9}\)), storing the threshold values of \({N}_{{{{\rm{c}}}}}\) bits for each DAC value (\({2}^{{{{{\rm{N}}}}}_{{{{\rm{d}}}}}}\) addresses). In recent years, digital components, which strongly benefit from advances in semiconductor process miniaturization, and the analog components, which benefit from conventional stable processes, are implemented on separate chips. In these cases, the UD-DDS offers a reduced baud rate between chips to 2-bit/9-bit.

Key criteria for time-to-digital converter architecture selection

Timers in LiDAR are classified into time-dependent single-photon counting and rise-time measurement. The latter timer, which directly measures ToF, is used for SPL as it can be measured in a single shot. In a coaxial optical system, a high-precision TDC with a relatively large core unit area can be introduced because measurements are processed sequentially for each point. Several types of fine-TDC were reported, such as vernier buffer ring31,67, pulse shrinking buffer ring68, and PSGBR30,47. The fine-TDC was combined with the coarse-TDC to create the two-stage TDC, thereby expanding the measurable range30. To further support long-range operation, the stable clock from TCXO was introduced directly or via the ref. 50. In addition, the TDCs designed for laser rang-finder applications were verified50,51,52. The temporal resolution of the PSGBR is determined by the difference in gate sizes of closely arranged paired inverters within the buffer. This architecture offers superior tolerance to manufacturing variations and electrical noise. In developing the two-stage TDC embedded in the SS-SPL based on the PSGBR, the layout of the 40 nm CMOS process was carefully designed to ensure the TDC usability in environments with several hundred milliwatts of common-mode noise.

The operation of the two-stage TDC with PSGBR is as follows. The coarse-TDC is initiated by inputting the start signal and is stopped by detecting the first counting-up of the inverter ring immediately following the stop signal input. An initial pulse width, which is a time difference between the stop signal input and the coarse-TDC stop, is determined by counting the number of gates traversed before the pulse disappears. The time difference between the start and stop signals is obtained by subtracting the estimated period of the fine-TDC from that of the coarse-TDC.

Estimation of the timing jitter of the time-to-digital converter

The TDC was evaluated in its integrated state within the module. During testing, the semiconductor laser embedded was active, and the MEMS mirror remained stationary. To analyze TDC performance, the rise timing distributions were obtained for Ch-1, Ch-3, and Ch-4. The start signal was input by the laser pulse, the stop signal was generated by a synchronized electrically strong waveform instead of the ToF, and the thresholds of the stop signal were set significantly above the baseline noise. The coarse count was 176, which corresponded to a time duration of 158 ns. The standard deviations were \({\sigma }_{{{\mathrm{ch}}}1}=42.28\) ps, \({\sigma }_{{{\mathrm{ch}}}3}=43.22\) ps, and \({\sigma }_{{{\mathrm{ch}}}4}=42.34\) ps, respectively. Subsequently, as shown in Fig. 4b, the standard deviation of each channel was calculated from the measured value of the standard deviation of the time difference between the channels. Notably, the standard deviation for single-photon measurements was increased from 229 ps at 10 m to 311 ps at 25 m, corresponding to a coarse count difference of 111. This difference between the cases of the reproducible strong electrical signal sequence and the actual ToF signals suggests that the coarse TDC is influenced by the stability of the input signal across the entire FoV.

Time-work error correction

Time-walk errors, which must be corrected to improve distance precision, were reported to primarily occur within the SPAD elements and the electrical readout circuitry51,52,69. By sufficiently expanding the cross-sectional area of the light illuminating the SiPM, as shown in Fig. 1c, the probability of two or more photons simultaneously arriving at a single SPAD element is minimized, thereby suppressing the occurrence of the former time-walk error. However, similar to an avalanche photodiode (APD)51, the time-walk error of several hundred picoseconds, caused by the slope of the output pulse waveform, was observed. In the configuration, time–walk error caused by variations in SiPM output intensity was corrected through dedicated signal processing that leveraged slight differences in rise timing across channels with different threshold voltages. The multi-threshold approach was adopted in APD-based systems51,52. In this study, the threshold voltage difference corresponded to the signal difference generated by a single photoelectron. For example, in the SP-mode, single-photon, two-photon, and three-photon or more detections can be distinguished based on the responding channels. Particularly for the case of three-photon or more detections, the nonlinear time walk error correction value was retrieved from the LUT containing the correction factor determined from the calibration and the time difference between Ch-4 and Ch-3. As shown in Fig. 4d, the measured single-shot timing jitter of 3 ps was sufficient to correct the time–walk errors on the order of several hundred picoseconds.

Equipment and postprocessing

The measurement system comprises the SS-SPL module housed in a die-cast shell, a circuit board equipped with the FPGA, and a 12 V AC adapter. The circuit board and a GPGPU-equipped PC are connected via USB3. Multiple buck converters are mounted on the circuit board, whereas low-dropout regulators and boost converters are embedded inside the module. The time difference data outputs from the four-channel TDC are input to the FPGA for rearrangement, noise filtering, time-walk-error correction, and conversion to position vectors. The PC displays the point cloud in real-time from the received position vectors and reconstructs them after accumulating data for approximately 20 s. Timestamps can be inserted at arbitrary intervals between position vector data for synchronization or analysis purposes. By fixing the SS-SPL to the digital camera (Sony ILX-LR1), an RGB image can be acquired simultaneously to create a colored point cloud by projecting the points onto the RGB image70. Although photo brightness and tint must be adjusted as needed, parameters must remain constant across the measurement for the stitched point clouds. Ambient illumination is measured by orienting an instrument toward the zenith. For point cloud stitching, the digital camera and SS-SPL are mounted on a gimbal where polar coordinates are controlled by the PC.

The stitched 3D model of the restaurant by the fusion system

A meticulously stitched 3D model was constructed from colored point clouds captured from 114 distinct viewpoints and locations, as shown in Fig. 7ae. All measurements were taken at a height of 1.58 m and in the SP-mode. The point clouds above 3 m in height from the floor were filtered out (Fig. 7a). Initially, mid-range point clouds up to 25 m were stitched together, followed by the alignment of finer short-range point clouds. No significant geometric mismatches were observed between the point clouds, confirming the achievement of geometric fidelity in the 3D model. The ground truth (photograph) of the restaurant’s interior and the corresponding stitched point clouds are illustrated in Fig. 7d, highlighting clear visibility of kitchen utensils and the neon lights of the logo. Even the structure of the dark ceiling was captured within the point cloud data. Furthermore, Fig. 7e presents the ground truth (photograph) of the text on the white wall and the naturally patterned concrete walls, alongside their corresponding stitched point clouds. Certain objects posed challenges for point cloud acquisition, such as the mirror-surfaced metal enclosure and the black couch with low reflectivity.

The stitched super-dense point cloud contained a large volume of data, such as 2.15 GB in point cloud data format for the case shown in Fig. 7a. Voxelization compressed the data to 345 MB at 10 mm voxel size and to 90 MB at 20 mm voxel size. Adaptive voxelization is expected to further compress the data while preserving fine structural details. Unfortunately, at substantially high point cloud densities, CICP failed to perform effectively, requiring manual stitching of all segments.

The 3D model of cherry blossom by the fusion system

The colored point clouds of cherry blossoms acquired by the first event in SP-mode, DP-mode, and TP-mode are shown in Fig. 7fk. Following a 20-s exposure from a single viewpoint, the post-process reconstruction was performed. At 2500 lx in SP-mode, the tree trunk (7 m ahead) was visible, while cherry blossoms and branches up to 8 m ahead could be captured from the gap between the petals of the cherry blossoms. At 30,000 lx, the point cloud density decreased throughout the space, reflecting the increase in the point cloud noise caused by sunlight as discussed in Supplementary Notes 4. Particularly, the bush in front of the tree disappeared considerably. In the DP-mode, visibility farther than the tree trunk (7 m away) was low. Minimal change in the point cloud density was observed, attributed to the difference in environmental illumination. In TP-mode, the visible distance became shorter, but the point cloud density was not affected by environmental illumination. The point cloud of cherry blossom, obtained by DP-mode from a slightly closer distance and processed to remove points farther than 5 m, is shown in Supplementary Notes 5. Detailed cherry blossoms could be captured, even though it was acquired from the single viewpoint. Supplementary Video 2 shows an omnidirectional view of a 3D model acquired by SP-mode under environmental illumination of 2500 lx.

The dynamic 3D model of basketball shot by the solid-state single-photon light detection and ranging

The basketball shot was captured by the SS-SPL under conditions including SP-mode, the first event, ambient illumination of 500 lx, and a single viewpoint on an indoor court. A noise filter was applied during post-processing, but no reconstruction was performed. The front view was nearly coaxial with the shooting direction of the SS-SPL, and the top and side views were obtained from the same point cloud at different angles (Supplementary Video 3). Colors were added subsequently to highlight the movement and were not the actual colors. Walls and other structures behind the basketball goal were eliminated using an area filter during post-processing.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.