Adaptative machine vision with microsecond-level accurate perception beyond human retina

Li, Ling; Li, Shasha; Wang, Wenhai; Zhang, Jielian; Sun, Yiming; Deng, Qunrui; Zheng, Tao; Lu, Jianting; Gao, Wei; Yang, Mengmeng; Wang, Hanyu; Pan, Yuan; Liu, Xueting; Yang, Yani; Li, Jingbo; Huo, Nengjie

doi:10.1038/s41467-024-50488-6

Download PDF

Article
Open access
Published: 24 July 2024

Adaptative machine vision with microsecond-level accurate perception beyond human retina

Ling Li ORCID: orcid.org/0009-0001-4398-6848¹,
Shasha Li²,
Wenhai Wang¹,
Jielian Zhang¹,
Yiming Sun¹,
Qunrui Deng¹,
Tao Zheng¹,
Jianting Lu³,
Wei Gao¹,
Mengmeng Yang¹,
Hanyu Wang¹,
Yuan Pan¹,
Xueting Liu¹,
Yani Yang¹,
Jingbo Li^4,5 &
…
Nengjie Huo ORCID: orcid.org/0000-0003-2520-6243^1,5

Nature Communications volume 15, Article number: 6261 (2024) Cite this article

10k Accesses
48 Citations
1 Altmetric
Metrics details

Subjects

Electronic devices

Abstract

Visual adaptive devices have potential to simplify circuits and algorithms in machine vision systems to adapt and perceive images with varying brightness levels, which is however limited by sluggish adaptation process. Here, the avalanche tuning as feedforward inhibition in bionic two-dimensional (2D) transistor is proposed for fast and high-frequency visual adaptation behavior with microsecond-level accurate perception, the adaptation speed is over 10⁴ times faster than that of human retina and reported bionic sensors. As light intensity changes, the bionic transistor spontaneously switches between avalanche and photoconductive effect, varying responsivity in both magnitude and sign (from 7.6 × 10⁴ to −1 × 10³A/W), thereby achieving ultra-fast scotopic and photopic adaptation process of 108 and 268 μs, respectively. By further combining convolutional neural networks with avalanche-tuned bionic transistor, an adaptative machine vision is achieved with remarkable microsecond-level rapid adaptation capabilities and robust image recognition with over 98% precision in both dim and bright conditions.

Bioinspired in-sensor visual adaptation for accurate perception

Article 03 February 2022

Self-powered and broadband opto-sensor with bionic visual adaptation function based on multilayer γ-InSe flakes

Article Open access 24 July 2023

Multifunctional human visual pathway-replicated hardware based on 2D materials

Article Open access 05 October 2024

Introduction

In the human retina, the photoreceptor (rod and cone cells) first transmits light stimulation to the horizontal and bipolar neurons, where the brain receives signals through a feedforward excitation circuit formed with bipolar neurons, and the horizontal neurons output inhibitory signals to receptors through feedback inhibitory circuit depending on the magnitude of the light stimulus. Notably, due to the higher photosensitivity of rod than cone cells by three orders, the rod cells activate the primary photoreceptor in weak light, and the inhibitory signal serves as a selector switching the primary photoreceptor to cone cells when shifting to strong light^1,2. This spontaneous visual behavior is known as visual adaptation^3,4, because of which, the retina prevents the brain from continuously receiving overstimulated information when the environmental brightness changes excessively and rapidly. However, this visual behavior relies on horizontal neurons modulated through feedback inhibitory with significant time hysteresis, which introduces serious hazards in daily life, such as car accidents, blindness and high difficulty in searching at night^5,6,7. Therefore, it is crucial to optimize the visual behavior, the crucial aspect of which is to optimize the feedback inhibition circuit for fast and high-frequency visual adaptation.

Machine vision technology, built on deep learning with convolutional neural networks, achieves highly accurate image recognition. This breakthrough technology demonstrates immense potential in areas such as autonomous driving, facial recognition, and medical imaging, that can replace the human retina in hazardous environments to perceive and judge^8,9,10,11. However, the currently developed vision perception systems struggle to adapt to varying brightness levels because of the impressionable image quality by brightness conditions, thus requiring complex circuits and advanced algorithms in machine vision. Recently, Prof. Chai Yang’s team proposed an innovative concept that can significantly reduce the requirements of circuits and algorithms by combining a two-dimensional (2D) bionic vision sensor with visual adaptation capabilities and a convolutional neural network¹². It is demonstrated that the ultrathin 2D materials are appropriate for developing bionic visual sensors due to excellent electrostatic doping, fast photoresponse, high avalanche performance and ferroelectric properties^{12,13,14,15,16,17,18}. Although, 2D bionic vision sensors have been well developed with bio-inspired photopic and scotopic adaptation, but still encounter some issues or disadvantages, limiting their combination to machine vision. For example, the bilayer MoS₂ transistor with surface trap states exhibits a visual adaptation function enabled by the charge trapping and detrapping mechanism under gate voltage modulation, that takes up to several minutes and requires manual configuration of different gates under both photopic and scotopic adaptation conditions⁴. The γ-InSe device can only simulate the photopic adaptation function through photo-pyroelectric and photo-thermoelectric effects⁵. The ReS₂/U6N bionic device utilizes the interfacial defects as feedback inhibition to induce visual behavior at limited wavelength also with prolonged adaptation time⁷. Therefore, the defects tuning as feedback inhibition in commonly reported 2D bionic visual device always leads to the slow adaption process just as retina behaves. The visual bionic mechanism needs to be explored to further optimize the visual adaptation that is beyond the human retina and the cutting-edge machine vision systems.

To solve the above issues and introduce a distinctive working mechanism for visual adaptation, we developed a bionic visual 2D transistor. A significant avalanche effect, which can be tuned by external voltage (V_DS and V_GS) and light illumination conditions, is enabled by the impact ionization in the depletion region formed in the ultrathin MoS₂ channel with top WSe₂. At V_GS = −3 V, the breakdown voltage (V_EB) is as low as 5.48 V and the multiplication factor reaches 5.29 × 10⁵, superior to the reported 2D avalanche transistors. By changing the environmental illumination from dim to bright, the output current increases and then decreases, showing retinal visual behavior. This is caused by the spontaneous transition of the device’s operating mechanism from a high-sensitivity avalanche effect to a low-sensitivity photoconductivity effect, whose function is like a switchover between rod and cone cells in the retina upon the environment change. Taking advantage of avalanche tuning operation as a feedforward inhibition circuit in a bionic neural network, the device can emulate high-frequency visual behavior at 6 and 3 kHz under simulated scotopic and photopic adaptation conditions, possessing a fast adaptation process of 108 and 268 μs, respectively, that is far beyond human retina function and the currently developed 2D bionic sensors. The −3 dB bandwidth reaches 10.5 kHz at a weak light power of 125.62 pW due to the defect filling by the carrier avalanche, that is also surpassing the dynamic response of the retina (500 Hz). Leveraging the microsecond-level rapid adaptation capability, the bionic avalanche transistor is further seamlessly combined with a convolutional neural network, developing an ultra-fast adaptative machine vision system, which excels in image recognition in both dim and bright environmental conditions, boasting a precision rate exceeding 98%.

Results

Retina and neural circuit motifs

As shown in Fig. 1a, the human retinal architecture consists of rich visual cells whose autonomic processing of light information is known as visual adaptation¹⁹. The types of neurons and the connection paths can form a variety of circuit motifs that are key to visual adaptation behavior. Figure 1b shows three typical examples in the visual system. Light information is transmitted from the retina to the cerebral cortex via feedforward excitation, which enables information convergence and divergence²⁰. Convergence improves the signal-to-noise ratio and divergence allows the signal to be processed by multiple channels to enhance computational performance. The feedback inhibition is an important circuit for visual adaptation, that involves the regeneration/bleaching photopigment and causes a long-time adaptation process^21,22. Feedforward inhibition exists in the cerebral cortex that is faster and more predictive than feedback inhibition²³. In this circuit, the target cell receives an integration of excitatory and inhibitory information, effectively avoiding the long adaptation process. However, almost all reported bionic visual systems adopted a feedback inhibition such as charge trapping or detrapping process to emulate the visual adaption, while the faster feedforward inhibition has never been employed in bio-inspired devices so far.

**Fig. 1: Classical core circuit motifs in retina.**

Figure 1c, d show the photopic/scotopic adaptation process of the bionic devices relying on feedback inhibition and feedforward inhibition circuit, respectively. In the former case, the visual adaptation involves an overexcited signal change at the primary stage (Process 1) by the application of extra gate voltages (V_GS), and a long-time adaptation process with a timescale of minutes (Process 2) caused by the prolonged charge trapping/detrapping process. As a more advantageous situation, the feedforward inhibitory circuit-based visual adaptation avoids receiving overexcited signals and proceeds much faster with a timescale of microseconds. In addition, the range of current variation is larger, indicating a more obvious excitation (inhibition) effect. To be different from previous bionic devices based on feedback inhibition, this work will employ a faster and more predictive feedforward inhibition in an avalanche transistor for ultra-fast and high-frequency visual adaptation behavior.

Device scheme and characterization

Figure 2a and Supplementary Fig. 1 illustrate the schematic diagram and optical microscopy image of the device structure and electrical connections, respectively. This junction field effect transistor (JFET) consists of an ultrathin MoS₂ transport channel and top WSe₂ gate, where the depletion region and vertical electric field at the MoS₂/WSe₂ interface can be modulated by top gate voltages to control the switching behavior and avalanche effect. The detailed device fabrication process is presented in the Methods. The cross-sectional high-resolution transmission electron microscopy image is shown in Fig. 2b, demonstrating the clear lattice fringe and ultrathin thickness of MoS₂ (3.93 nm) and WSe₂ (3.04 nm), ensuring a smooth and clean van der Waals (vdW) interface. To further identify the composition of the stacking layers, Fig. 2c and Supplementary Fig. 2 show the energy-dispersive X-ray spectroscopy (EDX) elemental mapping and analysis plots corresponding to the assembled WSe₂ and MoS₂, respectively.

**Fig. 2: Characterization of bionic visual device based on MoS₂/WSe₂ vdW heterostructure.**

Modulation of the electric field is crucial in the device operation mechanism, thus it is essential to ensure the formation of depletion region and built-in electric field at interface. The work functions of WSe₂ and MoS₂ can be measured by Kelvin probe force microscopy (KPFM) to be 4.81 and 4.68 eV, respectively, with a surface potential difference (SPD) of 135 mV between both, as shown in Fig. 2d and Supplementary Fig. 3. The calculation for KPFM measurement is presented in Methods. With the lower work function of MoS₂, the electrons shift from MoS₂ to WSe₂ after contact until reaching an equilibrium state, forming a built-in electric field with direction pointing from MoS₂ to WSe₂, According to the above analysis and previous report^24,25, WSe₂/MoS₂ heterojunction presents a typical type-II energy band arrangement as shown in Fig. 2e²⁶. The I_GS-V_GS curve exhibits the diode characteristic with a current rectification ratio as high as 10⁴ (Fig. 2f) and an ideal factor of unity (Supplementary Fig. 4), further demonstrating the formation of the depletion region in the high quality of the heterojunction.

Gate-tuned avalanche properties and operation mechanism

Acting as a JFET, our device exhibits a well-switching behavior as demonstrated by its transfer characteristics (Supplementary Fig. 5). Interestingly, by further increasing the drain voltage (V_DS), an avalanche phenomenon is observed as shown in Fig. 3a. Under a fixed gate voltage and with an increase in V_DS, the drain current (I_DS) deviates from linear to reach saturation due to the pinch-off effect, then increases rapidly as the carrier avalanche multiplication when V_DS exceeds the electric breakdown voltage (V_EB). Notably, the gate voltage (V_GS) exhibits significant modulation on V_EB and multiplication factor as shown in Fig. 3b. The definition of the V_EB is shown in Supplementary Fig. 6 and the multiplication factor is defined as \({{{\rm{M}}}}=\frac{I}{{I}_{{{{\rm{s}}}}}}\), where I and I_s are the avalanche and saturation current, respectively. As V_GS decreases, the V_EB is lowered while the multiplication factor is improved significantly, reaching 5.48 V and 5.29 × 10⁵, respectively, at V_GS = −3 V. Figure 3c summarizes V_EB and multiplication factor of the reported 2D avalanche transistors, showing the superior avalanche performance and lower power consumption of our device with highest multiplication factor and low breakdown voltage^{27,28,29,30,31,32,33,34,35}.

**Fig. 3: Avalanche properties and operation mechanism.**

Detection sensitivity is a key parameter to evaluate the avalanche properties and is inversely proportional to noise. Figure 3d and inset shows the noise density spectrum to current (S_n/I) at different V_DS, transforming from 1/ƒ noise to stable white noise as increasing frequency. Interestingly, when an avalanche occurs at V_DS > V_EB, the sensitivity is improved with decreased S_n/I from 2.41 × 10⁻⁴ to 1.83 × 10⁻⁴ Hz^−1/2, which is consistent with the Geiger mode characteristics of conventional avalanche transistors and implies that the devices have the potential for weak light detection. S_n is acquired from the Fourier transformation of dark current traces, as shown in Supplementary Fig. 7. The ionization index (n) can be calculated from the formula \(1-\frac{1}{{{{\rm{M}}}}}={(\frac{{V}_{{{{\rm{DS}}}}}}{{V}_{{{{\rm{EB}}}}}})}^{n}\), which is related to the ionization rate at the drain side. By calculating the slope of the linear region in Fig. 3e, n remains nearly constant with a value of ~12.2 at different V_GS, as demonstrated in Supplementary Fig. 8. Figure 3f shows the ionization rate mapping as a function of V_DS and V_GS, which can be calculated by \(\alpha \left({V}_{{{{\rm{DS}}}}}\right)=\,\frac{1}{n}*\frac{d{n}_{e}}{{dx}}=\frac{1}{L}*(1-\frac{1}{{{{\rm{M}}}}})\), where n_e is the electron density and L is the channel length. As V_GS decreases, the breakdown voltage is reduced, and the ionization rate increases by regulating the electric field at the MoS₂/WSe₂ junction. Due to space charge limitation³⁶, the ionization rate gradually saturates to 1.2 × 10⁵ cm⁻¹ with increasing V_DS.

A strong electric field is necessary for the avalanche effect whose strength can be represented by the impact ionization rate. Therefore, the avalanche operation mechanism can be analyzed by technology computer-aided design (TCAD) simulation on the intensity and distribution of electric field and ionization rate in the MoS₂ channel. The applied bias causes a voltage gradient across the channel, with the potential increasing from the source to the drain side. In this work, due to the voltage gradient within the channel, the higher potential at the drain side induces a thicker depletion region, causing the channel to form an unaligned depletion region vertically. As shown in Fig. 3g, with increasing V_DS, the electric field gradually increases to reach a pinch-off point that then moves towards the source end, which corresponds to the three working regions (linear, saturation, and ionization) in Fig. 3a, respectively. When the pinch-off point moves, the intensity and distribution of the ionization rate increase rapidly, which means that more electrons with high kinetic energy impact ionization to induce an avalanche effect. Here, the regulation of the avalanche effect by the bias primarily involves two factors, one is the increasing carrier drift velocity, and another is a thicker depletion layer at the drain side through the voltage gradient. Figure 3h illustrates V_GS regulation on avalanche effect at ionization region. The distribution of both electric field and ionization rate is positively correlated with |V_GS|, indicating that increasing |V_GS| can strengthen the avalanche effect. Notably, at V_DS = 0 V, as the -V_GS increases, the depletion layer thickness increases in an aligned manner. Consequently, under the same V_DS, a larger -V_GS results in a quicker transition to the avalanche state and forms a larger pinch-off region. However, due to the avalanche mainly occurring near the pinch-off area causing a higher ionization rate, the larger -V_GS does not further increase the ionization rate at the drain side, which corresponds to the constant ionization index as discussed above. These TCAD simulation results are very consistent with the observed bias- and gate-tuned avalanche effect in our device.

Light intensity-dependent avalanche and bionic neural network

Now, we turn to how our device can behave with a bionic visual function. As shown in Fig. 4a, the output current and avalanche effect can also be largely tuned by the incident light stimulus. With increasing light illumination, the photocurrent increases as a positive photoconductivity (PPC) in the linear and saturation regions, however, the PPC gradually transitions to negative photoconductivity (NPC) in the ionization region. Figure 4b shows the photocurrent and avalanche gain as a function of light intensity in the ionization region (V_DS = 7.5 V). The photocurrent first increases to 5.1 μA and then decreases to −2.2 μA with increasing light power, which behaves like the spontaneous visual adaptation preventing output of overstimulation information. The avalanche gain is defined as \({Ava\_Gain}=\,\frac{{I}_{{{{\rm{light}}}}}-{I}_{{{{\rm{dark}}}}}}{{I}_{{{{\rm{light}}}}0}-{I}_{{{{\rm{dark}}}}0}}\), where I_light is photocurrent and I_dark is dark current, I_light0 and I_dark0 are the photocurrent and dark current at V_DS = V_EB, respectively. As light intensity increases, the avalanche gain decreases from 1.5 × 10⁴ to −8, indicating that the dominant photo-sensing mechanism shifts from avalanche to photoconductivity effect. Figure 4c exhibits the sensor responsivity as a function of light intensity in both ionization and saturation regions. Significantly, the responsivity in the ionization region experiences great changes in both magnitude and sign, ranging from 7.6 × 10⁴ to −1 × 10³A/W, while that in the saturation region varies slightly from 158 to 5 A/W. The sensitivity evolution in the ionization region is similar to that in the retina, which demonstrates the reliability of the device model³⁷. Notably, the drain current is higher than the leakage current by more than 10³, confirming the validity of the avalanche effect and the high reliability of the device, as shown in Supplementary Fig. 9.

**Fig. 4: Light intensity-dependent avalanche and operation mechanism.**

For human retina, by changing the environment from dim to bright, the photoreceptors including rod cells (high sensitivity) and cone cells (low sensitivity) will dominate the perceptual function alternately. The transition between avalanche and photoconductivity effect in our device is like the switchover between rod and cone cells in the retina. In this way, the avalanche tuning with varying light illumination conditions has endowed the JFET with a visual behavior. The retina sensitivity gradually changes over time for a long visual adaption process due to that the switchover of the photoreceptor cells is controlled by feedback inhibition and regeneration/bleaching photopigment^3,4. To be in contrast, the sensitivity evolution of our device in the avalanche ionization region is light-adaptive and real-time, ensuring the immediate perception of the environment change and avoiding the potential harms caused by the long scotopic and photopic adaptation process of the human retina. It is noted that the switchover of photo-sensing mechanism is accompanied by sign reversal and magnitude changes with over five orders in both avalanche gain and responsivity, the large difference in sensitivity at weak and strong light stimuli can benefit the image contrast enhancement, which is superior to the retina and reported bionic device with visual adaption.

By comparing the sensitivity and avalanche gain with reported avalanche detectors in Fig. 4d, our device demonstrates superior avalanche photodetection characteristics with an avalanche gain of 1.5 × 10⁴ and responsivity up to 7.6 × 10⁴ A/W, exhibiting great potential in weak-light detection and clear visualization at dim environment acting as bionic visual sensor^{27,29,31,34,35,38,39}. As shown in Fig. 4e, the ionization index increases first and then decreases with increasing light intensity, whose trend is consistent with the photocurrent evolution as observed in Fig. 4b, indicating an efficient modulation of light on the ionization rate at the drain side⁴⁰. Figure 4f shows the ionization rate mapping, which again verifies the operation mechanism shifting from avalanche to photoconductivity. The external quantum efficiency (EQE) and V_EB as a function of light illumination are discussed in Supplementary Figs. 10–12.

To further explain the light-tuning avalanche effect, the TCAD simulated electric field and ionization rate under different illumination conditions are shown in Fig. 4g. The built-in electric field is inversely proportional to the light power and nearly disappears under strong light, which is due to the reversed photo-generated voltage at MoS₂/WSe₂ junction to counteract the built-in electric field. On further analysis, the ionization region area first increases and then decreases with increasing light, depending on the inhibition degree of the photo-generated voltage on the avalanche gain in the MoS₂ channel. In weak light condition, the electric field is slightly influenced, and the avalanche gain of the photo-generated carriers dominates. Under strong light illumination, the electric field is significantly weakened and accompanied by a gradual disappearance of the depletion region, which can subsequently inhibit the avalanche effect and reduce the photoresponse sensitivity. The TCAD simulations on electric field and ionization rate can match well with the experimental results, manifesting that the light-tuning avalanche effect can emulate the visual adaptation function.

Distinct from previous 2D bionic devices, our device can optimize visual adaptation beyond the retina by introducing an efficient bionic neural network, as shown in Fig. 4h. In this network, the avalanche and photoconductivity effect can emulate the function of rod and cone cells, respectively, because the photosensitivity by avalanche effect is four orders higher than that by photoconductivity effect. The photo-generated voltage is opposite in direction to the built-in electric field at the MoS₂/WSe₂ junction, which can be seen as an inhibition cell modulating the avalanche effect. Light (Stimulation), “Photo-generated voltage” (Inhibition cell), and “Avalanche” (Rod) can form a feedforward inhibitory circuit where the “Avalanche” receives both stimulus and inhibitory information to avoid it outputting over-current under photopic adaptation conditions. At strong light illumination conditions, the avalanche effect is inhibited, turning the photo-sensing mechanism to “Photoconductivity” (Cone). On the contrary, the mechanism switches from “Photoconductivity” to “Avalanche” by changing light stimulus to weak condition, corresponding to the scotopic adaptation process. For both photopic and scotopic adaptation, the switchover between “Photoconductivity” and “Avalanche” is much faster than the switching process between cone and rod through chemical reactions in the retina. Notably, the feedforward excitation circuit formed by the “Output current”, “Avalanche” and “Photoconductivity” exhibits multiplexed modulation characteristics and effectively improves the signal-to-noise ratio. Thus, our device offers great advantages in visual adaptation compared to human retina and previously reported bionic sensors by introducing a feedforward circuit as fast-switching mechanism.

High-frequency visual adaptation

The dynamic photoresponse at different V_DS and modulation frequency of incident light has been measured as shown in Supplementary Fig. 13, and the rise/fall times are extracted and plotted in Fig. 5a. Before the breakdown, the device works in photoconductive mode and exhibits relatively slow response speed of ~15 ms due to the long-lived trap states. When V_DS > V_EB, the device switches to avalanche mode with a much faster speed of 88 μs, which can be attributed to the large number of avalanche-generated carriers filling the trap states in the MoS₂ channel. To verify that, we measured the time-varying I_DS at different fixed V_DS under continuous light illumination, as shown in Fig. 5b. As time prolongs, the current drops significantly in the range of 10¹–10² ms, which corresponds to the release of trapped holes from shallow level defects in MoS₂ channel⁴¹. When V_DS exceeds V_EB, the current dropping effect is more noticeable due to more released holes in the avalanche region.

**Fig. 5: Avalanche photoresponse properties.**

Figure 5c demonstrates the response bandwidth in weak and strong light conditions, which represent the timescale of visual perception in bionic vision devices. The dynamic response of the human retina is 2 ms, corresponding to a bandwidth of 500 Hz. The −3 dB bandwidth of our device in the avalanche region can reach 10.5 and 3.7 kHz under weak and strong light, respectively, which are much superior to the retina and reported 2D JFETs detectors^42,43,44. The bandwidth under weak light is twice larger than that under strong light, which is related to the inhibition of the avalanche effect by the photo-generated voltage at strong light illumination. At V_DS of 3 V, the device is operated in photoconductive mode, showing a negligible light intensity dependence of dynamic response (Supplementary Figs. 14, 15). Notably, the device exhibits PPC and NPC effect under weak and strong light illuminations, respectively, which is corresponding to the current excitation and inhibition, respectively, enabling the visual adaptation function.

To further verify the advantage of the bionic device in visual adaptation, Fig. 5d, e shows the normalized real-time dependent current of the device under the simulated high-frequency scotopic and photopic adaptation conditions. For scotopic adaptation (Fig. 5d), the environmental illumination is switched from bright (2158.43 pW) to dim (125.62 pW) conditions, subsequently the current increases over time because the avalanche effect takes over the photo-sensing mechanism for higher sensitivity, which is analogous to the role of rod cells for higher visual sensitivity of photoreceptor over time under the dim-light condition. To emulate photopic adaptation (Fig. 5e), the dark background is suddenly turned to be bright, the current decreases quickly over time under continuous strong light irradiation due to the inhibition of “Avalanche” by the “Photo-generated voltage” as discussed above, which is like the inhibitory effect on rod cells by photopigment bleaching in the retina during the photopic adaptation process.

Unlike the long adaptation process of the retina, the current of our bionic device can reach saturation states within 108 and 268 μs for scotopic and photopic adaptation conditions, respectively, at a light frequency of 500 Hz, indicating the ultra-fast visual adaptation, which is of great importance for rapid response and emergency operation of machine vision systems upon the complex environment. Due to the fast visual adaptation process, our device can work to emulate both scotopic and photopic adaptation at 6 and 3 kHz, respectively, without much signal loss. High-frequency signal loss ratio (HLR) is a crucial parameter in the vision system, which is defined as \({{{\rm{HLR}}}}=\left(\frac{{I}_{{{{\rm{ini}}}}}-{I}_{{{{\rm{ini}}}}0}}{{I}_{{{{\rm{fin}}}}0}-{I}_{{{{\rm{ini}}}}0}}\right)+(\frac{{I}_{{{{\rm{fin}}}}0}-{I}_{{{{\rm{fin}}}}}}{{I}_{{{{\rm{fin}}}}0}-{I}_{{{{\rm{ini}}}}0}})\), where \({I}_{{{{\rm{ini}}}}}\) and \({I}_{{{{\rm{fin}}}}}\) represent the initial and final currents at high frequency (6 or 3 kHz), respectively; \({I}_{{{{\rm{ini}}}}0}\) and \({I}_{{{{\rm{fin}}}}0}\) represent the initial and final currents at low frequency of 500 Hz, respectively. The HLR values for scotopic and photopic adaptation are calculated to be only 15 and 14%, respectively, again verifying the more efficient image information processing capability at high-frequency conditions.

The current change ratio (CCR), defined as \({{{\rm{CCR}}}}=\,\frac{{I}_{{{{\rm{Fin}}}}}}{{I}_{{{{\rm{Ini}}}}}}\), where \({I}_{{{{\rm{Ini}}}}}\) and \({I}_{{{{\rm{Fin}}}}}\) represent the current at the initial and final state, is also proposed to quantitatively analyze the current excitation and inhibition effect. At 500 Hz, the CCR values are 2.62 and 0.47, which are larger and less than 1 for scotopic and photopic adaptation, respectively, indicating the current excitation and inhibition effect, respectively. At higher frequency (6 or 3 kHz), the slight variation in CCR values suggests that the visual adaptation function of our device is also reliable at high frequencies of environmental light. Supplementary Table 1 summarizes the reported negative photoconductive devices with potential applications in visual adaptation to highlight the advantages of our device. Overall, benefiting from the avalanche tuning mechanism as feedforward inhibition circuit, the bionic device exhibits fast and high-frequency visual adaptation behavior, which is much superior to the human retina and widely reported bionic sensors relying on feedback inhibition circuit. More details about the frequency, light intensity, and durability-dependent photoresponse measurements are shown in Supplementary Figs. 16–21.

Machine vision with ultra-fast and accurate image recognition

Deep learning with convolutional neural network (CNN) plays a crucial role in the image recognition function of machine vision, but the image brightness has a significant impact on accuracy. In Fig. 6a, to assess the image recognition performance of a typical three-layered CNN under varying brightness conditions, we utilized 60,000 MNIST dataset images with different brightness levels as the training set to CNN 30 times. Notably, during the training process, we input image brightness as an explicit additional parameter into the network. The detailed process of CNN deep learning is presented in Methods. As shown in Fig. 6b, the network exhibits excellent robustness and maintains an accuracy of 98.3% during the almost whole brightness-decreasing process. However, the accuracy declines significantly with increasing brightness, which is attributed to that the neural network struggles to accurately capture crucial features of overexposed images. In quantifying the influence of brightness augmentation on image recognition, the confusion matrices in Fig. 6c, d present the results of 10,000 image recognition trials under standard and 20% increased brightness conditions, respectively. It’s concerning that under the +20% brightness condition, the accuracy is only 83%, indicating that CNN are unable to improve the accuracy of classification features even though they have obtained brightness parameters on a fixed dataset. More confusion matrices at various brightness are presented in Supplementary Fig. 22.

To broaden the brightness scope of image perception and improve the recognition accuracy under bright conditions, we have employed a strategy that combines a convolutional neural network with our bionic MoS₂/WSe₂ transistors to construct an adaptative machine vision. This system possesses precise image recognition capabilities based on CNN and achieves ultra-fast visual adaptation via a bionic transistor. In Fig. 6e, f, we select the number “7” as the test image feature, with the illumination provided by the laser and the brightness data obtained by current mapping subsequently imported to CNN for processing. Under scotopic adaptation, the accuracy soars to 98.3% in a mere 9.5 μs, and it rapidly climbs to 98.2% in just 174 μs during photopic adaptation. This validates the adaptative machine vision with rapid adaptation and precise image recognition capabilities in different brightness environments. The inset depicts the contrast changes of MNIST images during the adaptation process, emphasizing the required image contrast for the adaptative machine vision system to efficiently capture and analyze image features. In a nutshell, the adaptative machine vision, with its microsecond-level adaptation time, significantly enhances image recognition accuracy with ultra-fast speed under varying brightness and environmental conditions. It holds vast potential across critical application scenarios such as facial recognition and autonomous driving, allowing for swift adjustments to the desired brightness and atmospheric conditions, thereby improving the efficiency of real-time image processing. Furthermore, there is the prospect of expanding the brightness scope of image perception, simplifying the complexity of hardware and algorithms, thereby enhancing the image processing capabilities of sensor terminals and propelling further advancements in machine vision technology.

Discussion

We have fabricated a bionic visual transistor that can be used to emulate the visual adaptation beyond the retina. Avalanche effect occurs by the impact ionization in the MoS₂ channel when the bias exceeds breakdown voltage (V_EB ~5.48 V). Through TCAD simulation, the electric field and ionization rate can be tuned by both V_DS and V_GS, enabling the tunability of avalanche performance. Under light illumination, the dominant mechanism responsible for photocurrent generation can be controlled by both V_DS and light intensity. By varying V_DS (larger or smaller than V_EB) and light intensity (from weak to strong), the photo-sensing mechanism of the device is switched from avalanche to photoconductive effect, resulting in a change of sensitivity from high to low value, whose function is like the rod and cone cells of the photoreceptor in the retina. The feedback inhibition causes a long visual adaption process of retina, for further optimization, we introduce a feedforward inhibition in our bionic device-based neural network by using the avalanche tuning operation, realizing a light-adaptive and real-time vision behavior. This can enable the visual device to timely perceive the rapid change of brightness, avoiding the occurrence of potential harms such as car accidents. The TCAD simulation on the distribution of electric field and impact ionization rate has also been performed to further verify the bias- and light-tuning avalanche effect. In addition to the high avalanche gain of 1.5 × 10⁴ and responsivity of 7.6 × 10⁴ A/W in the avalanche ionization region, our device also exhibits a large −3 dB bandwidth up to 10.5 kHz at weak light (125.62 pW). The bionic device achieves ultra-fast and high-frequency visual adaptation behavior at 6 and 3 kHz for simulated scotopic and photopic adaptation conditions. Importantly, through the combination of convolutional neural networks with bionic avalanche transistors, an adaptative machine vision system has been achieved. This system demonstrates exceptional microsecond-level rapid adaptation, enabling image recognition with over 98% precision in both dim and bright lighting conditions. Thus, the avalanche tuning based bio-inspired visual device can avoid long time visual adaptation process by introducing a more predictive and faster feedforward inhibition circuit, that holds great promise for widespread applications in the field of machine vision, bringing forth ideas and designs for bio-inspired visual systems while avoiding excessive reliance on complex circuits and algorithms.

Methods

Device fabrication

Few-layer MoS₂ and WSe₂ were mechanically exfoliated from the crystal (Shanghai OnWay Technology Co., Ltd). The 3.93 nm MoS₂ and 3.04 nm WSe₂ were successively stacked on 300 nm SiO₂/Si substrate using the PVA (polyvinyl alcohol, MACKLIN Co., Ltd, Shanghai)/PDMS (poly-dimethylsiloxane) assisted dry transfer technique. The substrate with the photoresist (An ARP-5350 positive photoresist from Taizhou SUNANO New Energy Co., Ltd.) was spin-coated onto the substrate at 3000 rpm for 60 s and then baked on a hot plate for 4 min at 100 °C. The Cr/Au (5/50 nm) electrodes were fabricated using an Ultraviolet Maskless Lithography machine (TuoTuo Technology, UV Litho-ACA) and an electron beam evaporation technique. Finally, the device was annealed for 2 h at 150 °C under N₂ atmosphere to improve the interfacial contact.

Device characterization

The potential difference of the heterojunction interface was measured by a Scanning Probe Microscope (SPM) with the functional modules of KPFM (Oxford Cypher S AFM. Co., Ltd.), which can be calculated by following equation: e*SPD = e*ф_tip − e*ф_sample, where e is elementary charge, eф_tip and eф_sample are the work functions of AFM tip and sample, respectively. The STEM and energy-dispersive X-ray spectroscopy (EDX) elemental mapping of MoS₂ and WSe₂ were measured using STEM (Thermo Fisher Talos F200X). The electrical characterizations of the device were measured via a four-probe station (PSAICPB6A, PRECISION SYSTEM INDUSTRIAL Co., Ltd.) equipped with a KEITHLEY 2636B and 2611B semiconductor source meter. The noise spectral density was measured via KEITHLEY 2636B and oscilloscope then calculated by the Fourier transformation. The photoresponse was tested using a 635 nm optical fiber laser and the spot diameter of all the excitation lasers was ~3 mm. The response time was extracted via an electric shutter system and oscilloscope. The rise and fall times are defined as the time it takes for the current to rise from 10 to 90% and the time it takes for the current to fall from 90 to 10%, respectively.

Deep learning

In the deep learning process of convolutional neural networks, the input encompasses grayscale information from 784 pixels of an image. This information undergoes processing through fully connected layers, enabling the synthesis and extraction of individual pixel features. Eventually, the input pixel information is mapped into the network, forming a hidden layer. Through this deep learning process, the network can accurately classify training images and learn the patterns and features within them. The role of the fully connected layer is to interactively process various parts of the image, forming a holistic understanding of the image. This comprehensive processing equips the network with the ability to recognize complex patterns and abstract features, enabling it to learn both surface-level image characteristics and deeper layers of image structure during the training process.

Data availability

The data supporting the findings of this study are available within the article and its supplementary files. Any additional requests for information can be directed to, and will be fulfilled by, the corresponding authors. Source data are provided with this paper.

References

Ding, H., Smith, R. G., Poleg-Polsky, A., Diamond, J. S. & Briggman, K. L. Species-specific wiring for direction selectivity in the mammalian retina. Nature 535, 105–110 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Masland, R. H. The fundamental plan of the retina. Nat. Neurosci. 4, 877–886 (2001).
Article CAS PubMed Google Scholar
MOVSHON, J. A. & LENNIE, P. Pattern-selective adaptation in visual cortical neurones. Nature 278, 850–852 (1979).
Article ADS CAS PubMed Google Scholar
Snowden, R. J. & Hammett, S. T. Subtractive and divisive adaptation in the human visual system. Nature 355, 248–250 (1992).
Article ADS CAS PubMed Google Scholar
Story, D. F., McCulloch, M. W., Rand, M. J. & Standford-Starr, C. A. Conditions required for the inhibitory feedback loop in noradrenergic transmission. Nature 293, 62–65 (1981).
Article ADS CAS PubMed Google Scholar
Doiron, B., Chacron, M. J., Maler, L., Longtin, A. & Bastian, J. Inhibitory feedback required for network oscillatory responses to communication but not prey stimuli. Nature 421, 539–543 (2003).
Article ADS CAS PubMed Google Scholar
Pan, F. et al. Feedback inhibition of calcineurin and Ras by a dual inhibitory protein Carabin. Nature 445, 433–436 (2007).
Article ADS CAS PubMed Google Scholar
Li, X. et al. Power-efficient neural network with artificial dendrites. Nat. Nanotechnol. 15, 776–782 (2020).
Article ADS CAS PubMed Google Scholar
Kar, K., Kornblith, S. & Fedorenko, E. Interpretability of artificial neural network models in artificial intelligence versus neuroscience. Nat. Mach. Intell. 4, 1065–1067 (2022).
Article Google Scholar
Krogh, A. What are artificial neural networks? Nat. Biotechnol. 26, 195–197 (2008).
Article CAS PubMed Google Scholar
Kasai, H., Ziv, N. E., Okazaki, H., Yagishita, S. & Toyoizumi, T. Spine dynamics in the brain, mental disorders and artificial neural networks. Nat. Rev. Neurosci. 22, 407–422 (2021).
Article CAS PubMed Google Scholar
Liao, F. et al. Bioinspired in-sensor visual adaptation for accurate perception. Nat. Electron 5, 84–91 (2022).
Article Google Scholar
Ma, S. et al. An artificial neural network chip based on two-dimensional semiconductor. Sci. Bull. 67, 270–277 (2022).
Article CAS Google Scholar
Mennel, L. et al. Ultrafast machine vision with 2D material neural network image sensors. Nature 579, 62–66 (2020).
Article ADS CAS PubMed Google Scholar
Li, S. et al. Wafer-scale 2D hafnium diselenide based memristor crossbar array for energy-efficient neural network hardware. Adv. Mater. 34, 2103376 (2022).
Article CAS Google Scholar
Huh, W., Lee, D. & Lee, C.-H. Memristors based on 2D materials as an artificial synapse for neuromorphic electronics. Adv. Mater. 32, 2002092 (2020).
Article CAS Google Scholar
Moon, G. et al. Atomically thin synapse networks on Van der Waals photo-memtransistors. Adv. Mater. 35, 2203481 (2023).
Article CAS Google Scholar
Ci, W. et al. All-in-one optoelectronic neuristor based on full-vdW two-terminal ferroelectric p–n heterojunction. Adv. Funct. Mater. 34, 2305822 (2023).
Benucci, A., Saleem, A. B. & Carandini, M. Adaptation maintains population homeostasis in primary visual cortex. Nat. Neurosci. 16, 724–729 (2013).
Article CAS PubMed PubMed Central Google Scholar
Douglas, R. J., Koch, C., Mahowald, M., Martin, K. A. C. & Suarez, H. H. Recurrent excitation in neocortical circuits. Science 269, 981–985 (1995).
Article ADS CAS PubMed Google Scholar
Wang, T.-M., Holzhausen, L. C. & Kramer, R. H. Imaging an optogenetic pH sensor reveals that protons mediate lateral inhibition in the retina. Nat. Neurosci. 17, 262–268 (2014).
Article CAS PubMed PubMed Central Google Scholar
Le Masson, G., Renaud-Le Masson, S., Debay, D. & Bal, T. Feedback inhibition controls spike transfer in hybrid thalamic circuits. Nature 417, 854–858 (2002).
Article ADS PubMed Google Scholar
Kim, J.-H., Ma, D.-H., Jung, E., Choi, I. & Lee, S.-H. Gated feedforward inhibition in the frontal cortex releases goal-directed action. Nat. Neurosci. 24, 1452–1464 (2021).
Article CAS PubMed Google Scholar
Tang, H.-L. et al. Multilayer graphene–WSe ₂ heterostructures for WSe₂ transistors. ACS Nano 11, 12817–12823 (2017).
Article CAS PubMed Google Scholar
Lee, S. Y. et al. Large work function modulation of monolayer MoS₂ by ambient gases. ACS Nano 10, 6100–6107 (2016).
Article CAS PubMed Google Scholar
Doan, M.-H. et al. Charge transport in MoS₂/WSe₂ van der Waals heterostructure with tunable inversion layer. ACS Nano 11, 3832–3840 (2017).
Article CAS PubMed Google Scholar
Lei, S. et al. An atomically layered InSe avalanche photodetector. Nano Lett. 15, 3048–3055 (2015).
Article ADS CAS PubMed Google Scholar
Seo, J. et al. Ultrasensitive photodetection in MoS₂ avalanche phototransistors. Adv. Sci. 8, 2102437 (2021).
Article CAS Google Scholar
Jia, J. et al. Avalanche carrier multiplication in multilayer black phosphorus and avalanche photodetector. Small 15, 1805352 (2019).
Article Google Scholar
Kim, J. et al. Channel-length-modulated avalanche multiplication in ambipolar WSe₂ field-effect transistors. ACS Nano 16, 5376–5383 (2022).
Article CAS PubMed Google Scholar
Sangwan, V. K. et al. Intrinsic carrier multiplication in layered Bi2O2Se avalanche photodiodes with gain bandwidth product exceeding 1 GHz. Nano Res. 14, 1961–1966 (2021).
Article CAS Google Scholar
Pak, J. et al. Two-dimensional thickness-dependent avalanche breakdown phenomena in MoS ₂ field-effect transistors under high electric fields. ACS Nano 12, 7109–7116 (2018).
Article CAS PubMed Google Scholar
Son, B. et al. Efficient avalanche photodiodes with a WSe₂/MoS₂ heterostructure via two-photon absorption. Nano Lett. 22, 9516–9522 (2022).
Article ADS CAS PubMed Google Scholar
Meng, L. et al. Low-voltage and high-gain WSe₂ avalanche phototransistor with an out-of-plane WSe₂/WS₂ heterojunction. Nano Res. 16, 3422–3428 (2023).
Article ADS CAS Google Scholar
Deng, W. et al. Strain effect enhanced ultrasensitive MoS₂ nanoscroll avalanche photodetector. J. Phys. Chem. Lett. 11, 4490–4497 (2020).
Article CAS PubMed Google Scholar
Surdi, H., Thornton, T., Nemanich, R. J. & Goodnick, S. M. Space charge limited corrections to the power figure of merit for diamond. Appl. Phys. Lett. 120, 223503 (2022).
Article ADS CAS Google Scholar
Schneeweis, D. M. & Schnapf, J. L. Photovoltage of rods and cones in the macaque retina. Science 268, 1053–1056 (1995).
Article ADS CAS PubMed Google Scholar
Son, B. et al. Efficient avalanche photodiodes with a WSe₂/MoS₂ heterostructure via two-photon absorption. Nano Lett. 22, 9516–9522 (2022).
Yang, Y. et al. Plasmonic transition metal carbide electrodes for high-performance InSe photodetectors. ACS Nano 13, 8804–8810 (2019).
Article CAS PubMed Google Scholar
Miller, S. L. Avalanche breakdown in germanium. Phys. Rev. 99, 1234–1241 (1955).
Article ADS CAS Google Scholar
Jiang, J. et al. Defect engineering for modulating the trap states in 2D photoconductors. Adv. Mater. 30, 1804332 (2018).
Article Google Scholar
Wang, B. et al. Mixed‐dimensional MoS₂/Ge heterostructure junction field‐effect transistors for logic operation and photodetection. Adv. Funct. Mater. 32, 2110181 (2022).
Article CAS Google Scholar
Guo, N. et al. Light‐driven WSe₂‐ZnO junction field‐effect transistors for high‐performance photodetection. Adv. Sci. 7, 1901637 (2020).
Article CAS Google Scholar
Wang, H. et al. Junction field‐effect transistors based on PdSe₂ /MoS₂ heterostructures for photodetectors showing high responsivity and detectivity. Adv. Funct. Mater. 31, 2106105 (2021).
Article CAS Google Scholar

Download references

Acknowledgements

We acknowledge financial support from the Guangdong Basic and Applied Basic Research Foundation (No. 2024A1515030107), the National Natural Science Foundation of China (Nos.11904108 and 62004071), the China Postdoctoral Science Foundation (No.2020M672680), the “Pearl River Talent Recruitment Program” (No.2019ZT08X639), and the Scientific Research Start-up Foundation for PhD of Chaohu University (No. KYQD-2023012). We acknowledge BioRender.com for helping us to create Figs. 1a, 1b, 4h.

Author information

Authors and Affiliations

School of Semiconductor Science and Technology, South China Normal University, Foshan, 528225, P.R. China
Ling Li, Wenhai Wang, Jielian Zhang, Yiming Sun, Qunrui Deng, Tao Zheng, Wei Gao, Mengmeng Yang, Hanyu Wang, Yuan Pan, Xueting Liu, Yani Yang & Nengjie Huo
School of Electronic Engineering, Chaohu University, Hefei, 238000, China
Shasha Li
National Key Laboratory of Science and Technology on Reliability Physics and Application of Electronic Component, China Electronic Product Reliability and Environmental Testing Research Institute, Guangzhou, 510610, China
Jianting Lu
College of Optical Science and Engineering, Zhejiang University, Hangzhou, 310027, China
Jingbo Li
Guangdong Provincial Key Laboratory of Chip and Integration Technology, Guangzhou, 510631, P.R. China
Jingbo Li & Nengjie Huo

Authors

Ling Li
View author publications
Search author on:PubMed Google Scholar
Shasha Li
View author publications
Search author on:PubMed Google Scholar
Wenhai Wang
View author publications
Search author on:PubMed Google Scholar
Jielian Zhang
View author publications
Search author on:PubMed Google Scholar
Yiming Sun
View author publications
Search author on:PubMed Google Scholar
Qunrui Deng
View author publications
Search author on:PubMed Google Scholar
Tao Zheng
View author publications
Search author on:PubMed Google Scholar
Jianting Lu
View author publications
Search author on:PubMed Google Scholar
Wei Gao
View author publications
Search author on:PubMed Google Scholar
Mengmeng Yang
View author publications
Search author on:PubMed Google Scholar
Hanyu Wang
View author publications
Search author on:PubMed Google Scholar
Yuan Pan
View author publications
Search author on:PubMed Google Scholar
Xueting Liu
View author publications
Search author on:PubMed Google Scholar
Yani Yang
View author publications
Search author on:PubMed Google Scholar
Jingbo Li
View author publications
Search author on:PubMed Google Scholar
Nengjie Huo
View author publications
Search author on:PubMed Google Scholar

Contributions

L.L. and S.L. contributed equally to this work. L.L. and N.H. conceived the idea and supervised the work. W.W., Y.S. and M.Y. designed the experiments. J.Z., Q.D., T.Z., J.L. and W.G. helped with mechanism analysis and discussion. H.W. and Y.P. analyzed the physical model. X.L. performed device fabrication and characterization. Y.Y. supports the characterization of materials. L.L., J.L. and N.H. co-wrote the manuscript and all authors contributed to the revision of the manuscript.

Corresponding author

Correspondence to Nengjie Huo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Source data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Li, L., Li, S., Wang, W. et al. Adaptative machine vision with microsecond-level accurate perception beyond human retina. Nat Commun 15, 6261 (2024). https://doi.org/10.1038/s41467-024-50488-6

Download citation

Received: 10 January 2024
Accepted: 12 July 2024
Published: 24 July 2024
Version of record: 24 July 2024
DOI: https://doi.org/10.1038/s41467-024-50488-6

This article is cited by

2D computational photodetectors enabling multidimensional optical information perception
- Fakun Wang
- Shi Fang
- Qi Jie Wang
Nature Communications (2025)
Ultra-highly linear Ga2O3-based cascade heterojunctions optoelectronic synapse with thousands of conductance states for neuromorphic visual system
- Peng Li
- Xuanyu Shan
- Yichun Liu
Light: Science & Applications (2025)
Bio-inspired optoelectronic devices and systems for energy-efficient in-sensor computing
- Xiaoting Wang
- Heyi Huang
- Huaqiang Wu
npj Unconventional Computing (2025)
In-material physical computing based on reconfigurable microwire arrays via halide-ion segregation
- Dengji Li
- Pengshan Xie
- Johnny C. Ho
Nature Communications (2025)
Event-driven retinomorphic photodiode with bio-plausible temporal dynamics
- Qijie Lin
- Congqi Li
- Hui Huang
Nature Nanotechnology (2025)