Introduction

Conventional computing, based on Von Neumann’s architecture, has been the cornerstone of computing systems for several decades. This traditional computing paradigm, combined with complementary metal-oxide semiconductor (CMOS) technology, has been paramount in laying the foundation of high-performance computing and driving technological innovation across various fields. However, traditional computers encounter difficulties in processing massive amounts of data efficiently because of the physical separation between the central processing unit (CPU) and the memory unit. The constant movement of data between memory and processing units is a core issue that results in considerable latency and energy consumption, a problem commonly referred to as the Von Neumann bottleneck. In recent years, the explosive growth of data, driven by advancements in machine learning applications, artificial intelligence (AI), and various sources such as online platforms, scientific research, and the Internet of Things, has posed significant challenges for conventional computing architectures1. The further advancement of conventional computing is also hindered by the slowing down of Moore’s Law2,3, which predicted the doubling of transistor counts on integrated circuits every 2 years4 and has been the driving force behind the exponential growth in computing. However, as transistor size approaches physical limits, sustaining the exponential growth in computing capabilities becomes increasingly challenging5,6. As a result, there is a pressing need for novel computing approaches to address the inefficiencies of conventional systems and meet the growing demands of data-intensive applications while mitigating energy consumption6,7. This is where brain-inspired neuromorphic computing has gained prominence to overcome the limitations of conventional computing8,9. The brain’s inherent features, such as massive parallelism, in-memory computing, high integration density, and synaptic plasticity, enable it to process vast amounts of information efficiently, exhibiting remarkable abilities such as learning, recognition, decision-making, and adaptability, all while minimizing energy consumption10.

Over the years, artificial neural networks (ANNs) have been developed to overcome the Von Neumann bottleneck inherent in conventional computing architectures, leading to significant advancements in the field11,12. However, challenges remain in reducing computation costs and achieving brain-like efficiency because the algorithms of ANNs and associated software are executed on conventional computers, leading to certain drawbacks. One of the main issues is the considerable power consumption of these computers when running AI applications, which far exceeds the energy efficiency of the human brain. For example, the AlphaGo system, which consists of 1200 CPUs and 180 image processors (GPUs), consumes several hundred thousand Watts13. While the human brain, with ~1011 neurons and 1015 synapses14, consumes only 20 Watts of power, with each stimulus requiring an operation consumption of only 1–100 femto joules15. Synaptic plasticity is fundamental to the brain’s computational capabilities, referring to the ability of synapses to adjust their strength based on neuronal spiking activity14, including long-term plasticity, short-term plasticity, and spike-timing-dependent plasticity (STDP), etc. In the neural system, short-term plasticity is responsible for computational functions associated with spatiotemporal information16, while long-term plasticity establishes the fundamental framework for memory and learning17, and STDP, which follows the principles of the Hebbian theory represents the learning rule and includes various forms such as asymmetric STDP, symmetric STDP, and anti-STDP18,19,20. Emulating synaptic plasticity at the device level promises to achieve energy-efficient, higher-performance, and scalable neuromorphic computing. Different device architectures, including CMOS transistors, memristors, gate-tunable transistors, and ferroelectric transistors, have been utilized to emulate the functionality of biological synapses through electronic and optoelectronic devices.

Among these, memristors21,22,23,24, initially conceptualized by Leon Chua25, and later experimentally observed in HP labs, have immense potential for emulating synaptic plasticity based on their conductivity modulation from past history programming15,26. These two-terminal electronic devices exhibit similar functionality to synapses and offer advantages such as excellent scalability, simple geometry, low fabrication cost, non-volatility, analog switching, low power consumption, faster switching speed, CMOS compatibility, and high integration density. Furthermore, they can be seamlessly integrated into large-scale crossbar arrays, crucial for essential vector-matrix multiplication in neuromorphic computing. A large variety of material-based memristors, including binary oxide (TiOX27,28,29, SiOX30,31, NiOX32,33, etc.), perovskites (MAPbI334,35, (C4H9NH3)2PbBr436, MAPbBr337 etc.), organic materials (organic small molecules SU-838, monochloro copper phthalocyanine (ClCuPc)39, fluoropolymer40, etc.) and 2D materials41,42,43,44 (graphene, hBN, 2D TMDs, etc.), have been extensively explored. Among all these, 2D TMDs emerge as exceptionally desirable materials, positioned as essential components across a broad spectrum of electronic and optoelectronic devices. This is particularly notable in the context of cutting-edge memristive and synaptic devices tailored for the advancements in Neuromorphic Computing. Due to their nearly atomic thickness and reduced screening effects, 2D TMDs offer tunable physical properties through diverse techniques, including electrostatic doping45, defect-engineering46, chemical intercalation47, and strain-engineering48. Various mechanisms, including interface electrostatic modulation, charge transfer, phase change, energy band shifts enabled by gate tunability, and defect migration, facilitate the realization of memristive behavior and synaptic plasticity in these materials. 2D TMDs-based memristors have successfully emulated crucial synapse functions, including long-term potentiation (LTP)32,49,50,51, long-term depression (LTD)49,50,51, and STDP49,51,52,53,54. However, existing challenges persist, including non-linearity and asymmetry in synaptic weight updates, limited dynamic range, high programming current, and variability from cycle to cycle and device to device. To achieve brain-like efficiency, it is imperative to address these challenges. An ideal synaptic device should possess attributes such as a wide dynamic range, linear and symmetric synaptic weight updates, low programming current, and minimal variability both within cycles and across devices. These traits are pivotal for ensuring accurate and efficient neural network operations closely mimicking the human brain’s functioning. Addressing these hurdles through physics-driven device engineering could pave the way for improved synaptic functionality.

Here, we studied the influence of on-chip Argon (Ar) plasma treatment on the memristive behavior and synaptic plasticity of chemical vapor deposition (CVD)-grown polycrystalline MoS2 monolayer devices. Plasma treatment has arisen as a versatile method for modifying 2D materials on a large scale, providing a time-efficient and cost-effective approach55,56,57. Ar plasma treatment has been utilized to achieve controlled atomic layer thinning of MoS2 flakes and induce desulfurization of MoS2 flakes, all without the introduction of external atoms or chemical doping58,58,59,60. Ar-plasma treatment has been used previously to improve the performance matrix, such as dynamic range and linearity of synaptic weight updates of exfoliated few-layer MoS2 based memristor61. Here, we have shown that optimal-power Ar plasma can significantly enhance the performance matrix of CVD-grown polycrystalline monolayer MoS2 based memristor. Our work first reveals that memristors fabricated from as-grown polycrystalline MoS2 monolayers exhibit limitations, including a poor switching ratio (less than 102), non-linearity and asymmetry in synaptic weight updates, substantial cycle-to-cycle variability of switching ratio, and poor STDP behavior. Following the application of optimal-power plasma treatment, leading to a sulfur vacancy density of  ~3.4 × 1014 cm−2 in as-grown MoS2, we observed significant improvements in the performance of memristors. These improvements included a significantly expanded switching ratio (102–104), more linear and symmetric synaptic weight updates, a reduced non-linear factor (NLF) from 21.79 to 2.81, and enhanced STDP behavior. Furthermore, employing a three-layer ANN simulation with plasma-treated memristors shows learning accuracy of  ~97% for recognizing hand-written MNIST digits. Our approach to enhancing resistive switching and synaptic functionality in 2D semiconductor devices opens up exciting possibilities for further research and development in the realm of synaptic devices.

Results and discussion

For the present study, polycrystalline monolayers of MoS2 were synthesized on a SiO2/Si substrate using the CVD technique (see Methods for details). The synthesized materials were characterized using Raman Spectroscopy, photoluminescence (PL), X-ray photoelectron spectroscopy (XPS), atomic resolution high-angle annular dark-field scanning transmission electron microscopy (HAADF-STEM), and optical microscopy. Optical micrograph and Raman spectroscopy of the as-grown polycrystalline MoS2 monolayer are presented in Fig. 1a, b, respectively. The Raman spectrum showed two prominent peaks: the in-plane vibration mode \({{{{{\rm{E}}}}}^{1}}_{2{{{\rm{g}}}}}\) at  ~386 cm−1 and the out-of-plane mode A1g at  ~404 cm−1. In addition, a third peak at  ~378 cm−1 was also observed, attributed to a disorder-induced peak62. The difference between the A1g and \({{{{{\rm{E}}}}}^{1}}_{2{{{\rm{g}}}}}\) peak positions was found to be  ~18.5 cm−1, confirming the monolayer nature of the synthesized MoS263. Figure 1c displays the PL with a peak observed at  ~1.82 eV attributed to A excitons64. Figure 1d depicts the HAADF-STEM image of MoS2 monolayer. The image reveals that the Mo and S atoms are arranged in a hexagonal pattern, indicating the presence of a hexagonal crystal structure within the MoS2 monolayer. Figure 1e presents the fast Fourier transform (FFT) analysis of the HAADF-STEM image. The FFT image demonstrates the hexagonal symmetry of the MoS2 crystal, confirming the arrangement observed in Fig. 1d. Figure 1f provides additional information by indicating an interplanar distance of 0.27 nm, corresponding to the 100 planes. Figure 1g represents the Inverse Fast Fourier Transform (IFFT) image of Fig. 1d. Figure 1h is the zoomed image of the selected area in Fig. 1g, and the inset shows the atomic model of MoS2. Figure 1i shows the intensity profile along the line in Fig. 1h. Additionally, XPS analysis of CVD-grown MoS2 is detailed in Supplementary Fig. S1. We have confirmed the thickness of as-grown MoS2 using atomic force microscopy (AFM). Supplementary Fig. S2a presents the AFM image of CVD-grown MoS2 flake, while Supplementary Fig. S2b displays the corresponding height profile, revealing the thickness of the as-grown MoS2 to be 0.65 nm, confirming its monolayer nature. The memristors were fabricated in a field effect geometry using the polycrystalline MoS2 monolayer as the semiconducting channel material. Source/drain electrodes were patterned using photo-lithography, followed by Cr/Au electrode deposition through thermal vapor deposition and a subsequent lift-off process. In addition, the heavily doped Si substrate was utilized as the back gate electrode. It should be noted that we have fabricated devices on monolayer MoS2 flakes shaped like four-point and six-point stars. Figure 1a and Supplementary Fig. S3 illustrate the various shapes of our CVD-grown MoS2 flakes, including six-point, five-point, four-point, and three-point stars. It is well known that six-point, five-point, and four-point stars have grain boundaries (GBs)45,65,66. Zheng et al. employed second harmonic generation microscopy to examine these GBs in CVD-grown MoS2 flakes and determined that only the three-point star MoS2 flakes are single crystals, while the other shapes are inherently polycrystalline67. Figure 2a represent the device schematic showing the channel length 5 μm and width 25 μm and Supplementary Fig. S3b shows the optical image of the fabricated device. Figure 2b presents the output characteristics, i.e., the drain current (ID) versus drain voltage (VDS) for different back gate voltages (VGS). It is evident that there is a good linear relationship between ID and VDS, which indicates that the field effect transistor (FET) device possesses ohmic contacts. Figure 2c displays the transfer characteristic, demonstrating the drain current (ID) versus gate voltage (VGS) in linear and logarithmic scale at VDS = 0.5 V. It shows that the fabricated device is n-type, and the field-effect mobility is found to be  ~16 cm2V−1s−1. Figure 2d exhibits the hysteresis observed in the transfer characteristics, i.e., ID versus VG at VDS = 0.5 V. Several potential causes for the observed hysteresis in transfer characteristics have been identified, including electron trapping and de-trapping by adsorbed molecules on the MoS2 surface68, charge trapping at the SiO2/MoS2 interface69, and intrinsic defects within the MoS2 material70.

Fig. 1: Structural characterization of CVD-Grown MoS2.
figure 1

a Optical image of CVD-grown polycrystalline MoS2 (scale bar 20 μm). b Raman and (c) Photoluminescence (PL) of CVD-grown MoS2. d HAADF-STEM image and (e) corresponding FFT pattern. f Interplanar spacing corresponding to 100 planes is shown. g IFFT of (d). h Zoomed image of selected area marked in (g). Inset shows the atomic model of MoS2 (Mo: yellow, S: green). i Intensity profile along the line in (h).

Fig. 2: Electrical characteristics of CVD-grown monolayer MoS2.
figure 2

a Schematic of a MoS2 memristor using a 285 nm SiO2/Si substrate (b) Output characteristics (ID versus VD) of the device, VG changes from 0 V to 60 V in steps of 10 V. c Transfer characteristics (ID versus VG) of the device left in logarithmic scale and right in linear scale. d Hysteresis in transfer characteristics, ID versus VG at VD = 0.5 V, black arrows show the sweeping direction.

We first studied the switching behavior of pristine devices, as depicted in Fig. 3. Figure 3a shows the current-voltage (I-V) characteristics at 0 gate voltage. Initially, the device starts in a high resistance state (HRS) when the drain voltage (VD) is swept from 0 to 10 V (sweep 1). As the voltage sweep progresses, the device gradually transitions to a low resistance state (LRS) and maintains this state during the sweep from 10 V back to 0 V (sweep 2). Subsequently, during the sweep from 0 V to −10 V (sweep 3), the device resets to the HRS state. Finally, the device remains in the HRS state during the sweep from −10 V to 0 V (sweep 4), the corresponding logarithmic graph is given in Supplementary Fig. S4. The observed behavior exhibits a non-linear pinched hysteresis loop in the I-V characteristics, a distinctive feature of memristive behavior25,26. As shown, the hysteresis loops exhibit a counterclockwise direction in both the right and left branches. It should be noted that the MoS2 memristor does not require an electroforming process. The underlying mechanisms of resistive switching phenomenon in lateral TMDs-based memristors are still debated, with different types of switching mechanisms proposed based on experimental results. One mechanism involves the dynamics of charging and discharging of trap states, which can originate from defects in the channel material, gate oxide, or semiconductor-electrode interfaces. These states can capture and release electrons or holes based on the applied voltage and, therefore modify the electrostatic potential, free charge carrier density, and contact barriers71,72,73 Lee et al. reported a dual-gated MoS2 memtransistor using Al2O3 as the top gate and SiO2 as the back gate, attributing the memristive loop to traps filling and emptying at the MoS2/Al2O3 interface74. Yang et al. observed memristive behavior in monocrystalline MoS2 FETs using HfO2 as a gate dielectric, attributing this behavior to charge trapping and de-trapping at the MoS2/HfO2 interface75. When the device length is close to or shorter than 1 μm, a high drain bias increases the electric field strength, providing sufficient energy for injected carriers to overcome the channel bandgap and become trapped in the mid-gap states of Al2O3 or HfO2 dielectric. In both cases, the resistive switching effects were associated with transport behaviors primarily controlled by space-charge-limited current and trap-filled limit (TFL)74,75. Another mechanism involves charged point defects that are mobile enough to move when an external electric field is applied. Experiments have demonstrated that sulfur vacancies accumulate and migrate along GBs in the conducting channels of lateral monolayer MoS2 devices, leading to changes in channel resistance and causing hysteresis in the ID-VD curve65. Electrostatic force microscopy and cryogenic transport measurements revealed dynamic variations in the Schottky barrier height (SBH) in polycrystalline CVD-grown MoS2 monolayer-based memtransistors45. Based on these experimental results, the origin of hysteresis in the ID-VD curve was hypothesized that local redistribution of defects under external field, facilitated by GBs, causes the variation in the dopants density near the contact edge45. These changes result in the dynamics variation of SBH by image-charge lowering, resulting in a pinched hysteresis loop in the ID-VD curve45. Li et al. performed a systematic investigation into the switching characteristics of lateral memristors based on mechanically printed few-layer MoS2. Their study identified two distinct DC-programmed switching modes: rectification-mediated and conductance-mediated. These modes were associated with variations in MoS2/Ti Schottky barriers and the redistribution of ionic vacancies within the MoS2 channels. Their findings were supported by results from Kelvin probe force microscopy, Auger electron spectroscopy, and electronic characterization76. While Spetzler et al. introduced a semi-classical charge transport model to study the role of defect dynamics in the switching process, the model was validated with experimental data from lateral memristive devices based on exfoliated few-layer MoS276. The results based on this model the hysteresis in I-V curves is attributed to the dynamics of mobile charged vacancies which leads the formation and annihilation of a vacancy depletion region. Additionally, it has been demonstrated that the I-V curve is significantly affected by Schottky barrier lowering. However, it does not have a notable impact on the hysteresis area. This model effectively explains the I-V curve and pulse behavior, including distinct features such as different hysteresis directions, hysteresis crossing, and asymmetry observed experimentally77. Regardless of the cause of hysteresis in MoS2 memtransistors, this article focuses on engineering it to enhance memristive and synaptic functionality. Figure 3b shows the endurance characteristic of the MoS2 memristor that was switched 180 times between HRS and LRS using full-sweep cycles. Notably, around the 70th cycle, we observed a sudden drop in the drain current (ID). This phenomenon might be attributed to the release of oxide-related traps activated by the high electric field at the source electrode45. Previous studies have linked such dip and rise patterns in endurance curves to the combined effects of gas or water molecule absorption/desorption and charge trapping and de-trapping at the semiconductor-dielectric interface78. Considerable cycle-to-cycle variability is evident in both the HRS and LRS, and the switching ratio remains below 102 at VD = 0.5 V. During the online training of neural networks, the utilization of devices exhibiting cycle-to-cycle variation leads to escalated training costs and significant accuracy deterioration. Because training requires a large number of writing and erasing operations, often exceeding millions of repetitions79. Figure 3c shows the LTP-LTD characteristics of the device, which imitate the excitatory and inhibitory behavior of biological synapses. For a positive pulse (set pulse), the post-synaptic (ID) current increases similar to the facilitation of synaptic strength of biological synapses, and for a negative pulse (reset pulse), the current decreases similar to depression of synaptic strength of the biological synapses. Here, we applied set pulses of +20 V amplitude, with an on-off time of 1 ms, reset pulses of −20 V amplitude with an on-off time of 1 ms, and measured the current after the pulses at a DC bias VD (read voltage) of 0.1 V. Our pristine MoS2 synaptic devices exhibited non-linearity and asymmetry in weight updates. The processes of weight increase (LTP) and weight decrease (LTD) progress differently, causing asymmetry. Initially, conductance undergoes gradual changes, but saturation takes place as the number of applied pulses increases, imposing a limit on the dynamic range and the number of attainable synaptic weight states. Dynamic range refers to the ratio of the highest to the lowest conductance in LTP-LTD characteristics. Synaptic devices with limited dynamic ranges often suffer from diminished learning accuracy in neural network applications80,81. Most neuro-inspired algorithms utilize analog synaptic weight updates to learn patterns and extract features. A higher number of multilevel states (e.g., exceeding hundreds of levels) contribute to improved learning capability and enhance network robustness82. Linearity in weight updates signifies the linear alteration of conductance with the number of applied pulses83. The non-linearity of potentiation (or depression) makes it challenging to fine-tune the conductance to the target value, resulting in poor convergence rates during training and diminished learning accuracy80,84. A significant aspect of using a memristor as an artificial synapse in the latest AI technology, such as spiking neural network, is their ability to mimic STDP, which represents a temporally asymmetric manifestation of Hebbian learning, triggered by the timing correlation between spikes from pre-synaptic and post-synaptic neurons. If the pre-synaptic spike occurs before the post-synaptic spike, the synaptic weight (or strength) increases (potentiation). Conversely, if the pre-synaptic spike occurs after the post-synaptic spike, the synaptic weight decreases (depression). The effect of correlated spiking on synaptic strength diminishes rapidly as the time interval between the spikes increases. Here, we mimicked indirect STDP (Fig. 3d) in our device by applying a pair of electrical spikes separated by a time interval Δt45,85. Specifically, a positive pair of pulses induces positive change (LTP), while a negative pair induces a negative change (LTD) in synaptic weight (conductance). We have quantified these changes by plotting the percentage change in synaptic weight, denoted as ΔW%, as a function of the time difference Δt. ΔW% is calculated as follows

$$\Delta W \% =\frac{{W}_{{{{\rm{final}}}}}-{W}_{{{{\rm{initial}}}}}}{{W}_{{{{\rm{initial}}}}}}\times 100$$
(1)
Fig. 3: Synaptic functionality of pristine MoS2 memristor.
figure 3

Switching characteristics and emulation of synaptic function in optimal-power pristine MoS2 memristor. a ID versus VD characteristics of MoS2 memristor at VG = 0 V and scan rate 0.5 Vs−1 (b) Endurance of ILRS and IHRS state (top) and ILRS/IHRS (bottom) at VD = 0.5 V in the HRS (sweep 1) and the LRS (sweep 2) for 180 sweep cycles. c LTP and LTD characteristics, Post-Synaptic current (ID) versus Pulse number. Here, pulses of 20 V amplitude with an on-off time of 1 ms for potentiation and pulses of −20 V amplitude with an on-off time of 1 ms for depression have been applied, then source-drain current (ID) was measured at read voltage (VD) = 0.1 V. d The measured change in synaptic weight as a function of the time interval (Δt) between paired pulses of 20 V and  −20 V, both pulses on time was 1 ms, the current was measured at each time interval after the pulse pair by applying a source-drain bias. The solid lines are exponential fits. VG is set to 0 V for all the measurements.

Where Winitial and Wfinal are the channel conductance before applying pulses and after applying pulses, respectively. To determine the time constants for potentiation and depression, we fitted the STDP data from Fig. 3d (represented by black solid lines) using the following mathematical model86.

$$\Delta W\propto \left\{\begin{array}{ll}\exp \left(-\frac{\Delta t}{{\tau }^{+}}\right),&\,{\mbox{if}}\,\,\Delta t \, > \, 0 \\ -\exp \left(\frac{\Delta t}{{\tau }^{-}}\right), &\,{\mbox{if}}\,\,\Delta t \, < \, 0\hfill \end{array}\right.$$
(2)

Here, τ+ and τ are the characteristic time between spikes for which there is a significant change in synaptic weight. We found these values to be 17 ms and 20 ms for potentiation and depression, respectively. Positive pulses pair show  ≈30% change in synaptic weight while negative change shows around 100%. Typically, in the emulation of brain-like STDP, synaptic weight change is more pronounced when the time interval between spikes is less. As time progresses, this change should eventually saturate. However, in pristine MoS2, the weight change for positive spike pairs does not adhere to this pattern. Further, to address the limitations observed, such as low switching ratio, nonlinearity, and asymmetry of synaptic weight updates in the pristine device memristor, we explored a promising avenue for enhancement: on-chip optimal-power Ar plasma treatment. Determining the power of Ar plasma is crucial for creating sulfur vacancies in the MoS2. We observed that low-power Ar plasma does not generate sulfur vacancies, while high-power can potentially damage the sample. More details are given in the Supplementary information. S atoms and Ar+ ions have comparable masses, and Ar+ ions present within the moderate power plasma generally possess sufficient energy, which is insufficient for knocking off Mo atoms and enough to remove sulfur atoms59. Consequently, this process selectively sputters S atoms and creates sulfur vacancies, leaving Mo atoms unaffected. Figure 4a illustrates the schematic of the Ar plasma treatment process. Raman, PL, and XPS spectroscopy were performed to examine the effect of Ar Plasma at different power levels. Following low-power Ar plasma treatment, no obvious change was observed in Raman (Supplementary Fig. S5a), PL (Supplementary Fig. S5b), and XPS spectra (Supplementary Fig. S6), even after 60 s of plasma treatment. These observations conclusively confirm that low-power plasma does not induce sulfur vacancies in the channel. However, significant changes were observed in Raman, PL, and XPS spectroscopy after optimal-power plasma treatment. Figure 4b, c show the XPS spectra of Mo 3d core level before and after optimal-power plasma treatment, respectively. The dual peaks arising from Mo4+ 3d3/2 and 3d5/2 were deconvoluted into two peaks to enhance the fitting process, employing Gaussian-Lorentzian (GL) mixing and Shirley backgrounds. Before plasma treatment (Fig. 4b), higher binding energy peaks at 229.9 and 232.8 eV signify the stoichiometric intrinsic MoS2 (i-MoS2), and lower binding energies peaks at 229.2 and 232.3 eV, corresponds to defective or sub-stoichiometric MoS2 (d-MoS2) featuring sulfur vacancies87. Additionally, the third peak, situated below the Mo4+ 3d3/2 doublet at around  ~233 eV, and the broader peak at  ~236.0 eV, represent the Mo6+ 3d5/2 and 3d3/2 doublets of MoO3 or sub-oxides of MoOx, respectively. After optimal-power plasma treatment, observable modifications are apparent in the XPS spectra (Fig. 4c). The doublet peaks of Mo4+ 3d5/2 and 3d3/2 shift towards lower binding energies (Fig. 4c and Supplementary Fig. S7). The observed shift to lower binding energies in the Mo 3d peaks signifies a change in the electrostatic environment surrounding the Mo atoms. This change is a consequence of the reduction of neighboring S atoms, to which the Mo atoms were initially bonded88. This behavior aligns with previous findings where sulfur was selectively sputtered from MoS259,88. Also, there is a reduction in the contribution of i-MoS2, accompanied by an increase in the contribution of defective components (d-MoS2), confirming the introduction of more sulfur vacancies than the pristine sample87. Additionally, the diminished integrated area of the S 2s peak relative to the Mo 3d peaks suggests a decrease in the overall sulfur content compared to the pristine sample. Figure 4d, e represent the XPS spectra of S 2s core level before and after optimal-power plasma treatment, respectively. Moreover, the expanded FWHM of all peaks after plasma treatment indicates the introduction of disorder in the system87. We also found that contribution of the MoO3/MoOx peak becomes larger in the plasma-treated sample, reflecting the oxidation of MoS2 flakes87, which is inevitable during the microfabrication process. However, Oxygen elements can not migrate into the MoS2 channel under an external field60. Figure 5a, b display the PL spectra before and after optimal-power plasma treatment, respectively. Both the A-exciton and B-exciton peaks were fitted with a Voigt function. The ratio of the A and B-emission intensities can qualitatively indicate non-radiative recombination; a lower B/A ratio reflects fewer defects and better sample quality64. After optimal-power plasma treatment, the B/A ratio increases compared to the pristine sample, suggesting introducing defects in the plasma-treated sample. Additionally, Raman spectra following optimal-power plasma treatment reveal a red shift in the E12g peak and a smaller blue shift in the A1g peak, along with a broader full width at half maximum (FWHM) for both peaks (Fig. 5c), indicating the presence of sulfur vacancies60,89. Also, Supplementary Fig. S8 shows the fitted Raman spectra after optimal-power plasma treatment, showing two defect-induced peaks at  ~377 cm−1 and  ~411 cm−162. Supplementary Figs. S9 and S10 present the optical and HAADF-STEM images of MoS2 following optimal-power plasma treatment. After high-power plasma treatment, no Raman and PL signals were observed, as depicted in Supplementary Fig. S11a, b. The optical image (Supplementary Fig. S12) after the high-power plasma treatment reveals a noticeable etching of MoS2. While Supplementary Fig. S13 shows the comparison of transfer characteristics before and after optimal-power plasma treatment which reveals that optimal-power plasma does not damage the sample. Therefore, optimal-power Ar plasma was utilized to introduce sulfur vacancies deliberately. The defect concentration in both pristine and optimal-power plasma treated MoS2 is obtained by performing stoichiometry calculations using the core level spectra of S 2s and Mo 3d. The stoichiometry of MoS2 is determined from the following calculations90.

$$\frac{S(\,{\mbox{at.}} \% )}{Mo({\mbox{at.}}\, \% )}=\frac{\left(\frac{{I}_{{{{\rm{S}}}}2s}}{{\sigma }_{{{{\rm{S}}}}2s}}\right)}{\left(\frac{{I}_{{{{\rm{Mo}}}}3d5/2}}{{\sigma }_{{{{\rm{Mo}}}}3d5/2}}\right)}$$
(3)

here, S (at.%) and Mo (at.%) represent the atomic per cent (at.%) of sulfur (S) and molybdenum (Mo), respectively. IS2s and IMo3d5/2 denotes the integrated intensity of the S 2s peak and the Mo 3d5/2 peak, respectively. Additionally, \({\sigma }_{{S}_{2s}}\) and \({\sigma }_{M{o}_{3d5/2}}\) represent the photoionization cross-sections at a photon energy of 1.5 keV, modeled by Scofield, where \({\sigma }_{{S}_{2s}}=1.9066\) and \({\sigma }_{M{o}_{3d5/2}}=7.4630\)91. In the case of pristine MoS2, the calculated stoichiometry is determined to be MoS1.98, resulting in a defect density of 1%. In the ideal superstructure of 1H MoS2, the sulfur-sulfur distance is 3.162 Å92, leading to a density of sulfur atoms in monolayer MoS2 of ~2.3 × 1015 cm−2. With a defect concentration of 1%, the defect density is estimated to be ~2.3 × 1013 cm−2, which aligns with values reported for pristine samples93. For the optimal-power plasma-treated MoS2, stoichiometry is MoS1.7 with a defect concentration of ~15%, corresponding to a sulfur vacancy density of 3.4 × 1014 cm−2.

Fig. 4: Defect-engineering of CVD-grown MoS2.
figure 4

a A visual representation of Ar plasma treatment. b XPS of Mo 3d core level of pristine MoS2 (c) XPS of Mo 3d core level of optimal-power plasma-treated MoS2. d XPS of S 2p core level of pristine MoS2 (e) XPS of S 2p core level of optimal-power plasma-treated MoS2.

Fig. 5: Optical and electrical characterization after plasma treatment.
figure 5

a PL spectra of pristine MoS2 (b) PL spectra of optimal-power plasma treated MoS2 (c) Raman spectra of MoS2 before and after optimal-power plasma treatment (d) ID-VD loop comparison before and after optimal-power plasma treatment at VG = 0 V and scan rate 0.5 Vs−1.

Next, we explored the switching characteristics and synaptic plasticity of the device after optimal plasma treatment. We compared the memristive loops of the pristine device and the plasma-treated device under identical conditions, including the VD range and sweep rate. It is important to mention that the comparison of switching characteristics and synaptic functionality was conducted on the same device before and after plasma treatment. A significant opening of hysteresis in ID-VD curve has been observed after the plasma treatment (Fig. 5d). We have discussed above that optimal-power Ar plasma treatment introduces more sulfur vacancies in the MoS2 channel, which can be attributed to the pronounced opening of hysteresis in the ID-VD curve after the plasma treatment. The memristive loop shown in Fig. 6a for VD scan ± 40 V exhibits broad hysteresis characteristics, and the corresponding logarithmic scale graph is given in Supplementary Fig. S14. Notably, Fig. 6b demonstrates a substantial enhancement in the endurance of the HRS state (sweep 1) and the LRS state (sweep 2) at 0.5 V. Following plasma treatment, a significant reduction in IHRS is observed. At 0.5 V read voltage, IHRS is found to be around 1 pA (Fig. 6b) after plasma treatment, while before plasma treatment, it was nearly 1 nA (Fig. 3b). If the memristor is used as a selector device, then standby power consumption is given by the following equation94

$${{\mbox{P}}}_{{{{\rm{standby}}}}}={I}_{{{{\rm{HRS}}}}}\times {V}_{{{{\rm{read}}}}}$$
(4)

After plasma treatment, the standby power consumption decreased significantly from ~0.5 nW to 0.5 pW. This reduction in off-state current contributes to enhanced energy efficiency. Additionally, a low off-current minimizes parasitic power consumption in large crossbar arrays31. Moreover, the switching ratio (Fig. 6c) has been increased significantly from 102 to 104, with minimal cycle-to-cycle variability. Figure 6d shows the LTP and LTD curves for 100 number of identical pulses. The pulse parameter and read voltage were kept the same as for the pristine devices. The observed variation in current with the number of pulses indicates a larger number of available synaptic weight states and increased dynamic range compared to the pristine device. LTP and LTD curves follow the same trajectory. Also, linearity and symmetry, the two most crucial attributes for evaluating synaptic devices, have been significantly improved. Floating gate-type synaptic devices have already been explored to enhance linearity and symmetry95,96,97,98. However, these devices require the incorporation of a charged store layer and a blocking oxide, adding complexity to the fabrication process, especially when aiming for large-scale array applications. While strategies such as encoding the input pulses have been proposed to address non-linearity and asymmetry, these methods often lead to added intricacy in on-chip implementation, time delay, and additional energy consumption99,100,101. Our approach, on the other hand, achieves high linearity and symmetry using a straightforward two-terminal device and identical pulses. This implies that the device can transition to a specific state by employing a certain number of identical pulses. Subsequent application of an equal number of similar pulses with opposite polarity can restore the original state. This feature proves advantageous for neural network training. In Fig. 6e, we have presented the cycle-to-cycle variability of potentiation and depression for 20 cycles. Each of these cycles consists of a sequence of ten identical positive pulses (+20 V, 1 ms) followed by ten negative pulses (−20 V, 1 ms). The response remains consistently linear and symmetric throughout each cycle. We compared the ID-VD characteristics and LTP-LTD characteristics of 10 devices before and after optimal-power plasma treatment (Supplementary Figs. S15S24). Additionally, Supplementary Figs. S25S37 show the ID-VD characteristics for another 11 devices. We observed the consistent opening of the hysteresis in the ID-VD curve, as well as improved linearity and symmetry in the LTP-LTD characteristics, with reduced cycle-to-cycle variability compared to the pristine devices. Figure S30a presents the ID-VD characteristics over 80 cycles at  ±10 V, and Figure S29b shows the same for 520 cycles after optimal-power plasma treatment for the same device. Figures S31a and S31b illustrate the HRS and LRS with sweep cycles at VD = 1 V. It is evident that after optimal-power plasma treatment, the switching ratio increased by two orders of magnitude with less cycle to cycle variability, and the current at the HRS decreased. However, we have observed device-to-device variability, such as hysteresis area and asymmetry of hysteresis. This variability can be attributed to various factors, such as the inhomogeneity of CVD-grown flakes45 and defects introduced during the fabrication process. Even slight variations in the Schottky barriers (>0.04 eV) can significantly alter the I-V characteristics, affecting both the area and symmetry of the hysteresis77. To assess the energy efficiency of our synaptic transistor, it is important to consider the energy dissipation for each pulse. This energy, denoted as E, is determined by the product of three key factors: the average current during the pulse (ID), the duration of the pulse (tpulse), and the source-drain bias voltage (VD) of the pulse. This relationship is expressed by the following equation53

$$E={{\mbox{I}}}_{{{{\rm{D}}}}}\times {t}_{{{{\rm{pulse}}}}}\times {{\mbox{V}}}_{{{{\rm{D}}}}}$$
(5)

We have observed a minimum energy dissipation of  ≈30 fJ per pulse, a value that aligns closely with the energy consumption per synaptic event in biological systems, which is around  ≈10 fJ102. The STDP, shown in Fig. 6f, closely resembles that of biological synapses. As for less time interval, weight change is larger, and as time interval increases, weight change decreases with an exponential decay pattern, resulting in a time constant of 20 ms for the positive pair and 18 ms for the negative pair. Furthermore, the window for weight updates has been extended compared to the pristine sample. The alteration in synaptic weight is  ≈100%, signifying improvement of spike time-dependent learning. Enhanced conductance variations enable well-defined states and enhance the temporal sensitivity of the STDP behavior53. Thus, a plasma-treated device shows better STDP emulation than the pristine device. Table 1 provides an overview of recent advancements in MoS2-based memristors, comparing our work with others in terms of switching ratio and switching voltage range. While our results are competitive, future improvements could explore scaling down the channel length to achieve lower operating voltages and leveraging gate tunability for enhanced performance.

Fig. 6: Synaptic functionality of plasma-treated MoS2 memristor.
figure 6

Switching characteristics and emulation of synaptic function in optimal-power plasma-treated MoS2 memristor. a ID versus VD characteristics of optimal-power plasma-treated MoS2 memristor at VG = 0 V, arrows show the sweep direction, scan rate was kept 2 Vs−1. b Endurance of HRS (sweep 1) and LRS (sweep 2) state at VD = 0.5 V over 100 cycles. c Endurance of switching ratio (ILRS/IHRS) VD = 0.5 V over 100 cycles. d Post-synaptic current (ID) versus pulse number, showing long-term potentiation and depression, set pulse amplitude was +20 V with an on-off time of 1 ms, and reset pulse amplitude was −20 V with an on-off time of 1 ms; and current was measured at source-drain bias VD at 0.1 V. e Post-synaptic current versus pulse number, set pulse amplitude was +20 V with an on-off time of 1 ms, and reset pulse amplitude was −20 V with an on-off time of 1 ms; the current was measured at source-drain bias VD at 0.01 V for 20 switching cycles. f The measured change in synaptic weight as a function of the time interval (Δt) between paired pulses of 20 V and  −20 V with on time 1 ms, blue data points are for positive pulse pair and red data points are for negative pulse pair and the solid lines are exponential fits. VG = 0 V for all measurements.

Table 1 Comparison of different memristive devices based on MoS2

The plasma treatment provides an upper hand in realizing linear analog memory states in MoS2-based memristors, which can further be utilized to recognize images or patterns accurately. Towards this study, finally, we demonstrate an ANN network for hand-written digit classification of Modified National Institute of Standards and Technology (MNIST) datasets103 simulated using the open-source PyTorch package104. As shown in Fig. 7a, a three-layer fully connected ANN network with 784 input, 256 hidden, and 10 output nodes is conceived in an off-chip training procedure. The input signals from the 60,000 MNIST images of 28 × 28 pixels are employed to 784 input neurons, whereas the 10 output neurons correspond to the output classes of MNIST digits ranging from 0 to 9. The normalized conductance values extracted from the LTP/LTD curves can be treated as the synaptic weights in the ANN module. The non-linear parameters from the LTP curves are extracted using the following memristive switching model,105 as shown in Fig. 7b.

$$G=\left\{\begin{array}{ll}{\left(\left({G}_{{{{\rm{Max}}}}}^{m}-{G}_{{{{\rm{Min}}}}}^{m}\right)\times \omega +{G}_{{{{\rm{Min}}}}}^{m}\right)}^{\frac{1}{m}}, \hfill &\,{\mbox{if}}\,\,m\ne 0\\ {G}_{{{{\rm{Min}}}}}^{m}\times {\left(\frac{{G}_{{{{\rm{Max}}}}}}{{G}_{{{{\rm{Min}}}}}}\right)}^{\omega }, \hfill &\;\;\;{\mbox{if}}\,\,m=0 \end{array}\right.$$
(6)

Where GMax and GMin are maximum and minimum conductance in our LTP characteristics, ω is a variable and m is the non-linearity coefficient of the potentiation curve, whose value is 1 for the ideal device case. The value of m from the fitted equation for 100 pulses is extracted to be 21.79 before plasma treatment, and after plasma treatment, it has been improved to 2.81. NLF for LTD characteristics 2.79 (Supplementary Fig. S38a), indicating that LTP-LTD characteristics are highly symmetric. These coefficients, along with other parameters such as asymmetric ratio and dynamic range, play a crucial role in the high-accuracy training of such devices, as recently reported105,106,107. To simulate the network, we use a backpropagation algorithm with cross-entropy loss as the cost function employing a nonlinear rectified linear unit activation function for the propagation of information. Figure 7c shows the recognition accuracy of the plasma-treated MoS2 devices over 50 epochs. Interestingly, a high training accuracy of  ≈97% is realized in our devices, which is also comparable to the software-based calculations. Additionally, simulations incorporating the LTD curve (Supplementary Fig. S38b) also show high training accuracy, ~97%. This demonstrates the efficacy of the proposed devices towards the ANN implementations using future crossbar architectures. Furthermore, we have repeated the above simulations for various hidden nodes starting from 4 to 256 to study the dependence of classification accuracy of our ANN module based on MoS2 memristors. The variation of accuracy with the increasing number of hidden layers is represented in Fig. 7d, which is consistent with previous reports105. The confusion matrix is calculated as an additional step to enhance the validation of our supervised learning process and assess our ANN model’s effectiveness in distinguishing output classes, as depicted in Fig. 7e. The values along the diagonal represent the normalized predictions made by the ANN model, which correspond to the true labels found in the test data. The prominently high values along the diagonal and the correspondingly low values off the diagonal unmistakably demonstrate the strong class separation capabilities inherent in our device-based ANN implementation.

Fig. 7: Demonstration of a three-layer ANN model and performance metrics.
figure 7

a Demonstration of a three-layer ANN model with 784 input, 256 hidden, and 10 output neurons using 28 × 28 pixel MNIST datasets as input signals. b The extracted non-linearity factors from the fitted potentiation curves post-plasma treatment. c The digit recognition training accuracy of our device compared with software values. d Variation of training accuracy with the increasing number of hidden neurons. e Confusion matrix of handwritten digits from 0 to 9 where the diagonal represents high-accuracy classification.

Conclusion

In summary, we have successfully introduced an experimental approach to enhance the performance matrix of CVD-grown MoS2 memristors for neuromorphic applications. The main advantages of employing plasma treatment include its high material selectivity for sputtering material, large-scale modifications, postmetallization alteration, and a feasible technique. Plasma-treated devices substantially show improved resistive switching, endurance, and energy efficiency. These enhancements include an expanded switching ratio (from 102 to 104), significantly reduced cycle-to-cycle variability, improved STDP behavior, and linear and symmetric weight updates. NLF has been improved from 21.8 to 2.81. ANN simulations based on plasma-treated devices reveal learning accuracy of ~97% on the MNIST handwritten digits dataset. Furthermore, our observations of STDP behavior suggest a promising increase in learning potential. This research significantly advances the utilization of 2D semiconductors in future neuromorphic devices.

Methods

Growth of polycrystalline monolayer MoS2

We synthesized polycrystalline MoS2 monolayer on 285 nm SiO2/Si substrates using the atmospheric pressure chemical vapor deposition (APCVD) technique in a single zone CVD setup. Sulfur and molybdenum trioxide (99.98% trace metal) powder, both purchased from Sigma-Aldrich, were used as precursors. The SiO2/Si substrates underwent a 10-min bath sonication in acetone and another 10-min sonication in isopropyl alcohol. Subsequently, they were cleaned under O2 plasma at ~400 mTorr for 5 min with 40 W power applied to the radio frequency coil. For the synthesis process, in a 3.5-cm-diameter quartz tube furnace, 400 mg of sulfur powder was placed in an alumina boat ~18 cm upstream from the MoO3 boat (outside the furnace). The sulfur powder was heated independently using a heating belt. The substrates were placed upstream on a different alumina boat positioned on the other side of the MoO3 boat ~5 cm away. The tube furnace underwent purging with ultrahigh-purity Ar gas at 400 standard cubic centimetres per minute (sccm) for 15 min, followed by a flow rate adjustment to 20 sccm. The furnace temperature was then raised to 750 C (heating rate 5 C min−1), and the reaction continued for 5 min. Simultaneously, when the furnace temperature reached ~725 C, the heating belt around the sulfur boat was increased to 150 C and maintained at that temperature for 25 min. After the reaction, all the heaters were turned off, and the furnace was allowed to cool to room temperature naturally.

Material characterization

Raman and photoluminescence spectroscopy

The coverage and growth quality of polycrystalline MoS2 was characterized using an optical microscope (Nikon LV150N). Raman and PL spectra were collected using a LabRAM HR, Horbia Jovin Yvon spectrometer with a 532 nm laser.

X-ray photoelectron spectroscopy(XPS)

XPS was performed to verify the chemical composition of the CVD-grown MoS2 films using a Thermo Scientific XPS instrument, which was equipped with a monochromatic Kα Al X-ray line as the X-ray source. The X-ray beam had an approximate diameter of 400 μm and an energy of 1486.6 eV. To counter sample charging, all spectra underwent charge correction against the C 1s adventitious carbon peak at 284.8 eV.

High-angle annular dark-field scanning transmission electron microscopy (HAADF-STEM)

Atomic resolution high-angle annular dark-field scanning transmission electron microscopy (HAADF-STEM) images were acquired with Hitachi HD2700C dedicated STEM with Cs probe corrector. Other microscopes were employed with an acceleration voltage of 200 kV.

Device fabrication and electrical measurement

MoS2 memristor was fabricated using a standard photo-lithography method followed by thermal evaporation of metal (5 nm Cr and 50 nm Au) and lift-off in acetone. All electrical, endurance, and synaptic plasticity measurements were carried out in a home-built probe station using a Keithely 2636B dual channel source meter unit controller using LabVIEW programs at ambient conditions.

Ar plasma-treatment

The Ar plasma treatment was conducted using a Diener Zepto plasma cleaner. For low-power plasma treatment, the conditions included a pressure of 0.3 mbar, a power of 5 W, and a treatment time of 60 s. optimal-power plasma treatment was carried out with a pressure of 0.3 mbar, a power of 10 W, and a treatment time of 5 s. High-power plasma treatment was performed under the following conditions: a pressure of 0.3 mbar, a power of 20 W, and a treatment time of 5 s.