Figure 1
From: Scalable photonic reinforcement learning by time-division multiplexing of laser chaos

Architecture for scalable reinforcement learning using laser chaos. (a) Solving the multi-armed bandit problem with N = 2M arms using a pipelined arrangement of comparisons between thresholds and a series of chaotic signal sequences. (b) Chaotic time series with the definitions of the inter-decision sampling interval (ΔS) and inter-bit sampling interval (ΔL) to arrive at a single decision. The 2Z + 1 threshold levels are also depicted, where Z is a natural number. (c) Schematic diagram of the decision-making system architecture based on laser chaos and pipelined threshold processing.