Fig. 2: Schematic for the multilayer structure and DQN framework.
From: General deep learning framework for emissivity engineering

a Five-layer multilayer structure composed of two alternating materials. b Schematic of the DQN framework. The state consists of two materials and five thicknesses of the multilayer, then the state parameters are fed into the DQN to generate an Action. Then take the action to update the state. TMM is adopted to simulate the new state, and reward is obtained to feed back to neural network (agent). The new state is fed into the DQN for the next iteration. Each pair of state, action and reward is recorded as dataset to train the neural network, so that it can take the action that increases accumulated reward and finally get the corresponding state with the maximum reward