Fig. 4: The reward trajectory during the training process in the lower level. | Communications Engineering