Table 4 Average Cumulative Reward per Episode.

From: A hybrid reinforcement learning and knowledge graph framework for financial risk optimization in healthcare systems

Model

Avg. Reward

Random Billing Policy

14.83

DQN (flat state)

38.12

PPO + Latent State Only

44.21

PPO + Graph Embedding Only

48.74

Proposed Hybrid Model

57.19