Fig. 2
From: Measuring and mitigating debugging effectiveness decay in code language models

Normalised debugging effectiveness trajectories compared to baseline continuous debugging (\(A_0\)) with fresh start strategies implemented at \(\theta \in \{50, 80\}\). The distinctive spikes in \(A_{50} \text { and } A_{80}\) demonstrate successful intervention effects, where fresh starts reset the debugging process and break the monotonic decay pattern observed in baseline approaches. These spikes represent moments where strategic reinitialisation successfully shifts models from failed solution exploitation back to productive exploration, enabling recovery from debugging decay within the same computational budget.