Nature

Fig. 1: Accuracy and output length of DeepSeek-R1-Zero throughout the training process. | Nature

Fig. 1: Accuracy and output length of DeepSeek-R1-Zero throughout the training process.

From: DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning

Fig. 1: Accuracy and output length of DeepSeek-R1-Zero throughout the training process.

Search

Advanced search

Quick links