Extended Data Table 1 Statistics of continual pretraining dataset

From: Advancing biomolecular understanding and design following human instructions