Fig. 4: Test set performance vs. number of training samples for the doping extraction task using GPT-3 with the Doping-English schema.
From: Structured information extraction from scientific text with large language models

This schema specifically requires the model to learn a new and specific sentence structure to use as the output. We separate scores by (a) host-dopant links (relations), (b) host entities alone, and (c) dopant entities alone. We note that below approximately 10 samples, the scores are zero because the model has not learned the specific structure of the desired output sentences. Source data are provided as a Source Data file.