Fig. 2

BERT subtoken lengths of concatenated gold/system summaries (test1 Text-davinci-003 system) for doctor-patient dialogue to clinical note generation task. As embedding-based models require encoding the concatenated reference and hypothesis, it would be difficult to fairly evaluate the corpus using current pretrained BERT models with a 512 limit.