A genome language model, Protein Set Transformer, trained on viral datasets, uncovers evolutionary rules of protein content and organization driving precise virus identification, host prediction, and protein annotation for viral genomics and ecology.
- Cody Martin
- Anthony Gitter
- Karthik Anantharaman