Extended Data Fig. 3: Plasmid features importance of unknown sequences.
From: Improving lab-of-origin prediction of genetically engineered plasmids via deep metric learning

a NTI for an unknown sequence. This may help to investigate specific patterns. b The NTI for Bernard Moss’s lab which was assigned the sequence author by our model. c Comparison between tokens highlighted for the sequence and important tokens from the predicted lab. The red line represents the sequence. The first bar under the graph represents the laboratory, while the second represents the sequence. d Plotting the difference between the NTI of the sequence and the NTI of the predicted laboratory, we can see that few tokens stand out in this sequence beyond the usual presented by the laboratory, and the bar is almost all black.