Fig. 5: Lab Jaccard distance calculation. | Nature Communications

Fig. 5: Lab Jaccard distance calculation.

From: PlasmidHawk improves lab of origin prediction of engineered plasmids using sequence alignment

Fig. 5

To calculate lab Jaccard distances between two labs, such as lab A and lab C, we first build a fragment set, FA and FC, for each lab. A fragment set contains all the pan-genome fragments annotated by the corresponding labs. The lab Jaccard distance between lab A and lab C is JD(A, C) = 1 − J(FA, FC), where the Jaccard index (J(FA, FC)) is the fraction of shared pan-genome fragments out of all the different fragments lab A and C have. A large lab Jaccard distance indicates two labs have few shared sequences.

Back to article page