Extended Data Fig. 1: PanKmer Jaccard similarity matrix of 193 Cannabis genomes.
From: Domesticated cannabinoid synthases amid a wild mosaic cannabis pangenome

PanKmer (PK) was used to estimate the relationship between the genomes in the cannabis pangenome. A large portion of the pangenome included elite cultivars, breeding trios and foundational Marijuana (MJ) lines originating from breeding programs spanning the 1970s to present (Supplementary Fig. 1; Supplementary Table 1). These samples represented chemotypes showing high expression of pentyl or propyl (varin) homologs of CBDA or THCA, and cannabinoid free (type V) plants. Flowering time variation was also captured with the inclusion of both short-day (SD) and DN phenotypes. The remaining cultivars came from the United States Department of Agriculture (USDA) Germplasm Resource Information Network (GRIN) and German federal genebank (IPK Gatersleben) repositories to ensure researchers will have access to plants for experimentation. These samples included European and Asian fiber and seed hemp, feral populations, North American marijuana (type I), hc yielding (CBDA or CBGA) hemp (type III and IV), male plants (XY; Fig. 1b) and monoecious plants (XX; Supplementary Table 1). Together, this comprehensive dataset provides a foundation for exploring cannabis genomic diversity, hybridization, and trait evolution. See Figshare for full resolution version.