Extended Data Fig. 4: Phylogenetic and retention profile clustering analyses of Hox syntenic regions.

a-d, Bayesian inference phylogenetic trees of amino acid sequences of 4 non-Hox syntenic genes to Hox clusters, Gbx (a), Cbx (b), Hnrnpa (c) and Agap (d), of the inshore hagfish (in red), the sea lamprey (in light blue), the Arctic lamprey (in dark blue) and selected gnathostomes. Orthologs from the European amphioxus Branchiostoma lanceolatum were used as outgroup to root the trees. Posterior probability is indicated in each node. Scales indicate number of substitutions per site. Phylogenetic analyses of Hox genes generally fail to determine orthology due their high conservation and short alignments. The phylogenetic trees of these non-Hox linked genes clearly support the orthology of Hox-α (Gbx, Cbx and Hnrnpa), Hox-δ (Gbx, Hnrnpa and Agap), and Hox-ζ (Hnrnpa) clusters, while β and ε genes always group together, as previously observed for the lamprey17. The alignments used to build the trees, together with the MrBayes parameters and number of generations used to build each tree are provided as Supplementary Files 16–19. e, clustering analysis of retention profiles (see main text) resolved the orthology relationships of Hox-β, Hox-ε, Hox-γ, as well as Hox-ζ clusters. Supported orthologies in each analysis are marked with color-coded rectangles. The location of each cluster is indicated in parenthesis in e (ssc, super scaffold; HiC cl, Hi-C contact cluster, or chromosome). For the clustering analysis of AC1-derived chromosomes in the lamprey and hagfish, we split Hi-C cluster 3 into two halves, each containing one Hox cluster: 3L (coordinates 0–107.78 Mb), 3R (107.78-194 Mb).