Figure 1: Number of proteins in different classes of orthologous groups.

The statistics for M. cinxia are very similar to those for other Lepidoptera, including 5,977 conserved core proteins, 7,177 taxonomic order-, family- or species-specific proteins and 3,513 proteins without detectable sequence similarity to others. There appears to be rapid turnover of gene content in the genomes, indicated by the large proportion of order- and family-specific groups, which represent either rapidly evolving genes or dispensable ancient paralogues that have been deleted from most other lineages. Species codes are as used in SwissProt: Melci=Melitaea cinxia, Helme=Heliconius melpomene, Danpl=Danaus plexippus, Bommo=Bombyx mori, Pluxy=Plutella xylostella, Dromo=Drosophila mojavensis, Drosi=Drosophila simulans, Drome=Drosophila melanogaster, Culqu=Culex quinquefasciatus, Anoga=Anopheles gambiae, Aedae=Aedes aegypti, Apime=Apis mellifera, Harsa=Harpegnathos saltator, Solin=Solenopsis invicta, Nasvi=Nasonia vitripennis, Trica=Tribolium castaneum, Acypi=Acyrthosiphon pisum, Pedhu=Pediculus humanus, Dappu=Daphnia pulex, Ixosc=Ixodes scapularis, Felca=Felis catus and Ratno=Rattus norvegicus. The lepidopteran species are highlighted with a box. See Supplementary Note 8 for the definition of classes.