Fig. 3 | Scientific Data

Fig. 3

From: Homologous Pairs of Low and High Temperature Originating Proteins Spanning the Known Prokaryotic Universe

Fig. 3

Workflow for labelling homologous protein pairs across temperature. Raw data includes RefSeq. 16s rRNA sequences, OGT labels from Engqvist, UniProtKB proteome metadata and proteins. Proteome metadata is parsed to identify a single proteome for highly represented organisms, while retaining data for weakly studied taxa. Proteins are filtered such that only ones from the chosen proteomes and for which we have OGT are kept. Protein pair search space is filtered by first identifying pairs of related organisms via 16s rRNA alignment. Protein pairs are searched for by alignment of sequences. Final database tables are taxa, pairs of meso/thermo taxa, proteins, and pairs of meso/thermo proteins.

Back to article page