PID estimation
Average PID values were calculated for 19220 Hsp60 sequences in 19 phyla
(Table 1), where they ranged from 28±4% (Viruses) to 84±6% (Chordata).
The maximum average PID value was determined for the Hsp60 sequences
belonging to Chordata, which was expected in accordance with our
previous work14. The minimum average PID value
corresponds to Viruses, which can be explained by their high mutation
rate. For groups with a higher taxonomic rank, the average PID values
varied within a narrow range from 51±9% to 58±4% for Bacteria
(Proteobacteria, Cyanobacteria, Actinobacteria, Firmicutes, and
Bacteroidetes, with the exception of Chlamydiae), from 63±3% to 64±7%
for Fungi (Basidiomycota and Ascomycota), and from 41±18% to 44±17%
for Plantae (Chlorophyta and Streptophyta). Apicomplexa and Euglenozoa
can be combined into Protozoa with the average PID values ranging from
42±14% to 50±14%. For Hsp60 sequences belonging to Metazoa
(Arthropoda, Mollusca, Platyhelminthes, Nematoda, and Chordata), the
range of average PID values was wider: from 50±10% for Arthropoda to
84±6% for Chordata.
Clustering was performed on 19 phyla, and the average PID value was
calculated for each of the sub-clusters (Supplementary, PID, and SD).
The number of sub-clusters ranged from two to seven in 19 phyla (Table
1). The calculation of the average PID values and their clustering were
carried out between 60 sub-clusters (Figure 1).