Figure 4. The average nucleotide composition of the Hsp60 genes
from 17 phyla: a – The average total GC contents at each positions of
codon of Hsp60 sequences and corresponding genomes; b - The average
content of GC1, GC2, and
GC3 in Hsp60 genes. Phyla were sorted by average total
GC content of Hsp60 sequences. Student’s t-test was used to compare the
average GC content of the Hsp60 sequences and the average GC content of
the corresponding genomes. The difference between two independent
samples of GC values is considered statistically significant if the
p-value is less than 0.05. Statistically indistinguishable average GC
values are marked with “ns” (non-significant).
The average GC content in the Hsp60 genes ranges from 0.41±0.13
(Apicomplexa) to 0.67±0.04 (Actinobacteria) (Figure 4a). As can be seen,
the average GC content of almost all Hsp60 genes is comparable to or
exceeds the average genomic background. In turn, the opposite is
observed for Euryarchaeota. The upward trend in the GC content in the
Hsp60 genes may be associated with recombination (GC-biased gene
conversion)46,47, repair48, and the
environmental changes49,50, in which there is an
increase in the frequency of AT→GC substitutions. Thus, it can be
assumed that the Hsp60 gene is tightly controlled by DNA repair systems
that protect the genetic material from mutations.
To determine the contribution of each of three codon positions to the
total GC content of Hsp60 genes from 17 phyla, the GC1,
GC2, and GC3 contents were calculated.
The average GC1 values vary from 0.49±0.06 (Apicomplexa)
to 0.69±0.02 (Actinobacteria), and their contribution to the total GC
content of Hsp60 genes is moderate (Figure 4b). At the same time,
GC2 in the range from 0.36±0.03 (Apicomplexa) to
0.44±0.02 (Basidiomycota) were the least variable and practically did
not affect the GC composition of the Hsp60 genes. These results are
obvious since the second codon position is the most conserved.
GC3 values vary from 0.25±0.15 (Firmicutes) to 0.9±0.11
(Actinobacteria), which indicates a mutation bias51.
It should be noted that starting from Euryarchaeota, the average
GC3 values of the Hsp60 genes increase sharply (Figure
4b), and the average GC content becomes more than 0.5 (Figure 4a).
The substitution of nucleotides at the third position of the codon,
caused by point mutations or repair processes, does not change the amino
acid, but only indicates the mutational pressure for codon usage.
According to the theory35, mutational pressure tends
to push the GC content in a gene/genome towards equilibrium (neutrality
of codon usage), reducing the heterogeneity caused by natural
selection34. Equilibrium of the nucleotide composition
of the gene/genome, in which selective constraints (factors that reduced
the evolutionary divergence of the functional sequence) do not affect
the GC content, is achieved when the frequencies of the AT→GC and GC→AT
mutations are equal34. These mutations can be fixed or
removed from the population by natural selection and random genetic
drift35. The frequencies of the AT→GC and GC→AT
mutations at the third position of the codons also reflect the direction
of the mutational pressure. In general, the GC3 value is
less than 0.5 when the gene is under the influence of AT pressure, andvice versa 52. Thus, we can initially identify
two groups of Hsp60 genes that differ in the direction of mutational
pressure (Figure 4b). The AT-group includes Apicomplexa, Chlamydiae,
Firmicutes, Streptophyta, Nematoda, Bacteroidetes, Mollusca,
Cyanobacteria, and Chordata, which have an average total GC content of
less than 0.5 in the Hsp60 genes. In turn, the phyla Euryarchaeota,
Arthropoda, Ascomycota, Proteobacteria, Euglenozoa, Basidiomycota,
Chlorophyta, and Actinobacteria form a GC-group with an average total GC
content of more than 0.5. However, the threshold of 0.5 is nominal due
to the imbalance between the rates of mutation and repair
processes53. Therefore, a neutrality analysis was
carried out to clarify the direction of the mutational pressure and to
reveal the degree of its influence on the codon usage with the
determination of the equilibrium point (Figure 5).