a: Introns in PCGs and rRNA; b: Exons in PCGs; c: rnl and rns including exons.
3.2. Protein‑coding genes and codon usage among the four mitogenomes
Most PCGs and rps3 are initiated with the ATG codon, whereascox1 and nad6 in the O. xuefengensis mitogenome are initiated with the TTG codon. Further, cox3 in the O. sinensis mitogenome and atp9 in both O. xuefengensis andO. sinensis mitogenomes are initiated with the GTG codon. In addition, most PCGs and rps3 are terminated with the TAA codon, but cox1 , rps3, and cox3 in the O. sinensismitogenome, and cox1 and cob of the O. xuefengensismitogenome are terminated with the TAG codon (Table S3). Comparative analysis indicated that all the PCGs and rps3 encode 20 amino acids, with similar amino acid compositions among the cordyceps mitogenomes. The most overused amino acid is Leu (ranging in prevalence from 628 to 652), whereas the least used amino acid is Cys (32 – 48 residues). The usage of Asn, Lys, Asp, and Glu in the O. xuefengensismitogenome is much higher than in those of the other three cordyceps (Fig. 3, Table S4). The relative synonymous codon usage (RSCU) of the 14 PCGs and rps3 in the four cordyceps mitogenomes was also analyzed. A total of 4,616, 4,429, 4,326 and 4,366 codons were observed among the PCGs and rps3 of the O. xuefengensis, O. sinensis, C. militaris, and C. brongniartii mitogenomes, respectively. The most used codon among all four mitogenomes is UUA, but the most frequently used codon is AGA in the O. xuefengensis andO. sinensis mitogenomes (Fig. 4, Table S5).
3.3. Variation, genetic distance, and evolutionary rates of common genes
The lengths and/or GC contents of the 14 core PCGs and rps3 are not all consistent across the four mitogenomes, with the exception of atp8, atp9, and nad4L PCGs. In particular, the length of nad5 in O. xuefengensis is about 1,000 bp longer than in the other three cordyceps mitogenomes. Further, the GC content of all 14 PCGs and rps3 differ, indicating that the core PCGs and rps3 have been modified within different cordyceps. Across all PCGs in the mitogenomes, GC content is highest in atp9 within all four cordyceps mitogenomes, and lowest in atp8 within the O. xuefengnensis and O. sinensis mitogenomes, but lowest in rps3 within the C. militaris and C. brongniartii mitogenomes. AT skew is negative in most PCGs except rps3, while GC skew is present in most PCGs except atp8. In particular, the AT skew of atp6 in the C. militaris mitogenome and nad6 in the O. xuefengnensis mitogenome are negative, while the GC skew of only rps3 in the C. brongniartii mitogenome is negative (Fig. 5).
Across all 14 PCGs and rps3 genes that were evaluated, rps3 has the greatest K2P genetic distance among the mitogenomes, followed by cox1. atp8 has the least genetic distance among the genes investigated, indicating a high degree of conservation. Ka analysis indicated that rps3 and cox1 have relatively high nonsynonymous substitution rates. Further, the Ks of nad1 is the highest across all four mitogenomes, while that of atp8 is the lowest. The Ka/Ks values for all 14 PCGs and rps3 are < 1, suggesting that these genes are subject to purifying selection (Fig. 6, Table S6).
3.4. Repetitive elements
A total of 18 repeat sequences were identified in the mitogenome of O. xuefengensis, 250 in that of O. sinensis, 6 in that of C. militaris and 23 in that of C. brongniartii (Table S7). The length of repeat sequences ranges from 38 bp to 372 bp, with pairwise nuleotide similarities ranging from 74.73% to 100%. The largest repeat region was observed in the O. sinensis mitogenome, within ORF322 and ORF308. The largest repeat region in the O. xuefengensis mitogenome is 220 bp long and located in the intronic region of cox2 and ORF301, while the largest repeat region in the C. militaris mitogenome is 259 bp long and located in the intronic region of rnl. Lastly, the largest repeat region in the C. brongniartii mitogenome is 191 bp long and located between cox1 and trnR.
Seven tandem repeats were detected in both the O. xuefengensis and C. brongniartii mitogenomes, 43 in the O. sinensis mitogenome, and 5 in the C. militaris mitogenome (Table S8). The longest tandem sequence is found in the O. sinensis mitogenome, comprises 123 bp, and is located in the in the ORF138 coding region. Of all tandem repeats in the four mitogenomes, most exist in one or two copies. REPuter identified forward (F) and palindromic (P) repeats in the mitogenome of each cordyceps (Table S9), including 69 F and 34 P repeates in the mitogenome of O. xuefengensis that account for 10.16% of the total mitogenome, along with 16 F and 1 P repeats in the C. militaris mitogenome that account for 4.95% of its total mitogenome. In addition, 38 F and 19 P repeats were identified in the C. brongniartii mitogenome that account for 14.54% of its total mitogenome, and a total of 754 F, 538 P, 31 C, and 60 R repeats were identified in the O. sinensis mitogenome that account for 65.55% of its total mitogenome.
3.5. Gene rearrangements and phylogenetic analysis
The relative positions of the mitochondrial genes (including rnl, rns, rps3, the 14 core PCGs, and the tRNAs) are highly conserved in the four cordycep mitogenomes. The O. xuefengensis mitogenome only lacks trnI, consistent with that of O. sinensis (Fig. 7). Genome synteny analysis also identified several instances of gene rearrangement in the mitogenomes of the four cordyceps. All four cordyceps mitogenomes could be divided into two homologous regions, with the sizes and relative positions of these homologous regions substantially differing among the four species. Mitogenomic rearrangements were not observed among the four species (Fig. 8).
A complete mitogenome was produced for O. xuefengensis in this study and used with 19 other complete mitogenome sequences from six families for phylogenetic reconstruction to further investigate the phylogenetic position of O. xuefengensis within the Hypocreales order. Species and NCBI accession numbers used in these phylogenetic analyses are listed in Supplementary Table S10. Bayesian inference (BI) based on mitochondrial gene datasets using the GTR + I + G nucleotide substitution model yielded an identical and well-supported topology in which all major clades are well supported. The 20 fungal species comprised six major clusters, corresponding to the major groups of the orders Hypocreales and Agaricomycetes. The four cordyceps species were divided into two groups, wherein the close relationship between O. xuefengensis and O. sinensis was highly supported (Fig. 9).
4. Discussion
Cordyceps are fungal species used in traditional medicine that are represented by the well-known fungal resource, O. sinensis.However, natural populations of O. sinensis have been over harvested to the extent that it is an endangered species. It is consequently necessary to explore alternative cordyceps resources, because artificial cultivation of O. sinensis is difficult to achieve [15]. O. xuefengensis is a new potential cordyceps resource that was found on Xuefeng mountain in Hunan. Investigation of its artificial cultivation and biometabolite biosynthesis yielded some positive results [14]. Several previous studies have investigated the phylogenetic relationships of cordyceps, although detailed phylogenetic relationships for the species remain poorly understood, especially since increasing numbers of related species have been reported [16-18]. It is difficult to accurately classify fungal species based on limited morphological characters and overlapping morphological features [19]. Consequently, mitogenomes have been widely used in the phylogenetic analysis of eukaryotes due to many advantages, including uniparental inheritance, rapid evolutionary rates, and the presence of several molecular markers [20]. Here, we report a new mitogenome sequence for O. xuefengensis that could provide new insights into the evolutionary histories of cordyceps fungal mitogenomes and the taxonomic relationships among cordyceps.
Mitogenome sizes significantly differ among fungi, ranging from approximately 11.09 kbp (Hanseniaspora uvarum ) [21] to 272.2 kbp (Morchella importuna ) [22] in length. The variation in size of mitogenomes among different fungal species primarily results from the presence and extent of intergenic regions and introns [23]. The Ascomycota cordyceps fungus O. sinensis contained a mitogenome with introns accounting for 68.09% of the whole mitogenome, while introns in the mitogenomes ofO. xuefengensis,C. militaris, and C. brongniartii only accounted for 26.85%, 29.49% and 24.91% of their mitogenomes, respectively. Further, intergenic regions within theO. xuefengensis mitogenome accounted for 43.93% of the whole mitogenome, but only 11.41%-17.66% of the mitogenomes of the other three cordyceps species. Thus, considerable variation in the size of introns and intergenic regions are the major reasons underlying the significant differences in the mitogenome sizes ofO. sinensis and O. xuefengensis . In contrast, C. militaris and C. brongniartii exhibited similarly sized mitogenomes, due to similar numbers and sizes of intergenic regions and introns.
However, the large difference in introns did not affect the taxonomically close relationship between O. sinensis and O. xuefengensis. Maximum Likelihood and Bayesian phylogenetic analyses were constructed using 14 PCGs and rps3 from 20 fungi to reveal the taxonomic relationships of these fungi. The phylogenetic analysis highly supported the close relationship between O. xuefengensisand O. sinensis . This apparent domestication of ancestral introns may indicate an adaptation to their host genome that may be due to their decreased mobility [24], or alternatively, to their potential important role in stabilizing the gene that hosts the intron [25, 26]. Other introns may be acquired late in the divergence of taxa, either through HGT events or through active transposition [27, 28]. The transposition of some introns to other genomic regions with less sequence similarity can occur more frequently under stress-induced conditions [29, 30]. A recent study demonstrated that the number and Pcl of introns are highly variable between two Rhizopogon species [31]. Further, the introns in nuclear genes of S. cerevisiaeplay crucial roles in the survival of the organism under starvation conditions [32]. Consequently, the abundance of introns in the mitochondrial genome of O. sinensis relative to other cordyceps species may be a consequence of the particular environments they inhabit.
The accumulation of repeated evolutionary events in fungal mitogenomes could lead to over-dispersal of repeat sequences and the introduction of new genes through HGT, thereby contributing to dynamic changes in genome structure and gene order [33]. Moreover, the accumulation of repeat sequences could also be highly related to gene recombination and gene loss [34]. A variety of models have been proposed to investigate these mitochondrial gene rearrangements [35]. Here, the mitogenome of O. sinensis harbored a larger proportion of repeat sequences than the mitogenomes of O. xuefengensis , C. militaris orC. brongniartii . Interestingly, collinearity analysis revealed that the mitogenomic gene order in these four fungi was not highly variable. In addition, positive selection of core genes was not observed by Ka/Ks analysis. This differs from observations in plants, wherein mitochondrial gene order is highly variable due to high rates of recombination [33]. These observations collectively indicate that selection upon the genomes by environmental factors is relatively weak among the four cordyceps fungi evaluated here, resulting in highly conserved phylogenetic relationships. Thus, we could also infer that differentiation of the cordyceps genomes remain in early stages, consistent with the molecular phylogenetic analyses. One reasonable explanation for these contradictory results is that the larger proportion of repeat sequences in the genome of O. sinensisexists in intronic regions, but not in intergenic regions, leading to little influence on their evolution.
Mitochondrial genomes are obtained from a common ancestor and have been widely used in population genetics and evolutionary studies [36, 37]. Different barcode genes, including ITS, LSU, RPB2, and EF1α, have been used for phylogenetic analysis [9, 38]. However, the large number of available molecular markers in mitogenomes and their independent evolutionary histories make them attractive tools for reconstructing phylogenetic relationships [39]. Here, we reconstructed a well-resolved phylogeny based on the combined alignment of 14 PCGs and the rps3 gene that separated 20 fungal species into major clades. In addition, K2P and Ka/Ks analysis indicated thatcox1 and rps3 divergence play important roles in population differentiation. Given that reliable genetic molecular markers are critical for further understanding the phylogenetics and classification of species, these two genes could be further evaluated as barcodes for fungal species identification.
5. Conclusions
This study extends the known mitogenomes of cordyceps fungi and identified variation in PCGs, tRNA genes, and rRNA genes among medicinal cordyceps species, including O. xuefengensis . Comparative mitochondrial genome analysis of four cordyceps fungi were conducted, revealing that their mitochondrial genomes are conserved and that differentiation of their mitogenomes is still in the early stages. Nevertheless, results indicated that the cox1 and rps3genes likely play important roles in population differentiation among the taxa. Mitogenome analyses, like those reported here, will allow further study of the population genetics, taxonomic relationships, and evolutionary biology of medicinally important cordyceps species.
Supplementary Materials: The following materials are available online. Table S1: General tRNA features in the four cordyceps mitogenomes. Table S2: The number of introns and their host genes. Table S3: Complete mitochondrial genome characteristics for the four cordyceps analyzed here. Table S4: Amino acid composition of mitogenomes among the four cordyceps species. Table S5: The RSCU of 14 PCGs and rps3 in the four cordyceps mitochondrial genomes. Table S6: Genetic analysis of 14 core PCGs and rps3 among the four cordyceps genomes. Table S7: Local BLAST analysis of the four cordyceps mitogenomes against themselves. Table S8: Tandem repeats detected in the mitogenomes of cordyceps using the Tandem Repeats Finder program. Table S9: Distribution of repeat loci in the four cordyceps mitogenomes identified by REPuter. Table S10: The GenBank accessions for genomes used in the phylogenetic analysis.
Author Contributions: Conceptualization: C.Z., S.X., and. S.Z. Methodology: J.J., C.Z., R.Z., and H.L. Software analysis: C.Z., J.J., and J.X. Investigation: R.Z. Resources: C.Z. and H.L. Data curation: C.Z., J.J., and J.X. Writing (original draft preparation): C.Z. Writing (review and editing): S.Z., J.J., and S.X. Visualization: J.X. Project administration: D.W. Funding acquisition: S.Z. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the National Natural Science Foundation of China (No. 81673585), the Natural Science Foundation of Hunan Province (2018JJ3309, 2020JJ5330), the Program of Survey and Monitoring of Chinese Medicines for National Drugs ([2017] 66) and the Key Project at the Central Government Level for the Ability Establishment of Sustainable Use for Valuable Chinese Medicine Resources (2060302).
Conflicts of Interest: The authors declare no conflicts of interest.