I. Introduction
Environmental DNA (eDNA) obtained from ancient samples such as sediments, ice or water are valuable data sources for a wide range of disciplines in past and present biodiversity and biogeography [1-4]. Within the field of ancient metagenomics, the number of published genetic datasets has risen dramatically in recent years and have become an increasingly powerful tool to investigate wide-ranging topics [5]. However, the ancient environmental metagenomics remains many issues that should be to be addressed relating to ancient DNA (aDNA) such as degraded nature, incomplete reference databases, sensitivity to contamination by modern DNA [6-8]. This review aims to provide an overview of the use of ancient metagenomics in large-scale ecological and evolutionary studies of individual taxa and communities of both microbes and eukaryotes and illustrate the limitations, risks, and potentiality of this ancient eDNA research via high-throughput sequencing (HTS) technologies. Further, paleogenetic and paleogenomics will provide diverse insights into studying evolution and how the present world came to be.
II. Ancient eDNA and ancient environmental metagenomics
In general, eDNA was extracted from ancient samples extremely fragmented and chemically modified depending on the sample types [6]. Typically, the size of ancient eDNA fragments is from 70 base pairs (bp) to less than 100 bp long [9] and with ends impacted by cytosine deamination [10]. Only in a few cases, where extraordinary preservation such as Antarctic conditions, for example, 500 bp of aDNA were recovered from lake sediment [11], respectively. These conditions generally feature anoxic, cold and dry conditions [6]. In the context of isolating aDNA from environmental samples, environmental aDNA including sedimentary ancient DNA (or sedaDNA) is used widely and applies to DNA isolated from sedimentary deposits in lake cores [12-14], marine [15, 16], cave [17-19], ancient forest [20], permafrost [13, 21-23], peat [24], tropical swamp [25]. However, there is potential for many other materials to provide information about the past via aDNA analysis as basal ice [20], glacial soil [26], silt-soaked [27]. Analysis of aDNA datasets, when combined with traditional proxy results, appears to complement each other, revealing a greater diversity of species than utilizing the methodologies independently [15, 28, 29]. Therefore, aDNA should be considered as a complementary, rather than alternative, approach to assays of more traditional established methods [3, 30].
The metagenomics of ancient environmental DNA can be broadly defined as the study of the total genetic content of samples that have degraded over time from several hundred to hundred-thousand years [5, 31]. Despite an extensive application including studies of genome reconstruction of specific microbial taxa [12, 32], host-associated microbial communities [33, 34], and environmental reconstructions using sedaDNA [5, 25, 35], the major source of ancient eDNA has been almost entirely limited to inventorying taxa through time by using DNA metabarcoding approach [15, 16, 36, 37]. Recent advances of next‐generation sequencing (NGS), massively parallel or deep sequencing technology, have the potential to radically change this situation, from sequencing of millions of short DNA fragments to generating datasets of genome-scale from extant and extinct species by bioinformatics analyses [12, 13, 32, 37].
III. The problem of environmental ancient DNA
Despite recent methodological strategies for aDNA extraction, Polymerase Chain Reaction (PCR) and/or sequencing, the study of aDNA could be negatively affected by the applicability and the outcome by several inherent technical issues. Part of the challenge is the fact that ancient samples are often rare and precious materials, such as low DNA quantities, DNA damage, high fragmentation, and contamination with modern sources [6]. In general, the ancient eDNA sample processing and analysis should be processed with practical recommendations for ancient DNA research to prevent contamination, reviewed in Capo et al., 2021 [35] for lake sediment cores and Armbrecht et al., 2019 [8] for marine sediment cores.
The current aDNA extraction protocols were not very different from the protocols used to obtain DNA from environmental settings including silica-based, alcoholic, and phenol-chloroform protocols [22, 38, 39]. For the molecular analyses, the yield and integrity of the recovered aDNA obtained will influence the reliability of subsequent results. Therefore, extraction protocols of aDNA should be carefully considered and adapted depending on the physical and chemical properties of sediments, DNA-subtracts interaction, or target organisms [8, 15, 40, 41]. Further, quick, simple and direct DNA extraction procedures are needed for use in regular analysis of aDNA.
DNA damage alters the base-pairing properties of individual bases and is vastly over-represented in aDNA sequences. This increased rate of polymerase misincorporation errors and therefore sequencing errors by incorporating wrong nucleotides opposite modified bases [42, 43]. During PCR, DNA damages cause blocking primer binding/DNA polymerase progression, preventing the amplification of the templates, or hydrolysis of the phosphodiester bond, resulting in a single-strand break [44-46]. For instance, the majority of errors give by deamination of cytosine to uracil, which pairs up with adenine instead of guanine, leading to thymine to cytosine transitions [45-47]. However, well-characterized degradation features of aDNA i.e., damage patterns and high fragmentation, allow us to authenticate ‘true’ aDNA sequences.
IV. How to study ancient metagenomic
The application of several technologies, from PCR and the earlier methods, including Sanger sequencing, to HTS, also known as Next-Generation Sequencing (NGS) [48] for short-read (shotgun) sequencing [49] or long-read sequencing, dramatically started a new revolution in ancient DNA research (Figure 1). While traditional PCR methods could only amplify a small number of specific target sequences, HTS combines amplification and sequencing of up to several billions of individual DNA library templates at a time. DNA/RNA metabarcoding approach is an extension of DNA barcoding, which relies on HTS technologies [36, 50-53]. Furthermore, HTS can sequence shorter DNA fragments - shotgun [37] and event recover whole genome sequences for the study of paleogenomics [12, 54, 55]. These technologies generate large quantities of highly accurate DNA sequences at lower costs than it was possible by using first-generation sequencing technologies.