Taxonomic composition
Because different data sets in this study included ESVs, OTUs or morphospecies, the direct comparisons of their taxonomic unity were performed at the genus level. In this study, metabarcoding results from 10 g and 0.5 g of DNA extracts exhibited highly concurring taxonomic composition, with only few mismatched taxa, which were represented by a low relative abundance of reads. In accordance with the community analyses, this indicates sample-size independent patterns when detecting diatoms via metabarcoding from lake sediments. However, comparisons between microscopy and metabarcoding data resulted in a higher number of mismatched taxa (Fig. 4). Not completely matching identifications from microscopy vs . metabarcoding have been reported in several previous diatom-related studies (e.g. Visco et al. 2015; Riveraet al. 2018; Tapolczai et al. 2019), with the possible reasons discussed within. One of the main reasons of such mismatches is the incompleteness of the reference sequence databases, which consists of a limited number of annotated taxa. For example, Sichuaniellalacustris , discovered only by morphological analyses, is the unique representative of the genus Sichuaniella, which was originally described from Sichuan Province on the southeastern Tibetan Plateau (Li, Lange-Bertalot & Metzeltin 2013) and has no genetic information in the public databases. Therefore, the identity of this species in the metabarcoding data set cannot be confirmed. Additionally, there are no reference sequences for genera such as Platessa ,Odontidium and Gomphosinica in the public databases.Gomphosinica has been separated from Gomphonema and described as a new genus based only on their morphological differences (Kociolek, You, Wang & Liu 2015). Thus, Gomphosinica in the microscopy data set could potentially be represented asGomphonema in the metabarcoding data set.
The inter-investigator variation depending on changes in diatom taxonomy and the use of synonym names could add additional layers for the mismatches between microscopy and metabarcoding data. In this study, it is difficult to consistently separate Staurosirella andPseudostaurosira (missing from metabarcoding data) fromStaurosira (present in metabarcoding data) under the light microscope and even with support from SEM images. AlthoughPseudostaurosira was one of the most abundant genera in the microscopy data, it was missing from the metabarcoding inventories, whereas the relative abundance of Staurosira was high in latter data sets (Table S3). Medlin, Yang and Sato (2012) have pointed out that the molecular separation of Pseudostaurosira andStaurosirella from Staurosira is arguable. On the other hand, in the few studies that have attempted to merge morphological- and molecular-based phylogenies of the Fragilariaceae, the morphological characterization is often poorly done (Morales et al. 2019). We speculate that morphologically identified Pseudostaurosira(especially Pseudostaurosira brevistriata ) corresponds toStaurosira in the metabarcoding data, as their presence-absence patterns in our sediment samples correlates well (Table S3). Furthermore, identification of Pseudostaurosira in the metabarcoding data sets was also limited due to a fact that almost all originally named Pseudostaurosira were re-assigned toStaurosira in the curated R-Syst diatom database (Rimet et al. 2016).
The majority of other missing genera from metabarcoding data sets were represented in very low abundances in the microscopy data. Similarly, Kermarrec et al. (2013) reported that morphologically identified low abundance taxa (< 1% from 450 valve counts) were often not detected in the DNA metabarcoding data set. These low abundance taxa may indicate the transport of diatom valves with highly degraded DNA from other locations (thus non-detectable with herein used primers). On the other hand, environmental DNA could be carried along large distances (Deiner & Altermatt 2014), which also could contribute to the observed ‘extra’ diatom taxa in metabarcoding data sets, which were not detected via microscopy. Moreover, some of the diatom taxa with fragile and weakly silicified valves, such as Cylindrotheca ,Entomoneis , Fistulifera , Reimeria ,Seminavis , that were detected only in the metabarcoding data sets, might be sensitive to the chemical treatment (e.g. HCl and H2O2) during sample preparation for microscopy. Based on personal observations of water samples from Nam Co, we have confirmed the presence of several Entomoneis andFistulifera species (data not shown), which further supports the assumption that valves of fragile diatom species may be more prone to dissolution and therefore undetectable in sediment samples. Thus, incompleteness of the reference databases, together with the continuously changing diatoms classification system and DNA transportation characterizations contribute to at least some extent to the issue of non-matching taxa between microscopy and metabarcoding results.