Richness and composition
The numbers of ESVs/OTUs/morphospecies per sample for metabarcoding and microscopy data are summarized in Table 1. The numbers of ESVs/OTUs per sample between HTS 10 and HTS 0.5 treatments demonstrated a strong positive correlation (Spearman R > 0.863; Fig. 2a-c; Fig. S1), and no significant differences in the (ESV/OTU) richness (Fig. 2d). Accordingly, the intra-pipeline comparisons of HTS 10 and HTS 0.5 data (20 comparable samples; Table 1) demonstrated high proportions of shared ESVs/OTUs (63.9-87%; Fig. 3). For the genus level comparisons across metabarcoding data sets, ESVs and OTUs were annotated to genus level only when the similarity and coverage of the representative read of ESV/OTU was ≥ 95% against a reference sequence. Similarly, a large proportion of genera were shared between HTS treatments (87.7-92.2%; Fig. 3). Interestingly, inter-pipeline comparisons revealed that higher proportion of genera was shared between HTS 0.5 treatments compared with HTS 10 (87.7% vs . 76.4%; Fig. S2). Compared to data generated with the OTUs pipelines, the data from the ESVs pipeline harbored a higher number of different genera (67 vs . 62 for HTS 10 and 69vs . 65 for HTS 0.5; Fig. S2). For the HTS 10 data, the unique genera (8 genera; i.e. genera that were identified only in the corresponding data set) of the ESVs data set represented a total of only 2.67% of sequences (range of <0.001% to 1.44%; Table S3). For the HTS 0.5 data, the unique genera (4 genera) of the ESVs data set represented total of less than 0.1% of sequences (range of < 0.001% to 0.016%; Table S3). The data set of 97% OTUs did not contain any unique genera, and there was only one unique genus for the 95% OTUs data (Sternimirus , sequence abundance < 1%; Fig. S2; Table S3).
Morphological examination of the sediment samples recovered a total of 189 diatom taxa from 11 surface sediment samples (Table 1), which included 59 genera (Fig. 4; Table S3). Unlike the per sample richness correlations observed between the metabarcoding treatments (Fig. 2a-c), correlations were not obvious between richness values from the microscopy and metabarcoding data (P > 0.398 for all cases; Fig. S3). Across treatments, detected species richness by microscopy differed significantly only from the ESVs data (Fig. 2d). The detected composition of genera by microscopy were compared with metabarcoding data, which harbored 54 different genera for the ESVs data, 49 and 50 genera for the 97% and 95% OTU data, respectively. The genus level comparisons (among 11 corresponding samples) revealed that 50.7-54.3% of genera were shared between microscopy and metabarcoding treatments (Fig. 4). Compared with the metabarcoding inventories, the microscopy data set harbored larger proportion of unique genera (Fig. 4). From these, the majority were represented in low abundances in the microscopy data set (< 9 counted valves per sample). However, counts of the valves assigned to Pseudostaurosira , one of the most abundant genera that were completely missing from metabarcoding data, was 519 (11.77%) across the microscopy data set.
Comparing the relative abundance of valves and sequences of the matching genera between microscopy and metabarcoding data, revealed overall significant positive correlations (Spearman R > 0.317 and P < 0.023; except for 97% OTUs HTS 10 vs . microscopy data, where P = 0.067; Fig. S4). The outstanding exceptions werePantocsekiella and Achnanthidium , which had high relative abundance in microscopy, but low abundance in metabarcoding data (Fig. S4). Vice versa , Staurosira and Aulacoseira were found to have high relative abundance in metabarcoding data, but low in microscopy data (Fig. S4).