Results

eDNA detected taxa list and comparison with conventional sampling

The total numbers of sequence reads obtained before and after quality control (metabarcoding bioinformatic process) were 45.999x106 and 36.820 x106, respectively. A total of 53.589x103 reads assigned to UNKTaxa were discarded. A total of 474.323x103 reads were assigned to WASTaxa. All the remaining reads were assigned to KNWTaxa. The final mean sequencing depth was 34,161 per positive PCR replicate (range 90x103 to 179.209x103).
Out of a total of 86 taxa detected after the bioinformatic process and reassignment procedure (see Supplementary Tab 1 for taxa abbreviation list), five were classified as UNKTaxa (Bar_med, Eso_cis, Onc_cla, Onc_mas, Ric_bal) and 21 as WASTaxa. Among WASTaxa (Fig. 2), the two most abundant (Onc_myk and Sal_spp) were detected at 15 and 7 of the 47 sites, respectively, in the Upper Danube and its tributaries. Most of the other taxa were marine species detected mainly (71% of the total number of occurrences) downstream from Vienna (7 taxa) and on two tributaries, the Arges River (8 taxa) and the Russenki Lom River (5 taxa). Of the 60 remaining taxa classified as KNWTaxa (Supplementary Tab 1), 48 were identified at the species level, eight at the genus level, and four at a higher taxonomic level (taxa groups).
When only considering the 18 sites sampled with both TEF and eDNA, 40 of the 62 species caught by using TEF were also detected at the species level by using eDNA. Eighteen of the remaining species were detected by using eDNA at a higher taxonomic level. Eight species-specific taxa (”Aci_gue”, ”Aci_rut”, ”Aci_ste”, ”Bar_car”, ”Ben_sp”, ”Pun_pla”, ”Rom_ura”, ”Umb_kra”) and one taxa group (Cor. sp.) were detected only by using eDNA. Four species (”Clu_cul”, ”Eud_dan”, ”Eud_mar”, ”Neo_eur”) were caught only by using TEF. Despite the lack of discrimination between certain species by using eDNA, the taxon richness per site obtained by using eDNA was higher than that obtained by using TEF (Fig. 3), with mean richness values of 29.7 and 21.6, respectively (Student’s t test, t = 5.2147, df = 17,p < 0.001). The difference was slightly greater when species caught by using TEF were grouped together following the taxonomic assignment used for the eDNA taxa (mean TEF species richness of 20.17, t = 6.1429, df = 17, p< 0.001).

Comparison between absolute eDNA copy concentration and TEF abundance

The average amount of teleo-DNA per sample was 4,130,634 DNA copies (range 50,676 to 23,684,000), corresponding to an average concentration of 1,223,819 DNA copies per litre (range 7,219 to 9,046,465). The concentration of teleo-DNA per site decreased along the first 500 km of the Danube and remained stable downstream (Fig. 4). The teleo-DNA concentrations in the tributaries were significantly higher than those in the Danube (Student’s t test, t = -5.231, df = 44.987, p< 0.001). For all 47 sites, the teleo-DNA concentrations were negatively correlated with the mean water flow (Fig. 4, Pearson’s R coefficient = -0.740, n = 47, p < 0.001).
The total fish density and biomass estimated by TEF at the 18 common sites were strongly correlated with the teleo-eDNA concentrations (Fig. 5): Pearson’s R coefficients of 0.821 (n = 18, P = 0.00002) and 0.760 (n = 18, p < 0.001), respectively. When all the common sites were combined, the correlation between the taxa-specific eDNA concentration per litre and the species-specific abundance/biomass per ha was of comparable intensity: Pearson’s R coefficients of 0.763 (n = 40, p < 0.001) and 0.673 (n= 40, p < 0.001), respectively. When the sites and species were differentiated, the concentration of taxa-specific eDNA per litre at each site remained significantly correlated with the specific density and biomass per ha estimated from TEF samples but with a lower intensity (Fig. 5): Pearson’s R coefficients of 0.527 (n = 224, p < 0.001) and 0.397 (n = 224, p < 0.001), respectively.
The co-inertia analysis showed a high level of similarity between the structure of the fish assemblages revealed by using eDNA (taxa-specific number of DNA copies per litre) and TEF (specific number of fish caught per ha) at the 18 common sites with RV co-inertia criteria of 0.797 (p < 0.001) and 0.984 (p < 0.001) when the TEF data were expressed in density and biomass, respectively. The coordinates of the eDNA and TEF samples were highly correlated on the first and second co-inertia factors for TEF expressed as density (Pearson’s R coefficient = 0.982, p < 0.001) or as biomass (Pearson’s R coefficient = 0.993, p < 0.001). A direct comparison of the longitudinal distributions of species/taxa obtained with the two methods confirmed their similarity (Supplementary Fig. 1).
The change in the concentrations of the specific taxa DNA copies per litre from the source to the mouth of the Danube River provides evidence of a succession of species (Fig. 6). Bar_bar, Cot_gob, Huc_huc, Lam_pla, Pho_pho and Thy_thy were restricted to the upper Danube, while Aci_rut, Neo_flu, Sab_bal, and Sca_ery were detected from Vienna to the mouth of the Danube. Abr_bra, Alb_alb, Cyp_car, Sil_gla and Zin_str were detected along the entire course of the river. Syn_aba and Alo_spp. were only present in the lower Danube, but Alo_spp. was also detected 12 km upstream from Iron Gate I (Fig. 6). Aci_ste and Umb_kra were limited to the Danube delta.

Relationship between the quantity of teleo-DNA extracted and taxonomic richness

The relationship between the number of teleo-DNA copies extracted from a water sample and the number of taxa detected was tested using non-linear mixed-effects (NLME) models with sites as a random factor and two alternative fixed effects: teleo-DNA and water volume (V).
The non-linear mixed-effects (NLME) models with teleo-eDNA as a fixed effect had a lower Akaike information criterion (AIC) value than the NLME model with only the random effect (site identity): AIC values of 566.63 and 600.67, respectively. The Wald chi-square test showed a significant effect for the fixed-effect teleo-eDNA (Wald Chi-squared test = 29.973, df = 1, p < 0.001). The NLME model with the water volume sampled (V) as a fixed effect had a higher AIC value than the NLME model with only the random effect (835.28 versus 600.67), and V was not significant (Wald chi-square test = 1.004, df = 1,p > 0.05). For the best model including teleo-eDNA (Fig. 7), the Pearson’s R coefficient between the observed and predicted values at the individual level was 0. 959 (n = 94, p < 0.001), and the residuals were normally distributed (Shapiro test, W = 0.994, p > 0.05). Asymptotic richness per site and relative growth coefficient estimates and their associated standard errors at the population level were 27.29 (± 0.75) and 4.55 (± 0.83), respectively, for fixed effects and 17.96 (± 5.06) and 4.49 (± 0.3.06), respectively, for random effects. At the individual level, asymptotic richness and relative growth coefficients varied from 19.34 to 34.42 and from 0.1793 to 6.2658, respectively, with only one relative growth coefficient value less than 1 (Enns River).
The predicted value of teleo-eDNA needed to detect 95% of the taxa richness was 0.651x106 DNA copies when considering the model parameters defined at the population level. At the individual (site) level, this amount varied from 0.252x106 to 2.520x106 DNA copies after excluding the Enns River site (value of 15.4040x106 DNA copies).