Amplicon library preparation and bioinformatic analyses
Illumina outputs were demultiplexed using Illumina MiSeq control
software (v.2.6.2.1). The raw DNA libraries were deposited at the
National Center for Biotechnology Information (NCBI) under the
BioProject ID PRJNA630297. The primers were removed using cutadapt
v.1.15 (Martin, 2011). Additionally, traces of both primers and Illumina
NexteraPE-PE adaptors were removed using trimmomatic v. 0.38 (Bolger,
Lohse, & Usadel, 2014). Bases with average quality scores below 20 with
a sliding window of 4 bases were trimmed. Trimmed reads were then
imported and analyzed using R (R Core Team, 2014) and Quantitative
Insights Into Microbial Ecology v. 2020.6 (QIIME2; Bolyen et al., 2019).
Given that the DNA libraries were sequenced during different runs, reads
for both loci were initially filtered, denoised, and merged with DADA2
(Callahan
et al., 2016) using DADA2 workflow for big data (hosted athttps://benjjneb.github.io/dada2/bigdata_1_2.html). The
same program was also used to detect and remove chimeras. Sequences were
then pooled in ASVs (100% identity), and singletons were discarded.
Feature tables were rarefied at the minimum sample read depth using the
QIIME2 diversity alpha-rarefaction plugin and implemented in a
downstream analysis. Amplicon sequence variants were then aligned with
MAFFT v.7
(Katoh
& Standley, 2013), and a phylogenetic tree was built and rooted with
fastTree v.2.1
(Price,
Dehal, & Arkin, 2010). Taxonomic assignments were conducted using the
naïve Bayesian approach rdp classifier (Wang, Garrity, Tiedje, &
Cole, 2007). The classifier was trained on the BOLD
(Ratnasingham
& Hebert, 2007) and SILVA v.132
(Quast
et al., 2013) databases for COI and 18S rRNA, respectively.
We limited our analyses to known zooplankton taxa and discarded reads
that were not identified to the phylum level with > 80%
confidence. Accordingly, ASVs with no taxonomic match (NA) or those not
assigned to the kingdom Animalia were removed. Moreover, taxonomy
lineages from the World Register of Marine Species
(WoRMS
Editorial Board, 2017) were retrieved with a synonymous per taxa method
using the taxize package (Chamberlain & Szöscs, 2013) in order to
include only marine zooplankton species and to match the obtained
organism nomenclature with the SILVA and BOLD databases. Finally, in
order to minimize the chance of spurious ASVs being included in the
final datasets, the number of reads for each ASV were log transformed
and ranked according to read abundance (Supplementary Information Fig.
S1). Notably, we observed that ASVs composed of < 12 and
< 21 reads for 18S and COI, respectively, were generally
detected in only one sample. Thus, we considered those ASVs as PCR
amplification and/or sequencing noise, and we removed them from the
final datasets.