Bioinformatics
Demultiplexed sequences were batch processed using CutAdapt to remove
primer sequences and perform preliminary quality filtering (Martin
2011). The denoising algorithm DADA2 was run in R to produce amplicon
sequencing variants and remove chimeras (Callahan et al. 2016).
Parameterization and length trimming was dependent on locus. Widespread
contaminants were identified using package decontam in R using
sequences identified in controls (Davis et al. 2018); the
threshold argument was set to 0.5, which removes sequences more
prevalent in controls than in true samples. Once contaminants were
removed, ASVs were further curated using the LULU algorithm, which
identifies NuMTs and remaining artefactual sequences (Frøslev et
al. 2017). Finally, sequences were filtered at the ASV and sample level
by read counts. ASVs represented by less than 10 reads were removed. At
the sample level, the number of reads found for an ASV that represented
less than 0.01% of the total ASV reads or that represented less than
1% of the total reads for the sample were removed. Additionally, one
sample had missing data and needed to be removed because it could not be
confidently placed in ginger sites or native forest sites.
Sequences were then written to FASTA files and, using Geneious, assigned
taxonomic identities through megablast. Full lineage information was
assigned using a custom R script based on package rentrez (Winter
2017). The level of assigned taxonomic identity was dependent on percent
identity match. Order-level identity required >=85%,
family >= 92%, genus >= 97% and species
>= 99%. Sequences that were not identifiable to
order-level were removed. To construct the prey data set, arthropod
reads were filtered to remove Philodromidae reads, which represented the
spider itself, and sequences in order Hymenoptera, which belonged to
parasitoid wasps rather than prey. Because the Hymenoptera with
confident BLAST identifications were all parasitoids, the ASVs only
identifiable to order were additionally removed as likely parasites.
Prey reads were assigned native or non-native status based on publicly
available literature. Sequences that likely belong to entomopathogenic
fungi and parasitic wasps were similarly identified using publicly
available literature.