Bioinformatics
Demultiplexed sequences were batch processed using CutAdapt to remove primer sequences and perform preliminary quality filtering (Martin 2011). The denoising algorithm DADA2 was run in R to produce amplicon sequencing variants and remove chimeras (Callahan et al. 2016). Parameterization and length trimming was dependent on locus. Widespread contaminants were identified using package decontam in R using sequences identified in controls (Davis et al. 2018); the threshold argument was set to 0.5, which removes sequences more prevalent in controls than in true samples. Once contaminants were removed, ASVs were further curated using the LULU algorithm, which identifies NuMTs and remaining artefactual sequences (Frøslev et al. 2017). Finally, sequences were filtered at the ASV and sample level by read counts. ASVs represented by less than 10 reads were removed. At the sample level, the number of reads found for an ASV that represented less than 0.01% of the total ASV reads or that represented less than 1% of the total reads for the sample were removed. Additionally, one sample had missing data and needed to be removed because it could not be confidently placed in ginger sites or native forest sites.
Sequences were then written to FASTA files and, using Geneious, assigned taxonomic identities through megablast. Full lineage information was assigned using a custom R script based on package rentrez (Winter 2017). The level of assigned taxonomic identity was dependent on percent identity match. Order-level identity required >=85%, family >= 92%, genus >= 97% and species >= 99%. Sequences that were not identifiable to order-level were removed. To construct the prey data set, arthropod reads were filtered to remove Philodromidae reads, which represented the spider itself, and sequences in order Hymenoptera, which belonged to parasitoid wasps rather than prey. Because the Hymenoptera with confident BLAST identifications were all parasitoids, the ASVs only identifiable to order were additionally removed as likely parasites. Prey reads were assigned native or non-native status based on publicly available literature. Sequences that likely belong to entomopathogenic fungi and parasitic wasps were similarly identified using publicly available literature.