Sequence data processing and symbiont genus determination
Detailed descriptions of the data processing pipeline are on Github (https://github.com/evelynabbott/codominant_symbiosis.git). The Fastq files from both studies were downloaded using the SRA toolkit. Adapter trimming was done on paired-end mode using cutadapt, with a minimum length of 20 bp and a PHRED quality cutoff set to 20. FASTQC (Andrews, 2010) was used to assess the quality of a subset of 10,000 reads before and after trimming. Reads were then mapped to a combined reference comprising Cladocopium transcriptome, Durusdiniumtranscriptome (Ladner et al., 2012), and Acropora millepora genome (Fuller et al., 2018) using bowtie2. The reads in the resulting sam files were then split into three separate sam files, one for each organism. PCR duplicates were removed after alignment using MarkDuplicates from the Picard Toolkit (Broad Institute, 2019). Samtools (Li et al., 2009) was used to sort and convert from sam files to bam files. FeatureCounts (Liao, Smyth, & Shi, 2014) was used to count reads mapping to annotated gene boundaries.