Bioinformatics
We trimmed the sequence data to remove potential PCR artifacts using the program TrimGalore version 0.6.5 (https://github.com/FelixKrueger/TrimGalore), a wrapper for Cutadapt . We used the Burrows-Wheeler Aligner software version 0.7.17 to map reads to a reference genome from the closely related Yellow Warbler (Setophaga petechia ; Bay et al. 2018). After mapping, the resulting SAM files were sorted, converted to BAM files, and indexed using Samtools version 1.9 . We marked read duplicates with MarkDuplicates from GATK version 4.1.4.0 and clipped overlapping reads with the clipOverlap function from bamUtil (https://genome.sph.umich.edu/wiki/BamUtil:_clipOverlap). Sequencing depth for individuals was calculated using Samtools. Initial population genetics analyses revealed a large effect in the data due to high variation in sequencing depth among individuals. To reduce sequencing depth variation, we followed the recommendations of and used the DownsampleSam function from GATK to randomly down sample reads from BAM files with greater than 2X coverage, to 2X coverage.
To identify genetic markers from low-coverage WGS data, we used stringent filtering options in ANGSD version 0.9.40 (). We retained reads with a mapping quality of at least 30 and base quality of at least 33. SNPs were identified based on a p-value of less than 1e-6. We retained SNPs that had read data in at least 50% of individuals (n = 165), a minor allele frequency greater than 0.05, and minimum and maximum total depths of 231 and 924, respectively. The minimum total depth threshold was chosen by the minimum number of individuals required to call a variant (n = 165) multiplied by the mean sequencing depth of all individuals (1.4X). The maximum total depth threshold was determined by 2 * total number of individuals * mean sequencing depth. The filtered variants were output as genotype likelihoods and used in subsequent analyses.