Read processing, SNP calling, and filtering
Adapter sequences and low-quality bases were removed from the raw reads
using Trimmomatic (Bolger, Lohse, & Usadel, 2014). The clean reads were
then mapped to the reference genome of S. baicalensis using the
mem algorithm of BWA (Li, 2013). Aligned reads with a mapping quality
score below 20 were discarded. Duplicates were removed with Picard
Toolkit
(http://broadinstitute.github.io/picard/,
accessed in 2019). We used SAMtools (Li et al., 2009) to generate
consensus sequences around target sequences. All genes were aligned in
MAFFT (Katoh & Standley, 2013), and alignments were trimmed by TrimAL
(Capella-Gutiérrez, Silla-Martínez, & Gabaldón, 2009). One LCN gene
that was not mapped to any of the nine reference chromosomes was
discarded in the following analyses and left 51 LCN genes.
Finally, single nucleotide polymorphisms (SNPs) were called with
BCFtools (Li, 2011) with default parameters. VCFtools (Danecek et al.,
2011) was used to retain biallelic SNPs with a missing rate <
0.5, minor allele frequency > 0.01, SNP site with a quality
score > 10, and a mean minimum depth > 3.