Sequencing data analysis and SNP callingĀ
Raw sequence data were processed using the Universal Network-Enabled
Analysis Kit (UNEAK) pipeline implemented in the Iplant collaborative
platform. This pipeline produced a hapmap file for downstream analysis.
This file was used as input for SNP identification using the GBS
pipeline implemented in TASSEL (Version: 3.0.166). Raw SNPs were
filtered following the dDocent guidelines (Puritz, Hollenbeck and Gold
2014). In short, using vcftools (Danecek et al. 2011) variants were
filtered for depth > 5, quality >Q30, and
initially 50% missingness. This file was used to screen samples for
high levels of missingness (all were <30%). The final SNP set
was filtered for a maximum of 5% missing values and a minor allele
frequency < 0.05.