Best Practices:
Museum collections are increasingly being used for molecular sequencing, yet comparative studies on the retrieval and reliability of microsatellite genotypes from these data sources are not readily available. Here we show that while museum specimens can recover reliable, and important genotypes for rare, endangered and elusive species, additional precautions must be made prior to acceptance of genotypes.
From our data we recommend a minimum of three successful amplifications for each marker of interest. The samples which routinely amplified (HQMS) recovered genotypes with very similar rates of genotype confirmation between replicates, and when compared to the tissue sample. Poor performing samples may require additional replication compared to better performing samples. We noticed in the LQMS that the longer the microsatellite locus, the worse the marker amplified. This was apparent when none of the LQMS recovered genotypes for the ~250 bp microsatellite locus GLSA-52. All of the HQMS samples recovered reliable genotypes across replicates tested here, and for all marker lengths and types of repeat motif.
Our data separated sample types into three categories, tissue, HQMS and LQMS, the latter two designations were only applied after many rounds of PCR and agarose gel visualization. During project design samples should be evaluated so that adequate replicates of PCR can be performed to ensure accurate genotypes. Additionally, calibration/confirmation of the genotypes generated by GBS can be done via CE or other fragment visualizing instruments (Fragment Analyzer, Advanced Analytical Ankeney, IA). It is notable that genotypes may be predictably shifted from comparison of GBS and CE methods as detailed in Barbian et al., (2018).
In order to reduce the inaccurate genotypes, optimization of PCRs should be performed prior to GBS. The addition of various reagents has been shown to increase specificity and reduce non-specific amplification, as has been widely published over the past 30 years (Boleda, Briones, Farres, Tyfield, & Pi, 1996; Robertson & Walsh-Weller, 1998; J. F. Williams, 1989). The PCRs performed here incorporated a touchdown protocol, which starts at a high annealing temperature (60°C) for 2 cycles of PCR before reducing to the lowest annealing temperature of 50°C for 35 cycles. Touchdown PCR was used on the museum specimens and across loci as it was shown to effectively amplify all microsatellite markers. Two microsatellites (GS-2 and GS-4) had Bovine Serum Albumin (BSA) added since, during initial PCR testing, BSA improved amplification success. GS-4 however, recovered numerous unconfirmed genotypes, which could be related to input DNA quality, or from poor performance in PCR. This locus in particular would benefit from additional optimization in order to determine if non-specific amplification could be reduced. GS-2 may have also benefited from additional optimization as that locus recovered numerous flags from the CHIIMP pipeline including PCR artifacts, PCR stutter and more than two prominent sequences.
The CHIIMP pipeline worked well on our samples after modification of published protocols (Barbian et al., 2018). We found it useful to combine the CHIIMP genotypes with the quality data as determined by the proportion of reads passing prinseq filtering to evaluate which samples may be more prone to false/inaccurate genotypes. The combination of multiple rounds of PCR, prinseq quality filtering and manual evaluation of CHIIMP results allowed increased confidence in the genotypes recovered by museum specimens in this study. This process is illustrated in Figure 2, and summarized here. First, we would assign our samples as high or low quality following multiple attempts of PCR with agarose gel visualization. Second, based on these findings, we suggest recovering minimally 3 successful PCR replicates prior to genotyping. If PCRs continue to fail, optimization of each locus may be helpful, as well as evaluation of DNA extracts for the presence of PCR inhibitors, which has been shown to affect recovery of ancient DNA and environmental DNA samples (Matheson et al., 2009; McKee, Spear, & Pierson, 2015; Pontiroli et al., 2011). Once successful amplification has occurred across all samples and markers, perform library preparation on successfully amplified PCR products and sequence on an Illumina platform with adequate insert length for the included microsatellites. Sequence to a minimum depth of 1000 reads per sample per microsatellite marker. For our data that would entail 5000 sequences per sample.
Demultiplexed data should have CutAdapt and FastQC performed in order to run CHIIMP v 0.3.1. Simultaneously, run prinseq as a parallel analysis to determine the overall quality of the samples. Samples with higher proportions of low quality reads should be noted as they may be more prone to erroneous genotypes. When alleles are recovered only in low quality samples, it is imperative to look at the output from CHIIMP, and determine if the differences are associated with primer sequences or repeat elements. If primer sequence varies, manually correct the length when the entire primer sequence would be included, and ignore primer site size mismatches in allele calls as this is likely an artifact of sequencing or amplification errors. Traditionally, fragment size analysis via CE would ignore peaks outside of the expected size range via programs like GeneMapper™ (Applied Biosystems). If an allele does not have a priming site error, it is important to evaluate if the size shift follows microsatellite evolutionary patterns, for example, if it is two base pairs shifted in a dinucleotide sequence that makes evolutionary sense. However, stutter sequences are often frame shifted by the size of the repeat motif. While CHIIMP evaluates for stutter, and allows filter manipulation, we found it was possibly including false alleles due to the nature of our markers. Dinucleotide sourced microsatellites are more challenging regarding stutter evaluation due to the short difference in size between true alleles and stutter peaks. (Barbian et al., 2018; O’reilly, Canino, Bailey, & Bentzen, 2000). In order to further scrutinize stutter sequences, we calculated the proportion of reads associated with the various alleles. If one call makes up a very small percentage (30% the number of reads as the other allele) it is likely a stutter peak. Further visualization via electropherograms could illuminate this process if rampant in a locus. Additionally, the number of reads represented by the allele can also provide insight into whether or not the polymorphism is due to sequencing error. By following these practices we reduced our allelic dropout by about 3% across loci. We also had to remove many of the LQMS genotypes as we could not be certain they were authentic. The CHIIMP pipeline also allows for optimization and customization of commands for recovering more strict versus lenient genotypes. Here we modified the published parameters based on the depth of coverage of our samples.