4.1 Method of data analysis and model selection of landscape genetics with whole-genome resequencing
The common markers in landscape genetics are microsatellites, mitochondrial DNA, amplified fragment length polymorphisms, and the Y chromosome (Manel, Schwartz, Luikart, & Taberlet, 2003). In recent years, single nucleotide polymorphisms (SNPs) have become another major and widely used marking method. The greatest advantage and characteristic of SNPs is that the number of polymorphic sites increases greatly compared with that of other molecular markers. However, the increase in number of markers also increases the difficulty and the time needed for data analysis. Therefore, by selecting the correct analysis tool, the genetic model in whole-genome resequencing can be accurately obtained and the analysis time effectively shortened. Pairwise estimation of FST is an important parameter in population genetic analysis that can conveniently summarize the population structure (Weir & Cockerham, 1984). Pairwise FST is generally calculated using GENEPOP (Rousset, 2008), the R package adegenet , or GenAlEx 6.5 (Peakall & Smouse, 2012). In this study, after filtering, 4,152,751 SNPs were obtained in Shunchang and 3,298,993 SNPs were obtained in Xiapu. Then, the R packageadegenet was used to calculate pairwise FST; however, the calculation required a long time, approximately one month. Therefore, the same data were used to calculate FSTthrough 5,000-bp windowing in vcftools software (Auton & Marcketta, 2015), and the calculation required only 7 h, which showed that using this method to calculate pairwise FST could effectively shorten the calculation time.
Observed (HO) and expected (HE) heterozygosity, fixation index (FIS), allele diversity (A), and mean number of alleles per locus (K) are also important parameters in population genetic analysis, which can be calculated by software such as Arlequin 3.11 (Excoffier, Laval, & Schneider, 2005), GENETIX 4.05 (Belkhir, Borsa, Chikhi, Raufaste, & Bonhomme, 2004), or ADZE 1.0 (Szpiech, Jakobsson, & Rosenberg, 2008). The commonly used software for calculating these parameters in Restriction-site associated DNA sequencing (RAD-seq) or Genotyping by sequenceing (GBS) is GenAlEx 6.5, but because the number of SNPs in whole-genome resequencing is usually a hundred times that of RAD-seq or GBS, GenAlEx 6.5 did not appear to be able to support such a large data set. Therefore, in this case, Metapop2 software (López-Cortegano et al., 2019) was selected, which can not only calculate the complete genetic diversity but can also effectively analyze a large number of SNPs (e.g., >100,000 SNPs) in the latest optimized version. The software calculated the population genetic parameters in a total of 60 h, which indicated that Metapop2 can effectively analyze millions or even tens of millions of SNPs in a short time.
To quantify the landscape structure and determine its effect on population genetics we used method of least-cost path based on resistance surface or straight-line transects (Spear et al., 2010). To eliminate the shortcomings of those two methods, Strien et al. (2012) developed the least-cost transect analysis (LCTA) by combining the two methods. While realizing objectivity, this analysis can also use buffers to form transect widths to quantify the proportions of each of the landscape types. In this study, LCTA was performed at a fine scale (<10 km) based on whole-genome resequencing, which clearly revealed the relationships between different tree types, urban areas, roads, and farmland and the dispersal and gene flow of M. alternatus . Thus, this analysis was effective with SNP markers and could identify the effects of landscape types at a fine scale. Cleary et al. (2017) used this analysis to describe the landscape genetics of two frugivorous bats under agricultural intensification and also obtained a better interpretation of effects. The db-RDA model also effectively described landscape genetics at a fine scale. This model breaks through the limitation that the existing method of RDA can only be performed by using Euclidian distance and allows the use of Bray–Curtis or other ecologically meaningful measures (Legendre & Anderson, 1999). The LCTA and db-RDA well explained the effects of different landscape types onM. alternatus at a fine scale, and because the results obtained by the two models were relatively consistent, both are applicable to landscape genetics at a fine scale under whole-genome resequencing.