Data Analysis of DNA Sequences
All the DNA sequences were edited and assembled with the program SeqMan
software package (DNA Star), and multiple alignments were generated with
Bioedit, version 7.0.4.1 (Hall, 1999). The three cpDNAs were combined by
PAUP, version 4.0b10 (Swofford, 2003). Nuclear genes had heterozygous
sites in some individuals, identified by overlapping peaks in
chromatograms. DnaSP, version 5.0 (Librado & Rozas, 2009) was used to
resolve the nuclear DNA sequences by applying the algorithms of PHASE
(Stephens & Donnelly, 2003; Stephens, Smith, & Donnelly, 2001). The
phased nDNAs were used in the following analyses.
Sequences were collapsed into haplotypes with the DnaSP version 5.0
(Librado & Rozas, 2009). Haplotype geographic distribution was drawn in
ArcGIS, version 10.5 (http://desktop.arcgis.com). For cpDNA, genetic
variation indices, such as the number of haplotypes (h), haplotype
diversity (H d), variable sites (S) and nucleotide diversity per
site (π) were calculated by DnaSP, version 5.0 (Librado & Rozas, 2009).
For each nuclear gene, besides the above indices, we also used DnaSP
(Librado & Rozas, 2009) to calculate Watterson’s parameter (θ) and the
minimum number of recombination events (R m).
Genetic differentiations between the two Cycas species were
estimated based on the combined cpDNA and each low copy nuclear gene.
The Wright’s fixation index (F ST) was calculated
for each locus using AMOVA implemented in Arlequin, version 3.11
(Excoffier, Laval, & Schneider, 2005) with significance tested using
1000 permutations as described in Excoffier et al. (1992) (Excoffier,
Smouse, & Quattro, 1992). We used Network, version 4.6.1.2 (Forster,
Forster, & Watson, 2007) to construct haplotype relationships following
the media-joining calculation at each locus, and all variation character
states were specified as unordered and equally weighted.
Effective population size, migration rate and divergence time were
estimated based on the isolation with migration (IM) model using the
combined cpDNA and five low copy nuclear genes in the IMa program (Hey
& Nielsen, 2007). The HKY model was chosen for all loci. We used the
calculate mutation rate 2.48 × 10-10 per site per year
for the cpDNA data (Appendix S1). For nuclear genes, the average
mutation rate 6.7 × 10-10 per site per year which was
derived from the average rates of Cycadales (De La Torre, Li, Van de
Peer, & Ingvarsson, 2017) to estimate divergence time between C.
bifida and C. micholitizii .
Ecological niche modeling (ENM) was performed to evaluate present
suitable habitats for C. bifida and C. micholitzii in
MaxEnt, version 3.4.1 (Phillips & Dudik, 2008). The current
distribution information for both species was obtained from the sampling
sites (Table 1), a total of five presence records of C.bifida and
nine records of C. micholitzii . Nineteen bioclimatic variables
under current conditions (1970-2000) in 30 arc-second resolution were
downloaded from the WorldClim 2.1 (www.worldclim.org). Ten replicates of
MaxEnt were run, and model performance was assessed using the area under
the curve (AUC) of the receiver operating characteristic (ROC) plot with
25% of the localities randomly selected to test the model. Niche
similarity between C. bifida and C. micholitzii was
measured through two niche overlap indices, Hellinger-derived I and
Schoener’s D. These metrics are implemented in the software
ENMTools,
version 1.4.3 (Warren, Glor, & Turelli, 2010). Firstly, the actual
niche overlap (the empirical I and D) between the two species was
calculated. Then, a niche identity test was conducted to compare the
overlap of the species pair’s actual niches to a distribution of niche
overlap obtained from 100 pseudoreplicates niches datasets that were
obtained by pooling all the occurrences of the two species.
Demographic history of C. bifida and C. micholitzii was
inferred in three ways, respectively. First, the signatures of
demographic change were examined with neutrality tests, including
Tajima’s D , Fu and Li’s D * andF * and Fu’s F S, using
DnaSP, version 5.0 (Librado & Rozas, 2009). The sum-of-squared
deviations (SSD) and raggedness index as well as their P-values were
calculated with the software Arlequin, version 3.11 (Excoffier et al.,
2005). Second, we used a pairwise mismatch distribution to test for
population expansion in DnaSP, version 5.0 (Librado & Rozas, 2009).
Third, an Extended Bayesian Skyline Plot was created by the program
BEAST, version 1.8.2 (Drummond & Rambaut, 2007), to further investigate
the demography history of the two species. A strict molecular clock
models with 6.7 × 10-10 per site per year mutation
rate (De La Torre et al., 2017) for each nuclear gene and 2.48 ×
10-10 per site per year mutation rate for cpDNA as
stated above, and the Coalescent: Extend Bayesian Skyline Plot (EBSP)
was specified as tree prior. The analysis was run for
109 generations sampling every 100,000 generations.
The ESS parameter of each process was checked by TRACER, version 1.5
(Rambaut & Drummond, 2009), until a stable value exceeding 200 was
reached, suggesting that there was acceptable mixing and sufficient
sampling.