Group Dynamic History Inference
To determine the cause of the multiple ancestral compositions shown in
the ADMIXTURE results, gene flow among the four groups was detected
using the ABBA-BABA test. Tests involving NE and EL, NE and CN, and EL
and CN showed a significant deviation of D-stat from zero (absolute
value of Z-score greater than 3), indicating gene flow between these
groups. Interestingly, no gene flow was detected between the NW and
other groups (Figure 3A and Table S2). Combined with the results of the
ABBA-BABA test, the gene flow pattern and divergence time between
different groups and effective population size (Ne) of each group were
all inferred by G-PhoCS. The gene flow between NE and EL and between NE
and CN was greatest, there was less gene flow from NE to the other two
groups than in the opposite direction, and the gene flow between EL and
CN was smaller. The effective population size was 23,946,300 for the
ancestors. The ancestry population was differentiated into eastern and
western lineages
at
approximately 6.45 Mya, and their effective population size were
7,654,150 and 13,578,175, respectively. Next, CN and NW separated at
approximately 6.04 Mya, and NE and EL separated at 5.82 Mya. All the
divergence times were in the late Miocene. The current effective
population sizes of NE, EL, CN and NW were 6,566,575, 1,650,775,
1,534,025 and 2,618,550, respectively (Figure 3B and Table S3). All the
Ne values of the other groups were lower than those of their common
ancestor.
The nucleotide diversity (π) of the NE, EL, CN and NW populations was
calculated throughout the genome. Among the four groups, NW showed the
highest nucleotide diversity, and EL showed the lowest nucleotide
polymorphism (Figure S3A). FST was calculated for the
four groups to infer population genetic differentiation. At the overall
level of the genome, the FST values between NE and EL,
NE and CN, NE and NW, EL and CN, EL and NW, and CN and NW were 0.160,
0.151, 0.152, 0.159, 0.193 and 0.161, respectively. These results
indicated that the four groups of A. viridiflora had a moderate
level of differentiation, among which EL and NW were the most
differentiated. The FST values calculated for the four
groups using 10 kb windows across genomes were consistent with that at
the overall level of the genome (Figure S3B). Linkage disequilibrium
analysis showed that CN and EL presented a greater degree of LD, while
NE and NW showed less linkage disequilibrium (indicated by
r2). When r2 = 0.1, the decay
distances of NE, EL, CN and NW were 10 kb, 37 kb, 44 kb and 6.9 kb,
respectively (Figure S3C). The rapid decay of NE and NW may have been
due to the higher genetic diversity relative to that of EL and CN.