Group Dynamic History Inference
To determine the cause of the multiple ancestral compositions shown in the ADMIXTURE results, gene flow among the four groups was detected using the ABBA-BABA test. Tests involving NE and EL, NE and CN, and EL and CN showed a significant deviation of D-stat from zero (absolute value of Z-score greater than 3), indicating gene flow between these groups. Interestingly, no gene flow was detected between the NW and other groups (Figure 3A and Table S2). Combined with the results of the ABBA-BABA test, the gene flow pattern and divergence time between different groups and effective population size (Ne) of each group were all inferred by G-PhoCS. The gene flow between NE and EL and between NE and CN was greatest, there was less gene flow from NE to the other two groups than in the opposite direction, and the gene flow between EL and CN was smaller. The effective population size was 23,946,300 for the ancestors. The ancestry population was differentiated into eastern and western lineages at approximately 6.45 Mya, and their effective population size were 7,654,150 and 13,578,175, respectively. Next, CN and NW separated at approximately 6.04 Mya, and NE and EL separated at 5.82 Mya. All the divergence times were in the late Miocene. The current effective population sizes of NE, EL, CN and NW were 6,566,575, 1,650,775, 1,534,025 and 2,618,550, respectively (Figure 3B and Table S3). All the Ne values of the other groups were lower than those of their common ancestor.
The nucleotide diversity (π) of the NE, EL, CN and NW populations was calculated throughout the genome. Among the four groups, NW showed the highest nucleotide diversity, and EL showed the lowest nucleotide polymorphism (Figure S3A). FST was calculated for the four groups to infer population genetic differentiation. At the overall level of the genome, the FST values between NE and EL, NE and CN, NE and NW, EL and CN, EL and NW, and CN and NW were 0.160, 0.151, 0.152, 0.159, 0.193 and 0.161, respectively. These results indicated that the four groups of A. viridiflora had a moderate level of differentiation, among which EL and NW were the most differentiated. The FST values calculated for the four groups using 10 kb windows across genomes were consistent with that at the overall level of the genome (Figure S3B). Linkage disequilibrium analysis showed that CN and EL presented a greater degree of LD, while NE and NW showed less linkage disequilibrium (indicated by r2). When r2 = 0.1, the decay distances of NE, EL, CN and NW were 10 kb, 37 kb, 44 kb and 6.9 kb, respectively (Figure S3C). The rapid decay of NE and NW may have been due to the higher genetic diversity relative to that of EL and CN.