FIGURE LEGENDS
Fig.1 Sampling localities and geographical distribution of the three distinct populations of Parascaris spp. The upper left corner was a zebra-derived roundworm.
Fig.2 Population structure and relationships of roundworms from horse, zebra and donkey. (a) Principal component analysis (PCA) plots of the first three components. The fraction of the variance explained was 12.75% for PC1, 7.09% for PC2 and 0.06% for PC3; (b) Phylogenetic tree [maximum-likelihood (ML) tree with 1000 bootstraps] of all samples inferred from whole-genome tag SNPs, with B. schroederias an outgroup; (c) Population structure plots with K=2-4. The y axis quantifies the proportion of the individual’s genome from inferred ancestral populations, and x axis shows the different populations.
Fig.3 Chronogram of the Parascaris spp. based on Bayesian coalescent analysis of SNP data using SNAPP. Nodes with high support (posterior probability = 1.00) are filled in red color. Error bars represent the 95% highest posterior densities (HPD). The colored circles represent different populations.
Fig.4 Demographic history of the Parascaris spp. reconstructed from the reference and population resequencing genomes.(a) The colored lines represent the estimated effective population size of each population. The 100 curves of each color represent the PSMC estimates for 100 sequences randomly resampled from the original sequence. The generation time (g ) and the neutral mutation rate per generation (µ ) of Parascaris spp. were 0.17 years and 0.9×10−8, respectively. (b) Coalescent-based inference of demographic history using MSMC2. The upper panel shows the effective population sizes (Ne ) of three populations, while the lower panel shows the split time between three populations; (c) Effective population size and split time based on SMC++ method.
Fig.5 Demographic inferences and early gene flow ofParascaris spp. populations. (a) Results of the population genetic model comparison using the three-dimensional site frequency spectrum (3D-SFS) between the PEc, PEz and PEa populations. A simplified graph of the best-fit model is depicted, along with the comparison of the 3D-SFS for data, model and residuals. (b) Results of the population genetic model comparison using the two-dimensional site frequency spectrum (2D-SFS) between PEc and PEz & PEa population along with the 2D-SFS for data, model and residuals.
Fig. 6 Schematic diagram of glycolysis, tricarboxylic acid cycle and lipid metabolism. The red arrow represents significantly positively selected enzymes in the PEz&PEa clade (P<0.01), and the blue arrow represents significantly positively selected enzymes in the PEc clade (P <0.01). The Manhattan plot is the XP-EHH score of the 50k region around the related genes.