FIGURE LEGENDS
Fig.1 Sampling localities and geographical distribution of the
three distinct populations of Parascaris spp. The upper left
corner was a zebra-derived roundworm.
Fig.2 Population structure and relationships of roundworms from
horse, zebra and donkey. (a) Principal component analysis (PCA) plots
of the first three components. The fraction of the variance explained
was 12.75% for PC1, 7.09% for PC2 and 0.06% for PC3; (b) Phylogenetic
tree [maximum-likelihood (ML) tree with 1000 bootstraps] of all
samples inferred from whole-genome tag SNPs, with B. schroederias an outgroup; (c) Population structure plots with K=2-4. The y axis
quantifies the proportion of the individual’s genome from inferred
ancestral populations, and x axis shows the different populations.
Fig.3 Chronogram of the Parascaris spp. based on Bayesian
coalescent analysis of SNP data using SNAPP. Nodes with high support
(posterior probability = 1.00) are filled in red color. Error bars
represent the 95% highest posterior densities (HPD). The colored
circles represent different populations.
Fig.4 Demographic history of the Parascaris spp.
reconstructed from the reference and population resequencing genomes.(a) The colored lines represent the estimated effective population size
of each population. The 100 curves of each color represent the PSMC
estimates for 100 sequences randomly resampled from the original
sequence. The generation time (g ) and the neutral mutation rate
per generation (µ ) of Parascaris spp. were 0.17 years and
0.9×10−8, respectively. (b) Coalescent-based inference
of demographic history using MSMC2. The upper panel shows the effective
population sizes (Ne ) of three populations, while the lower panel
shows the split time between three populations; (c) Effective population
size and split time based on SMC++ method.
Fig.5 Demographic inferences and early gene flow ofParascaris spp. populations. (a) Results of the population
genetic model comparison using the three-dimensional site frequency
spectrum (3D-SFS) between the PEc, PEz and PEa populations. A simplified
graph of the best-fit model is depicted, along with the comparison of
the 3D-SFS for data, model and residuals. (b) Results of the population
genetic model comparison using the two-dimensional site frequency
spectrum (2D-SFS) between PEc and PEz & PEa population along with the
2D-SFS for data, model and residuals.
Fig. 6 Schematic diagram of glycolysis, tricarboxylic acid cycle
and lipid metabolism. The red arrow represents significantly positively
selected enzymes in the PEz&PEa clade (P<0.01), and the blue
arrow represents significantly positively selected enzymes in the PEc
clade (P <0.01). The Manhattan plot is the XP-EHH score
of the 50k region around the related genes.