Within-lineage population genetics
Within Lineage B, genetic patterns remained highly similar for all
filters (but see supplemental figures and tables for differences). As
conclusions remained the same, all further reported analyses were
performed filtering on 3X coverage and max. 30% missing data, as this
retained the most SNPs.
Population genetic diversity varied among lakes (Table 1, Supplemental
Table 4). The highest genetic diversity was consistently found for the
lagoon populations Bay and DAR, as seen for nucleotide diversity (π)
(0.0101 and 0.0095, respectively), and for the expected heterozygosity
(He) (0.157 and 0.117, respectively). Lowest genetic
diversity was observed in populations P.1 (π = 0.0036,
He = 0.054) and P.27 (π = 0.0037, He =
0.038). Population B.3 also showed low heterozygosity
(He = 0.034), but relatively high nucleotide diversity
(π = 0.0074). However, this may be an artefact of low sample size. When
estimating heterozygosity from genotype likelihoods via ANGSD, we found
the lowest heterozygosity for the populations P.5 (0.019) and P.27
(0.021).
The samples clustered per lake and lagoon location (Fig. 3, Supplemental
Fig. 1, Supplemental Fig. 2). The first four Principal Components (PCs)
in the Principal Component Analysis (PCA) explained 80.5% of total
variation (Fig. 3A). PC1, explaining 45.6% of the variation, separated
populations by geographic region, with the Raja Ampat lakes being
distinct from the lakes in Berau. PC2, explaining 24.4% of variation,
separated lake MIS01 from the other lakes. PC3 and PC4 (explaining
10.5% in total) further separated lagoon DAR and lakes P.5, and to a
lesser extend P.1 and P.30. In the PC1 versus PC2 plot the lagoon
populations (Bay and DAR) clustered towards the center of the graph,
indicating them to be ancestral. For Bay, this continued for the PC3
versus PC4 plot, but not for DAR. Lakes P.27 and P.32 remained closely
associated.
The Admixture analysis further supported the pattern of clustering per
lakes (Fig. 3B). Convergence of likelihood values indicated the number
of ancestral populations to be K = 9 (Supplemental Fig. 3, 4). When
putative number of populations was set to 9, all populations were
separated apart from B.2, which consisted of a mix of Bay and B.1
genetic lineages. Some admixture of B.1 genetic diversity into Bay and
DAR populations was observed, indicating some genetic connection between
these populations. Setting K at 7 or 8 indicated some admixture between
P.30 and P.5 (K=8) or among P.27 and P.32 with Bay being a mixture of
other populations (K=7). Setting K at 10 separated all populations.
Findings form the phylogenetic network were consistent with patterns
found for PCA and Admixture plots (Fig. 3C, Supplemental Fig. 5). The
network showed a high fit (fit = 99.2) and small degree of reticulation
(d = 0.153), thus indicating a tree-like structure. The lagoon
populations Bay and DAR showed higher reticulation than the marine lake
populations, indicating higher intra-population diversity.
Pairwise fixation indices (F’ST) showed high levels of
genetic structuring (0.629 ±0.133) (Fig. 3D). The F’STranged from 0.182 between Bay and B.2 to 0.778 between P.30 and P.32
(Supplemental Fig. 6, Supplemental Table 5). All pairwise comparisons
were significant, except for the comparison between P.32 and B.2,
potentially due to sample size (n = 4 and 2, respectively. The migration
network among lakes indicated strongest relative bidirectional migration
among lakes in Berau (Fig. 1D). Lagoon population Bay was linked to some
degree to all other populations (relative fraction 0.4-1). Within Raja
Ampat, bidirectional migration above the threshold of 0.4 was observed
between P.5 and four other lakes (P.30, P.32, P.1, and P.4). There was
low connectivity among lakes P.27, P.30, P.32 and P.1 in Raja Ampat
(>0.4).