Searching for sex-linked loci in reduced-representation DArT
genotypes for Macquarie perch and golden perch
For 171 Macquarie perch and 66 golden perch of known sex we obtained
genome-wide SNP markers using DArTseq, a reduced representation
sequencing method similar to double-digest restriction-associated
sequencing (list of samples in Supplementary Material S1). We genotyped
93 female and 78 male Macquarie perch from the Dartmouth and Yarra
populations, and 41 female and 25 male golden perch from Macquarie,
Murray and Murrumbidgee populations.
Sequencing libraries were prepared at Diversity Arrays Technology Pty
Ltd (Canberra, Australia) following Kilian et al. (2012). DNA samples
were digested using a combination of restriction enzymes Pst I andSph I that target low-copy genomic regions (details in Appendix
A). For quality control, each lane included ~25% of
technical replicates from independent libraries. SNP discovery and
genotyping were performed using DArT P/L’s proprietary analytical
pipeline (Jaccoud, Peng, Feinstein, & Kilian, 2001; Kilian et al.,
2012). The method assembles short 69-bp DArT loci de novo . The
DArT pipeline removed poor-quality sequences, applying more stringent
criteria to the barcode region than the rest of the sequence, corrected
low-quality bases from singleton tags using collapsed tags with multiple
members as a template, and used a secondary pipeline (DArTsoft14) for
SNP calling. Clusters, comprising tags differing by no more than 3
bases, were parsed into separate SNP loci while ensuring the balance of
read counts for the allelic pairs: loci with a 5-fold or higher
difference in read counts for each allele were rejected. Reproducibility
of SNP calls was then assessed based on score consistency between
technical replicates, and SNPs with a reproducibility <95%
were removed. No other filtering was performed at this stage, to
maximize the chance of finding sex-linked loci. DArT loci were aligned
to their respective newly-assembled reference genomes (NCBI WGS Project
accessions: SEMN01 for Macquarie perch and VMKM01 for golden perch)
using BLAST, with e-value ≤5e-5 and sequence identity ≥90%.
We tested each SNP locus for belonging to one of four types of strata
consistent with XY sex-determination (Shams et al., 2019):
(i ) Y-only sequences on old
strata (hereafter, “Y-linked ”): always apparently homozygous
(actually hemizygous) in males, and absent in females.
(ii ) Different alleles for non-recombining X- and Y- gametologs
on moderately old strata (hereafter “XY-gametologs ”):
homozygous in all females and heterozygous in all males.
(iii ) Y-limited variation on young strata (hereafter
“loci bearing recent-Y-specific polymorphism ”):
homozygous in all females and either homozygous or heterozygous in
males. Variation on Y-gametologs can accumulate faster than on
X-gametologs due to lower X-Y than X-X recombination.
(iv ) X-limited variation on old strata for loci missing on
Y-gametologs (hereafter “old-X-linked ”): homozygous or
heterozygous in females but always hemizygous in males (males have half
the read depth of females).
For these tests we used the gl.sexlinkage function of the dartR
package (Gruber, Unmack, Berry, & Georges, 2018) in R (R Core Team,
2020). Under default settings- t.het = 0, t.hom = 0, system = ’xy’- the
function looks for patterns consistent with (i ) and (ii );
here, the tolerance parameter t.het is the proportion of individuals of
the sex expected to be homozygous (XX females) allowed to be
heterozygous, and t.hom is the proportion of individuals of the sex
expected to be heterozygous (XY males) allowed to be homozygous. To find
(iii ) and (iv ), defined here as loci heterozygous in
>10% of males or females, t.het =0 and t.hom = 0.9 were
used under ’xy’ and ‘zw’ systems respectively; these settings allowed up
to 90% of individuals in the sex expected to have heterozygotes to be
homozygous. None of the values of t.hom<0.8 in Macquarie
perch, and t.hom<0.7 in golden perch returned sex-linked loci,
indicating very low variability for sex-linked loci. To reduce the
number of false positives due to small sample size, only loci
successfully scored in >75% (>58) of male
Macquarie perch and >95% (>23) male golden
perch were considered, male sample sizes being smaller than for females
in both species.