Searching for sex-linked loci in reduced-representation DArT genotypes for Macquarie perch and golden perch
For 171 Macquarie perch and 66 golden perch of known sex we obtained genome-wide SNP markers using DArTseq, a reduced representation sequencing method similar to double-digest restriction-associated sequencing (list of samples in Supplementary Material S1). We genotyped 93 female and 78 male Macquarie perch from the Dartmouth and Yarra populations, and 41 female and 25 male golden perch from Macquarie, Murray and Murrumbidgee populations.
Sequencing libraries were prepared at Diversity Arrays Technology Pty Ltd (Canberra, Australia) following Kilian et al. (2012). DNA samples were digested using a combination of restriction enzymes Pst I andSph I that target low-copy genomic regions (details in Appendix A). For quality control, each lane included ~25% of technical replicates from independent libraries. SNP discovery and genotyping were performed using DArT P/L’s proprietary analytical pipeline (Jaccoud, Peng, Feinstein, & Kilian, 2001; Kilian et al., 2012). The method assembles short 69-bp DArT loci de novo . The DArT pipeline removed poor-quality sequences, applying more stringent criteria to the barcode region than the rest of the sequence, corrected low-quality bases from singleton tags using collapsed tags with multiple members as a template, and used a secondary pipeline (DArTsoft14) for SNP calling. Clusters, comprising tags differing by no more than 3 bases, were parsed into separate SNP loci while ensuring the balance of read counts for the allelic pairs: loci with a 5-fold or higher difference in read counts for each allele were rejected. Reproducibility of SNP calls was then assessed based on score consistency between technical replicates, and SNPs with a reproducibility <95% were removed. No other filtering was performed at this stage, to maximize the chance of finding sex-linked loci. DArT loci were aligned to their respective newly-assembled reference genomes (NCBI WGS Project accessions: SEMN01 for Macquarie perch and VMKM01 for golden perch) using BLAST, with e-value ≤5e-5 and sequence identity ≥90%.
We tested each SNP locus for belonging to one of four types of strata consistent with XY sex-determination (Shams et al., 2019):
(i ) Y-only sequences on old strata (hereafter, “Y-linked ”): always apparently homozygous (actually hemizygous) in males, and absent in females.
(ii ) Different alleles for non-recombining X- and Y- gametologs on moderately old strata (hereafter “XY-gametologs ”): homozygous in all females and heterozygous in all males.
(iii ) Y-limited variation on young strata (hereafter “loci bearing recent-Y-specific polymorphism ”): homozygous in all females and either homozygous or heterozygous in males. Variation on Y-gametologs can accumulate faster than on X-gametologs due to lower X-Y than X-X recombination.
(iv ) X-limited variation on old strata for loci missing on Y-gametologs (hereafter “old-X-linked ”): homozygous or heterozygous in females but always hemizygous in males (males have half the read depth of females).
For these tests we used the gl.sexlinkage function of the dartR package (Gruber, Unmack, Berry, & Georges, 2018) in R (R Core Team, 2020). Under default settings- t.het = 0, t.hom = 0, system = ’xy’- the function looks for patterns consistent with (i ) and (ii ); here, the tolerance parameter t.het is the proportion of individuals of the sex expected to be homozygous (XX females) allowed to be heterozygous, and t.hom is the proportion of individuals of the sex expected to be heterozygous (XY males) allowed to be homozygous. To find (iii ) and (iv ), defined here as loci heterozygous in >10% of males or females, t.het =0 and t.hom = 0.9 were used under ’xy’ and ‘zw’ systems respectively; these settings allowed up to 90% of individuals in the sex expected to have heterozygotes to be homozygous. None of the values of t.hom<0.8 in Macquarie perch, and t.hom<0.7 in golden perch returned sex-linked loci, indicating very low variability for sex-linked loci. To reduce the number of false positives due to small sample size, only loci successfully scored in >75% (>58) of male Macquarie perch and >95% (>23) male golden perch were considered, male sample sizes being smaller than for females in both species.