ABSTRACT
Here we present an annotated, chromosome-anchored, genome assembly for
Lake Trout (Salvelinus namaycush ) – a highly diverse salmonid
species of notable conservation concern and an excellent model for
research on adaptation and speciation. We leveraged Pacific Biosciences
long-read sequencing, paired-end Illumina sequencing, proximity ligation
(Hi-C), and a previously published linkage map to produce a highly
contiguous assembly composed of 7,378 contigs (contig N50 = 1.8 mb)
assigned to 4,120 scaffolds (scaffold N50 = 44.975 mb). 84.7% of the
genome was assigned to 42 chromosome-sized scaffolds and 93.2% of
Benchmarking Universal Single Copy Orthologs were recovered, putting
this assembly on par with the best currently available salmonid genomes.
Estimates of genome size based on k-mer frequency analysis were highly
similar to the total size of the finished genome, suggesting that the
entirety of the genome was recovered. A mitome assembly was also
produced. Self-vs-self synteny analysis allowed us to identify homeologs
resulting from the Salmonid specific autotetraploid event (Ss4R) and
alignment with three other salmonid species allowed us to identify
homologous chromosomes in other species. We also generated multiple
resources useful for future genomic research on Lake Trout including a
repeat library and a sex averaged recombination map. A novel RNA
sequencing dataset was also used to produce a publicly available set of
gene annotations using the National Center for Biotechnology Information
Eukaryotic Genome Annotation Pipeline. Potential applications of these
resources to population genetics and the conservation of native
populations are discussed.