3.1 | Genome assembly, annotation and evaluation
A total of 29 Gb (110 X) PacBio long reads were generated (Table S1).
The genome size was estimated to be 266.8 Mb based on K-mer depth
distribution analysis (Tables S2 and Fig. S1), and the size of the
assembled B. schroederi genome reached 253.61 Mb, accounting for
95.05% of the estimated genome. This genome contained 75 scaffolds, 21
of which were superscaffolds ligated using 105 Gb (~386
X) Hi-C sequencing data (Fig. 1a; Fig. S2). The total length of these 21
superscaffolds reached ~250.70 Mb, accounting for
98.86% of the whole genome (Table S3). The final scaffold N50 was 12.32
Mb (Table 1), which is significantly better than the published genome
(Y. Hu et al., 2020). The GC-depth distribution (Fig. S3a and S3b)
further showed that most genomic regions have a GC content narrowly
centered around 37%, which is similar to that of other three roundworms
(Ascaris suum , Parascaris univalens and Toxocara
canis ; Table 1) (Jex A R 2011; Zhu et al., 2015). BUSCO scores against
nematode and eukaryote databases were 91.7% and 92.8%, respectively
(Fig. S4), reflecting the highest genome completeness among published
roundworm genomes (Table 1). 19,262 protein-coding genes were predicted
via ab initio, homology-based and RNA sequencing-aided methods
(see Methods). KEGG, COG, TrEMBL, GO, Swissprot and InterPro (Fig. S6
and Fig. S7). The average length of coding sequences (CDS) was 1,052 bp
with an average of 6.87 exons per gene, which is similar to that of
other related roundworms (Table S4). To evaluate the completeness of the
predicted protein-coding genes, we compared the length distributions of
mRNA, CDS, exons and introns in B. schroederi with those in other
five nematodes (Fig. S9 and Fig. S10).
Total repeats (DNA transposons and RNA transposons) accounted for
12.66% of the genome (Table S6-S8 and Fig. S11). Huge variation in the
proportion of repeat content is found among published nematode genomes
(from 1% to 48%) (A. Coghlan, Tyagi, Cotton, Holroyd, & Berriman,
2018; Schiffer, Kroiher, Kraus, Koutsovoulos, & Schierenberg, 2013).
Transposable
elements (TEs) account for 10.24% of the B. schroederi genome
(Table S8), while TEs constitute 4.4% and 13.5% in
the genomes of A. suum (Jex
A R 2011) and T. canis (Zhu et al., 2015) , respectively.
We identified a significant expansion of DNA transposons
in B. schroederi compared
to T. canis (Zhu et al.,
2015) and A. suum (Supplementary Data 1) (Jex A R 2011). There
are at least 64 DNA transposon families of which CMC-EnSpm, DNA and
MULE-MuDR dominated the genome. We identified 17 long terminal repeats
(LTRs) retrotransposon and 41 non-LTRs retrotransposon families (25 LINE
and 16 SINE). Pao and Gypsy are the predominant LTRs, and CR1, RTE-RTE
and L2 are the predominant non-LTRs. The number and size of the
retrotransposon families are similar to those of other related
roundworms (Ghedin et al., 2007; Jex A R 2011; Zhu et al., 2015).