3.1 | Genome assembly, annotation and evaluation
A total of 29 Gb (110 X) PacBio long reads were generated (Table S1). The genome size was estimated to be 266.8 Mb based on K-mer depth distribution analysis (Tables S2 and Fig. S1), and the size of the assembled B. schroederi genome reached 253.61 Mb, accounting for 95.05% of the estimated genome. This genome contained 75 scaffolds, 21 of which were superscaffolds ligated using 105 Gb (~386 X) Hi-C sequencing data (Fig. 1a; Fig. S2). The total length of these 21 superscaffolds reached ~250.70 Mb, accounting for 98.86% of the whole genome (Table S3). The final scaffold N50 was 12.32 Mb (Table 1), which is significantly better than the published genome (Y. Hu et al., 2020). The GC-depth distribution (Fig. S3a and S3b) further showed that most genomic regions have a GC content narrowly centered around 37%, which is similar to that of other three roundworms (Ascaris suum , Parascaris univalens and Toxocara canis ; Table 1) (Jex A R 2011; Zhu et al., 2015). BUSCO scores against nematode and eukaryote databases were 91.7% and 92.8%, respectively (Fig. S4), reflecting the highest genome completeness among published roundworm genomes (Table 1). 19,262 protein-coding genes were predicted via ab initio, homology-based and RNA sequencing-aided methods (see Methods). KEGG, COG, TrEMBL, GO, Swissprot and InterPro (Fig. S6 and Fig. S7). The average length of coding sequences (CDS) was 1,052 bp with an average of 6.87 exons per gene, which is similar to that of other related roundworms (Table S4). To evaluate the completeness of the predicted protein-coding genes, we compared the length distributions of mRNA, CDS, exons and introns in B. schroederi with those in other five nematodes (Fig. S9 and Fig. S10).
Total repeats (DNA transposons and RNA transposons) accounted for 12.66% of the genome (Table S6-S8 and Fig. S11). Huge variation in the proportion of repeat content is found among published nematode genomes (from 1% to 48%) (A. Coghlan, Tyagi, Cotton, Holroyd, & Berriman, 2018; Schiffer, Kroiher, Kraus, Koutsovoulos, & Schierenberg, 2013). Transposable elements (TEs) account for 10.24% of the B. schroederi genome (Table S8), while TEs constitute 4.4% and 13.5% in the genomes of A. suum (Jex A R 2011) and T. canis (Zhu et al., 2015) , respectively. We identified a significant expansion of DNA transposons in B. schroederi compared to T. canis (Zhu et al., 2015) and A. suum (Supplementary Data 1) (Jex A R 2011). There are at least 64 DNA transposon families of which CMC-EnSpm, DNA and MULE-MuDR dominated the genome. We identified 17 long terminal repeats (LTRs) retrotransposon and 41 non-LTRs retrotransposon families (25 LINE and 16 SINE). Pao and Gypsy are the predominant LTRs, and CR1, RTE-RTE and L2 are the predominant non-LTRs. The number and size of the retrotransposon families are similar to those of other related roundworms (Ghedin et al., 2007; Jex A R 2011; Zhu et al., 2015).