Repetitive sequences found in the assembled genome
RepeatMasker program (Tarailo-Graovac, & Chen, 2009) estimated that repeat elements occupy 43.5% (196,045,652 bp) of the assembled genome (Table 1). Except for ‘unclassified’ repeats, LINE is the largest superfamily of repetitive sequences in S. ricini (Figs. 2A, B). Interestingly, although the total length of LINE and its proportion to all repetitive sequences in the genome were similar between S. ricini and B. mori (Figs. 2A, B), the components of families of LINE were different. Table S8 shows the copy number of each LINE family in S. ricini and B. mori genomes. For example, while the CR1-Zenon family was the largest LINE family in S. ricini , the largest family in B. mori was Jockey. Given these results, although both S. ricini and B. mori have larger amounts of repetitive sequences in the genome than other lepidopteran species do (Fig. 2A), the expansion of repetitive sequences seems to have occurred in parallel and independently on their own phylogenetic branches.
Another noteworthy feature was that the S. ricini genome contains considerably small amounts of SINE (Fig. 2A). While the B. morigenome showed a large proportion of SINE (19.4% of all repetitive sequences), SINEs in S. ricini genome occupied only 0.0588%. This finding also supported the hypothesis of parallel and independent expansion of repetitive sequences.