3.4 | Analysis of spacer sequences and homology to foreign DNA
Spacers are the products of exogenous genetic elements, which records the encounters of host with invading DNA. There was a total of 536 type I-E, 168 I-E*, and 128 IV-A spacers present in 111 CRISPR/Cas systems. The homology search revealed that more than one-third of the spacers (39.30%, 327/832) were homologous to plasmids or phages. Specially, 173 type I-E spacers (32.28%, 173/536), 59 type I-E* spacers (35.12%, 59/168), and 95 type IV-A spacers (74.22%, 95/128) were homologous to plasmids or phages (Figure 5A). There were cases where one spacer sequence targeted both the plasmid and the phage. Besides, we observed that type I-E and IV-A systems displayed a target bias towards plasmids, whereas type I-E* systems exhibited preference for phage targets (Figure 5A).
Through BLASTp analyses against the NCBI database, the function of proteins targeted by spacers was further investigated. If the hypothetical proteins, unknown and non-coding regions were not considered, conjugation transfer proteins were the most commonly targeted, such as TrbI, TrbD, and TrbH (Figure 5B). Notably, spacers from K. variicola targeted MGEs from multiple species, includingK. pnenumoniae , Escherichia coli, Salmonella enterica, and other species (Figure 6 and Table S6-S8). Moreover, 103 strains have at least one spacer targeted MGEs from K. pneumoniae .