Figure Legend
Figure 1 CRISPR/Cas systems in K. variicola . (A) The pie chart shows the proportion of 105 strains with and without CRISPR/Cas systems. The stacking plot represents the number of each CRISPR/Cas subtype. (B) Schematic diagram of each CRISPR/Cas subtype. Different colors are used to represent the CRISPR/Cas system and its gene neighbor. The K. variicola type I-E system is located between thecysH and iap genes, and the cas genes are shown in blue. The type I-E* system is located on the downstream of the ABC transporter system, and the cas genes are shown in green. The type Ⅳ-A system is adjacent to the umuD and umuC , and thecas genes are shown in orange. The repeats and spacers of CRISPR are represented by white diamonds and black rectangles, respectively. All genes are drawn to scale.
Figure 2 (A) Visualization and alignment of nucleotides of repeat sequences in each CRISPR/Cas subtype. (B) PAM prediction of each CRISPR/Cas subtype. (C) Secondary structure prediction and MFE value of each CRISPR/Cas subtype. Nucleotide bases are colored by base-pair probability. (D) The repeat length of three types of CRISPR/Cas system. (E) The number of three types of CRISPR/Cas system. The box plot represents the number of spacers detected in each subtype. *: p< 0.05, **: p < 0.01, ***: p< 0.001, ns: non-significance.
Figure 3 Phylogenetic tree based on MLST. From inside to outside, the colored rings indicate the type I-E, type I-E*, type Ⅳ-A CRISPR/Cas systems, respectively. The short band indicates that thecas genes in the system is incomplete, and the long band indicates that the CRISPR/Cas system is complete.
Figure 4 Hierarchical clustering analysis of spacer arrangements in type I-E (A), type I-E* (B), type IV-A systems (C). The various colors correspond to different STs, and the STs are marked on the left. The blue and gray squares denote the presence and absence of spacers, respectively.
Figure 5 (A) Comparison of detected spacer-protospacer matches for type I-E, type I-E*, and type IV-A systems. (B) Spacers target specific exogenous genes. Nearly one-quarter of K. variicolaspacers have a match to genes with informative annotations, and the proportion of gene matching is presented.
Figure 6 The network ofK. variicola and MGEs from other species based on protospacer-spacer matches in type I-E (A), type I-E* (B), type IV-A (C) systems. Nodes indicate individual spacers and edges represent CRISPR spacer targeting based on spacer-protospacer matches.
Figure S1 Comparison of cas genes between K. variicola strain WCHKP19 and 9 K. variicola strains. The genes (cas3, cse1, cse2, cas7 , and cas5 ) responsible for the interference are shown in blue. The gene (cas6 ) involve in expression is highlighted in red. The adaptation genes (cas1 andcas2 ) are depicted in orange. The repeats and spacers are represented by white diamonds and black rectangles, respectively. All genes are drawn to scale.