3.3 | Relationship between CRISPR distribution and
MLST
Based on MLST results, 243 different sequence types (STs) were
identified in 427 K. variicola strains, but 60 strains were not
assigned to a defined ST due to the limited information in PubMLST
database. The most prevalent ST was ST20 (18/427, 4.22%), followed by
ST60 (14/427, 3.28%) and ST10 (11/427, 2.58%). Further analysis found
that the distributions of type I-E and type I-E* systems were strong
associated with MLST, but type IV-A system was scattered throughout the
whole genetic lineage (Figure 3). For example, once one strain within
one ST harbor type I-E or I-E* system, all strains within the same ST
were type I-E-positive (e.g., ST20, ST92, ST108, ST137) or I-E*-positive
(e.g., ST188). However, this phenomenon was not found in strains
containing type IV-A system.
To further clarify the relationship between CRISPR evolution and MLST, a
hierarchical clustering analysis was performed based on the presence of
spacers. Likewise, there was a strong association between the spacer
contents of type I system and MLST (Figure 4A and 4B). For example, all
ST20 strains harbored relatively conserved type I-E spacer contents.
Likewise, type I-E* spacers also showed obvious aggregation (e.g.,
ST188). Differently, type IV-A spacer contents were random across MLST.
As shown in Figure 4C, type IV-A spacers compositions of ST271, ST115,
and ST148 were highly similar though they were phylogenetically
unrelated.