Database of curated H8-L-H9 motif sequences from bacterial GluRS
Is the unique D-helix interacting H8-L -H9 motif in GluRS a “protein” signature of tRNAGlx-discrimination that complements the GluRS-discriminatory D-helix signature on “tRNAGln” ? To address this question, we first classified the H8-L -H9 motif of bacterial GluRSs. Subsequently, we sought a correlation between different classes of H8-L -H9 motif and the intrinsic tRNAGln-discrminatory character of GluRSs they belong to. In order to analyze GluRS sequences with a focus on the H8-L -H9 motif, a comprehensive and curated bacterial GluRS sequence database is required. We had earlier curated such a database (4), based on the presence/absence of a second copy of GluRS, the presence of GlnRS and the presence of gatCAB in each bacterial genome.
The presence of GlnRS in the genome signifies that the corresponding GluRS in the genome is tRNAGln-discriminatory (D-GluRS). Further, the GluRS is designated as D(-) if the genome lacks gatCAB (for which GluRS must strictly be tRNAGln-discriminatory since misacylated Glu-tRNAGln cannot be transformed to Gln-tRNAGln) or D(+) if the genome contains gatCAB (the GluRS may not be strictly tRNAGln-discriminatory, since misacylated Glu-tRNAGln can still be transformed to Gln-tRNAGln by gatCAB). The absence of GlnRS in the genome (in this case the genome always contains gatCAB), and the presence of a single copy of GluRS in the genome signifies that the genomic GluRS is tRNAGln-non-discriminatory (ND-GluRS). When the genome lacked GlnRS but contained twin copies of GluRS, the GluRSs are designated as T1-GluRS and T2-GluRS. To summarize, D(-)-GluRS glutamylates only tRNAGlu and is strictly discriminatory against tRNAGln, D(+)-GluRS glutamylates tRNAGlu and possibly discriminates against tRNAGln, ND-GluRS glutamylates both tRNAGlu and tRNAGln. Experiments performed on a few twin GluRSs (10,11) suggest that T1-GluRS glutamylates tRNAGlu and discriminates against tRNAGln, while T2-GluRS possibly glutamylates tRNAGln and not tRNAGlu.
Following this nomenclature scheme, complete genomic sequences of 433 bacterial species were analyzed from the KEGG database (www.genome.jp/kegg) and annotated as D(-)-GluRS, D(+)-GluRS, ND-GluRS, T1-GluRS and T2-GluRS. Table S1 shows the sequence alignment of GluRS H8-L -H9 motifs for all bacterial GluRSs sequences used in this work, annotaed with the organism name (3 letter code used in the KEGG database) and the tRNAGlx-discriminatory status, as arrived from whole genome analysis.