3 A SET OF HIGHLY EXPRESSED UNKNOWN PROTEINS IN B.
SUBTILIS
A recent proteomic study with B. subtilis revealed that essential
proteins are highly overrepresented in the proteome (Reuß et al., 2017).
Although they account only for only 6% of all B. subtilis genes,
the essential genes use 57% of the translation capacity. This reflects
the importance of these proteins for the cell. On the other hand,
unknown and poorly studied proteins, which correspond to 25% of the
genes, only use 3% of the translation capacity under standard
conditions. It is very likely that many of the unknown proteins are only
needed under very special conditions. This special-purpose demand
combined with the low expression also explains why the function of such
proteins has never been identified. On the other hand, there is a set of
41 proteins of unknown function that are highly expressed under most
conditions (Table 1). Among these are the two essential proteins YlaN
and YneF. It is tempting to speculate that these highly expressed
unknown proteins are important for the physiology of B. subtiliseven under standard conditions. Interestingly, there are three pairs of
paralogous proteins on the list (YqhY/ YloU, YabR/ YugI, YtxH/ YhaH).
Moreover, several of the proteins have been detected in a recent global
in vivo interaction study with B. subtilis (see Table 1). The
complete set of highly expressed unknown B. subtilis proteins can be
easily accessed in the database SubtiWiki
(http://www.subtiwiki.uni-goettingen.de/v4/category?id=SW.6.7;
Pedreira et al., 2022b; see Fig. 1B).
Of the 41 proteins, several have or may have RNA-binding activity, among
them the most strongly expressed and highly conserved unknown protein
YqeY. This protein contains a domain also present in the GatB subunit of
the glutamyl-tRNA amidotransferase subunit suggesting that YqeY might
also have tRNA amino acid amidase activity. The YtpR protein possesses
an tRNA-binding domain that is also present in the beta subunit of the
phenylalanine tRNA synthetase. YtpR physically interacts with GatB
(O’Reilly et al., 2023) suggesting that it might facilitate the
interaction between the Glu-loaded tRNA and the glutamyl-tRNA
amidotransferase GatABC. The paralogous YabR and YugI proteins possess
an RNA-binding S1 domain. Both proteins interact with the small subunit
of the ribosome (O’Reilly et al., 2023) indicating their involvement in
translation. This may also be the case for YrzB, which physically
interacts with multiple ribosomal proteins (O’Reilly et al., 2023). The
YlxR and KhpA proteins were found to bind RNA in Clostrioides
difficile (Lamm-Schmidt et al., 2021). Finally, the YlbN protein is
conserved in most bacteria and plant chloroplasts. The orthologous
chloroplast and E. coli proteins are required for the
accumulation of 23S rRNA (Yang et al., 2016) even though the molecular
activity of the protein remains unknown.
Several of the highly expressed unknown proteins are likely involved in
the control of metabolism. The YjlC protein is encoded in an operon with
the NADH dehydrogenase, and the two proteins interact physically
(O’Reilly et al., 2023) (see Fig. 1A). It is tempting to speculate that
YjlC somehow controls the activity of NAD dehydrogenase. Indeed, both
proteins are required for genetic transformation (Koo et al., 2017)
indicating that they perform a joint function. The YlaN protein
interacts with the key regulator of iron homeostasis (O’Reilly et al.,
2023), and the normally essential gene becomes dispensable at high iron
concentrations (Peters et al., 2016). Thus, YlaN may control iron
homeostasis via Fur. The paralogous YqhY and YloU proteins are encoded
in operons with genes required for complementary aspects of fatty acid
acquisition, either biosynthesis or fatty acid phosphorylation. TheyqhY gene is quasi-essential and the cells respond to its
inactivation with the accumulation of suppressor mutations in the
subunits of acetyl-CoA carboxylase (Tödter et al., 2017). Thus, these
two proteins may control different aspects of lipid biosynthesis. The
strength of initial protein-protein interaction information is
demonstrated by the YneR protein which interacts with the PdhA and PdhB
subunits of pyruvate dehydrogenase. Targeted experimental studies
revealed that YneR acts as an inhibitor of pyruvate dehydrogenase
activity. The prediction of the YneR-PdhA-PdhB complex structure using
the power of artificial intelligence suggested that YneR protrudes into
the substrate binding site of pyruvate dehydrogenase, thus suggesting a
mechanism for inhibition. Indeed, site-directed mutagenesis based on the
predicted complex structure verified this mechanism (O’Reilly et al.,
2023). This example shows the power of association analyses.
Another interesting group of highly expressed unknown proteins consists
of rather small proteins in the range of 47 to 54 amino acids that are
encoded in the 5’ regions of highly expressed genes and that are
associated to an RNA element that is transcribed in the same
orientation. Moreover, the occurrence of these proteins is limited toB. subtilis and very close relatives in the genus Bacillus(see Table 1). All these features are reminiscent of regulatory elements
that are involved in mechanisms similar to attenuation. Actually, the
BrmB leader peptide of brmCD operon shares all properties with
respect to protein size, linkage to an RNA element, and occurrence only
in B. subtilis (Reilman et al., 2014).