Intradomain insertions
Loops are present within domains as well, usually on the surface between
two secondary structure elements. From the point of view of a protein
engineer seeking to insert protease cut sites, purification tags, or
domains, known loops tolerant of significant changes to their
composition and/or length were catalogued; thus, repetitive elements
that alter secondary structures or undefined structures were not
considered (Figure 4, Table II).
AT region [including FSD, n=949): Most of the observed insertions
(19/34) were before the first β-strand or after the last β-strand of AT,
in the FSD [especially α1-α2 (8) and α3-β3 (10), but also β3-β4 (5)
and the loop following η4 (4)]. Within the AT domain, the loops that
tolerated the most insertions are α10-β7 (3) and α15-β12 (3).
DH [n=599]: The longest loop, β9-β10, is also the most frequently
inserted (16 instances, including a GT dipeptide repeated 20 times in
the cyclizidine PKS). The next most frequently inserted loop is η5-β14
(6), adjacent to the active site.
KR region of β-modules [including the dimerization element (DE),
n=279]: None of the 3-helix DE, present in 77% of the β-modules,
contain insertions [27]. Although KRs and
KRc are similarly sized, KRs contains
significantly more insertions (19 vs. 2), with β3-α5 (5) and η1-β6 (5)
being the most frequently inserted loops.
KR from γ- and δ-modules [n=599]: Even with the region upstream of
α2 in KRs not being analyzed due to low sequence
conservation, KRs contains more insertions than
KRc (18 vs. 4). The most frequently inserted loop of
KRc is α6-β6 (2).
ER [n=158]: Each of the 4 observed insertions are located in the
N-terminal portion of the substrate-binding subdomain.
ACP [n=949]: No insertions were observed within this 100-residue,
helical domain. This analysis includes α1 (often referred to as “helix
0”), which is rarely absent[26].
DDs [n=388]: Most of the docking domain motifs,CDD and NDD, could be grouped into
Class 1a (n=226), Class 1b (n=63), or Class 2 (n=70) (Supplementary
Figures 9-11)[7]. No insertions were observed in the Class 2 docking
motifs. The majority of the insertions in Class 1a and Class 1bCDDs were immediately upstream of the terminal helix
(8/12 and 4/6). The only NDDs that possess insertions
at their upstream end belong to Class 1a (3).
KS [n=949]: Two insertions were observed, both in β13-β14, the most
downstream loop.