2.4 | Ensembles of alternative conformations.
Following the success of deep-learning methods for single structures, it
is increasingly important to assess methods for predicting ensembles of
alternative conformations. While deep learning and other methods have
the potential to generate ensembles in some circumstances, these
abilities have never been rigorously tested. In CASP15, we made a first
attempt to include this category. For CASP purposes, we categorize
ensembles 8 as: (1)
cases where a macromolecule populates multiple conformations under the
same environmental and chemical conditions (including intrinsically
disordered proteins or parts of proteins; vibrational motion; local
alternative conformations; ‘ghost’ conformations which are present at
low level but are dominant in other conditions; and folding
intermediates.). (2) Cases where a macromolecule adopts different
conformations in response to environment or chemical change (ligand
binding; macromolecular complex formation; post-translational
modification; mutations; and crystal, pH and other environmental
changes). A third category of ensembles we consider is the set of
conformations consistent with the experimental data. The latter is an
increasingly important category both because of the now common high
accuracy of the computed structures and the inclusion of lower
resolution data in CASP.
Targets for alternative conformers do not require separate prediction
formats as they are 3D structures routinely processed in CASP, but they
do require a mechanism for submitting multiple models. In CASP15, this
need was handled in two different ways. In some cases, different
alternative conformations were treated as separate targets. In
particular,
- two targets were assigned for modeling an isocyanide hydratase
represented by a wild-type structure (target T1110) and its one-point
mutant T1109, where amino acid D183 was changed to A183,
- two targets (R1107 and R1108) were assigned for modeling human and
chimpanzee CPEB3 ribozymes, which differ by a single nucleic acid A30
(human) → G30 (chimpanzee),
- two pairs of targets (TR1189 and TR1190) were assigned for modeling
complexes of the metabolite repressor protein (RsmA) and a non-coding
RNA (RsmZ). Both complexes contain one RNA molecule but different
number of protein molecules (6 in TR1189 and 4 in TR1190),
- five targets (T1158v0-v4) were assigned for modeling a type IV ABC
transporter, where five different conformations have been observed,
depending on environmental conditions (ligand binding).
In other cases, participants were encouraged to submit multiple
conformers using the standard CASP five models target format. This
approach was used for
- three kinases (CASP targets T1195-T1197), each of which has two to
three sets of experimental coordinates representing different
conformations,
- the Holliday Junction complexes (targets T1170, H1171, H1172), some
subunits of which are deformed due to the contact with DNA and other
protein molecules in the complex,
- RNA origami target R1138, which was solved in a kinetically trapped
young state and the mature state,
- SL5 domain of the RNA betacoronavirus structure BtCoV-HKU (CASP target
R1156), where one of the helices accepted multiple relative
conformations with respect to the remainder of the structure.