Introduction
In recent years the progress in there-dimensional (3D) protein structure
prediction was impressive1. Application of deep
learning-based methods now allows modeling of structures for most of the
individual proteins2–4. However, the majority of
proteins do not function in isolation. They usually perform their
functions by interacting with other proteins and assembling into stable
or transient protein complexes. Therefore, if we wish to have a detailed
understanding of how proteins function, the knowledge of the structures
of individual proteins is not sufficient. We need to know the structures
of corresponding protein complexes.
The number of possible binary protein-protein interactions is much
higher than the number of proteins encoded in genomes, and only a small
part of these interactions have already been discovered
experimentally5, 6. Similarly, the number of different
structural types of protein complexes is predicted to be much higher
than the number of protein folds7, 8. Therefore, the
structural modeling of protein-protein interactions represents a more
complex problem than the prediction of structures for individual
proteins.
Currently, two main methods are available to model the structures of
protein complexes. Template-based modeling is based on the observation
that homologous proteins often interact in the same
way9. Thus a known structure of a protein complex can
serve as a template for modeling homologous protein complexes. If there
are no templates, protein-protein docking methods can be
used5. Docking methods aim to find how proteins
interact with each other starting from known structures of individual
subunits that can be either solved experimentally or obtained by
computational modeling.
The field of protein structure prediction is monitored in the Critical
Assessment of Structure Prediction (CASP) experiments that explore every
aspect of protein structure modeling1. The Critical
Assessment of PRedicted Interactions (CAPRI) experiments are devoted to
the prediction of the structures for diverse protein
complexes10. Both CASP and CAPRI are based on blind
testing. The participants are given the sequences of proteins, for which
structures are solved experimentally but not published (termed
“targets”), and then they are asked to provide structural models.
Subsequent comparison of models with the experimental structures enables
establishing the current state-of-the-art in the field and also
objective comparison of different methods. In recent years, CASP and
CAPRI experiments are collaborating in the area of structural modeling
of protein complexes11, 12, and a category dedicated
to assessment of multimeric proteins has been established in CASP as
well13, 14.
We participated in recent CASP and CAPRI experiments, aiming to test our
abilities to predict structures of protein complexes using
template-based modeling and free docking15–17. Our
results demonstrated that there is room for improvement in both of the
methods. In template-based modeling, the identification of templates can
be enhanced. In docking, the assessment and selection of correct
interfaces from thousands of diverse docking models is probably the most
important problem. It is also interesting to see how the progress in
protein structure prediction influences modeling of protein complexes.
At present, it is often possible to generate sufficiently accurate
models of individual proteins, but does this help to predict the
protein-protein interaction interfaces?
To explore these questions in detail, we participated in the CASP14
experiment, where our group (“Venclovas”) performed relatively well,
particularly in the interaction interface prediction. In this article we
describe our modeling methods and analyze our results in detail aiming
to understand what went right, what went wrong and why.