Introduction

In recent years the progress in there-dimensional (3D) protein structure prediction was impressive1. Application of deep learning-based methods now allows modeling of structures for most of the individual proteins2–4. However, the majority of proteins do not function in isolation. They usually perform their functions by interacting with other proteins and assembling into stable or transient protein complexes. Therefore, if we wish to have a detailed understanding of how proteins function, the knowledge of the structures of individual proteins is not sufficient. We need to know the structures of corresponding protein complexes.
The number of possible binary protein-protein interactions is much higher than the number of proteins encoded in genomes, and only a small part of these interactions have already been discovered experimentally5, 6. Similarly, the number of different structural types of protein complexes is predicted to be much higher than the number of protein folds7, 8. Therefore, the structural modeling of protein-protein interactions represents a more complex problem than the prediction of structures for individual proteins.
Currently, two main methods are available to model the structures of protein complexes. Template-based modeling is based on the observation that homologous proteins often interact in the same way9. Thus a known structure of a protein complex can serve as a template for modeling homologous protein complexes. If there are no templates, protein-protein docking methods can be used5. Docking methods aim to find how proteins interact with each other starting from known structures of individual subunits that can be either solved experimentally or obtained by computational modeling.
The field of protein structure prediction is monitored in the Critical Assessment of Structure Prediction (CASP) experiments that explore every aspect of protein structure modeling1. The Critical Assessment of PRedicted Interactions (CAPRI) experiments are devoted to the prediction of the structures for diverse protein complexes10. Both CASP and CAPRI are based on blind testing. The participants are given the sequences of proteins, for which structures are solved experimentally but not published (termed “targets”), and then they are asked to provide structural models. Subsequent comparison of models with the experimental structures enables establishing the current state-of-the-art in the field and also objective comparison of different methods. In recent years, CASP and CAPRI experiments are collaborating in the area of structural modeling of protein complexes11, 12, and a category dedicated to assessment of multimeric proteins has been established in CASP as well13, 14.
We participated in recent CASP and CAPRI experiments, aiming to test our abilities to predict structures of protein complexes using template-based modeling and free docking15–17. Our results demonstrated that there is room for improvement in both of the methods. In template-based modeling, the identification of templates can be enhanced. In docking, the assessment and selection of correct interfaces from thousands of diverse docking models is probably the most important problem. It is also interesting to see how the progress in protein structure prediction influences modeling of protein complexes. At present, it is often possible to generate sufficiently accurate models of individual proteins, but does this help to predict the protein-protein interaction interfaces?
To explore these questions in detail, we participated in the CASP14 experiment, where our group (“Venclovas”) performed relatively well, particularly in the interaction interface prediction. In this article we describe our modeling methods and analyze our results in detail aiming to understand what went right, what went wrong and why.