2.3 | Estimation of model accuracy (EMA) for oligomeric targets. 
The EMA category has been an integral part of every CASP experiment starting with CASP741-48. It has attracted the attention of many developers, with over 70 methods tested in the previous CASP experiment48. An emphasis on the importance of this category led to very positive developments in protein structure prediction as modelers now routinely integrate quality estimates into their modeling pipelines. In particular, the CASP14-winning AlphaFold2 method offers reliable estimates of global and local accuracy of their models10,11.
In CASP15, the focus of the EMA category shifted from predicting accuracy of single-sequence proteins to multi-molecular complexes.
2.3.1 | Model accuracy prediction format(https://predictioncenter.org/casp15/index.cgi ?page=format#QA). For global (whole model) accuracy prediction (QMODE1), participants are asked to submit a fold similarity score (SCORE, in 0-1 range), which estimates the similarity of model’s overall fold to the target’s one, and an interface similarity score (QSCORE, also in 0-1 range), which evaluates reliability of quaternary structure interfaces. Submitting the QSCORE is optional, and predictors can skip it by putting ‘X’ symbol in the corresponding place of a QA prediction (see the link above). In QMODE2 (local accuracy), in addition to the QMODE1 scores, the predictors are asked to assign confidence scores to the interface residues of the model, indicating their likelihood of being present in the native structure’s interface. Interface residues are identified as having contact with at least one residue from a different chain, with a Cβ-Cβ distance not exceeding 8Å (or Cα, if the residue is Glycine).
Examples of EMA predictions in QMODE1 and QMODE2 are provided in Example 5 on the CASP15 format page.
2.3.2 | Submission collecting process. EMA predictions in CASP15 are requested for all (and only) multimeric targets. In contrast with previous CASPs, EMA targets are released after all models (and not only server models) are collected on the corresponding structure prediction target. A tarball with assembly predictions from all CASP groups is created the next day after the TS target closure, and a link to the tarball file is pushed to the EMA servers and posted at the CASP15 website. All EMA groups, regardless of their type (i.e., ‘server’ or ‘human’) have 2 days to return accuracy estimates for TS models included in the tarball file. The predictions are checked with the verification scripts, and successful predictions are saved for subsequent evaluation.
2.3.3 | EMA evaluation measures. Global predictions were compared with established evaluation metrics possessing the desired attributes. This is the oligomeric Template Modelling score (TM-score)49 for overall topology (SCORE) and the contact based QS-score50 which is interface centric (QSCORE). To ensure a comprehensive evaluation, these metrics were supplemented with additional measures. An oligomeric GDT-like score, referred to as oligo-GDTTS, was employed for overall topology analysis, and a variant of the interface centric DockQ score51. Notably, DockQ evaluates pairwise interfaces, necessitating the introduction of a weighted average metric—termed DockQ-wave—to effectively score higher-order complexes. Local predictions were compared against the per-residue lDDT 17 and CAD (AA-variant) 52scores, which assess the accuracy of relative atom positions in the neighborhood, including neighboring chains. Conceptually the scores are contact-based, but do not penalize for added contacts, which is relevant in case of incorrect interfaces. To address this limitation, two novel local variants of the QS-score and DockQ have been introduced: PatchQS and PatchDockQ. All evaluation metrics are described in detail in the CASP15 EMA Assessment paper6.