It is important to highlight that this study is primarily intended to establish the reliability of the statistical criteria for evaluation of performance and transferability of functionals. While the reported rankings can be used to establish some trends, the list of functionals is not comprehensive enough to provide reliable final suggestions on which functional to pick among the more than 300 available in the literature. Some conclusion on the performance and transferability of the considered functionals are still interesting to report, and are as follows:
These results are strengthened by the fact that the majority of the highlighted functionals overlap with the top performers suggested in recent reviews by Head-Gordon’s \cite{mardirossian_thirty_2017}, Goerigk’s \cite{goerigk_trip_2019}, and Grimme’s \cite{goerigk_look_2017} groups, obtained with larger databases and considering a broader spectrum of functionals.  Finally, connecting the transferability results to the issue of counting the number of parameters presented in Section \ref{816226}, the summary of the results plotted in Fig. \ref{898028} demonstrates a clear lack of correlation between the average ranking of each functional and its number of degrees of freedom. This lack of correlation supports the main message of this work: The number of fitted parameters does not represent an effective measure of the transferability of a functional. More reliable statistical criteria—such as those developed in this work, or alternatively, the probabilistic performance estimator recently introduced by Pernot and Savin \cite{pernot_probabilistic_2018,pernot_probabilistic_2020}—should be used to evaluate the reliability of new and existing xc functionals.