I think there is a fundamental flow of logic here, that ultimately hurts the value of this paper. In practical research settings where conformed sampling is used, there is no access to 3D geometries obtained with high-level QM methods. Therefore, I think the meaningful comparison would be conformed energies with geometries obtained by respective approximate methods.
As noted in the introduction, we performed this exact analysis in our previous paper in IJQC \cite{Kanal_2017}. We found reviewers (across several journals) and many readers found the analysis of differing optimized geometries confusing. Consequently, we focused this work on a limited question - "how well do single points correlate?" This limited question was, in fact, suggested by multiple reviewers of our previous paper.
We agree with the reviewer that analysis of optimized geometries from different methods is a useful concern, but believe it is beyond the scope of this paper – which already considers ~6500 single points and over 30 methods.
In the revised manuscript, we have discussed the consequences for conformer sampling in the conclusions – that is, we suggest readers use methods such as the ANI models, GFN methods, or even faster density functional methods such as B97-3c to optimize and rank relative conformers.