Figure 2. A framework for model benchmarking for different purposes. Light shading indicates the need for decisions about how temporal ands spatial observations and their uncertainties are used to define limits of acceptability. Learning from model rejections indicates an area of research that is largely unexplored (though intrinsic to most model development).