3.2.2 Multivariable regression analysis
A multi-variable regression analysis was undertaken to identify any significant relationship between the same 350-year water level variation and a range of sets of variables. In order to find a plausible regression model, different predictors were taken into consideration, as well as their possible interaction terms. To conduct the analysis in a systemic way and explore all the possible combinations of the predictors, a maximum of ten input variables have been considered: lake area, watershed area, outflow channel width, peak discharge (return period of 350 years) and their six associated interaction terms (i.e. lake area multiplied by watershed area, lake area multiplied by outflow channel width, etc.). Most of the regression models showing p-values lower than the threshold value of 0.05 are associated with high RMSE or very low values of adjusted R squared. Moreover, none of the regression models proved to be robust: by simply removing a few stations from the sample and re-running the fitting we can obtain models that use completely different variables as predictors.