3.2.2 Multivariable regression analysis
A multi-variable regression analysis was undertaken to identify any
significant relationship between the same 350-year water level variation
and a range of sets of variables. In order to find a plausible
regression model, different predictors were taken into consideration, as
well as their possible interaction terms. To conduct the analysis in a
systemic way and explore all the possible combinations of the
predictors, a maximum of ten input variables have been considered: lake
area, watershed area, outflow channel width, peak discharge (return
period of 350 years) and their six associated interaction terms (i.e.
lake area multiplied by watershed area, lake area multiplied by outflow
channel width, etc.). Most of the regression models showing p-values
lower than the threshold value of 0.05 are associated with high RMSE or
very low values of adjusted R squared. Moreover, none of the regression
models proved to be robust: by simply removing a few stations from the
sample and re-running the fitting we can obtain models that use
completely different variables as predictors.