Statistical Analysis
Data was analyzed using four key seasonal phases during our study period, defined by snowpack thickness and vegetation phenology:Snow Melt (7 to 23 June; day of year (DoY) 158 to DoY 174);Green-Up (24 June to 20 July; DoY 175 to DoY 201); Peak Growing Season (21 July to 23 August; DoY 202 to DoY 235); andLate Growing Season (24 August to 10 September; DoY 236 to DoY 253). More details on how the “seasonal phases” were chosen and defined can be found in Hrach et al. (2021). Furthermore, the study period data was divided based on Stable Shade with consistent average daily shade of around 2 hours (7 June to 30 July) andDynamic Shade with increasing average daily shade (31 July to 10 September). Horizon shade results from Hrach et al. (2021) were used in a linear regression model to understand the general influence of shade (independent variable) on the water and carbon fluxes (dependent variables) during the course of the entire study period.
In order to better understand the effect of shade and other environmental controls on explaining the temporal variability in ET, GPP, and WUE at our site, attention was focused on the Peak Growing Season . This avoided the confounding effects of changing phenology (i.e. increasing leaf area during Green up or senescence during Late Growing Season ). It was assumed that maximum leaf area occurred during Peak Growing Season , which was confirmed by GPP data as the time of peak productivity. This was also the only season which had both Dynamic shade andSteady shade days during our study. Since we were also interested in WUE differences between Dynamic and Steady shade, the dataset was further filtered to have only daytime values (Rg>50 µmol/mol), since during the night GPP=0 and WUE should be zero by definition. Finally, only observed data was used in this analysis, such that there were no gap-filled observations.
With the filtered data set (n=247, with 167 data points fromDynamic shade period and 80 from Steady shade days), general additive modelling (GAM) was used to investigate which of the following environmental variables best described the temporal variability in ET and GPP: incoming photosynthetically active radiation (used as a surrogate for incoming shortwave radiation, Rg),soil moisture – an average of 2-10cm depths (SM), soil temperature at 2 cm depth (Tsoil.2cm), vapour pressure deficit(VPD), and shade as a factor (fShade, which was 0 for Steady shade data and 1 for Dynamic shade data). Day of year (DoY) and time of day (Hour) were also used as explanatory variables to account for any temporal variability. Adding time into the autocorrelation structure did not improve the quality of model as much as adding time directly as an input variable. Additional variables, such as air temperature, relative humidity, soil temperature at other depths (5 and 10cm, as well as a mean of 2-10cm), net radiation and ground heat flux, were also considered at the beginning, but all of these variables were highly correlated with the final chosen set, so they were not included in the model selection runs. Only variables that had the highest correlations with the response variables (i.e. ET and GPP) were tested in the models. The “mgcv” R-package (Wood, 2004, 2011, 2017) was used for this analysis. Model selection was based on Akaike’s Information Criterion (AIC), where the model with the largest change in AIC (ΔAIC) was deemed best fitting. Models with delta-AIC less than 2 were deemed similar. Once the best fitted model was determined, we reran the model, excluding one of the key explanatory variables at a time and plotted the observed vs predicted values. The adjusted R2 of the 1:1 relationship between the observed and predicted values from such models was used to assess the relative contribution of each of the explanatory variables in the final best-fitted model to explaining the temporal variability in the response variable modelled.
All other statistical analyses were performed and summarized with packages dplyr , reshape2 (Wickham, 2007), tidyr (Wickham & Henry, 2019), and illustrated with ggplot2 in RStudio. Prior to any analysis, data was assessed for normality through a Shapiro-Wilks normality test, which concluded that all daily data used in this analysis was normaly distributed (p>0.05), other than the hill shade model output. Since the hill shade results are the independent variable and the remainder of the data was accepted by the Shapiro-Wilks test, parametric testing was acceptable to use in the statistical analysis.