Statistical Analysis
Data was analyzed using four key seasonal phases during our study
period, defined by snowpack thickness and vegetation phenology:Snow Melt (7 to 23 June; day of year (DoY) 158 to DoY 174);Green-Up (24 June to 20 July; DoY 175 to DoY 201); Peak
Growing Season (21 July to 23 August; DoY 202 to DoY 235); andLate Growing Season (24 August to 10 September; DoY 236 to DoY
253). More details on how the “seasonal phases” were chosen and
defined can be found in Hrach et al. (2021). Furthermore, the study
period data was divided based on Stable Shade with consistent
average daily shade of around 2 hours (7 June to 30 July) andDynamic Shade with increasing average daily shade (31 July to 10
September). Horizon shade results from Hrach et al. (2021) were used in
a linear regression model to understand the general influence of shade
(independent variable) on the water and carbon fluxes (dependent
variables) during the course of the entire study period.
In order to better understand the effect of shade and other
environmental controls on explaining the temporal variability in ET,
GPP, and WUE at our site, attention was focused on the Peak
Growing Season . This avoided the confounding effects of changing
phenology (i.e. increasing leaf area during Green up or
senescence during Late Growing Season ). It was assumed
that maximum leaf area occurred during Peak Growing Season , which
was confirmed by GPP data as the time of peak productivity. This was
also the only season which had both Dynamic shade andSteady shade days during our study. Since we were also interested
in WUE differences between Dynamic and Steady shade, the
dataset was further filtered to have only daytime values
(Rg>50 µmol/mol), since during the night GPP=0 and WUE
should be zero by definition. Finally, only observed data was used in
this analysis, such that there were no gap-filled observations.
With the filtered data set (n=247, with 167 data points fromDynamic shade period and 80 from Steady shade days),
general additive modelling (GAM) was used to investigate which of the
following environmental variables best described the temporal
variability in ET and GPP: incoming photosynthetically active
radiation (used as a surrogate for incoming shortwave radiation, Rg),soil moisture – an average of 2-10cm depths (SM), soil
temperature at 2 cm depth (Tsoil.2cm), vapour pressure deficit(VPD), and shade as a factor (fShade, which was 0 for Steady
shade data and 1 for Dynamic shade data). Day of year (DoY) and time of
day (Hour) were also used as explanatory variables to account for any
temporal variability. Adding time into the autocorrelation structure did
not improve the quality of model as much as adding time directly as an
input variable. Additional variables, such as air temperature, relative
humidity, soil temperature at other depths (5 and 10cm, as well as a
mean of 2-10cm), net radiation and ground heat flux, were also
considered at the beginning, but all of these variables were highly
correlated with the final chosen set, so they were not included in the
model selection runs. Only variables that had the highest correlations
with the response variables (i.e. ET and GPP) were tested in the models.
The “mgcv” R-package (Wood, 2004, 2011, 2017) was used for this
analysis. Model selection was based on Akaike’s Information Criterion
(AIC), where the model with the largest change in AIC (ΔAIC) was deemed
best fitting. Models with delta-AIC less than 2 were deemed similar.
Once the best fitted model was determined, we reran the model, excluding
one of the key explanatory variables at a time and plotted the observed
vs predicted values. The adjusted R2 of the 1:1
relationship between the observed and predicted values from such models
was used to assess the relative contribution of each of the explanatory
variables in the final best-fitted model to explaining the temporal
variability in the response variable modelled.
All other statistical analyses were performed and summarized with
packages dplyr , reshape2 (Wickham, 2007), tidyr (Wickham & Henry,
2019), and illustrated with ggplot2 in RStudio. Prior to any analysis,
data was assessed for normality through a Shapiro-Wilks normality test,
which concluded that all daily data used in this analysis was normaly
distributed (p>0.05), other than the hill shade model
output. Since the hill shade results are the independent variable and
the remainder of the data was accepted by the Shapiro-Wilks test,
parametric testing was acceptable to use in the statistical analysis.