Statistical analyses
The three primary response variables recorded from the experiment were diapause induction rate, development rate, and body weight. Variation in diapause induction was tested using a generalized linear model with a logit link function and diapause/nondiapause as the binary response variable. Population, sex and treatment (six-level factor; see Fig. 1) were used as explanatory variables.
Development rate was defined as 1/d , where d is the time needed to complete a given stage of development. Three intervals were separately analyzed: the time from hatching to the second molt (instars 1+2; this data only available for Öland and Stockholm), the time from the second molt to the third molt (instar 3), and the time from the third molt to pupation (instar 4). For each of these analyses a three-way Anova was used, with treatment, sex and population as explanatory variables. In one of the six treatments (larvae that had been switched from long to short days in the third instar), development rate in the fourth instar in particular was strongly bimodal according to diapause decision (Fig. S1). For this reason, this treatment was split by diapause decision, giving seven treatment levels instead of six, when analyzing fourth-instar development rate.
Finally, weight was analyzed as a repeated measurement, using a mixed linear model with individual treated as a random effect. Developmental stage (third instar/fourth instar/pupa), treatment (six levels), sex and population were used as fixed effects, hence testing for differences in weights between treatments at different points in development. Because larvae grow more or less exponentially in size, weights were log-transformed in order to scale values across the time axis.
All analyses were carried out in R version 3.6.1 (R Development Core Team, 2019). For each analysis, all fittable two- and three-way interactions between the explanatory variables were tested, and nonsignificant interactions were removed stepwise (in order of highest p-value) so as not to sacrifice statistical power (Engqvist, 2005). The significance of model terms (α=0.05) was evaluated using analysis of variance (for continuous responses, i.e. weight and development rate) or analysis of deviance (for binomial responses, i.e. diapause) with the Anova function from the car package (Fox & Weisberg, 2019). The final models are shown in Tables S1 and S2. Because larvae were shifted between photoperiod regimes as they developed, the actual number of unique conditions experienced was two, then four, then six, depending on the stage of the experiment (Fig. 1). To address this, planned contrasts were applied to the final models for development rate and weight, in order to pool and compare larvae that had experienced the same conditions up until a given point. At the start of the third instar, the only contrast was long days vs short days. At the start of the fourth instar, long vs short days were contrasted, and larvae that had switched photoperiods in the previous instar were additionally contrasted with their respective photoperiod of origin. At pupation all six treatments were distinct, so all pairwise comparisons were made, using Tukey’s HSD method to compensate for multiple testing. All contrasts were applied using the emmeans package (Lenth, 2020), and were calculated without controlling for population except where stated otherwise. All treatment contrasts are summarized in Tables S3-S6.