Data analysis
All tests were done using R (version 3.6.1) and RStudio (1.2.1335).
Graphs were created with the ggplot2 package in all cases (Wickham,
2016). As the data was not normally distributed, generalized linear
mixed models (GLMM) were used with the appropriate error structure for
analyzing the fecundity data and sperm viability. In the first case, as
the data suffered from an excess of zero counts, we used zero-inflation
models as advised by Zuur and colleagues (Zuur et al., 2009). We
modelled both the likelihood of sterile replicates as well as the number
of offspring in one model, combining both a binomial and a count part
(with a negative binomial distribution) in one model. This needed
packages pscl (Jackman, 2017), lmtest (Zeileis and Hothorn, 2002) and
glmmTMB (Brooks et al., 2017). The model contained the following
factors: temperature, day and day2, as well as the
interactions between temperature and day and
temperature-day2. For the sperm viability analysis, we
accounted for pseudoreplication as the sperm from the same male was
measured in three time points. In this case, time point, and temperature
were included in the model as fixed factors.
Fertility in the fecundity assay, male organ size (wing length, AG and
SV size), the egg-to-adult survival as well as the behavior in the sperm
competition experiment were analyzed with generalized linear models
(glm) with the appropriate error structure and correction for
overdispersion using the quasi-extension if necessary. Significance of
factors was tested through an analysis of deviance by subtracting a
factor from the full model and tested with an F - or Chi-square
distribution as appropriate for the error structure (Crawley, 2007). We
present models with only the retained significant factors. Most of the
statistical analysis were done in two different ways: in the first case,
all five treatments (developmental temperature and opportunity to
recover or not) were considered separately by coding them as five
different treatments. In the second approach we instead included larval
temperature and recovery as different factors, but this precluded us
from using data from control males, allowing comparisons only among
heat-challenged males. As control males both remained at their
developmental temperature and were ‘allowed’ to recover it was not
possible to assign them to either level for the factor recovery and thus
precluded us from coding this as two independent treatment factors with
a full-factorial design. We report always the first approach unless the
contrary is specified.
A Chi-square test was applied to analyze sperm presence in the SVs and
the mating and remating rates in the sperm competition experiment
(Dytham, 2011). Allometry between AG size and wing length was tested by
using a regression. For that, both variables were converted into the
same units (µm2) and the data was transformed to a log
scale for the analysis (Shingleton et al., 2007). Day was included as a
fixed factor in the model to account for the ontogenetic allometry.
Package multcomp (Hothorn et al., 2008) was used for the post-hoc
comparison of wing length. Pairwise comparisons using t tests were used
for analyzing differences between temperature treatments in the AG size.