Statistical analyses
Statistical analyses were conducted using R (3.5.2; R Development Core Team, R Foundation of Statistical Computing, Vienna, Austria). In our first analysis, we used a generalized linear mixed model (lme4package; Bates et al. 2015) to test whether the number of discrete songs per hour varied among the six breeding stages. We included the discrete song rate from a given recording session as the response variable, the breeding stage (i.e., no nesting duties, nest building, egg stage, nestling care, fledgling care, non-breeding) observed that same day as a fixed factor, and subject identity (1–12) as a random effect to account for possible dependencies among multiple recording sessions of the same male. The response was modeled with a negative binomial distribution and log link. The overall statistical significance of breeding stage was tested using the Anova function of the car package (Fox and Weisberg 2019). Post-hoc linear contrasts of estimated marginal means (emmeans package; Lenth 2021) were then used to compare discrete song rate between the breeding (i.e., mean of no nesting duties, nest building, incubation, nestling care, and fledgling care) and non-breeding seasons, between the no nest duty and nest duty stages (i.e., mean of nest building, incubation, nestling care, and fledgling care) of the breeding season, and between the nestling care stage and the other nesting stages (i.e., mean of nest building, incubation, and fledgling care). We could not repeat this analysis on rambling song because preliminary inspection of the data revealed that only 5% of all songs were rambling song, thus precluding reliable estimates of rambling song rates from our short recording sessions. For example, only 11 rambling songs were detected during the entire nestling care period.
In our second analysis, we used a generalized linear mixed model to test whether song perch height was associated with breeding stage or song type. The song perch height (m) of each song was included as the dependent variable, with breeding stage and song type as fixed factors and recording session (1–32) nested within subject identity (1–12) as a random effect to account for possible dependencies among multiple perch heights estimated from the same recording session of the same male. The response was modeled using a Poisson distribution with log link. After testing the overall significance of breeding stage and song type, post-hoc linear contrasts of estimated marginal means were used to compare song perch height between the breeding and non-breeding seasons and between the no nest duty and nest duty stages of the breeding season.
Results were considered statistically significant where P< 0.05. We used the DHARMa package (Hartig 2020) to validate the two statistical models. Its diagnostic tests, combined with visual inspection of scaled residual plots, indicated adequate model fit. We also simulated the responses of each model and compared the simulated data to the original data by overlaying semi-transparent histograms of each; in all cases, we found strong agreement between the simulated data and the original data.