Main Body:
The aggregation of large datasets from museum records and community
science provides new power for macroscale ecological analyses. However,
“with great power must come great responsibility” (Lee 1962) and while
valuable, these datasets must be treated carefully. Prior work with
aggregate occurrence datasets has demonstrated spatiotemporal and
taxonomic biases that must be addressed (Troudet et al. 2017;
Ries et al. 2019), yet researchers still utilize these data “off
the shelf” without proper curation or appropriate modelling strategies.
Fric et al. 2020 used occurrence data for 100 species aggregated in the
Global Biodiversity Information Facility (GBIF) in temperate regions of
North America and Europe to track phenology across latitudes. Estimating
phenology metrics and trends from large occurrence datasets is possible,
but requires sufficient data density and appropriate statistical methods
(Taylor and Guralnick 2017). These data were too sparse to estimate
phenology across the axes of interest, and were insufficiently curated.
Further, an improper analytical model resulted in spurious patterns
which don’t make ecological sense. For instance, after “correcting”
for altitude and year, their results indicate that flight period onsets
are the same at low and high latitudes for most species, contrary to
expected patterns (Karlsson 2014). However, when we applied appropriate
data curation and methods, our results found later onset and shorter
flight periods at higher latitudes.
The data used in Fric et al. (2020) were insufficient for broad
phenological analysis. Datasets, defined by species and continent, were
analyzed with as few as 15 occurrence records across >20
degrees latitude and >100 years, a threshold far below any
reasonable standard. Phenological “onset” and “termination” of
flight periods were extracted simply as the first and last day-of-year
(DOY) values within latitudinal bands, pooled across all years and
altitudes. Even with sufficient data this would be problematic; here it
resulted in only one observation date being used as both “onset” and
“termination” of flight periods in an average of 20% of latitudinal
bands per species (Figure 1), so most analyses included latitudes with
flight periods of just one day.
The data in Fric et al. (2020) were also improperly curated. Spatial
precision and outlier detection were incomplete. Altitudes were
extracted using imprecise GIS coordinates, sometimes representing sea
floor or vague place names (eg, ”Mt Shasta”); other altitudes varied
significantly from GBIF ”verbatim elevation”. Across datasets, altitudes
were skewed (95.6% of observations in species’ lowest quartiles),
giving high altitude observations outsized leverage in regressions.
Temporal outliers were problematic; one species’ onset at 68° N was in
January, when the next occurrence across all latitudes was in June. No
sources were cited for species traits, and we found evidence documenting
additional generations in portions of their range for 22 of these
“obligate univoltine” species (Supplemental Table 1).
Finally, the regressions used in Fric et al. (2020) were not appropriate
for their questions. The phenometrics were regressed against latitude,
altitude, and year separately, without regard for spatiotemporal bias.
Regressions were considered “corrections” and residuals were fed into
new regressions, which failed to consider collinearity in the
explanatory variables. In single regressions with latitude, 12 species’
onset appeared earlier at higher latitudes and only 26 species showed
later onset at higher latitudes. Meanwhile “corrected” regressions
found onset for most species unaffected by latitude (Figure 2).
In our reanalysis, we applied stricter data standards and curation to
extract robust phenometrics. For 72 species (76 datasets) we confirmed
as univoltine, we filtered data for altitude (0-500m) and timing
(March-November). We calculated phenometrics for year-latitude
combinations with at least 10 observations. Fifty-four datasets did not
include at least three latitudes with data fulfilling these criteria,
making them insufficient for analysis (Supplemental Table 1). For the
remaining 22 datasets, all European, onset and termination were
estimated from a weibull distribution using R package phest (Pearse
2017) and bounded by days (60,330). We modeled each species phenometric
in a linear model (DOY ~ latitude + year) using R
version 3.6.2 (R Core Team 2019).
Species-specific coefficients and significance for onset & termination
dates in our reanalysis varied substantially from the original results
(Figure 2; Supplemental Table 2). In our model, higher latitude
generally correlated with later onset (15/22 species), while
correlations were mixed for termination date (Figure 2). These results
indicated later and/or shorter flight periods at higher latitudes. Year
was usually nonsignificant, with most significant patterns consistent
with longer flight periods in recent years.
Research using large
occurrence datasets with variable effort must be explicit about how data
have been obtained and processed prior to analysis. Despite our critique
of the study in question, we recognize that occurrence data have great
potential to address many ecological questions. We enthusiastically
support continued digitization and use of collection data in ecological
analysis, but urge researchers to exercise caution using these data,
especially when estimating effort-sensitive metrics, such as phenology.