University of Texas, Austin, 305 E. 23rd St.,
Austin, TX 78712, USA
Jennifer A. Miller,11Current affiliation: Department of
Geography and the Environment, University of Texas, Austin, 305 E.
23rd St., Austin, TX 78712, USA University
of Texas, Austin, 305 E. 23rd St., Austin, TX 78712,
USA
ABSTRACT Home range estimates have been a key geographic unit
for understanding the link between animals and their habitat/resource
choices since the term was first described by Burt (1943) and formally
quantified by Mohr (1947)—who introduced minimum convex polygons (MCP)
as a method to delineate individual home ranges. Numerous methods have
subsequently been developed to estimate home ranges. However, depending
on the method used, widely different estimations can be found with the
same animal location dataset. With different home range delineations,
inferences in a heterogenous landscape about animal resource and habitat
preferences with different delineations can impact wildlife management.
In this research, time-based home range methods that account for
autocorrelation in animal movement were evaluated for accuracy in terms
of area, shape, and location in response to sample size and common
wildlife GPS-point patterns. These characteristics of home range
estimation are important for inferring animal habitat and resource use.
Despite the improved accuracy of time-based methods compared to
traditional point-based methods like MCP, location was often inaccurate
for all GPS-point patterns, as were shape and area for GPS-point
patterns with perforations (common for areas with large physical
barriers like mountains or lakes). These findings are important to
wildlife managers using time-based home range methods for analysis.
KEY WORDS autocorrelation, GPS-point pattern, home range,
movement ecology, and animal-space use.
The home range is the area where an animal confines most of its movement
(Burt, 1943) and is a key spatial unit for understanding the
relationship between animals and their environment. Home ranges are
environmentally heterogeneous, and animals use them unevenly. The
familiarity that animals have with their home range allows them to find
resources, routes to avoid predators, and shared areas to find potential
mates (Powell and Mitchell 2012). Wildlife ecologists use home ranges to
analyze animal space use, such as habitat choice, territorial overlap,
and resource use (Long and Nelson, 2015).
The delineation of home ranges falls broadly into two methodological
approaches; either direct delineation of the home range (i.e., via
minimum convex polygons (MCP)) or by using the 95% contour of a
utilization distribution (UD), which measure the intensity of animal
space use (i.e., via kernel density estimation (KDE)). Though these two
approaches calculate animal space use in fundamentally different ways,
the analytical usage of home range delineators and 95% UD-contours is
functionally similar, such as habitat preferences assessment (Hickman et
al. 2016), response to natural disaster (Jordan et al. 2019), or impacts
of anthropogenic disturbance (Knüsel et al. 2019).
Historically, MCP and KDE have been the primary methods used for these
respective approaches (Downs 2010). Both MCP and KDE estimate animal
space use from a point-based perspective and do not consider that
movement is a temporally autocorrelated process (Noonan et al. 2019).
The lack of consideration for temporal autocorrelation in point-based
home range methods (Cushman 2010) leads to underestimated home range
size (Swihart and Slade 1985, Downs and Horner 2008), biased predictions
of habitat selection (Litvaitis 1994, Van Moorter et al. 2016), and
miscategorized habitat and resource use (Harris et al. 1990).
To reduce issues related to autocorrelation, a strategy that is often
applied is to filter points so the data are statistically independent
(Welch et al. 2016). However, filtering autocorrelated data does not
necessarily ensure independence; it may only reduce the ability to
detect it (Fortin and Dale 2005). Filtering data can also lead to the
potential loss of information (i.e., reducing sample size), which itself
can produce poor home range area-estimates (Rooney et al. 1998, Downs
and Horner 2008). Also, the autocorrelation of tracking data is not
necessarily a negative property, as it can indicate an animal’s
preferred habitat or resource choices (Boyce et al. 2010).
Autocorrelation can also reveal the temporal limitations of animal
movements (Long and Nelson, 2015), and be used to examine the relative
frequency with which animals use areas within their home range (Benhamou
2011).
To exploit autocorrelation as a data property rather than a statistical
problem, several home range estimation methods were recently developed
that utilize the temporal aspects of tracking data. Walter et al. (2015)
found that the shape produced by time-based home range estimators more
accurately fit the GPS-point patterns of tracking data than shapes
produced from methods that do not include a time component. While shape
is a major aspect of accurately estimating home ranges, so are area and
location. Accurately estimating each of those three components (area,
shape, and location) is vital, if home range estimates are to be used to
infer animal habitat and resource choices.
The research presented here offers an analysis of point-based and
time-based home range estimators to understand how the accuracy of area,
shape, and location of different methods is preserved in response to
different GPS-point pattern shapes and sample sizes. The shape of
GPS-point patterns has a tremendous impact on the accuracy of home range
estimation (Downs and Horner 2008), and telemetry data will exhibit very
different patterns depending on individual preferences, seasonality, or
animal motion capacity (e.g., terrestrial or avian animals). Similarly,
the sample size of the data has a well-established influence on home
range estimation accuracy (Boulanger and White 1990, Seaman and Powell
1996, Börger et al. 2006, Downs and Horner 2008).
By comparing home range estimation methods at different sample sizes and
for different GPS point-pattern shapes, this research clarifies which
methods are most appropriate for different data and for different
ecological questions (e.g., whether some methods are more applicable for
species with patchy home ranges, like birds) (Table 1). The research
presented here evaluated the point-based home range techniques MCP and
KDE, along with path-based home range methods that utilize the temporal
aspect of telemetry data. These time-based techniques included biased
random bridge (BRB) (Benhamou 2011), potential path area (PPA) (Long and
Nelson 2015), the time-geographic density estimation (TGDE) for moving
point objects (Downs 2010), and Time Local Convex Hull (T-LoCoH) (Lyons
et al. 2013).
Burt (1943, p. 351) differentiated an animal’s home range from its
territory; describing the home range as the area near a ”home site”
where an animal will mate, search for food, and care for offspring.
Territory, on the other hand, is the area within the home range that an
animal defends. More distinctly, the area where an animal spends 95% of
its time and utilizes 50% of its space has defined the home range
(Garrott and White 1990).
The most common classes of home range estimators are variations of MCP
and KDE (Signer et al. 2015). MCP home ranges are determined by
enclosing a set of geographic points with the smallest possible convex
polygon or hull, which is a polygon where all angles are less than 180
degrees. While still widely used, MCP are sensitive to point geometry,
have biases in range estimation, and cannot differentiate internal space
(Downs and Horner 2008).
KDE was developed as a method that could be used to differentiate
internal spaces by calculating an animal’s utilization distribution (UD)
(Lyons et al. 2013). UD are two dimensional (x,y) frequency
distributions of animal occurrences that identify how much an animal
uses different locations within its home range (Winkle 1975). KDE uses a
non-parametric probability function called a kernel to estimate the
likelihood that an individual uses neighboring cells. The kernel
function, centered at each data point (x), weighs the contribution of
data point xi to x based on the distance between the two. The shape of
the kernel function and the bandwidth determine the extent of that
contribution (Silverman 1986). Depending on bandwidth size, KDE can
estimate dramatically different home range shapes and areas within the
same dataset (Gitzen and Millspaugh 2003).
It is typical to use the 95%-isopleth of utilization distribution, such
as KDE, to delineate an animal’s home range (Downs and Horner, 2008),
which has been considered a more reliable means of estimating animal
home ranges than MCP (Kernohan et al. 2001). However, comparisons of MCP
and KDE home range delineations were often done with simulated data that
had known statistical distributions, such as bivariate normal mixtures,
that do not represent the full range of space-use patterns of wildlife
species (Worton 1989, Boulanger and White 1990, Seaman and Powell 1996,
Seaman et al. 1999, Gitzen and Millspaugh 2003). A more recent
comparison of MCP and KDE, with GPS-point pattern shapes that more
closely resemble wildlife movement patterns with unknown statistical
distributions, found that KDE also tends to overestimate home range
area, when subsampled from a point-sample with a greater number of
points (Downs and Horner, 2008).
Cushman (2010) hypothesized that much of the variability of MCP and KDE
occurs because they treat animal locations as independent (i.e., they
ignore autocorrelation). Recently, several researchers have exploited
the autocorrelation of tracking data to parameterize time in home range
estimations. For example, in place of the traditional kernel used by
KDE, the time-geography density estimation (TGDE) uses an animal’s
maximum velocity to produce a geo-ellipse to estimate the uncertainty
between consecutive sample locations (Downs 2010). One advantage of TGDE
is that it replaces a subjective parameter (kernel bandwidth size) with
an objective factor based on an animal’s maximum travel velocity. In
addition, overestimation of home ranges is limited with TGDE because
velocity constrains the space-time path. Similarly, the potential path
area (PPA) (Long and Nelson, 2015) outlines where an individual could
have been present by utilizing 2D-space-time prisms (Hägerstraand 2005).
PPA estimates are based on all trajectory points and an animal’s maximum
velocity, thereby considering the confines of an animal’s movement,
which potentially limits home range over-estimation.
Benhamou (2011) developed BRB to estimate animal space use with a
combination of a movement kernel density estimator (MKDE) (Benhamou and
Cornélis 2010) and a biased-Brownian bridge. The MKDE measures
recurrence intensity for different areas of a home range based on the
time spent in an area and the number of times the animal returns to that
same area (Benhamou and Riotte-Lambert 2012). A Brownian bridge is
conceptually comparable to PPA as it finds an uncertainty bounds between
successive telemetry points. Instead of using velocity though, a
Brownian bridge uses Brownian variance—fairly comparable to variance
in a regression model—and a diffusion coefficient or random movement
to estimate uncertainty. To calculate variance, every other point in a
telemetry path is removed and a line fitting the remaining points is
created. The variance is then a function of how far the removed-observed
points are from the new line (Horne et al. 2007).
The biased aspect of BRB includes an advective or drift coefficient to
simulate the general inclination for an animal to travel in the
direction of its next relocation. The advection coefficient allows for
reorientations, which potentially creates more realistic home range
estimates as animals reorient themselves toward preferred areas. In the
BRB calculations, both the diffusive and advective coefficient change
based on the properties of consecutive relocations, so it does not rely
on a global property and is adaptable over time.
Lyons et al. (2013) developed a home range estimator, termed the time
local convex hull (T-LoCoH) that is a temporally parameterized version
of the point-based local convex hull (LoCoH) developed by Getz and
Wilmers (2004). LoCoH calculates convex hulls beginning with each point
and its n-nearest points. The hulls are then unioned iteratively to form
a utilization distribution, where the smallest hulls are indicative of
frequently used areas (Getz et al. 2007). The 95%-isopleth can then be
used as a home range delineator. T-LoCoH incorporates time into LoCoH by
finding neighbors based upon a time-scaled distance (TSD) parameter that
is a combination of spatial proximity and time difference between
points. In contrast to the other time-based home range methods mentioned
above, T-LoCoH does not discretize trajectory segments based upon
temporal differences between sequential steps, nor model movement on
that segmentation. Instead, T-LoCoH uses the TSD metric to characterize
the spatiotemporal relationship between all point pairs (Lyons et al.
2013).
Methods
For the home-range-comparison analysis, simulated data were created that
are analogous to GPS-point patterns commonly found in animal telemetry
data. Those patterns include concave, convex, disjoint, linear, and
perforated (Figure 1). Home ranges calculated with 100% of the
simulated GPS-points (n = 2,500), here after referred to as the
true-home range, were used to compare home range accuracy of each method
in response to different sample size and GPS-point patterns. 2,500
points was chosen because that is the approximate-minimum value that
could effectively simulate all of the GPS-point pattern types and
allowed for a subsample at 5% (n = 125)—a reasonable number for a
small GPS telemetry dataset (Downs and Horner 2008, Walter et al. 2015).
In addition, previous comparisons of home range methods had used point
patterns with ~2,500 points for their comparisons (Downs
and Horner, 2008; Walter et al., 2015). Keep in mind that this analysis
represents the impacts of sampling and GPS-point pattern shape on the
tested home range methods, and does not indicate how well the “real”
home range is calculated (Borger et al, 2006).
Each simulation was run 100 times for each of the pattern types for a
total of 500 simulations. From each of the 500 simulations, 5-, 10-,
25-, and 50-percent of the GPS points were subsampled to calculate home
ranges with BRB, KDE, MCP, PPA, TDGE, and T-LoCoHo. The subsampled home
ranges were measured for accuracy in comparison to the true-home range
in terms of:
Area: measured as a ratio of the home range area calculated
with the subsampled GPS-points over the true-home range.
Shape: measured with Area Under the Curve (AUC) to assess how
well the subsampled home ranges fit the five GPS-point patterns. Edge
density (ED) was also used to measure how well edge complexity was
maintained. Patches were also counted to measure how accurately each
home range estimator calculated patches in comparison to the true-home
range.
Location: accuracy was measured using Earth Mover Distance
(EMD), also known as the Wasserstein metric, to assess how well home
range location was maintained.
Simulated Data
Correlated random walks
To simulate the different GPS-point patterns, this research used the
correlated random walk (simm.CRW) function available in the adehabitatLT
package in R (Calenge, 2016). The simm.CRW function calculates the
trajectory of an animal by simulating successive relocations based on
step length and the distance drawn from two distributions. The direction
of each move is drawn from a wrapped normal distribution (with a
concentration parameter of r and the length of the step is drawn from a
chi distribution and multiplied by h * sqrt(dt), where h is a scaling
parameter for the step length and dt is the distance between successive
locations (Kareiva and Shigesada 1983).
Specifically, a correlated random walk (CRW) takes the form (Turchin,
1998):
\(x_{t+1}=d_{t}\cos\left(\theta_{t}\right)+x_{t}\) (2.1)
\(y_{t+1}=d_{t}\sin\left(\theta_{t}\right)+y_{t}\) (2.2)
Here xt and yt are the
coordinates of the CRW at time t , dt is
the step-length at time t and θt is the
step direction at time t (Long et al., 2015).
Figure : Examples of GPS-point pattern shapes: a). concave, b.) convex,
c.) disjoint, d.) linear, and e.) perforated.
Using the simm.crw function, this research simulated GPS-point patterns
to represent those commonly seen in wildlife tracking data including
concave, convex, disjoint, linear, and perforated (Downs and Horner
2008) (Figure 1). Concave GPS-point patterns represent telemetry
datasets commonly seen for terrestrial animals moving through open
environments with few boundaries (Figure 1a). To simulate the concave
GPS-pattern, the correlated random walk function was run with no
boundary limitations. For all other patterns, boundaries were created
that limited movement.
To create boundaries for each simulation, the rPoissonCluster function
from R’s spatstat library was used to create random distributions of
geographic points (Gabriel 2017). From those patterns, a similar
procedure as Downs and Horner (2008) was used. Downs and Horner (2008)
compared the effects of point patterns on MCP and KDE.
For convex patterns, five to seven points were subsampled from the
random Poisson distribution. These points were used to create an MCP
bounding box where the correlated random walk was allowed to run (Figure
1b). Convex patterns exemplify species that are territorial with
well-defined and well-defended boundaries. For disjoint patterns, two or
more bounding boxes were created and simm.CRW was allowed to run within
those bounds. To create a disjoint pattern, when simm.CRW would try to
step outside of one of the bounding boxes, instead of being directed
toward the inside of the box, as was the case in the other simulations,
the CRW would “fly” to one of the other bounding boxes. This “fly”
simulation created patterns that are typical of avian species, which
have multiple habitat patches with unvisited areas between patches
(Figure 1c). The linear home range pattern epitomizes the space use of
animals who occupy corridors or coastlines (Downs and Horner, 2008). To
create the bounding box for linear patterns, five to seven points were
again drawn from a random Poisson point pattern, but were instead used
to generate a line, which was then buffered to allow simm.CRW to create
patterns similar to Figure 1d. To generate perforated patterns, MCP were
created in the same manner as for the convex bounding, but simm.CRW was
only allowed to run outside of the bounding box (Figure 1e). Perforated
GPS-point patterns depict species whose home ranges have unused or
unusable areas, often found when there are large water bodies, steep
terrain, or other physical barriers.
Each of the five GPS-point patterns were simulated 100 times (n = 2,500)
for a total of 500 patterns. For each simulation the true-home range was
calculated with each home range method using 100% of the points. The
true-home range was then used to measure accuracy at lower sampling
levels at 5-, 10-, 25-, and 50-percent levels of the data (Börger et al.
2006, Downs and Horner 2008, Zurell et al. 2010). From each subsampled
simulation, accuracy metrics were computed against the true-home range
for area (Schuler et al., 2014), shape (Walter et al. 2015), and
location (Kranstauber et al. 2017).
Statistical analysis of the simulated data
Area accuracy
Since home ranges are often used as a geographic unit to infer animal
habitat and resource preferences, appropriately estimating home range
area is ecologically vital as either overestimating or underestimating
the size of a home range can lead to erroneous inferences (Börger et al.
2006). To assess area accuracy of each of the home range methods used in
this research, a ratio of the subsampled home range area over the
true-home range area was used for each simulation:
\(f\left(x;P,p\right)=\frac{p_{i,j}}{P_{i}}\text{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ }\)(2.3)
In equation 2.3, P refers to the true-home range area and p to the
subsampled home range area, i is the home range method (BRB, KDE, MCP,
PPA, TDGE, or T-LoCoH), and j the subsample size (5-, 10-, 25-, or
50-percent). The function f(x;P,p) is the distribution of the ratios,
where x is the ratio of one subsampled home range area over the
true-home range area. For each simulation, a value of 1.0 was 100%
accuracy. For the entire distribution, a mean ratio close to 1.0 was
indicative of greater accuracy (Börger et al. 2006). The precision of
the area estimate was represented by the dispersion of the data (Seaman
and Powell, 1996) (Figure 2).
Shape accuracy
Accurately calculating the shape of a home range is also ecologically
important, as shape, like area, can impact the inferences about habitat
and resources made with a home range. For example, even if the home
range area is correct, if the shape of the home range causes the
boundary to stretch into an area that is not used by an animal, the
inferences could be wrong (Figure 2).
Figure
: Three different shapes with an area of 4 meters.
To assess shape accuracy, area under the curve (AUC) was used, which is
a metric that has been used to assess home accuracy in response to
GPS-point pattern shape (Walter et al., 2015). It is worth noting that
AUC can be artificially inflated by increasing the extent to include
unsampled regions with no species occurrences (Cumming and Cornélis
2012). Therefore, GPS-point patterns were simulated on reference grids
with identical grain and extent (Cumming and Cornélis, 2012). AUC is a
single metric of the accuracy of receiver operating curves (ROC) of true
positives on the y-axis and false positives on the x-axis, where AUC is
the percent area under the ROC (Fielding and Bell 1997). In this
research, a false positive was a grid cell where the true-home range
contained no GPS-points, but grid cells of a home range at a lower
sampling level did contain GPS-points. A true positive were grid cells
that contained GPS-points in the true-home range and in the home ranges
calculated at lower sample levels. The AUC outputs an index value
between 0.5 and 1.0, with 0.5 equivalent to chance and 1.0 representing
a perfect agreement between the points and the home range contours
(Stark et al, 2017). AUC was calculated for each of the simulated
GPS-point patterns at 5-, 10-, 25-, and 50-percent levels of the data
using the caTools package in R.
In addition to AUC, the accuracy of boundary complexity was also
measured. Edge density (ED) is a measure of boundary complexity,
calculated as a ratio of home range perimeter by area (Hargis et al,
1998). High ratios are a characteristic of complex boundaries (Stark
et al, 2017). ED values revealed how well home range estimators maintain
boundary complexity at different sample levels and for different
GPS-point pattern shapes.
The number of patches outputted by each estimator compared to the true
home range were also counted; calculated as a percentage of time the
accurate number of patches was correctly outputted for each sub-sampled
estimation compared to the true-home range. Correctly identifying
patches is ecologically consequential, since different classes of
animals will have varying patch counts based on motion capacity. For
example, most mammals will typically have contiguous home ranges, where
many bird species will have non-contiguous home ranges.
Location accuracy
Like shape and area, locational accuracy is paramount, if a home range
is going to be used to infer animal habitat and resource choices. Even
if the shape and area were correct, incorrect locations will make such
inferences erroneous. To quantify the accuracy of home range location in
this research, Earth Mover Distance (EMD), also referred to as the
Wasserstein metric (Wasserstein 1969), was used. EMD is a tool commonly
used in image comparison, but was introduced by Kranstauber et al.
(2017) to quantify the accuracy of home range locations. EMD is
advantageous in comparing location because it is spatially explicit and
provides and intuitive quantification of location similarity between two
home range polygons (Kranstauber et al, 2017).
For two exactly matching home ranges, EMD will equal zero. Higher EMD
values indicate higher locational difference between two polygons. The
transport package in R was used to calculate EMD values for the
subsampled home range compared to the true-home range (Schuhmacher
et al, 2017). In Figure 3, an example of different shape locations is
shown. The orange polygons represent the true-home range, the green
polygons have lower EMD values than the purple polygons when transformed
to the orange polygon. In 3a, each polygon is the same shape, but the
purple polygon needs to be flipped and relocated to match the orange
polygon, whereas the green polygon only needs to be moved. In Figure 3b,
the purple polygon would have to be moved farther than the green polygon
to match the orange polygon.
Figure : Example of transforming
polygon locations as measured by EMD. The orange polygons are the true
home range. The green polygons have a lower EMD value and the purple
polygons have a higher value. a). The green polygon only has to be moved
to match the true-home range, the purple has to flip and move. b.) The
purple polygon would have to move farther than the green to match the
location of the orange polygon.