University of Texas, Austin, 305 E. 23rd St., Austin, TX 78712, USA
Jennifer A. Miller,11Current affiliation: Department of Geography and the Environment, University of Texas, Austin, 305 E. 23rd St., Austin, TX 78712, USA University of Texas, Austin, 305 E. 23rd St., Austin, TX 78712, USA
ABSTRACT Home range estimates have been a key geographic unit for understanding the link between animals and their habitat/resource choices since the term was first described by Burt (1943) and formally quantified by Mohr (1947)—who introduced minimum convex polygons (MCP) as a method to delineate individual home ranges. Numerous methods have subsequently been developed to estimate home ranges. However, depending on the method used, widely different estimations can be found with the same animal location dataset. With different home range delineations, inferences in a heterogenous landscape about animal resource and habitat preferences with different delineations can impact wildlife management. In this research, time-based home range methods that account for autocorrelation in animal movement were evaluated for accuracy in terms of area, shape, and location in response to sample size and common wildlife GPS-point patterns. These characteristics of home range estimation are important for inferring animal habitat and resource use. Despite the improved accuracy of time-based methods compared to traditional point-based methods like MCP, location was often inaccurate for all GPS-point patterns, as were shape and area for GPS-point patterns with perforations (common for areas with large physical barriers like mountains or lakes). These findings are important to wildlife managers using time-based home range methods for analysis.
KEY WORDS autocorrelation, GPS-point pattern, home range, movement ecology, and animal-space use.
The home range is the area where an animal confines most of its movement (Burt, 1943) and is a key spatial unit for understanding the relationship between animals and their environment. Home ranges are environmentally heterogeneous, and animals use them unevenly. The familiarity that animals have with their home range allows them to find resources, routes to avoid predators, and shared areas to find potential mates (Powell and Mitchell 2012). Wildlife ecologists use home ranges to analyze animal space use, such as habitat choice, territorial overlap, and resource use (Long and Nelson, 2015).
The delineation of home ranges falls broadly into two methodological approaches; either direct delineation of the home range (i.e., via minimum convex polygons (MCP)) or by using the 95% contour of a utilization distribution (UD), which measure the intensity of animal space use (i.e., via kernel density estimation (KDE)). Though these two approaches calculate animal space use in fundamentally different ways, the analytical usage of home range delineators and 95% UD-contours is functionally similar, such as habitat preferences assessment (Hickman et al. 2016), response to natural disaster (Jordan et al. 2019), or impacts of anthropogenic disturbance (Knüsel et al. 2019).
Historically, MCP and KDE have been the primary methods used for these respective approaches (Downs 2010). Both MCP and KDE estimate animal space use from a point-based perspective and do not consider that movement is a temporally autocorrelated process (Noonan et al. 2019). The lack of consideration for temporal autocorrelation in point-based home range methods (Cushman 2010) leads to underestimated home range size (Swihart and Slade 1985, Downs and Horner 2008), biased predictions of habitat selection (Litvaitis 1994, Van Moorter et al. 2016), and miscategorized habitat and resource use (Harris et al. 1990).
To reduce issues related to autocorrelation, a strategy that is often applied is to filter points so the data are statistically independent (Welch et al. 2016). However, filtering autocorrelated data does not necessarily ensure independence; it may only reduce the ability to detect it (Fortin and Dale 2005). Filtering data can also lead to the potential loss of information (i.e., reducing sample size), which itself can produce poor home range area-estimates (Rooney et al. 1998, Downs and Horner 2008). Also, the autocorrelation of tracking data is not necessarily a negative property, as it can indicate an animal’s preferred habitat or resource choices (Boyce et al. 2010). Autocorrelation can also reveal the temporal limitations of animal movements (Long and Nelson, 2015), and be used to examine the relative frequency with which animals use areas within their home range (Benhamou 2011).
To exploit autocorrelation as a data property rather than a statistical problem, several home range estimation methods were recently developed that utilize the temporal aspects of tracking data. Walter et al. (2015) found that the shape produced by time-based home range estimators more accurately fit the GPS-point patterns of tracking data than shapes produced from methods that do not include a time component. While shape is a major aspect of accurately estimating home ranges, so are area and location. Accurately estimating each of those three components (area, shape, and location) is vital, if home range estimates are to be used to infer animal habitat and resource choices.
The research presented here offers an analysis of point-based and time-based home range estimators to understand how the accuracy of area, shape, and location of different methods is preserved in response to different GPS-point pattern shapes and sample sizes. The shape of GPS-point patterns has a tremendous impact on the accuracy of home range estimation (Downs and Horner 2008), and telemetry data will exhibit very different patterns depending on individual preferences, seasonality, or animal motion capacity (e.g., terrestrial or avian animals). Similarly, the sample size of the data has a well-established influence on home range estimation accuracy (Boulanger and White 1990, Seaman and Powell 1996, Börger et al. 2006, Downs and Horner 2008).
By comparing home range estimation methods at different sample sizes and for different GPS point-pattern shapes, this research clarifies which methods are most appropriate for different data and for different ecological questions (e.g., whether some methods are more applicable for species with patchy home ranges, like birds) (Table 1). The research presented here evaluated the point-based home range techniques MCP and KDE, along with path-based home range methods that utilize the temporal aspect of telemetry data. These time-based techniques included biased random bridge (BRB) (Benhamou 2011), potential path area (PPA) (Long and Nelson 2015), the time-geographic density estimation (TGDE) for moving point objects (Downs 2010), and Time Local Convex Hull (T-LoCoH) (Lyons et al. 2013).
Burt (1943, p. 351) differentiated an animal’s home range from its territory; describing the home range as the area near a ”home site” where an animal will mate, search for food, and care for offspring. Territory, on the other hand, is the area within the home range that an animal defends. More distinctly, the area where an animal spends 95% of its time and utilizes 50% of its space has defined the home range (Garrott and White 1990).
The most common classes of home range estimators are variations of MCP and KDE (Signer et al. 2015). MCP home ranges are determined by enclosing a set of geographic points with the smallest possible convex polygon or hull, which is a polygon where all angles are less than 180 degrees. While still widely used, MCP are sensitive to point geometry, have biases in range estimation, and cannot differentiate internal space (Downs and Horner 2008).
KDE was developed as a method that could be used to differentiate internal spaces by calculating an animal’s utilization distribution (UD) (Lyons et al. 2013). UD are two dimensional (x,y) frequency distributions of animal occurrences that identify how much an animal uses different locations within its home range (Winkle 1975). KDE uses a non-parametric probability function called a kernel to estimate the likelihood that an individual uses neighboring cells. The kernel function, centered at each data point (x), weighs the contribution of data point xi to x based on the distance between the two. The shape of the kernel function and the bandwidth determine the extent of that contribution (Silverman 1986). Depending on bandwidth size, KDE can estimate dramatically different home range shapes and areas within the same dataset (Gitzen and Millspaugh 2003).
It is typical to use the 95%-isopleth of utilization distribution, such as KDE, to delineate an animal’s home range (Downs and Horner, 2008), which has been considered a more reliable means of estimating animal home ranges than MCP (Kernohan et al. 2001). However, comparisons of MCP and KDE home range delineations were often done with simulated data that had known statistical distributions, such as bivariate normal mixtures, that do not represent the full range of space-use patterns of wildlife species (Worton 1989, Boulanger and White 1990, Seaman and Powell 1996, Seaman et al. 1999, Gitzen and Millspaugh 2003). A more recent comparison of MCP and KDE, with GPS-point pattern shapes that more closely resemble wildlife movement patterns with unknown statistical distributions, found that KDE also tends to overestimate home range area, when subsampled from a point-sample with a greater number of points (Downs and Horner, 2008).
Cushman (2010) hypothesized that much of the variability of MCP and KDE occurs because they treat animal locations as independent (i.e., they ignore autocorrelation). Recently, several researchers have exploited the autocorrelation of tracking data to parameterize time in home range estimations. For example, in place of the traditional kernel used by KDE, the time-geography density estimation (TGDE) uses an animal’s maximum velocity to produce a geo-ellipse to estimate the uncertainty between consecutive sample locations (Downs 2010). One advantage of TGDE is that it replaces a subjective parameter (kernel bandwidth size) with an objective factor based on an animal’s maximum travel velocity. In addition, overestimation of home ranges is limited with TGDE because velocity constrains the space-time path. Similarly, the potential path area (PPA) (Long and Nelson, 2015) outlines where an individual could have been present by utilizing 2D-space-time prisms (Hägerstraand 2005). PPA estimates are based on all trajectory points and an animal’s maximum velocity, thereby considering the confines of an animal’s movement, which potentially limits home range over-estimation.
Benhamou (2011) developed BRB to estimate animal space use with a combination of a movement kernel density estimator (MKDE) (Benhamou and Cornélis 2010) and a biased-Brownian bridge. The MKDE measures recurrence intensity for different areas of a home range based on the time spent in an area and the number of times the animal returns to that same area (Benhamou and Riotte-Lambert 2012). A Brownian bridge is conceptually comparable to PPA as it finds an uncertainty bounds between successive telemetry points. Instead of using velocity though, a Brownian bridge uses Brownian variance—fairly comparable to variance in a regression model—and a diffusion coefficient or random movement to estimate uncertainty. To calculate variance, every other point in a telemetry path is removed and a line fitting the remaining points is created. The variance is then a function of how far the removed-observed points are from the new line (Horne et al. 2007).
The biased aspect of BRB includes an advective or drift coefficient to simulate the general inclination for an animal to travel in the direction of its next relocation. The advection coefficient allows for reorientations, which potentially creates more realistic home range estimates as animals reorient themselves toward preferred areas. In the BRB calculations, both the diffusive and advective coefficient change based on the properties of consecutive relocations, so it does not rely on a global property and is adaptable over time.
Lyons et al. (2013) developed a home range estimator, termed the time local convex hull (T-LoCoH) that is a temporally parameterized version of the point-based local convex hull (LoCoH) developed by Getz and Wilmers (2004). LoCoH calculates convex hulls beginning with each point and its n-nearest points. The hulls are then unioned iteratively to form a utilization distribution, where the smallest hulls are indicative of frequently used areas (Getz et al. 2007). The 95%-isopleth can then be used as a home range delineator. T-LoCoH incorporates time into LoCoH by finding neighbors based upon a time-scaled distance (TSD) parameter that is a combination of spatial proximity and time difference between points. In contrast to the other time-based home range methods mentioned above, T-LoCoH does not discretize trajectory segments based upon temporal differences between sequential steps, nor model movement on that segmentation. Instead, T-LoCoH uses the TSD metric to characterize the spatiotemporal relationship between all point pairs (Lyons et al. 2013).
Methods
For the home-range-comparison analysis, simulated data were created that are analogous to GPS-point patterns commonly found in animal telemetry data. Those patterns include concave, convex, disjoint, linear, and perforated (Figure 1). Home ranges calculated with 100% of the simulated GPS-points (n = 2,500), here after referred to as the true-home range, were used to compare home range accuracy of each method in response to different sample size and GPS-point patterns. 2,500 points was chosen because that is the approximate-minimum value that could effectively simulate all of the GPS-point pattern types and allowed for a subsample at 5% (n = 125)—a reasonable number for a small GPS telemetry dataset (Downs and Horner 2008, Walter et al. 2015). In addition, previous comparisons of home range methods had used point patterns with ~2,500 points for their comparisons (Downs and Horner, 2008; Walter et al., 2015). Keep in mind that this analysis represents the impacts of sampling and GPS-point pattern shape on the tested home range methods, and does not indicate how well the “real” home range is calculated (Borger et al, 2006).
Each simulation was run 100 times for each of the pattern types for a total of 500 simulations. From each of the 500 simulations, 5-, 10-, 25-, and 50-percent of the GPS points were subsampled to calculate home ranges with BRB, KDE, MCP, PPA, TDGE, and T-LoCoHo. The subsampled home ranges were measured for accuracy in comparison to the true-home range in terms of:
Area: measured as a ratio of the home range area calculated with the subsampled GPS-points over the true-home range.
Shape: measured with Area Under the Curve (AUC) to assess how well the subsampled home ranges fit the five GPS-point patterns. Edge density (ED) was also used to measure how well edge complexity was maintained. Patches were also counted to measure how accurately each home range estimator calculated patches in comparison to the true-home range.
Location: accuracy was measured using Earth Mover Distance (EMD), also known as the Wasserstein metric, to assess how well home range location was maintained.
Simulated Data
Correlated random walks
To simulate the different GPS-point patterns, this research used the correlated random walk (simm.CRW) function available in the adehabitatLT package in R (Calenge, 2016). The simm.CRW function calculates the trajectory of an animal by simulating successive relocations based on step length and the distance drawn from two distributions. The direction of each move is drawn from a wrapped normal distribution (with a concentration parameter of r and the length of the step is drawn from a chi distribution and multiplied by h * sqrt(dt), where h is a scaling parameter for the step length and dt is the distance between successive locations (Kareiva and Shigesada 1983).
Specifically, a correlated random walk (CRW) takes the form (Turchin, 1998):
\(x_{t+1}=d_{t}\cos\left(\theta_{t}\right)+x_{t}\) (2.1)
\(y_{t+1}=d_{t}\sin\left(\theta_{t}\right)+y_{t}\) (2.2)
Here xt and yt are the coordinates of the CRW at time t , dt is the step-length at time t and θt is the step direction at time t (Long et al., 2015).
Figure : Examples of GPS-point pattern shapes: a). concave, b.) convex, c.) disjoint, d.) linear, and e.) perforated.
Using the simm.crw function, this research simulated GPS-point patterns to represent those commonly seen in wildlife tracking data including concave, convex, disjoint, linear, and perforated (Downs and Horner 2008) (Figure 1). Concave GPS-point patterns represent telemetry datasets commonly seen for terrestrial animals moving through open environments with few boundaries (Figure 1a). To simulate the concave GPS-pattern, the correlated random walk function was run with no boundary limitations. For all other patterns, boundaries were created that limited movement.
To create boundaries for each simulation, the rPoissonCluster function from R’s spatstat library was used to create random distributions of geographic points (Gabriel 2017). From those patterns, a similar procedure as Downs and Horner (2008) was used. Downs and Horner (2008) compared the effects of point patterns on MCP and KDE.
For convex patterns, five to seven points were subsampled from the random Poisson distribution. These points were used to create an MCP bounding box where the correlated random walk was allowed to run (Figure 1b). Convex patterns exemplify species that are territorial with well-defined and well-defended boundaries. For disjoint patterns, two or more bounding boxes were created and simm.CRW was allowed to run within those bounds. To create a disjoint pattern, when simm.CRW would try to step outside of one of the bounding boxes, instead of being directed toward the inside of the box, as was the case in the other simulations, the CRW would “fly” to one of the other bounding boxes. This “fly” simulation created patterns that are typical of avian species, which have multiple habitat patches with unvisited areas between patches (Figure 1c). The linear home range pattern epitomizes the space use of animals who occupy corridors or coastlines (Downs and Horner, 2008). To create the bounding box for linear patterns, five to seven points were again drawn from a random Poisson point pattern, but were instead used to generate a line, which was then buffered to allow simm.CRW to create patterns similar to Figure 1d. To generate perforated patterns, MCP were created in the same manner as for the convex bounding, but simm.CRW was only allowed to run outside of the bounding box (Figure 1e). Perforated GPS-point patterns depict species whose home ranges have unused or unusable areas, often found when there are large water bodies, steep terrain, or other physical barriers.
Each of the five GPS-point patterns were simulated 100 times (n = 2,500) for a total of 500 patterns. For each simulation the true-home range was calculated with each home range method using 100% of the points. The true-home range was then used to measure accuracy at lower sampling levels at 5-, 10-, 25-, and 50-percent levels of the data (Börger et al. 2006, Downs and Horner 2008, Zurell et al. 2010). From each subsampled simulation, accuracy metrics were computed against the true-home range for area (Schuler et al., 2014), shape (Walter et al. 2015), and location (Kranstauber et al. 2017).
Statistical analysis of the simulated data
Area accuracy
Since home ranges are often used as a geographic unit to infer animal habitat and resource preferences, appropriately estimating home range area is ecologically vital as either overestimating or underestimating the size of a home range can lead to erroneous inferences (Börger et al. 2006). To assess area accuracy of each of the home range methods used in this research, a ratio of the subsampled home range area over the true-home range area was used for each simulation:
\(f\left(x;P,p\right)=\frac{p_{i,j}}{P_{i}}\text{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ }\)(2.3)
In equation 2.3, P refers to the true-home range area and p to the subsampled home range area, i is the home range method (BRB, KDE, MCP, PPA, TDGE, or T-LoCoH), and j the subsample size (5-, 10-, 25-, or 50-percent). The function f(x;P,p) is the distribution of the ratios, where x is the ratio of one subsampled home range area over the true-home range area. For each simulation, a value of 1.0 was 100% accuracy. For the entire distribution, a mean ratio close to 1.0 was indicative of greater accuracy (Börger et al. 2006). The precision of the area estimate was represented by the dispersion of the data (Seaman and Powell, 1996) (Figure 2).
Shape accuracy
Accurately calculating the shape of a home range is also ecologically important, as shape, like area, can impact the inferences about habitat and resources made with a home range. For example, even if the home range area is correct, if the shape of the home range causes the boundary to stretch into an area that is not used by an animal, the inferences could be wrong (Figure 2).
Figure : Three different shapes with an area of 4 meters.
To assess shape accuracy, area under the curve (AUC) was used, which is a metric that has been used to assess home accuracy in response to GPS-point pattern shape (Walter et al., 2015). It is worth noting that AUC can be artificially inflated by increasing the extent to include unsampled regions with no species occurrences (Cumming and Cornélis 2012). Therefore, GPS-point patterns were simulated on reference grids with identical grain and extent (Cumming and Cornélis, 2012). AUC is a single metric of the accuracy of receiver operating curves (ROC) of true positives on the y-axis and false positives on the x-axis, where AUC is the percent area under the ROC (Fielding and Bell 1997). In this research, a false positive was a grid cell where the true-home range contained no GPS-points, but grid cells of a home range at a lower sampling level did contain GPS-points. A true positive were grid cells that contained GPS-points in the true-home range and in the home ranges calculated at lower sample levels. The AUC outputs an index value between 0.5 and 1.0, with 0.5 equivalent to chance and 1.0 representing a perfect agreement between the points and the home range contours (Stark et al, 2017). AUC was calculated for each of the simulated GPS-point patterns at 5-, 10-, 25-, and 50-percent levels of the data using the caTools package in R.
In addition to AUC, the accuracy of boundary complexity was also measured. Edge density (ED) is a measure of boundary complexity, calculated as a ratio of home range perimeter by area (Hargis et al, 1998). High ratios are a characteristic of complex boundaries (Stark et al, 2017). ED values revealed how well home range estimators maintain boundary complexity at different sample levels and for different GPS-point pattern shapes.
The number of patches outputted by each estimator compared to the true home range were also counted; calculated as a percentage of time the accurate number of patches was correctly outputted for each sub-sampled estimation compared to the true-home range. Correctly identifying patches is ecologically consequential, since different classes of animals will have varying patch counts based on motion capacity. For example, most mammals will typically have contiguous home ranges, where many bird species will have non-contiguous home ranges.
Location accuracy
Like shape and area, locational accuracy is paramount, if a home range is going to be used to infer animal habitat and resource choices. Even if the shape and area were correct, incorrect locations will make such inferences erroneous. To quantify the accuracy of home range location in this research, Earth Mover Distance (EMD), also referred to as the Wasserstein metric (Wasserstein 1969), was used. EMD is a tool commonly used in image comparison, but was introduced by Kranstauber et al. (2017) to quantify the accuracy of home range locations. EMD is advantageous in comparing location because it is spatially explicit and provides and intuitive quantification of location similarity between two home range polygons (Kranstauber et al, 2017).
For two exactly matching home ranges, EMD will equal zero. Higher EMD values indicate higher locational difference between two polygons. The transport package in R was used to calculate EMD values for the subsampled home range compared to the true-home range (Schuhmacher et al, 2017). In Figure 3, an example of different shape locations is shown. The orange polygons represent the true-home range, the green polygons have lower EMD values than the purple polygons when transformed to the orange polygon. In 3a, each polygon is the same shape, but the purple polygon needs to be flipped and relocated to match the orange polygon, whereas the green polygon only needs to be moved. In Figure 3b, the purple polygon would have to be moved farther than the green polygon to match the orange polygon.
Figure : Example of transforming polygon locations as measured by EMD. The orange polygons are the true home range. The green polygons have a lower EMD value and the purple polygons have a higher value. a). The green polygon only has to be moved to match the true-home range, the purple has to flip and move. b.) The purple polygon would have to move farther than the green to match the location of the orange polygon.