2.2 Distribution pattern calculation
We compiled a database of rodent species distribution in China. Species
distribution data were obtained mainly from the following sources: 1)
the research results of Zhou et al. (2002) and Xing et al. (2008); 2)
National Zoological Museum of China, NZMC; 3) Global Biodiversity
Information Facility (GBIF); and 4) distribution and collection records
available in books or literature (Jiang et al., 2015; Ge et al., 2018;
Liu et al., 2019; Li et al., 2019; Cheng et al., 2021; Jackson et al.,
2022). After removing null values, offset values, and redundant data
from the distribution records, 237 species of rodents in two orders were
included in the analysis of this study. There were 67 endemic and 170
non-endemic species (Jiang et al., 2015; Wei et al., 2021) (Table S1).
MaxEnt (v3.4.1) was used for ecological niche modeling (ENM) of
potential rodent habitat areas in China. Considering that the MaxEnt
model requires at least five different coordinate values for each
species to produce more accurate results, six points were used as the
minimum criteria for calculating species distribution in this study. The
potential habitats of 210 rodent species with six or more distribution
points were simulated using ENM to determine the potential species
richness of rodents in China. Based on the characteristics of
distribution data and rodent habits, 26 environmental variables were
selected. The five categories of predictors were climate, topography,
vegetation, soil, and human activity intensity (Table S2). Chinese
administrative vector boundaries were obtained from the Data Center for
Resources and Environmental Sciences at the Chinese Academy of Sciences
(RESDC) (http://www.resdc.cn) and converted to 1 km2resolution.
The correlation of environmental variables was detected using the
ENMTools (Warren et al., 2021) package in R 4.0.5
(http://www.r-project.org). The variables that were not highly
correlated (r < 0.7) were used in the model prediction
to reduce the complexity of the model (Table S2). The percentage of
random test data was set to 25%, 10 sub-models were generated using the
bootstrap function of the MaxEnt model, and the average of the output of
the 10 sub-models on each image element was calculated as the final
prediction result of the species. Because each species has a different
degree of tolerance to the environment, the suitable habitat threshold
for each species was divided based on the critical value of the
available distribution records. The growth suitability at each sampling
point was extracted from the plot of the calculated growth suitability.
The standard deviation σ and mean value μ were calculated according to
the theory of normal distribution, μ-σ was selected as the threshold
value, transforming the species distribution probability maps into 0/1
binary distribution maps. The model accuracy was evaluated using
receiver operating characteristic (ROC) curves. The area enclosed by the
ROC curve and horizontal axis is the AUC value (Hanley & McNeil, 1982),
which can be used to measure the strengths and weaknesses of the model.
For species with predicted AUC values less than 0.8, the MaxEnt model
was optimized using the ENMeveal package in R (Muscarella et al., 2014),
and the model was run again.
The distribution ranges for the 27 species with less than six recorded
distribution points defaults to the grids where the distribution points
were located. The distribution range layer was converted into a 0/1
binary distribution map. Finally, the binary distribution map of 237
species was superimposed on the grid map, and the number of species
appearing in a single grid was counted to obtain the species richness
distribution map.