2.3 Construction of species distribution model
Ten modeling (GLM, GBM, CTA, RF, GAM, ANN, SRE, FDA, MARS and MAXENT)
algorithms provided by ”Biomod2” package were used to predict the
potential distribution of T. chinense . All models use default
parameters except the MAXENT model.
The prediction accuracy of MAXENT model is affected by parameter
settings. We tested the complexity and performance of the MAXENT model
under different settings of regularization multiplier (RM) and feature
class (FC) used the kuenm package in R 3.6.3 (Cobos et al., 2019).
Candidate models were created by combining 17 RM values and all 31
possible combinations of five FC (L: linear feature, Q: secondary
feature, H: fragmentation feature, P: product feature and T: threshold
feature). According to Akaike information criterion (AICc) model of the
delta on the choice of the optimal model, when the minimum value AICc
(deltaAICc = 0), it is considered to be the optimal model (Cobos et al.,
2019). The optimized MAXENT software parameters were RM=3 and FC=LQPT.
In species distribution modeling using the Biomod2 package, 70% of
occurrence data was selected as training data, and the rest was used as
testing data. The above process has been carried out five times. In
order to reduce spatial bias and better simulate the actual distribution
of species, we created 5,000 pseudo-absence points, repeated 3 times and
modeled. In the end, 150 layers were generated. We evaluated each model
using the true skill statistic (TSS) and the area under the receiver
operating characteristic curve (AUC) (Bucklin et al., 2015; X. Zhang et
al., 2020). The closer the TSS value and AUC value are to 1, the more
reliable the prediction will be (Zhao et al., 2021; Freitas et al.,
2019). We used the model with large average TSS (≥ 0.8) and AUC (≥ 0.9)
values to calculate the final species distribution layer.