NTF distribution: watersheds vs grids
Of the 19 rare species with non-single site distributions within grids or watersheds, 63.16% were distributed in the same or adjacent grid, and 84.21% were distributed in the same or adjacent watershed. With the exception of Drechslerella dactyloides , 8 of the 9 rare species that only appeared in 2 or 3 sites were distributed in the same or adjacent watershed (Figure 7).
At genetic level, the phylogenetic tree revealed that the widespread species A. oligospora was divided into 5 large clades. When grid was used as the unit of analysis, no clear distribution pattern of these clades was detected. When watershed was used as the unit of analysis, the distribution of these clades was consistent with the natural watershed divisions in Yunnan Province, except for Irrawaddy River, which has no corresponding clade. The strains within the other clades were distributed within their corresponding watersheds. The 64.70%,60.71%,80.00%,63.89% and 61.11% of the strains from clade 1 to clade 5 were correspondingly distributed in the Yangtze River, Red River, Pearl River, Mekong, and Salween-Irrawaddy watersheds respectively (Figure 8).
Based on our phylogenetic tree, the spatial distribution A. oligospora was machine learned using the randomly generated Voronoi polygons and the watersheds. We found that the 45 maps generated using polygons all had lower accuracy than the maps watersheds generated (Table S4). On average, the accuracy of the polygons was low (mean: 36%, median: 38%), with all but one prediction falling below 50%. By assigning the clades according to the watershedes, 68.8% of clades were classified correctly (Figures S2, panel a). Watershed explained nearly 70 % ofA. oligospora distribution in Yunnan. None of the additional climate, topographic, soil and vegetation variables significantly improved the model (Figure S3,Table S5).
We were better able to capture the distribution pattern of NTF when using watershed as the unit of analysis compared with grid (Figure 9). Only 17 intersections of were found when we constructed upset plots using watershed units. We found 19 species distributed in only one watershed unit, and that 76.2% of all species were distributed in the adjacent watersheds. There was a large gap between the dataset (variance was 7.89, and mean value was 2.53). Despite finding 22 combinations when constructing the integrative diagrams by grid, only 11 species were distributed in only one grid and 55.0% of all species were distributed in adjacent grids. There was a small fluctuation between the dataset (variance was 5.04, and mean value was 1.77). Ultimately, using watershed units led to a higher observation of endemic species, more species distributed within adjacent or similar units, and a more varied species composition, compared with using grid units.