Figure 4. Box plot showing the comparison of NDXI and slope between wetland (Mari ), forest, and grassland. Lower end of the box denotes 25th percentile, center line denotes 50th percentile (exclusive median), and upper end denotes 75th percentile. Cross mark in the box denotes the average. Bar denotes the largest and smallest values within 1.5 times interquartile range. Outside values denotes >1.5 times the interquartile range. n = 90 for each box plot. The underlying data for this figure are available in Table S1.
A graph of the decision tree analysis is shown in Figure 5. The accuracy rate of this model for test data was 92.6% (Figure S4). Accuracy rate is generally used to evaluate a prediction precision for unknown data and is calculated by dividing the number of correct classification for test data by total number of test data (30% of 270 samples: 81). As shown in Figure 4, NDSI was selected as a criterion at the first stage of decision tree analysis. Based on NDSI ≦ –0.31, most samples of the forest and grassland were classified into the true direction, while almost all samples of the wetland (Mari ) were classified into the false direction. This result indicates that wetland can be distinguished from forest and grassland, but only by NDSI. On the other hand, many samples of forest and grassland remained in the box of true direction after the first stage. Then, NDWI ≦ –0.674 was used as a criterion to distinguish forest and grassland: most samples of grassland were classified into the true direction, while most samples of forest were classified into the false direction. Finally, at the third stage, a small number of samples were classified by slope, NDSI, and NDVI. From these results, it follows that three-stage classification by the decision tree analysis based on NDXI and slope was enough to distinguish landcover as wetland (Mari ), forest, and grassland.