Figure 4. Box plot showing the comparison of NDXI and slope
between wetland (Mari ), forest, and grassland. Lower end of the
box denotes 25th percentile, center line denotes 50th percentile
(exclusive median), and upper end denotes 75th percentile. Cross mark in
the box denotes the average. Bar denotes the largest and smallest values
within 1.5 times interquartile range. Outside values denotes
>1.5 times the interquartile range. n = 90 for each box
plot. The underlying data for this figure are available in Table S1.
A graph of the decision tree analysis is shown in Figure 5. The accuracy
rate of this model for test data was 92.6% (Figure S4). Accuracy rate
is generally used to evaluate a prediction precision for unknown data
and is calculated by dividing the number of correct classification for
test data by total number of test data (30% of 270 samples: 81). As
shown in Figure 4, NDSI was selected as a criterion at the first stage
of decision tree analysis. Based on NDSI ≦ –0.31, most samples of the
forest and grassland were classified into the true direction, while
almost all samples of the wetland (Mari ) were classified into the
false direction. This result indicates that wetland can be distinguished
from forest and grassland, but only by NDSI. On the other hand, many
samples of forest and grassland remained in the box of true direction
after the first stage. Then, NDWI ≦ –0.674 was used as a criterion to
distinguish forest and grassland: most samples of grassland were
classified into the true direction, while most samples of forest were
classified into the false direction. Finally, at the third stage, a
small number of samples were classified by slope, NDSI, and NDVI. From
these results, it follows that three-stage classification by the
decision tree analysis based on NDXI and slope was enough to distinguish
landcover as wetland (Mari ), forest, and grassland.