Figure 3. Schematic flowchart of several processes for
landcover classification. The detailed explanations of the processes in
Google Earth Engine, QGIS, and Python are described in subsections 2.2,
2.3, and 2.4, respectively.
2.3 Investigation of NDXI and slope across different vegetation types
Investigation of NDXI in different vegetation types was conducted by GIS
software Quantum GIS (QGIS version 3.20.0). In this study, we attempted
to classify landcover in the Tyrma region into three vegetation types:
wetland, forest, and grassland (Figure 2). First, 30 random point clouds
were generated in a area where these vegetations are dominated. The
random point cloud is a tool to sample a certain number of data in a
given area, and we can specify the minimum distance between points
(Figure S3). Here, we specified 30 m as a minimum distance to avoid
generating points on the same grid of Landsat-8 data with 30 m
resolution. Most importantly, the areas where the random point clouds
were generated are locations that we confirmed the actual vegetation
type (ground truth) at the study site. The random point clouds were
generated in three ground truth areas for each vegetation type, that is,
a total of 270 points (3 vegetation types × 3 ground truth areas × 30
point clouds). Then, the values of NDXI calculated using JJA-median
Landsat-8 data were investigated for all 270 points. In addition to
NDXI, the degree of slope at all 270 points was also investigated. The
slope data were created using a 30 m resolution digital elevation model
(DEM) provided by Japan Aerospace Exploration Agency (JAXA).
The coordinates, the values of
NDXI, and the degree of slope for all 270 points are summarized in Table
S1.
2.4 Determination of classification criteria by decision tree algorithm
To classify landcovers based on NDXI and slope, the criteria were
determined by supervised machine learning. Here, we utilized decision
tree analysis in Python (version
3.8.5). A decision tree is an
algorithm that classifies data gradually based on generated rules and
outputs a tree-like graph. Of the data in all 270 points, 30% were used
as test data, and the other 70% were used as learning data. This data
classification was done in three stages. The obtained criteria as a
result of the decision tree analysis were extrapolated to the whole
Tyrma region, and landcover was classified into wetland (Mari ),
forest, and grassland. The code
for the decision tree analysis in Python is available in Figure S4.
2.5 Sampling of river waters
Water samples were collected in July 2019. Sampling of the Tyrma Main
River was conducted just before the confluence with the Gujik River, and
the sampling of other large rivers (the Yaurin River, the Gujik River,
the Gujal River, and the Sutyri River) was conducted just before the
confluence with the Tyrma River (Figure 1b). In addition, water samples
were also collected in 19 small rivers: 8 rivers in the Gujal River
system and 19 rivers in the Tyrma River system (Figure 1c). Two hundred
milliliters of water was sampled using a disposable syringe (TERUMO,
SS-50ESZ) and immediately filtered through 0.45 µm disposable filters
made of cellulose acetate (ADVANTEC, DISMIC 25CS045AS). One hundred
milliliters of the filtered water was preserved in an acid-washed
propylene bottle for dFe measurement and the other 100 mL was preserved
in a propylene bottle for DOC measurement. Both samples were kept in a
refrigerator until analysis. Also, electrical conductivity (EC) was
measured using a portable EC meter (ES-71, HORIBA) at the time of water
sampling.
2.6 Chemical analyses and statistics
dFe concentration was determined by the 1,10-phenanthroline method
(Russian international technical
standards 52.24.358-2006:
https://files.stroyinf.ru/Index2/1/4293837/4293837319.htm). Here, we
describe this method briefly. First, 1 mL of 10% hydroxylammonium
chloride was added to 50 mL of the sample. Second, this was boiled for
15–20 min until the volume reached 25 mL to separate organic iron
complexes into organic compounds and Fe (Ⅱ). Third, after cooling,
ammonium hydroxide was added until ~pH4. Fourth, 3 mL of
ammonium acetate buffer and 1 mL of 1,10-phenanthroline were added, and
ultrapure water was added until the volume reached 50 mL. Finally, 20
min after the color development, the absorbance at a wavelength of 510
nm was measured with an ultraviolet–visible spectrophotometer (SHIMADZU
UV mini-1240). In this paper, we define dFe as Fe that was determined by
this process. The detection limit for dFe by the 1,10-phenanthroline
method was 0.02 mg L–1. DOC concentration was
determined with a total organic carbon (TOC) analyzer (SHIMADZU
TOC-LCSH) using the catalytic combustion oxidation method. The detection
limit for TOC by the TOC analyzer was 0.1 mg L–1, and
standard solutions for DOC analysis were prepared using Potassium
Hydrogen Phthalate (C6H4(COOK)(COOH)) (Nacalai tesque).
Based on the produced landcover map (subsections 2.2–2.4), the coverage
of wetland (Mari ) was investigated for each catchment area of 5
large rivers and 19 small rivers. The correlation of water chemistry
(dFe, DOC, and EC) and the coverage of wetland (Mari ) was
assessed using liner regression analysis and non-liner regression
analysis. For non-liner regression analysis, three common functions
(power, exponential, and logarithmic) were investigated to create a
approximation curve. Both liner and non-liner regression calculations
were performed by least-square method with Microsoft Excel Solver
(version 2021). The approximation line or curve with the highest
coefficient of determination (r2 ) was selected
as the most suitable regression equation to represent the coefficient
between water chemistry and the wetland coverage. Note that coefficient
of determination (r2 ) and Pearson’s correlation
coefficient (r ) for each approximation curve were calculated by
fitting liner regression model for the log-transformed data.
3 Results
3.1 Landcover classification by decision tree analysis based on NDXI and
slope
Ranges of NDXI and slope on the point clouds in wetland (Mari ),
forest, and grassland are shown in Figure 4. Here we focus attention on
the differences of the NDXI and slope in the wetland compared with those
in the forest and the grassland. In the wetland, NDVI was in the range
of 0.50–0.75, NDSI was –0.32 to –0.14, and NDWI was –0.65 to –0.47.
NDVI and NDSI in the wetland largely overlapped with those of the
forest, indicating that NDVI and NDWI were not useful in distinguishing
between wetland and forest. On the other hand, NDSI in the wetland was
clearly higher than that in the forest and the grassland. The range of
slope in the wetland was quite low at 0.43–4.31 degrees. Compared with
this, the forest clearly showed a higher range, but the grassland showed
almost the same range as the wetland; accordingly, wetland cannot be
distinguished from grassland just by the slope. From these findings,
NDSI seems to be the most effective index for identifying the
distribution of wetland (Mari ).