Global t-SNE Map
A global t-SNE map was constructed with all ADHs, with labelled ADHs
highlighted in color based on their flow regime (Fig. 5). ADHs from the
same flow regime remained clustered on the map, yet the absolute
location of labelled ADHs has changed from the previous map (Fig. 3).
However, the topological relations among the clusters of flow regimes
were largely preserved. For instance, the clusters for Class 1 and 2,
which have similar ADH shape, remained close on the global t-SNE map,
while those that were distinct (e.g. Class 1 and 6) are widely spaced
across the map.
Figure 5: t-SNE map with all ADH samples. ADHs with flow regime labels
are highlight with colors.
A large number of compact clusters on the t-SNE map were observed for
Classes 1 and 2 as the ADHs in these clusters were highly correlated
(Fig. 6). Approximately 50 of those clusters were scrutinized, and they
result from same-year ADHs from a number of geographically proximate
watersheds located near the Pacific coast (i.e. southern BC, Washington,
Oregon, and northern California). This suggests that climate for a given
year drives similar hydrological responses across these watersheds,
which have highly similar responses.
There are 24 streams whose ADHs consistently occur in these clusters.
Figure 6: Small, compact clusters of ADHs are mostly observed in the
division of Class 1 and 2 (a). ADHs of the three selected clusters
exhibit strong correlation within clusters (b, c, d). Streams whose ADHs
are frequently included in the compact clusters are all located in
coastal PNW region (e).
Encoder
A number of encoders were tested, and the encoder with best MAE on the
testing dataset employed a nine-layer architecture with activation
functions of LeakyReLU (alpha=0.07) (see Encoder 50 in Table 2). The
Dropout layer slightly improved loss and remarkably reduced the gap
between the training and validating sets. This trained encoder achieved
a MAE of 1.07 on training set, and 3.81 on testing set. The average
displacement between the projected ADH points and their encoded
counterparts on the t-SNE map was 5.93 for the testing set, which was
small compared to the extent of the map (Fig. 7). The points formed a
near-circular shape distribution on t-SNE map (see Fig. 5), so the map
extent was measured by a radius to the map centroid. The radius that
covers 98% data points on t-SNE map was 115.9. The ratio of average
displacement to map radius is only 0.05, suggesting the errors of
encoder projection are limited. After eliminating outliers (i.e.> 95 percentile), the average displacement for
testing set dropped to 4.36.
Figure 7: Lines on map indicate the displacement between original t-SNE
points and their counterpart projected by encoder. The displacements for
training set are very small, while those for testing set are relatively
large.
To illustrate the potential use of an encoder and the t-SNE map for
classification, an ADH with an unknown flow regime was randomly selected
and projected onto the t-SNE map using the optimal encoder (Fig. 8). Its
ten nearest neighbours were identified and ADHs plotted with the unknown
ADH (Fig. 8). Based on this procedure, the ADH is presumed to belong in
Class 6; a flow regime that resembles the snow-dominated regime of the
Canadian Rockies. The implications of this procedure are discussed
below.
Figure 8: A randomly selected ADH on t-SNE map projected by Encoder (a).
Original (b) and normalized ADHs (c) of ten nearest neighbors.