Global t-SNE Map

A global t-SNE map was constructed with all ADHs, with labelled ADHs highlighted in color based on their flow regime (Fig. 5). ADHs from the same flow regime remained clustered on the map, yet the absolute location of labelled ADHs has changed from the previous map (Fig. 3). However, the topological relations among the clusters of flow regimes were largely preserved. For instance, the clusters for Class 1 and 2, which have similar ADH shape, remained close on the global t-SNE map, while those that were distinct (e.g. Class 1 and 6) are widely spaced across the map.
Figure 5: t-SNE map with all ADH samples. ADHs with flow regime labels are highlight with colors.
A large number of compact clusters on the t-SNE map were observed for Classes 1 and 2 as the ADHs in these clusters were highly correlated (Fig. 6). Approximately 50 of those clusters were scrutinized, and they result from same-year ADHs from a number of geographically proximate watersheds located near the Pacific coast (i.e. southern BC, Washington, Oregon, and northern California). This suggests that climate for a given year drives similar hydrological responses across these watersheds, which have highly similar responses.
There are 24 streams whose ADHs consistently occur in these clusters.
Figure 6: Small, compact clusters of ADHs are mostly observed in the division of Class 1 and 2 (a). ADHs of the three selected clusters exhibit strong correlation within clusters (b, c, d). Streams whose ADHs are frequently included in the compact clusters are all located in coastal PNW region (e).

Encoder

A number of encoders were tested, and the encoder with best MAE on the testing dataset employed a nine-layer architecture with activation functions of LeakyReLU (alpha=0.07) (see Encoder 50 in Table 2). The Dropout layer slightly improved loss and remarkably reduced the gap between the training and validating sets. This trained encoder achieved a MAE of 1.07 on training set, and 3.81 on testing set. The average displacement between the projected ADH points and their encoded counterparts on the t-SNE map was 5.93 for the testing set, which was small compared to the extent of the map (Fig. 7). The points formed a near-circular shape distribution on t-SNE map (see Fig. 5), so the map extent was measured by a radius to the map centroid. The radius that covers 98% data points on t-SNE map was 115.9. The ratio of average displacement to map radius is only 0.05, suggesting the errors of encoder projection are limited. After eliminating outliers (i.e.> 95 percentile), the average displacement for testing set dropped to 4.36.
Figure 7: Lines on map indicate the displacement between original t-SNE points and their counterpart projected by encoder. The displacements for training set are very small, while those for testing set are relatively large.
To illustrate the potential use of an encoder and the t-SNE map for classification, an ADH with an unknown flow regime was randomly selected and projected onto the t-SNE map using the optimal encoder (Fig. 8). Its ten nearest neighbours were identified and ADHs plotted with the unknown ADH (Fig. 8). Based on this procedure, the ADH is presumed to belong in Class 6; a flow regime that resembles the snow-dominated regime of the Canadian Rockies. The implications of this procedure are discussed below.
Figure 8: A randomly selected ADH on t-SNE map projected by Encoder (a). Original (b) and normalized ADHs (c) of ten nearest neighbors.