Introduction

Hydrologists typically acquire process knowledge from detailed place based studies and from representative experimental catchments, where hydrometric and biophysical attributes can be intensively measured over time. There are are large number of global catchment observation networks, yet in many parts of the world they are in decline due to the expense in establishing, operating and maintaining their infrastructure (Laudon et al., 2017). Consequently, extrapolating process knowledge to watersheds that are hydrologically similar, yet not necessarily measured, has been a major focus of the hydrological community for the past several decades with initiatives such as the Prediction in Ungauged Basins (PUB) program (Sivapalan et al., 2003), whose goal was to predict flow quantiles at ungauged or poorly gauged basins according to the historical flow data collected at hydrologically similar basins.
Catchment classification has a long history as a means to generalize the functional behaviour that exists within watersheds, quantify their similarity, and to transfer information among them (Wagener et al., 2007). While there is no universal hydrological classification, the degree of similarity that exists is often defined from intrinsic and response characteristics of watersheds such as: climate (e.g. temperature, precipitation), watershed biophysical characteristics (e.g. geological conditions, soil type, relief, and vegetation), and the flow regime (e.g. annual hydrograph). Climate indices for classification (e.g. K¨oppen, Thornthwaite) are widely applied at varying time scales and have an extremely long history identifying the intrinsic seasonality, thermal and moisture regimes of a region. Physiographic and biophysical indices such as soils, topography and geology strongly influence catchment behaviour (Buttle, 2006; Bormann, 2010), yet are not always ideal in defining process controls on catchment behaviour across scales and regions (Merz and Bloschl, 2005). Often, catchments with similar climate and physical conditions are not hydrologically similar (Oudin et al., 2010; Ali et al., 2012).
Evaluating catchment similarity based solely in terms of streamflow characteristics is popular; particularly in aquatic ecology where habitats are particularly sensitive to flow regimes (Poff et al., 1997). However, as Sawicz et al. (2011) notes, ecological studies are not typically aimed at understanding the behaviour of the catchment including the causes of a particular regime. Over time, the flow regime of a catchment is a descriptor of the seasonal behaviour of the streamflow (Haines et al., 1988) and by its nature is an integrator of a variety of hydrological processes produced by the interaction between climate and catchment physical characteristics. After decades of development, there are hundreds of indices available which quantitatively characterize five major components of flow regime: magnitude, timing, duration, frequency, and rate of change (Poff et al., 1997). Flow statistics (e.g. mean, max, and quantiles, standard deviation) at varying temporal scale are widely-used indices that reveal first-order information regarding magnitude, distribution, and variation of stream flow over a period of interest (Hall and Minns, 1999; Carey et al., 2010; Ali et al., 2012; Toth, 2013). More sophisticated indices, often explicitly reflecting specific hydrological processes, are preferred in catchment classification with respect to hydrological functions and system complexity (Sawicz et al., 2011). However, it remains a challenge to design a combination of hydrological indices that fully describe dominant hydrological characteristics of flow regimes, maximize distinctiveness among different flow regimes, as well as avoid information redundancy.
Classification based on flow statistics using clustering algorithms such as C-means and artificial neural networks (ANN) (Hall and Minns, 1999), hierarchical models (Snelder et al., 2005), and Bayesian clustering algorithm (Kennard et al., 2010; Sawicz et al., 2014), have been successfully applied for catchment classification and regionalization. The premise is to identify groups (or regions) in a way that similarity within a region is maximized whereas similarity between regions is minimized. Self-organized mapping (SOM), an unsupervised ANN machine learning technique has become increasingly appealing as it produces a low dimensional (typically two) representation of higher dimensional data that is simple to visualize. SOM preserves the topological structure of data as it transforms information from high-dimension feature space, and clusters information visually on maps where clustered points are more similar that distal points. When hydrological indices are transformed, catchments with homogeneous features are close on the 2-D map, and distance on the map can be used to visually infer similarity (Di Prinzio et al., 2011; Ley et al., 2011; Razavi and Coulibaly, 2013; Toth, 2013). Previously, SOM has been applied for catchment grouping with a moderate (∼50) number of samples (Ley et al., 2011; Toth, 2013), yet for extremely large data sets with thousands or millions of samples, computational time increases with sample size, challenging the utility of SOM application for very large data sets.
The objective of this research is to design and implement a novel method to visualize and classify streamflow regimes for a large streamflow data set focused on undisturbed rivers western North America. The classification is based on annual daily hydrographs (ADHs) from 304 sites over multiple years, providing 17110 ADHs for classification. The large nature of this data set renders traditional SOM impractical, and we therefore utilize t-distribution Stochastic Neighbor Embedding (t-SNE), an alternative machine learning algorithm proposed by van der Maaten (2009), to map ADHs to 2D feature space to assess flow similarity and compare this to traditional Principal Component Analyses. Furthermore, we develop an encoder neural network that allows additional data to be projected on to the t-SNE map; overcoming previous challenges with the non-parametric t-SNE technique. While this methodology only focuses on a limited region and does not attempt a universal classification, we attempt to show the novelty, flexibility and potential of this approach for future classification activities.