SWEEP analysis
To characterize genome-wide patterns of genetic variation and population differentiation, the nucleotide diversity (π) and population-differentiation statistic (F ST) were calculated using a sliding window approach (a 40-kb window sliding in 20-kb steps). Population pairwise F ST based on the mitochondrial cytb and nuclear snx33 gene were calculated using Arlequin software version 3.5.2.2 (Excofer et al. 2005). π was used to estimate the genomic diversity of populations andF ST was used to evaluate the strength of genomic differentiation between two populations.
The selection signatures of population-specific genomic regions were measured between S. schlegeli populations (YS-ECS and SCS). We applied a sliding window approach (a 40kb window sliding with a step size of 20kb) to identify selected regions. Genomic regions with a significantly high F ST statistic (corresponding to the top 5% level) and θπ ratio (the θπ ratio was calculated as log2(π1) - log2(π2), and the top 5% level of θπ ratio was selected) values were identified as selective sweep regions. Genes were submitted to the Gene Ontology databases for annotation (Ashburner et al. 2000; Ogata et al.2000). Gene ontology analyses were performed to annotate the genes to functional ontology terms. Using these candidate genes, we performed a targeted population genetic analysis. We used a higher resolution sliding window analysis in a 10-kb window (compared with a 100-kb window in selection analysis) to calculate F ST and the θπ between populations.
Comparative morphometric analysis of populations
We examined the cranial morphology of 163 individuals (included the 69 individuals involved in the RADseq) of S. schlegeli using landmark-based geometric morphometry analysis. Digital images of the right lateral side of each specimen were taken using a Canon camera (EOS Rebel T4i; 18 mega pixels). All the morphometric analyses were performed using the comprehensive programs of Thin-plate spline (tps) (available at http://life.bio.sunysb.edu/morph/). Digital images were sorted according to geographic population and converted to tps files using tpsUtil. On the head images, we digitized 26 landmarks using tpsDig2, and the tps files with landmarks were then processed in tpsRelw to enable relative warp analysis and to obtained X and Y coordinates for further analysis. ANOVAs were performed on the RW scores of all S. schlegeli populations.

Ecological niche analyses

WeWe collected 149 presence records of S. schlegeli from the Global Biodiversity Information Facility (GBIF, 2019), field investigations, and published literature (Table S1). To avoid clustered records, we used only one presence record per 5 arcminute grid cell, approx. 10 by 10 km, to match the spatial resolution of marine environmental predictors. As a result, we retained 131 records for subsequent analyses (Figure S2). Presence records ofS. schlegeli along the coastal waters of mainland China were assigned to three distinct clades (YS, ECS, and SCS) based on results of genetic clustering and previous phylogeographic evidence from related taxa in the same region (e.g. Dong et al. 2012; Liu et al. 2007; Ni et al. 2012; Xu et al. 2009).
Marine environmental predictors at a spatial resolution of 5 arcmin were retrieved from the MARSPEC database (http://marspec.weebly.com) (Sbrocco & Barber 2013). We initially considered bathymetry (i.e. depth of the seafloor), and 10 marine environmental predictors, including annual mean, range, variance, and extreme values of sea surface temperature (SST) and sea surface salinity (SSS) (Figure S3). Bathymetry was considered since seaweed pipefish prefers shallow waters (Figure S1) (Chen et al. 2017; Zhang et al. 2017). We calculated the pairwise Pearson’s correlation coefficients (r ) among predictors and selected only one among highly correlated predictors (i.e. |r | > 0.7) (Dormann et al. 2013). Accordingly, we retained five predictors for model development: bathymetry, mean annual SSS (bio8), annual range in SSS (bio11), mean annual SST (bio13), and annual range in SST (bio16) (Figure S3).
We used n -dimensional hypervolumes (Blonder et al. 2014) to explore niche differentiation among the three clades. We delineated hypervolumes using the five predictors selected for SDM and a gaussian kernel density estimator (Blonder et al. 2018). We studied niche overlap using both a similarity and a distance metric (Mammola 2019), namely the hypervolumes’ distance between centroids and pairwise overall differentiation (β diversity) (Carvalho & Cardoso 2018; Mammola & Cardoso 2020). The latter approach decomposes overall niche differentiation (βtotal) in two process: niche shifts (βreplacement) and niche expansion/contraction (βrichness) (Carvalho & Cardoso 2018). βtotal values range from 0 (when niches are identical) and 1 (fully dissimilar niches), holding true the equality βtotal = βreplacement + βrichness.
Furthermore, we explored habitat suitability of S. schlegeliduring present-day and the Last Glacial Maximum (LGM, about 21 kya) using species distribution model (SDM). SDMs represent a widely used tool to quantify a species’ habitat suitability by exploring the relationships between distribution data and environmental predictors (Araujo et al., 2019; Thuiller et al.2019). The MARSPEC database contains paleo marine environmental layers (Sbrocco 2014). For simplicity, Sbrocco (2014) assumed that sea level at the LGM was 120 meters lower than today. Marine environmental predictors at the LGM can be derived from general circulation models (GCMs). We considered paleoclimate simulations from three GCMs (i.e. CCSM3, ECBilt-Clio, and HadCM3M2) to reduce uncertainties derived from choices of GCMs (Thuilleret al. 2019).
We developed SDM using all distribution data (i.e. 131 records) and the five non–collinear predictors. For the purpose of reducing uncertainties associated with choices of SDM algorithms, we used an ensemble modeling approach (Thuiller et al. 2019). We fitted and tested the performance of ten algorithms available in biomod2 R package (Figure S4), using default settings (Thuiller et al.2020). In addition to species presence data, we randomly generated 10,000 background points within the study area as pseudo-absences. We evaluated the predictive abilities of the ten algorithms by a five-fold cross-validation approach. True skill statistic (TSS) and the area under the receiver operating characteristic curve (AUC) were used to measure SDM predictive performance; algorithms with TSS over 0.75 and AUC over 0.9 were considered to exhibit excellent predictive capacity (Landis & Koch 1977; Omri Allouche et al. 2006; Swets 1988). All distribution data and selected algorithms were used to predict habitat suitability of S. schlegeli under present-day and LGM climate conditions. Predictions of single algorithms were ensembled by a TSS weighted average strategy (Thuiller et al. 2020; Thuilleret al. 2019). The continuous habitat suitability results (ranging continuously from 0 to 1) were transformed into binary suitable/unsuitable maps by using the 10th percentile presence probability threshold.