(a) Data collection
We conducted a literature search in Web of Science to build a
comprehensive database of published genetic diversity observations in
marine fishes. The following keyword search terms were used: fish*
microsatellite* (marine OR ocean OR sea) and fish* mtDNA* (marine
OR ocean OR sea) . Only studies published prior to 5 January 2020 were
included in the dataset. This was a ”Class II” study in the sense of
Leigh et al. (2021) and had the benefits of more easily compiling
nuclear diversity data, accounting for allele frequencies in diversity
estimates, and using expert-defined populations, but the downside of
compiling data across fewer species, in contrast to ”Class III” studies
that use existing online databases like NCBI or BOLD to download, grid,
and analyze unique mitochondrial sequencies. During the literature
search, we excluded anadromous, catadromous, and estuarine species from
the database, as well as data from populations that were captive,
farmed, or stocked. We also excluded data from studies that either did
not report the corresponding latitudinal & longitudinal coordinates, or
only vaguely identified the sampling location (precision less than 3°).
For a more detailed explanation of further exclusion criteria see the
Supplemental Methods.
We recorded expected heterozygosity (He ) for
microsatellite studies, and nucleotide diversity (π) or haplotype
diversity (Hd ) for mtDNA studies as reported. The
standard errors of He , Hdor π were also recorded (or calculated from the standard deviations),
when provided. For mtDNA, marker length (in base pairs) was recorded.
For microsatellite studies, we recorded whether or not the primers were
originally developed in a different species, as cross-species
amplification can negatively influence diversity estimates (Barbará et
al. 2007). When possible, we recorded He on a
per-marker basis, though some studies reported only average
heterozygosity across markers. For these studies, we listed each locus
separately and extrapolated per-marker diversity by adding a normally
distributed error to the average diversity estimate (Pinsky & Palumbi
2014). This error distribution had a standard deviation equal to that
reported within the study. If a within-study standard deviation was not
available, we used the average standard deviation (0.24) reported across
all studies.
In addition to following global patterns, genetic diversity often
declines towards a species’ range margin, as populations at the edge
tend to be smaller in size relative to those at the range center (Clark
et al. 2021; Eckert et al. 2008). To help account for these cross-range
effects, we used the R package rfishbase v.3.1.6 (Boettiger et al. 2011)
to download species range data from Aquamaps (Kaschner et al. 2019). We
then calculated the latitudinal range position of each sampled
population in our database. This value ranged from 0 to 1, with 0
indicating the population was located at the center of itsspecies range
and 1 indicating the population was located at the very northern or
southern edge of its species range. Finally, we also recorded the family
and genus for each species.