Discussion
Identifying global patterns in genetic diversity is a fundamental goal in ecology and evolution. Since genetic diversity is a proxy for adaptive potential and the raw material for speciation events, determining its spatial distribution can help us better understand which species are most vulnerable to anthropogenic change and help explain global patterns in species diversity. Here, we outlined and tested three distinct macroecological drivers of intraspecific genetic diversity, identified global patterns, and assessed the congruence of these relationships across the genome. Overall, we found that nuclear genetic diversity was most strongly correlated with chlorophyll-a concentration, a proxy for primary productivity and resource availability, while mitochondrial diversity was tightly associated with chlorophyll-a concentration, sea surface temperature, latitude, and longitude. Taken together, these results provide support for our original hypotheses to varying degrees. The quadratic relationship between chlorophyll-a concentration and genetic diversity across the genome provides compelling evidence for the Productivity-Diversity Hypothesis and suggests that regions of higher productivity facilitate larger population sizes, and in turn, higher levels of genetic variation. However, our results suggest a tipping point may exist in this relationship, after which larger carrying capacities may result in reduced population sizes and declining genetic diversity (Lawrence & Fraser, 2020). Furthermore, temperature was positively correlated with mitochondrial genetic diversity, lending support to the Kinetic Energy Hypothesis and the relationship between temperature, metabolism, and mutation rates. The lack of a significant correlation with nuclear diversity further affirmed this theory, as oxidative damage is not expected to impact nuclear DNA and increase nuclear mutation rates in the same manner (Hoffman et al. 2004).
Interestingly, the Founder Effect Hypothesis was the only one of our three initial hypotheses the three that we did not find full support for, although the observed decline in mitochondrial genetic diversity towards the poles is in line with the hypothesis’ predictions. This decline was particularly pronounced near the Arctic, congruent with the outsized impact of glacial expansion on Northern hemisphere species, relative to their Southern Ocean counterparts (Fraser et al. 2012). Furthermore, the smaller Ne of mitochondrial DNA makes it more sensitive to LGM-induced bottlenecks (Birkey et al. 1989); strengthening any LGM signal in mitochondrial genetic diversity. Alternatively, the high levels of dispersal and admixture often observed in marine systems, along with high Ne s, may explain why a poleward decline was not observed in nuclear diversity, as elevated dispersal across the species range may help transport genetic diversity from the center to the poleward edge and replenish depleted gene pools. In fact, many temperate marine species harbor consistent levels of genetic diversity across their species range (Almada et al. 2012; Francisco et al. 2014; Martínez et al. 2015). Furthermore, microrefugia during the LGM that are uncoupled from historical climatic gradients may provide “re-seeding” opportunities for formerly glaciated regions and help buffer northern populations from extirpation, similar to previously documented patterns in the Antarctic (Suggitt et al. 2018). Given that some of these past refugia are close to modern northern range limits, expansion waves out of these locations would have been less susceptible to diversity loss from bottlenecks or serial founder events (Bringloe et al. 2020; Maggs et al. 2008).
Previous studies have also found latitudinal gradients in mitochondrial genetic diversity, including Manel et al. (2020), another prominent macrogenetic study that analyzed global patterns in marine fish genetic diversity. However, the methods and statistical analyses frequently employed by macrogenetic studies have come under recent criticism (Gratton et al. 2017, Paz-Vinas et al. 2021). Most earlier macrogenetic studies fall into the category of Class III (Leigh et al. 2021) - pooling samples and sequences into predefined grid cells or latitudinal bands, calculating diversity at the species level, then averaging all species estimates together (Manel et al. 2020; Miraldo et al. 2016; Theodoridis et al. 2020). While informative, studies of this design often fail to account for genetic variation within species (such as from the range center to edge), for the relative frequency of individual haplotypes within each population, for study-specific methodological choices, or for the unbalanced sampling of species across grid cells (Gratton et al. 2017; Paz-Vinas et al. 2021). As population size is the mediating factor in many hypotheses aimed at explaining global patterns of genetic diversity, including those assessed here, such distinctions are important. Genetic diversity may follow different spatial patterns at different scales, given that environmental gradients, ecosystem processes, and biogeography collectively influence how population-level genetic diversity is shaped into community-wide patterns (De Kort et al. 2021). Here, we conducted a Class II macrogenetic study, which enabled us to incorporate metadata from the original populations, including sample sizes and the demarcation of local populations (Leigh et al. 2021). This approach enabled us to better account for issues of within-species geographic variation and relative haplotype abundance.
Despite these differing techniques, our findings also show that mitochondrial diversity follows clear latitudinal and longitudinal gradients - peaking at lower latitudes and in the Indo-Pacific - and reaffirm patterns previously established in Manel et al. (2020). Interestingly, the Coral Triangle has been designated as the center of species biodiversity, and our models suggest it could play a similar role for genetic diversity, especially within the mitochondria. These results are unsurprising, as several of the predictors we found to be strongly associated with mitochondrial diversity (e.g. sea surface temperature) have also been linked with higher species richness (Tittensor et al. 2010). Furthermore, heightened habitat availability and coastline length have been suggested as specific drivers of species richness in the Coral Triangle and could also increase genetic diversity through their positive influence on population size (Sanciangco et al. 2013). However, our models suggest that other regions in the Indo-Pacific show elevated mitochondrial genetic diversity as well, including the coastline of the Indian subcontinent and Sri Lanka, suggesting other macroecological factors may also play an important role in creating and maintaining genetic diversity.
Importantly, compared to mitochondrial diversity, nuclear genetic diversity did not follow clear geographic gradients across either latitude or longitude. These results are similar to previous studies that saw no strong latitudinal patterns in the nuclear diversity of mammals (Schmidt et al. 2022), freshwater fish (Lawrence et al. 2023), plants (De Kort et al. 2021), or habitat-forming species (Figuerola-Ferrando et al. 2023). As nuclear diversity is more tightly coupled with population size than is mitochondrial diversity (Bazin et al. 2006), recent demographic processes or changes in population size may disrupt any pre-existing geographic patterns and result in no clear latitudinal gradients in diversity. When compared to the spatial gradients in mitochondrial genetic diversity, the inconsistency in global patterns across the genome reinforces the narrative that mitochondrial and nuclear DNA are distinct entities that are separately impacted by divergent evolutionary forces, like drift (via population size) and mutation rates (via kinetic energy). While useful in many circumstances, mitochondrial DNA should be employed with care, and not as a broad and convenient proxy for nuclear markers. This distinction is important because fish mitochondrial genomes are approximately 16 to 17 kb, while nuclear genomes range in size from 300 Mb to 4.5 Gb (Fan et al. 2020; Satoh et al. 2016), which means that more than 99.99% of the genome is nuclear. Thus, the nuclear genome contains the majority of standing genomic variation important for adaptation to changing conditions and for the speciation process.
Additionally, species-level variation often reduces our power to detect general macro-scale relationships, and almost certainly contributed to the lower R2 values reported here. Unsurprisingly, we found substantial variation in family-specific global gradients of genetic diversity for 10 families that represented a wide swath of life history traits. While most of the families followed the general patterns (at least for mitochondrial diversity) established in the main models, several instead showed increasing genetic diversity at higher latitudes and lower SST. Notably, most of these families (including Gadidae and Sebastidae) are traditionally found in colder, more temperate environments that also often have higher levels of primary productivity. If species at these latitudes can support consistently large populations due to higher resource availability, the relationship with other important ecological variables, like temperature, might be muted. This may be the case in our models, as all 10 families displayed either a positive or quadratic relationship with chlorophyll-a concentration. Nevertheless, this variation across families is an important reminder that global patterns are frequently complex, multifaceted, and often the result of many ecological and species-specific factors.
Generally speaking, macroecological drivers are likely to act in concert, not isolation, to shape global patterns. Variation in population size, and subsequently the strength of genetic drift, likely creates a baseline distribution of genetic diversity, upon which other evolutionary forces interact to create more complex patterns. Both mitochondrial and nuclear genetic diversity peaked in communities and ecosystems with higher resource availability, as represented by primary productivity. In addition, most models suggested genetic diversity was elevated closer to the range core, consistent with the central-marginal hypothesis that suggests population abundance—and subsequently, genetic diversity—is highest towards the range core where environmental conditions tend to be optimal (Eckert et al. 2008). Layered upon these findings, we found evidence that the higher mitochondrial substitution rates at lower latitudes may serve to replenish and accumulate diversity at lower latitudes, manifesting in a traditional latitudinal gradient for mitochondrial diversity that is highest near the tropics. As nuclear substitution rates are not as clearly elevated at higher temperatures (Hoffman et al. 2004), similar latitudinal patterns in nuclear genetic diversity were not apparent. Life history traits, anthropogenic change, phylogenetic relationships, and demographic history are also well-known determinants of genetic diversity, and it is likely these processes influenced our results. For instance, historically, tropical environments tend to be more stable, which can enable diversity at both the species and genetic level to accumulate over time and contribute to the latitudinal diversity gradients observed here (Rosenzweig 1995).
Range size is also commonly invoked as a driver of latitudinal patterns of genetic diversity (French et al. 2022; Lawrence & Fraser 2020), especially when genetic diversity increases towards higher latitudes. According to Rapoport’s rule, range size grows with latitude (Rapoport 1982), and may be coupled with a rise in genetic diversity because larger range sizes can support more and larger populations, and even low levels of gene flow among these demes can increase local genetic diversity (Waples 2010). However, as access to this range-wide genetic diversity is mediated by dispersal, there is no guarantee that a particular population will acquire novel alleles from elsewhere in the range. While most oceanic taxa likely have high enough rates of gene flow to facilitate this level of genetic exchange (Palumbi 1992), studies have found that marine ranges can be much more structured than previously thought (Pringle & Wares 2011; Selkoe et al. 2016). Future work explicitly testing the roles of range size and gene flow in determining general patterns of genetic diversity would help provide further clarity.
Investigating other DNA markers may also help disentangle the relative importance of various environmental drivers. In addition to the issues with mitochondria that we previously discussed, the high mutation rate of microsatellites, as well as ascertainment bias for highly polymorphic loci during marker generation, can create extraneous statistical noise and may be one reason why it was difficult to identify clear spatial patterns in nuclear diversity. Furthermore, the limited range of heterozygosity (0-1) can also impose inferential challenges and restrict the scope of observable patterns. These issues aside, microsatellites remain one of the most widely available measures of neutral nuclear genetic diversity and are positively correlated with genome-wide diversity (Mittel et al. 2015). Moreover, expected heterozygosity is a robust diversity metric unlikely to be biased by either sampling effort (Toro et al. 2009) or inbreeding because it is calculated from allele frequencies (Ritland 1996). While nuclear DNA sequence diversity (e.g. SNPs, haplotypes) provides a promising next step for future macrogenetic analyses, standardizing such data across studies remains a substantial bioinformatic challenge.
Overall, our results reveal disparate global gradients in mitochondrial and nuclear genetic diversity. While mitochondrial diversity peaks along the Equator and is positively associated with temperature, mirroring complementary patterns in marine species, nuclear genetic diversity shows no strong geographic patterns. Such a lack of clear gradients in nuclear diversity may be due in part to either evolutionary forces (e.g. contemporary demographic processes disrupting historical patterns, gene flow more evenly distributing alleles across species ranges, or latitudinally consistent mutation rates), analytical ones (e.g. the reduced statistical power of microsatellites), or a combination of the two. However, despite these differences, diversity across the genome was strongly correlated with chlorophyll-a concentrations and was elevated in regions of higher primary productivity and resource availability that are able to support larger population densities. Taken together, these findings enable a better understanding of the degree to which mutation rates (via elevated temperatures) and drift (via population size) work collectively to establish large-scale gradients of genetic diversity, providing a more comprehensive view of how forces interacting across the genome scale up to provide the starting material for species and ultimately community diversity.