Detecting clonality under realistic conditions
Based on our results, clonal richness (R ) and clonal evenness (Pareto \(\beta\)) are highly sensitive to sampling. Even using relatively large sample sizes (from 100 to 500 individuals) leads to deeply biased estimates of the true R and \(\beta\) and thusc values. R is always greatly overestimated, by some orders of magnitude more than previously demonstrated with empirical datasets for which the rates of clonality remained unknown (Arnaud-Haond et al., 2007; Gorospe et al., 2015), and except in nearly strictly sexual populations, \(\beta\) was also greatly overestimated (for\(c\geq 0.1\)). Genotypic descriptors computed from realistic sample sizes may be informative only for rare cases of small population sizes (N ≤1000 individuals in the case of our simulations). For most situations where population sizes are large, genotypic descriptors computed with realistic sample sizes result in extreme underestimation of the rates of clonality (see below) or even in overlooking the occurrence of PC (i.e. , considering the species as strictly sexual). These results raise questions regarding the conclusions derived in the literature from studies assessing the occurrence or even sometimes the extent of clonality based only on genotypic indices.
In contrast, the distribution moments of F IS and mean LD for common sample sizes (more than 20 individuals) produced values consistent with those obtained from genotyping the whole population, yet they previously could be interpreted only for extreme rates of clonality (c≥0.95). Consequently, when analysing samples from populations with more than 1000 individuals, most genetic descriptors should remain informative and sometimes, together with any Rvalues lower than 1, should be interpreted as a likely signature of a high prevalence of clonality (c≥0.95)
This worrying limitation recalls, for example, the results recently reported by Dia et al. (2014) for a unicellular phytoplankton species involved in harmful algal blooms (HABs), Alexandrium minutum . This species, which causes paralytic shellfish poisoning (PSP), shows an alternation between clonal (during the bloom) and sexual phases. Dia et al. (2004) sampled populations throughout the bloom (clonal) events, during which they grew from being nearly undetectable to exhibiting a concentration of 104 to 105 cells per litre. Of the more than 1000 strains cultivated, 265 were fully genotyped, among which no replicated genotypes were found, driving the estimate of clonal diversity to R =1. Without extensive knowledge of the biology of this species, clonality would not have been diagnosed on the basis of this sampling, which raises questions regarding the occurrence of clonality. Unfortunately, no F ISvalues could be reported in this study because only the haploid phase could be sampled, and the LD detected suggested the occurrence of recombination. However, according to these results, genetic descriptors allow the detection or estimation of clonality when its prevalence is extreme: the results by Dia et al. (2014) thus mainly suggest that the clonal rate during the bloom event did not exceed 0.95 in the few previous generations, still leaving great uncertainty as to the prevalence of sexual or clonal reproduction in this species.
Most target species in the literature, including clonal plants and invasive and pathogenic species, exhibit extremely large population sizes, thus raising serious questions regarding our ability to detect clonality based on realistic sample sizes, let alone infer its importance. The importance of sample size is reflected in the guidelines provided by the pioneering work of Tibayrenc et al. (1991), who listed 8 criteria to detect clonality, among which fixed heterozygosity, deviation from HWE and LD were expected to be important in the ability to diagnose clonality. Nevertheless, these criteria would apply only to diploid species with extreme rates of clonality, excluding haploid lineages and diploid species with c <0.95.
One may consider the clonal mechanisms and the way clonal replicates spatially disperse to better estimate the effect of the joint incidence of the sampling density and scale of dispersal of clones (driving the scale of spatial autocorrelation of genotypes compared to the grain size of sampling) on the ability of a given strategy to detect clonal replicates and therefore on the conclusions derived from population genetics data as to the incidence of sexual versus clonal reproduction. Along a continuum of dispersal from microorganisms such as unicellular algae and flying aphids to clonal plants with strong rhizomatic connections and ramets more often clumped than dispersed, the spatial autocorrelation of clones increases, as does the ability of a given sampling strategy to reveal clonal replicates at equal sampling densities. As a consequence, at the first end of this continuum, where spatial dispersal is not limited (as is the case for A. minutum ), genotypic parameters alone may not be informative on the existence or extent of clonality except for nearly strictly clonal organisms such as the human pathogen Trypanosoma cruzi . Such power would be gained as the spatial distance of clonal dispersal becomes lower than the sampling mesh size (for an example of the influence of sampling strategy in corals, see Gorospe et al., 2015; see Riginos, 2015) for a comment), and clonal replicates would become decreasingly randomly diluted at large population sizes and across vast spatial scales.