Sequencing data
We selected species (or species complexes) that are widely distributed across the Top End and Kimberley regions, which also had geographic extensive genetic sampling with precise GPS coordinates, and which represented different genera and families of lizards. These include geckos (Gehyra and Heteronotia ; Gekkoninae), skinks (Carlia ; Scincidae) and dragons (Diporiphora ; Agamidae). All but Heteronotia (represented by the generalist H. binoei ) include species with different habitat requirements and, from prior multilocus sequencing, varying scales of phylogeographic structure (Figure 1C; Supplementary Material S1), making them ideal for our study. Based on prior phylogeographic analyses and (except for rare cases of known mtDNA introgression) using mtDNA for lineage identification, we selected a subset of 579 samples (135 Carlia , 147Diporiphora, 214 Gehyra and 83 Heteronotia ) for the SNP screening, focusing on spatially unique individuals to maximize the number and geographic spread of sampled localities across the known range of each taxon (see Battey et al, 2020). We treat closely-related and parapatrically-distributed lineages as phylogeographic units, whether or not they have been recently revised taxonomically (Suppl. Mat. S1).
Our SNP detection method, Diversity Array Technology (DArT™), uses restriction-enzyme reduction sequencing on Next-Generation-Sequencing platforms to identify SNPs within randomly distributed 75bp contigs (Jaccoud, 2001), and has proven valuable for detecting admixture between populations (Jane Melville et al., 2017; Unmack et al., 2017) and for landscape genetic studies (De Fraga, Lima, Magnusson, Ferrão, & Stow, 2017; Rossetto et al., 2019). Details of the SNP genotyping can be accessed in Georges et al. (2018) and Wells and Dale (2018). For samples from each genus, and within the older Gehyra radiation, theaustralis, koira and nana clades separately (Figure S3), the sequences were processed by proprietary DArT analytical pipelines to map reads and call SNPs. First, sequences are quality filtered using stringent selection criteria that compares the barcode region to the rest of the sequence. Next, using the DART fast clustering algorithm with a Hamming distance, sequences are aggregated into clusters. Then, SNP markers are identified in each cluster that will calculate an index of reproducibility for each locus. The resulting data contains the presence/absence of restriction fragments per SNP (SilicoDArT) and the final SNP calling with the position of a variant base related to the restriction fragment. To ensure the quality of our data, we filtered SNPs by repeatability across technical replicates (>98%), call rate (<10% missing data), and removed singletons, using the dartR package (Gruber, Unmack, Berry, & Georges, 2018) in RStudio (RStudio Core Team, 2015).