SNP calling and statistical genomic analyses
Fastp v. 0.19.7 (Chen, Zhou, Chen, & Gu, 2018) was used to trim the RAD-seq reads, keeping reads with a minimum quality of 15 before mapping individual sequences against the reference genome of the Great tit (Laine et al., 2016; GenBank assembly accession: GCA_001522545.3) with BWA v0.7.17 (H. Li & Durbin, 2010). Genotyping was conducted with stacks v2.41 (Rochette, Rivera-Colón, & Catchen, 2019) “gstacks ” and “population ” functions, using “snp ” model, filtering for mapping quality >  10, alpha  =  0.05, minor allele frequency >  0.05 and observed heterozygosity <  0.65. Loci were retained if present in at least 90% of individuals in each population. Loci with extreme low or high coverage were removed (5% extremes filtered out using vcftools v0.1.15 Danecek et al., 2011). After filtering, 74,137 SNPs were retained for subsequent population genomic analyses.
To document genomic variation among urban and rural great tits from the three locations we used a redundancy analyses (RDA), with location (Barcelona, Montpellier and Warsaw), environment (urban or rural) and sex as explanatory variables. Partial RDA was also produced to test for each variable effect (environment, location or sex) alone after controlling for all other variables. The effect of a given factor was considered significant with a p-value <  0.05.
To estimate genome-wide differentiation between populations we used Weir and Cockerham’s FST (Weir & Cockerham, 1984) computed using the StAMPP R package (Pembleton, Cogan, & Forster, 2013). Average FST was estimated using all SNPs, and confidence intervals were assessed using 1000 bootstrap replicates.
We used two methods to investigate outlier SNPs potentially under divergent selection between forest and urban populations: an FST-outlier based method (using Bayescan v2.1, Foll & Gaggiotti, 2008), and a multivariate method (using a RDA, Forester, Jones, Joost, Landguth, & Lasky, 2016) aiming respectively at identifying strong outliers indicating footprints of differentiation for each population pair and weaker outliers that represents footprints of divergent selection typically expected in polygenic adaptations in response to complex environmental heterogeneity across several population pairs studied at once.
First, we ran Bayescan (with default parameter options) for each pair of populations (Barcelona, Montpellier & Warsaw) separately to detect SNPs with outlier values of FST. As recommended by (Foll & Gaggiotti, 2008) we considered SNPs as outliers when they displayed a q-value above the 0.1 threshold. Second, following a similar procedure as described above for the RDA analysis, we used a constrained RDA to investigate the effect of habitat (forest vs. city) and to identify outliers SNPs that displayed more than 3 times SD from the mean score on the constrained axis (Forester et al., 2016).