2.3 Phylogenetic Analysis
A total of 35 species (3 newly determined in this study, 32 available from GenBank) representing seven subfamilies of Hesperiidae were used to construct the phylogenetic relationships. The ingroup contains 5 species of Coeliadinae, 1 species of Euschemoninae, 2 species of Pyrginae, 4 species of Tagiadinae, 2 species of Eudaminae, 3 species of Heteropterinae, 2 species of Barcinae and 16 species of Hesperiinae. The 4 Papilionidae species (P. machaon , P. helenus , G. timur and P. apollo ) were selected as outgroups (Table 1).
The complete mitogenome genes were extracted using PhyloSuite v1.2.2 and the sequences of 13 PCGs of the 39 species were aligned in batches with MAFFT integrated into PhyloSuite. Nucleotide sequences were aligned using the G-INS-i (accurate) strategy and codon alignment mode. All rRNAs were aligned in the MAFFT with the Q-INS-i strategy (Katoh & Standley, 2013). Poorly matched sites in the alignments were removed using Gblocks v0.91b (Castresana, 2000). Individual genes were also concatenated using PhyloSuite v1.2.2.
We used 3 datasets to reconstruct the phylogenetic relationship: (1) PCG matrix, containing all codon positions of the 13 protein-coding genes; (2) PRT matrix, concatenating all codon positions of the 13 protein coding genes, 22 tRNAs and 2 rRNAs; and (3) 12PRT matrix, including the first and second codon positions of 13 protein-coding genes plus 22 tRNAs and 2 rRNAs. Based on 3 datasets, the maximum likelihood (ML) and Bayesian inference (BI) methods were used to reconstruct the phylogeny. The optimal partitioning scheme and nucleotide substitution model for ML and BI phylogenetic analyses were selected using PartitionFinder 2.1.1 (Lanfear, Frandsen, Wright, Senfeld, & Calcott, 2017) with the greedy algorithm and BIC (Bayesian information criterion) criteria (Tables S3 and S4). Maximum likelihood analysis was inferred using IQ-TREE (Nguyen, Schmidt, Von Haeseler, & Minh, 2015) with the ultrafast bootstrap (UFB) approximation approach (Minh, Nguyen, & von Haeseler, 2013), as well as the Shimodaira-Hasegawa-like approximate likelihood-ratio test (Guindon et al., 2010), and the bootstrap value (BS) of each node of the ML tree was evaluated via the bootstrap test with 10,000 replicates. Bayesian inference was carried out using MrBayes 3.2.6 (Ronquist et al., 2012) with the following requirements: 2 independent runs of 1×107 generations were conducted with four independent Markov Chain Monte Carlo (MCMC) runs, including 3 heated chains and a cold chain, by sampling every 1,000 generations. A consensus tree was obtained from all the trees after the initial 25% of trees from each MCMC run was discarded as burn-in, with the chain convergence assumed after the average standard deviation of split frequencies fell below 0.01. The confidence value of each node of the BI tree was presented as the Bayesian posterior probability (BP).
3. Results and Discussion