Gene annotation
Repeat families were de novo identified and classified using the RepeatModeler, subsequently genome was masked using RepeatMasker, and protein-coding genes were annotated using the MAKER2 annotation pipeline (Cantarel et al., 2008). Functional annotation was performed by aligning protein sequences with the protein database using BLAT (Kent, 2002) (identity >30 %, and the E < 1e -5). The detailed description in File S1.
Chromosome grouping and subgenome recombination test
We first collected genome data of 4 poplars and Salix suchowensis(Dai et al., 2014), and de novotranscriptomes assembly of other white poplars (Table S2 ). Then we performed gene family clustering using OrthoMCL on protein sequences, and conducted further collinearity analysis using MCScanX (Y. Wang et al., 2012) We chose 1,052 single copy and collinear orthologous genes to construct gene trees. Subsequently, the total 38 chromosomes of P. tomentosa were partitioned into two subgenomes (2 × 19 chromosomes) based on phylogenic distance. To investigate potential recombinantion between homologus gene pairs of the subgenomes, we compared the synonymous substitution rates of parent and progeny alleles, assuming that recombination would lead to higher substittion rates than in its absence. The detailed description in File S1.