Methods
Worker RNA-seq data. S. invicta fire ant colonies were collected in Florida, United States and reared for one month under standard laboratory conditions (Jouvenaz, Allen, Banks, & Wojcik, 1977) before any experimental manipulations. Prior to and during experiments, ants were fed a diet of crickets, mealworms and a mixture of sugar, fruits, and vegetables. To study the indirect genetic effects (on gene expression in adults) of the social environment (i) from egg to the pupal stage, (ii) during the first 14 days as adults (i.e., after eclosion from the pupae), and (iii) for the full time course from egg through adulthood, we transferred 25-40 dark worker pupae (i.e., 0-48h prior to emergence) between recipient colonies that were formed from 10 stock monogyne and 9 stock polygyne colonies: SB/SB worker pupae from monogyne donor colonies were transferred to both monogyne and polygyne recipient colonies and SB/SB and SB/Sb worker pupae from donor polygyne colonies to polygyne recipient colonies. Because the Sb haplotype recessively induces early mortality in U.S. populations (Hallar, Krieger, & Ross, 2007; Ross, 1997),Sb/Sb adults are rare and were excluded from our study. A given stock colony was never used more than once as a donor or recipient colony for a given treatment and pupae were never transferred between donor and recipient colonies originating from the same stock colony. All recipient colonies contained 250 workers which were marked by cutting the middle leg at the level of the mid-femur, allowing the easy identification of cross-fostered workers. Monogyne recipient colonies contained one monogyne reproductive SB/SB queen while polygyne recipient colonies contained two egg-laying dealate SB/Sb queens (mating status unknown). Sixteen days after the cross-fostering, we sampled all 0.6-0.9 mg minor workers with six intact legs for the RNA-seq analyses (these individuals were all 14-16 day-old). The decision to sample single focal individuals was made for two reasons: first, we planned to correlate gene expression levels with DNA methylation data from the same individuals (but the DNA methylation data were not of sufficient quality to pursue this aim), and, second, this sampling scheme is comparable to the gyne data we reanalyzed (Arsenault et al., 2020).
Every sampled worker was genotyped using an allelic discrimination assay (Shoemaker & Ascunce, 2010). We dissected the brains of workers on ice in Phosphate-buffered saline (PBS) solution, removing neighboring glands and neural elements attaching the brain to the cuticular periphery. Gasters were removed by a single cut. Dissected samples were stored in liquid nitrogen. Total nucleic acids from brains were extracted using an Agencourt FormaPure kit (Beckman Coulter Life Sciences, Indianapolis, Indiana) with volumes scaled down by a factor 4. The first two washing steps (wash buffer and 70% ethanol) were carried out twice. Half of the extracted nucleic acids were eluted with 20 µl of RNase free water and concentrated to 5 µl. The sample was digested by adding 20 µl of DNase I mix (60 µl contains 1.5 µl 1M Tris-HCl [pH 7.5], 3.75 µl 0.1M MgCl2, 1.5 µl 2.5 M KCl, 1.5 µl 0.1M DTT, 3.75 µl RNasin Plus [40u/µl; Promega, Madison Wisconsin], 5 units of DNase I [RNase-free; cat no. 04 716 728 001; Roche, Basel, Switzerland] and 47.5 µl RNase-free water) and incubating at 37°C for 15 min. Then 25 µl of T.E. buffer (pH 7.5) and SDS (0.2% final concentration) were added and RNA was extracted with chloropane. 8 µl of linear acrylamide (5 mg/ml; Ambion; Austin, Texas) was added as carrier to the aqueous phase and nucleic acids were precipitated by adding sodium acetate to 0.3 M and 2.5 volumes of ethanol followed by a wash in 70% ethanol. The pellet was air-dried for 5 min. and re-suspended in 9 µl of RNase-free water. Sequencing libraries for brains were prepared using SMARTer Ultra Low Input RNA for Sequencing-V3 kit (Takara Bio, Kusatsu, Japan) with a modified protocol provided by manufacturer for brains. After the fragmentation step (volume 75 µl), 8 µl of acrylamide carrier was added and double-strand cDNA was precipitated with ethanol followed by a wash in 70% ethanol. The pellet was air-dried for 5 min., re-suspended in 10 µl 10 mM Tris-HCL (pH 8.5) and processed further with the sequencing kit. Sequencing libraries for gasters were prepared using the KAPA stranded RNA-seq kit (Kapa Biosystems, Wilmington, Massachusetts) following manufacturer’s instructions. Paired-end 100-bp reads were generated for all samples using an Illumina (San Diego, California) HiSeq2500.
Gyne RNA-seq data. S. invicta gyne RNA-seq data were generated for a prior study, with detailed sampling and library preparation protocols described therein (Arsenault et al., 2020). Briefly, gynes embarking on mating flights were collected in the field in Georgia, United States. Each Illumina sequencing library was prepared from a genotyped individual’s brain or ovary tissues following the Smart-seq2 protocol (Picelli et al., 2014), and 75-bp single end reads were sequenced. Eight SB/SB gynes were sampled from 8 monogyne colonies, and 8 pairs of SB/SB and SB/Sb gynes were sampled from 8 polygyne colonies. By sampling gynes (alate pre-reproductive queen) immediately prior to their mating flights, we were able to compare the indirect genetic effects of the supergene onSB/SB individuals raised in monogyne and polygyne colonies while avoiding the large differences in reproductive status among individuals that are associated with age differences.
Bioinformatic analyses. RNA-seq reads were trimmed using TrimGalore (https://github.com/FelixKrueger/TrimGalore) and aligned to the “SINVBB1” genome assembly (Yan et al., 2020) using STAR (Dobin et al., 2013) with the 2-pass alignment procedure. In the gyne data, between 10.7 and 28.9 million reads were aligned in each sample while the worker data had higher coverage with all samples having between 33.3 and 158.2 million reads (Supplementary Data F). Gene counts exported from STAR were used for differential expression analyses with edgeR (Robinson, McCarthy, & Smyth, 2009). Differential expression was computed for each caste and tissue separately to minimize batch effects from the separate library preparation and sequencing procedures. Within each caste, differential expression was first computed using a pairwise approach. Low coverage genes with filtered using the default edgeR procedure (“filterByExpr”), where 5 samples (based on 70% of the minimum sample size of a group) need to have at least 10 reads aligned and the total count across all samples need to be greater than 15. Samples from gyne brains retained 10,150 genes after filtration, gyne ovaries retained 10,260, worker brains retained 12,006, and worker gasters retained 11,574 of the 16,314 total genes in the genome annotation. Library sizes were normalized using edgeR’s “calcNormFactors” which computes scale factors using a trimmed mean of M-values approach. Common and tagwise dispersions were computed and Quasi-likelihood F-tests were performed to compute differential expression (Robinson et al., 2009). Generalized linear model comparisons were performed using models with variables designated for Caste, Tissue, Natal Colony Type, and Supergene Genotype (cross-fostered samples were not included in this analysis). Gene Ontology functional annotations for genes were assigned with the eggnog-mapper (v 1.0.3) (Huerta-Cepas et al., 2018) and enrichment of functional terms was assessed statistically using the default “weight01” Fisher’s Exact Test in topGO (Alexa & Rahnenfuhrer, 2010). In all differential expression comparisons, a series of FDR-corrected p-value thresholds were utilized (FDR < 0.01, 0.05, and 0.1) to give a more complete view of the structure of the results. While FDR < 0.01 minimizes false-positives, it increases the likelihood of false-negatives. These patterns reverse as the FDR threshold is lowered. Consequently, we are able to identify genes with highly significant FDR values while also using less stringent FDR cutoffs for global analyses (transcriptome intersections and GO term enrichment) that are more robust to false positives.