Epigenetic processes have taken center stage for the
investigation of many biological processes and epigenetic modifications
have shown to influence phenotype, morphology and behavioral traits such
as stress resistance by affecting gene regulation and expression without
altering the underlying genomic sequence. The multiple molecular layers
of epigenetics synergistically construct the cell type-specific gene
regulatory networks, characterized by a high degree of plasticity and
redundancy to create cell-type specific morphology and function. DNA
methylation occurring on the 5’ carbon of cytosines in different genomic
sequence contexts is the most studied epigenetic modification. DNA
methylation has been shown to provide a molecular record of a large
variety of environmental factors, which might be persistent through the
entire lifetime of an organisms and even be passed onto the offspring.
Animals might display altered phenotypes mediated by epigenetic
modifications depending on the developmental stage or the environmental
conditions as well as during evolution. Therefore, the analysis of DNA
methylation patterns might allow deciphering previous exposures,
explaining ecologically relevant phenotypic diversity and predicting
evolutionary trajectories enabling accelerated adaption to changing
environmental conditions. Despite the explanatory potential of DNA
methylation integrating genetic and environmental factors to shape
phenotypic variation and contribute significantly to evolutionary
dynamics, studies of DNA methylation are still scarce in the field of
ecology. This might be at least partly due to the complexity of DNA
methylation analysis and the interpretation of the acquired data. In the
current issue of Molecular Ecology Resources, Laine and colleagues
(2023) provide a detailed summary of guidelines and valuable
recommendations for researchers in the field of ecology to avoid common
pitfalls and perform interpretable genome-wide DNA methylation
analyses.
A large variety of techniques to study DNA methylation at
high-resolution either genome-wide or at specific loci have been devised
and this field is rapidly and constantly evolving with new methods being
developed due to technological advances as well as our increasing
knowledge on the gene regulatory landscape created by epigenetic
modifications (Tost, 2022). Epigenome-wide association patterns
analyzing genome-wide DNA methylation patterns have been performed in
many human cohorts demonstrating altered DNA methylation associated with
most phenotypes that have so far been investigated as well as in
function of exposure to environmental conditions ranging from
temperature to chemical pollutants providing a solid scientific basis
for the need of similar studies in non-model animals.
Laine and colleagues focus on bisulfite sequencing, which is the most
commonly used technology, detailing the most frequently employed
variations of whole genome bisulfite sequencing (WGBS) such as classical
MethylC-seq and post bisulfite adaptor tagging (PBAT) as well as
approaches to focus on parts of the genome through Reduced
Representation Bisulfite Sequencing (RRBS). While WGBS comes at a
significant higher cost, requires much more extensive computational
resources and many sequencing reads will - due to the evolutionary
depletion of CpGs in mammals - not contain a single methylation
position, it allows to assess all CpGs in a genome. On the other hand,
RRBS is much more cost-effective but requires knowledge about the
expected genomic context and CpG density to identify the best suited
pair of restriction enzymes for selection of these regions. It should be
noted that data from human studies points to enhancers as being
significantly enriched for inter-individually variable and exposure
sensitive DNA methylation patterns rather than CpG island containing
promoters, which remain in most cases unmethylated (Gu et al., 2016, Lea
et al., 2018). This class of regulatory elements is however difficult to
enrich by RRBS.
There are a number of challenges associated with DNA methylation
analysis in general and especially in the context of ecology and
evolution, which the authors well elaborate on and provide practical
solutions. These include notably the required sample size that is
required to obtain statistical meaningful results as well as important
considerations on key parameters such as for example balancing
sequencing depth versus more replicates. One of the most critical
parameters as pointed out by the authors, is the type of sample that
will be analyzed, a point which cannot be stressed enough. As DNA
methylation patterns are cell-type specific, it is important to
critically question the relevance of the analyzed sample / tissue type
for the scientific question to be addressed in a study. For example,
despite its ease of access blood samples might not be an ideal target to
analyze behavioral or cognitive traits. A large portion of DNA
methylation patterns are similar between different cell-types and
tissues, with unmethylated promoters and highly methylated gene bodies
and intergenic regions leading to highly significant correlation
coefficients between tissues. However, these regions will in most cases
also not vary depending on other factors such as environmental
conditions, while CpGs displaying variability between tissues are in
many cases also those that will be more sensitive to modifications by
exposure and correlate with certain phenotypes. Furthermore, if a tissue
of relevance can be obtained, its composition will strongly determine
the overall DNA methylation patterns. If a specific cell population is
of interest, a simple way to avoid confounding is to couple the DNA
methylation analysis with cell sorting. Most WGBS protocols are easily
compatible with the amount of DNA that can be obtained after cell
sorting. If no specific cell population can be preselected, the
potential heterogeneity in cellular composition between samples can be
identified and corrected for using either cell-type specific reference
DNA methylome, or reference-free deconvolution algorithms, which
estimate the number of present cell-types (Teschendorff and Zheng,
2017). While reference-based deconvolution algorithms show improved
performance in terms of sensitivity and accuracy compared to
reference-free methods, reference epigenomes will be rarely available
for wild vertebrate species and reference-free methods will nonetheless
appropriate to detect large heterogeneity in cell-type composition
between the sample and potentially control for them. Of note, one might
not always want to control for this heterogeneity as exposure-induced
differential cell composition might be of interest itself as we recently
demonstrated for in utero exposure to synthetic phenols (Jedynak et al.,
2021).
Comprehensive DNA methylation studies using whole genome bisulfite
sequencing have been so far scarce in larger sample cohorts mainly due
to the cost of sequencing. It should be noted that the number of usable
sequences after trimming and mapping for bisulfite converted libraries
is substantially lower as for conventional genome sequencing. The use of
reference genome sequences from related species due to the absence of a
proper reference for the exact species under investigation will further
decrease the quantity of uniquely mapped reads. While cost has been the
main limiting factor, this year has seen a number of novel
high-throughput sequencers being announced and released onto the market
that will lead to a significant decrease in costs associated with
genome-wide bisulfite sequencing, with DNA methylation analysis already
reported for some of them (Lee et al., 2022). Due to the revived
competition on the sequencing market, further evolutions in terms of
cost and throughput can be expected in the near future enabling the
analysis of larger sample sizes. While sequencing protocols based on
bisulfite conversion, have proven robust and reproducible with over- or
incomplete conversion being a problem in only a minority of cases,
alternative protocols making use of enzymatic conversion such as EM-seq
(Vaisvila et al., 2021) have been devised and shown similar or better
performance as bisulfite-based methods, mainly if very limited amounts
of input material were used. Recent developments such as combined
genetic and epigenetic sequencing providing simultaneous information on
both classic and epigenetically modified nucleotides (Fullgrabe et al.,
2023), could simplify DNA methylation analysis in ecology research in
the future, especially if combined with long-read sequencing enabling to
create a genetic reference genome at the same time as well as the
genome-wide map of DNA methylation enabling phasing of genetic and
epigenetic variation.
While the analysis of DNA methylation is a first and important step to
decipher the mechanisms for altered phenotypes and traits, it will in
the future be necessary to analyze the multiple facets of epigenetics to
gain a full understanding of the functional impact of epigenetic
changes. This includes of course histone modifications as well as
non-coding RNAs, but also the oxidative derivatives of cytosine
methylation such 5-hydroxymethylation, that is highly prevalent in
gene-regulatory elements in some cell-types such as neurons and has an
opposite role on gene expression compared to cytosine DNA methylation.
Of note, bisulfite treatment is unable to differentiate between DNA
methylation and hydroxymethylation and alternative technologies are
required (Tost, 2022). Other DNA methylation marks such as adenine
methylation and methylation of RNAs (epitranscriptomics) are currently
intensively investigated for their gene regulatory potential, but also
for their ability to be modified by environmental exposures and their
capacity to memorize these exposures (Wu et al., 2023).
The field of ecology and evolution will strongly benefit from these
technological advances and new discoveries on the complex gene
regulatory landscape defined by epigenetic modifications in humans and
model organisms. Although epigenetic analyses might have unique
challenges, the guidelines by Laine et al. (Laine et al., 2022) provide
very valuable guidance and avoid frustrating and costly mistakes for
scientists, who would like to add an epigenetic dimension to their
ecology research to fully apprehend the molecular basis of phenotypic
variability, rapid environmental adaptation and evolutionary dynamics in
which epigenetic modifications play undoubtedly a major role.