Methods
Ethics
The study was approved by the Danish
National Committee on Health Research Ethics (VEK journal ID 74188).
Data management and privacy policies was approved by the Danish Data
Protection Agency (journal ID REG-131-2020). Informed consent was
provided by all participants. An honorary fee of 500 Danish Crowns per
EEG session was provided to all participants.
Participants
Patients (N = 50) of both sexes aged 18 to 59 with a primary diagnosis
of either agoraphobia, depression, generalized anxiety disorder (GAD),
obsessive-compulsive disorder (OCD), social anxiety disorder (SAD) or
panic disorder (PD), with or without comorbidities including another
emotional disorder, attention-deficit hyperactivity disorder and
personality disorder about to start UP transdiagnostic group cognitive
behavior therapy were recruited from three tertiary free-of-charge
public Mental Health Service outpatient clinics in Denmark, as described
in detail in (Reinholt et al., 2021). Patients were referred to the
clinic after two previous failed treatment attempts in primary care.
Exclusion followed the exclusion criteria for receiving treatment in the
participating outpatient clinics: an ICD-10 F20 diagnosis, bipolar
disorder or autism, alcohol- or substance use disorder, increased risk
of suicide, recent (<4 weeks) onset or alteration of
psychotropic medication, previous traumatic brain injury or organic
brain disorder as assessed by medical history, and normal mental
capabilities as estimated by having completed Danish primary school.
Healthy comparison subjects (HC,
N = 37), matched with the patient group in age and sex, were recruited
from the local community through posters and online advertisement.
Exclusion criteria were the same as for patients but also included no
prior or present psychiatric diagnosis or psychotropic medication. All
participants had normal or corrected-to-normal eye vision.
Clinical measures
Current medication status
was extracted from the electronic health record including type of
medication, dosage, treatment duration and changes hereof. Information
on handedness and hearing status (normal/impaired) was interview-based.
All participants were assessed with the the Mini-International
Neuropsychiatric Interview version 7 (M.I.N.I.) diagnostic interview by
a psychiatry trainee (MR) (Sheehan, 1998). For patients, a primary
diagnosis within the emotional disorder spectrum was confirmed, and up
to three concurrent secondary diagnoses were noted. Healthy comparison
subjects were screened for the absence of symptoms fulfilling criteria
for any psychiatric diagnosis.
Psychometrics
The battery of psychopathology measures consisted of several validated
self-report questionnaires. While rating-scales derived directly from
the HiTOP are under development and require local translation and
validation, the items selected in this study were deemed to be
sufficiently consistent with the HiTOP (Wendt et al., 2021).
The Multidimensional Emotional Disorder Inventory (MEDI, 49 items ranged
0 to 8) assesses nine empirically-supported transdiagnostic symptom
dimensions within the Internalizing spectrum: autonomic arousal,
avoidance, depression, intrusive cognitions, neurotic temperament,
positive temperament, somatic anxiety, social anxiety and traumatic
re-experiencing (Rosellini & Brown, 2019). Note that when calculating
the MEDI total score, Positive temperament is subtracted rather than
added.
The Modified Personality Inventory for DSM-5 and ICD-11 – Brief Form
Plus (PID5BF+M, here referred to as PID36, 36 items ranged 0 to 3)
assesses six personality trait domains in the Internalizing and Thought
disorder spectra: anankastia, antagonism, detachment, disinhibition,
negative affectivity and psychoticism (Bach et al., 2020). Note that
this version is developed in accordance with the coming ICD-11, which is
moving toward a dimensional understanding of the personality disorders
(Bach & Mulder, 2022).
In addition, two shorter self-report questionnaires were administered to
assess the severity of the transdiagnostic dimensions personality
pathology and psychological distress, respectively: the Level of
Personality Functioning Scale-Brief Form (LPFS, 12 items ranged 1 to 4)
(Hutsebaut et al., 2016) and the K10 distress scale (K10, 10 items
ranged 1 to 5) (Kessler et al., 2003).
For both HC and patients, this
battery was administered in conjunction with the respective EEG
recordings. Participants were instructed to answer +/- 1 week from the
EEG recording. All psychopathology measures were obtained using the
online survey and database management web application REDCap licensed to
Region Zealand, Denmark (Harris et al., 2019).
Procedures
EEG laboratory setup
EEG was recorded at two psychiatric hospitals in Region Zealand,
Denmark. Each session took place either in the morning or early
afternoon. The first session, baseline, lasted approximately three hours
including information, electrode and cap application, EEG recording
(~1 hour), M.I.N.I diagnostic interview and breaks.
Sessions two and three, for patients after 10 and 14 weeks,
respectively, lasted at most two hours and consisted only of the EEG
recording. In order to account for normal variation in the statistical
models, some HCs were invited to a second recording after at least 8
weeks. Participants were instructed to show up rested and to avoid
coffee and nicotine intake 2 hours before. Patients were also instructed
to avoid, if possible, medication prescribed “as needed” on the night
before and day of recording.
During EEG recording,
participants were seated in a comfortable armchair in a secluded room
and instructed to sit as still as possible. Visual stimuli were
presented on a 17” LCD monitor situated 1.5 meters from the
participant. Audio stimuli were presented with airtube stereo insert
earphones (C and H Distributors Inc., Milwaukee, WI, USA, 2021). Similar
room luminosity at the two sites was ensured with blackout curtains but
was not objectively measured.
EEG recording
EEG at the two sites was
recorded with identical Biosemi ActiveTwo Mark 2 systems with 64 Ag/AgCl
pin-type active electrodes attached to a cap according to the extended
10/20 system (BioSemi, Amsterdam, 2021). The signal was recorded
reference-free with common mode sense (CMS) and driven right leg (DRL)
electrodes as “ground” placed centrally close to POz. The signal was
digitized with a sampling rate of 2048 Hz. Electrode offset was kept
below 40 µV.
EEG paradigms
All paradigms were presented using Presentation® software version 23.0
(Neurobehavioral Systems, 2021). The paradigms, presented in this order
for all participants, were:
Attended oddball (AO)
Auditory stimuli (N =
1500 in 5 blocks) delivered monoaurally in a pseudorandom order: 10%
target tones (1100 Hz, 50 ms duration), 6% distractor tones (a 50 ms
bell sound) and 84% standard tones (1000 Hz, 50 ms). 50 dB sound
intensity and 10 ms rise/fall for all stimuli. Participants were
instructed to fixate on a white cross on a black background on the
monitor and to press the left mouse button with their index finger when
hearing the target tone while ignoring distractor stimuli. Participants
started with a 30 stimuli test round.
Flanker
The flanker task was a modified
version of the Eriksen Flanker (Eriksen & Eriksen, 1974) commonly used
in resarch, e.g., (Riesel et al., 2022; Seow et al., 2020). Five
horizontal arrows were presented in white on a black background on the
monitor. Trials (N = 480 in 10 blocks) could be either congruent
(<<<<< or
>>>>>)
or incongruent
(<<><< or
>><>>) and
were presented for 200 ms. Trials were 50% congruent and 50%
incongruent presented in random order. Participants were instructed to
respond as quickly and accurately as possible by pressing either the
left or right mouse button indicating the direction of the central
arrow. Participants had 1050 ms to respond. Feedback was delivered on
the monitor at the end of each block: if >90% correct
responses or <25% missed trials (“Try to respond faster!”)
and if accuracy <75% (“Try to respond more accurately”.
Otherwise feedback was “Good job!”. If participants had less than 17
errors in total, up to two extra blocks were administered in order to
ensure internal consistency of the ERN (Clayson, 2020). Participants at
first completed a test round consisting of 12 trials to ensure
instructions were understood.
Unattended oddball (UO)
Participants
watched a muted nature documentary while auditory stimuli (N = 1800 in 6
blocks) were delivered monoaurally at 50 dB in a pseudorandom order: 6%
frequency deviant tones (1100 Hz, 50 ms duration), 6% duration deviant
tones (1000 Hz, 100 ms duration), 6% combined frequency and duration
deviant tones (1100 Hz, 100 ms duration) and 82% standard tones (1000
Hz, 50 ms). 50 dB sound intensity and 10 ms rise/fall for all stimuli.
Participants were instructed to ignore all auditory stimuli and focus on
the monitor.
EEG preprocessing
EEG data were processed offline in EEGLAB 2023.1 on MATLAB R2021b
(Delorme & Makeig, 2004; Mathworks, 2022). Cleaning of artifacts and
noise was with the The Reduction of Electroencephalographic Artifacts
(RELAX) preprocessing pipeline, a novel pipeline based on an empirical
assessment of established cleaning methods (Bailey et al., 2022, 2023).
We applied the default RELAX pipeline, RELAX_MWF_wICA, which utilizes
methods from the following published toolboxes: fieldtrip, the MWF
toolbox, wICA (Castellanos & Makarov, 2006), ICLabel (Pion-Tonachini et
al., 2019), PREP (Bigdely-Shamlo et al., 2015) and Zapline-plus (Klug &
Kloosterman, 2021). Given that single-trial analysis handles noisy data
better and in order to remove as little brain activity and obtain as
many trials as possible, especially in the Flanker paradigm, we applied
RELAX with less-stringent settings than default for our main analysis
(specified below).
Prior to processing in RELAX, the raw Biosemi EEG data were imported
into EEGLAB reference-free and down-sampled to 250 Hz. In initial
preprocessing steps, RELAX removed line-noise at 50 Hz with the
Zapline-plus toolbox and referenced data to common average with the PREP
toolbox after the automatic removal of extremely noisy or flat channels.
Data were hi-pass filtered at 0.25 Hz and low-pass filtered at 80 Hz
using the default RELAX Butterworth filter, which is suggested to
perform better than EEGLAB’s pop_eegfiltnew (Bailey et al., 2023). Note
that RELAX applies a 0.25 Hz hi-pass filter by default instead of the
commonly used 1 Hz, a trade-off which somewhat decreases the quality of
the subsequent independent component analysis (ICA) decomposition but
does not distort the ERP time course (Bailey et al., 2023; Luck, 2014;
Tanner et al., 2016; Winkler et al., 2015).
Next, artifact reduction based on multiple wiener filtering (MWF) with a
delay period of 30 and wavelet-enhanced ICA (wICA) with the extended
infomax ICA algorithm proceeded with less strict than default RELAX
cleaning parameters. Specifically, muscle slope threshold was -0.31
(default -0.59), no channels were deleted due to muscle artifacts
(default: channels with 5% or more muscle artifacts deleted) and only
channels with 15% or more extreme artifacts were deleted (default 5%).
Other settings remained on default, including at most 20% removed
channels. Across all sessions, on average 59.7 channels remained for the
AO, 59.8 for the Flanker and 60.2 for the UO paradigm. There was no
difference between groups in number of removed channels at baseline.
After interpolating removed channels, the preprocessed data were epoched
and baseline-corrected according to parameters predetermined for each of
the three paradigms (see Table 1). Note that RELAX applies a
regression-based baseline correction method instead of the traditional
subtraction which has been shown to distort the ERP waveform (Alday,
2019). For response-locked ERPs in the Flanker paradigm, baseline
regression correction was with one factor with two levels: correct and
error response. For stimulus-locked ERPs from the Flanker paradigm and
ERPs from the other two paradigms, regression was with zero factors.
Next, epochs with an absolute voltage amplitude threshold exceeding 100
µV (default 60 µV) or a kurtosis/improbable data limit exceeding 3
standard deviations (SD)/median absolute deviation (MAD) overall or 5
SD/MAD at any channel were rejected.
Across all sessions, on average 1070 epochs of all trial types remained
for the AO, 417 for the Flanker paradigm and 972 for the UO paradigm. At
baseline, in the AO paradigm, there was a significant difference between
groups in number of remaining epochs (HC: 1093, Patients: 1056; t(85) =
3.32 , p = 0.001). This difference was driven by nearly non-significant
differences in the number of remaining Standard stimuli epochs (t(85) =
1.90, p = 0.061) and Target stimuli epochs (t(85) = 1.77, p = 0.08).
This small difference was deemed to be of no consequence for the main
analysis. There were no further differences between groups in number of
remaining epochs for any of the other paradigms. Finally, the
preprocessed data were converted to BIDS format to facilitate the
sharing of data with the community (C. R. Pernet et al., 2019).
Table 1 shows an overview of paradigm and ERP variables.
>>
Table 1 here <<
ERP analysis and
statistics
All demographic and behavioral statistics were conducted in R (R Core
Team, 2023). ERP Statistical models were designed, evaluated and
visualized using LIMO EEG in EEGLAB and MATLAB functions (Delorme &
Makeig, 2004; C. R. Pernet et al., 2011).
After preprocessing, ERP single-trial data were processed in LIMO EEG.
For a given subject, session and channel, this first-level of the GLM
has the general form where denotes the single-trial ERP data in the form
, is a design matrix coding for the paradigm-specific stimulus types,
are the first-level beta coefficients to be estimated and is the
residual term representing what is left when the effects of the beta
coefficients are accounted for.
The term , in LIMO EEG referred to as the adjusted mean, warrants
special attention, as effects on the other beta parameters are
modulations around this constant term. For example, the response-locked
Flanker model is , where is the beta coefficient corresponding to
correct responses and corresponds to error responses. Accordingly, .
Given the near-identical triphasic waveform of the CRN and the ERN, if
is more negative-going than , necessarily lies in-between. Therefore,
will be positive-going even though the CRN is a negative-going wave. As
such, if a result indicates that is modulated negatively by a
psychopathology measure, the interpretation is that a greater, or more
negative, CRN correlates with higher scores.
First-level model parameter estimation was with weighted least squares
(WLS), a robust extension to ordinary least-squares (OLS) which uses
principal component projection to weigh down outlier trials (C. Pernet
et al., 2022). In all ten ERP beta models were evaluated, each
containing one or more classic ERP components (see Table 1 for
stimulus types and associated ERP models).
At the second level, for each of these ERP beta models, we applied mass
univariate robust linear regression as implemented in LIMO EEG. Age,
gender, group, medication status, session number and psychopathology
measure corresponding to each session were explanatory variables. The
general form of the model was:
where are the second-level beta coefficients to be estimated, is the
explanatory variables data matrix and is the first-level ERP beta model
defined above.
In this linear regression model, gender was coded as female = 1, male =
-1. Group was coded as HC = 1, Patient = -1. Due to the large variety in
dosage and type, medication status was coded with two dummy variables
denoting no prescription (-1, -1), one prescription (1, -1) and more
than one prescription (1, 1). Medication prescribed “as needed” was
not considered since patients were instructed to avoid intake from the
afternoon before the day of recording. Session was coded with three
columns indicating with 1 or 0 whether the particular entry of the data
matrix belonged to baseline, week 10 or week 14. Psychopathology measure
was likewise coded in three columns indicating scores for the associated
session. Accordingly, had nine columns and as many rows as there were
data sets (N = 172). For example, a given row corresponding to data set
125 had the form denoting the subject’s age = 25, gender = -1 (male),
group = -1 (Patient), medication = [1 1] (more than one
prescription), Session = [… 0 1 0 …] (week 10) and
psychopathology measure = [… 0 39 0] (week 10).
Maximum likelihood estimates of were computed at each time frame,
channel by channel, using iterative re-weighted least squares (IRLS).
IRLS is a robust extension to OLS adding weights to outlier subjects,
and has been shown to increase sensitivity in the analysis of
neuroimaging data (Wager et al., 2005). had the form representing the
effects of each of the explanatory variables (plus a constant term) on
the ERP model at each data point.
Next, a linear combination of these second-level beta coefficients was
used to test for significant effects of the psychopathology measures
(Kiebel & Friston, 2004). Specifically, we defined a reduced model by
applying the contrast and tested, channel by channel, at each time frame
the null hypothesis where is the transpose of the contrast vector. In
other words, we tested the null hypothesis of no effect on of the
psychopathology measures while accounting for the other explanatory
variables. Note that this contrast model did not assess whether
psychotherapy treatment changes ERP features or modulates the
association with psychopathology measures, or whether associations are
present only at a given session, e.g. at baseline. On the upside, the
model allowed us to state that detected associations were present across
groups and sessions irrespective of effects of psychotherapy.
The associated one-sided t-test was:
where is the variance of the full model and the weights estimated with
IRLS applied to .
The result of these many one-sided t-tests was an uncorrected
statistical parametric map (SPM) of size , e.g., t -values for ERP
models in the AO paradigm. Correction for multiple comparisons (MC) was
conducted using threshold-free cluster enhancement (TFCE) as implemented
in LIMO EEG using 1000 bootstrap iterations (Mensen & Khatami, 2013; C.
R. Pernet, 2015; Smith & Nichols, 2009). TFCE builds on traditional
bootstrap or permutation-based cluster MC correction methods commonly
used in neuroimaging research, e.g. spatiotemporal clustering (Maris &
Oostenveld, 2007; Sassenhagen & Draschkow, 2019). However, instead of
pre-specifying a cluster-forming threshold and assigning to a cluster
all connected data points whose corresponding t -value is above
this threshold, the method considers clusters formed at all possible
thresholds. The more clusters a given data points belongs to within the
range of thresholds, the higher is the assigned TFCE score. As a result,
whereas in traditional spatiotemporal clustering methods, the threshold
would influence what type of clusters are detected, in TFCE narrow
clusters with high t-values are equalized with broad clusters with lower
t-values (Smith & Nichols, 2009). At each data point, , the TFCE score
is given by:
where and are the minimum and maximum t-values in the data,
respectively, is the cluster extent, is the cluster height and and are
scaling constants, which in LIMO EEG are fixed to 0.5 and 2,
respectively (C. R. Pernet, 2015). To arrive at the final corrected SPM
of significant t -values, the method proceeds with estimating the
empirical TFCE distribution through bootstrapping. Importantly, sampling
is with replacement of all datasets belonging to a subject. Then, the
maximum TFCE values from each bootstrap are sorted and the value at is
the estimated TFCE threshold, where is a pre-determined significance
threshold and are the number of bootstrap iterations. Data points whose
TFCE score exceeds this threshold are deemed significant at the level
and the corresponding t -values are included in the SPM. Note that
a trade-off for the increased cluster-detection capabilities of TFCE is
that one cannot state which of the included data points make a cluster
significant (Smith & Nichols, 2009).
For our main analysis, we tested the associations between the 10 ERP
beta models defined in Table 1 and the four transdiagnostic
psychopathology measures (K10, LPFS, MEDI and PID36) at an level of or
0.1% In case of significant results for MEDI and PID36 total scores, we
also show results for the respective sub scales. These results are
presented in full in Supplementary Materials and commented upon in
Discussion.
Results are presented as heat maps indicating in red or blue with
varying intensity positive or negative t -values, respectively.
Clusters of these t -values denote spatiotemporal regions where
the effects of psychopathology measures on the ERP model were
significant. As such, interpretation of results is in terms of direction
of effects at relevant regions of interest. To this end we also display
shaded regions indicating traditional ERP analysis time windows.
ERP grand averages are displayed for each stimulus type and group (HC
and Patient) as the the 20% trimmed mean of subject-level weighted
single-trial ERP data. Trimmed mean represents a robust central tendency
estimate of the mean of the raw single-trial data and corresponds to a
traditional grand average ERP waveform (C. R. Pernet et al., 2011;
Wilcox & Rousselet, 2018). Instead of traditional frequentist
confidence intervals (CI), which only gives the long-term probability of
the true mean, LIMO EEG by default displays the 95% Bayesian Highest
Density Interval (HDI), which is the 95% probability of the observed
20% trimmed mean (Morey et al., 2016; C. R. Pernet et al., 2011).
Finally, we also show results from statistical analyses of demographic
and psychopathology measures. To test the internal consistency of MEDI
and PID36 we estimated McDonald’s Omega using functions from the R
package semTools applied to the baseline dataset (N = 87)
(Jorgensen et al., 2022). McDonald’s Omega has been suggested to be a
more reliable estimate than the commonly used Cronbach’s Alpha (Flora,
2020). To test for group differences in demographics and psychopathology
measures we applied Welch’s two-sample t-tests for continuous variables
and Fisher’s exact test for the categorical variable Gender. To test for
change of psychopathology measures across sessions, from baseline to
week 10 and 14, respectively, we applied mixed linear regression models
using the R package lme4 , e.g., (Bates et al., 2015). Confidence
intervals and p-values were computed with a Wald t-distribution
approximation (Luke, 2017).