ERPs and alpha oscillations track the encoding and maintenance
of object-based representations in visual working memory
Siyi Chen, Thomas Töllner, Hermann J. Müller, & Markus Conci
Ludwig-Maximilians-Universität München, Munich, Germany
Short title: Object-based representations in visual working memory
Word count: 8897 (main text) + 250
(abstract)
Correspondence :
Siyi Chen
Allgemeine und Experimentelle Psychologie
Department Psychologie
Ludwig-Maximilians-Universität München
Leopoldstr. 13
D-80802 München
Germany
Email: Siyi.Chen@psy.lmu.de
Abstract
When memorizing an integrated object such as a Kanizsa figure, the
completion of parts into a coherent whole is attained by grouping
processes which render a whole-object representation in visual working
memory (VWM). The present study measured event-related potentials (ERPs)
and oscillatory amplitudes to track these processes of encoding and
representing multiple features of an object in VWM. To this end, a
change detection task was performed, which required observers to
memorize both the orientations and colors of six ‘pacman’ items while
inducing configurations of the pacmen that systematically varied in
terms of their grouping strength. The results revealed an effect of
object configuration in VWM despite physically constant visual input:
change detection for both orientation and color features was more
accurate with increased grouping strength. At the electrophysiological
level, the lateralized ERPs and alpha activity mirrored this behavioral
pattern. Perception of the orientation features gave rise to the
encoding of a grouped object as reflected by the amplitudes of the PPC.
The grouped object structure, in turn, modulated attention to both
orientation and color features as indicated by the enhanced N1pc and
N2pc. Finally, during item retention, the representation of individual
objects and the concurrent allocation of attention to these memorized
objects were modulated by grouping, as reflected by variations in the
CDA amplitude and a concurrent lateralized alpha suppression,
respectively. These results indicate that memorizing multiple features
of grouped, to-be-integrated objects involves multiple, sequential
stages of processing, providing support for a hierarchical model of
object representations in VWM.
Keywords: visual working memory, object-based representation, grouping,
lateralized ERPs, lateralized alpha suppression
Introduction
When perceiving meaningful visual objects in our cluttered environment,
the visual system has to integrate disparate component parts into
coherent wholes, as demonstrated, for example, by Kanizsa-type illusory
figures (Kanizsa, 1955). For instance, as depicted in Figure 1A(left panel), a configuration of six “pacman” elements generates the
perception of a star-shaped illusory object (a so-called ‘Kanizsa’
figure) with sharp boundaries that are perceived as lying above the
inducing circular elements. The perception of such an illusory object is
usually referred to as “modal completion” (see Michotte, Thines, &
Crabbe, 1964/1991). Recent neuroimaging studies showed activations in
the lateral occipital complex (LOC) to be linked to the processing of
Kanizsa figures, with closed shapes being represented via feedback
signals from mid-level visual areas to lower-level striate and
extrastriate areas (Chen et al., 2020, 2021b; Altschuler et al., 2012;
Murray et al., 2002; Lee & Nguyen, 2001; Stanley & Rubin, 2003).
The operation of binding smaller units into integrated whole objects not
only supports the structuring of perceptual input for more efficient
orienting and action in the environment, but also reduces capacity
limitations in visual working memory (VWM; Delvenne & Bruyer, 2006;
Morey, 2019; Morey et al., 2015; Nie et al., 2017; Peterson &
Berryhill, 2013; Woodman et al., 2003; Vogel et al., 2001). For
instance, when remembering the orientation of a gap in various disks,
memory performance improves when neighboring disks are grouped to form
an illusory rectangle, thereby effectively doubling the maximum number
of reportable items in VWM (Diaz et al., 2021; Gao et al., 2016). It has
also been suggested that individual, nonspatial features (such as color
and orientation) might be represented as bound objects in VWM (e.g.,
Luck & Vogel, 1997; Luria & Vogel, 2011; but see Gao et al., 2011; Ma
et al., 2014). For instance, Luck and Vogel (1997) showed that VWM
performance was essentially independent of the number of to-be-memorized
features that constituted a given object; instead, memory capacity
depended primarily on the number of individuated objects that had to be
retained (see also Delvenne & Bruyer, 2004; Vogel et al., 2001; but see
Wheeler & Treisman, 2002). Recently, Chen et al. (2021a) combined
manipulations of spatial grouping with a concurrent manipulation of
feature binding (see also Luck & Vogel, 1997; Luria & Vogel, 2011;
Fougnie et al., 2013; Olson & Jiang, 2002; Xu, 2002; Ecker et al.,
2013). In their study, a change detection task was used, which required
participants to memorize six pacman elements, each depicting a unique
color and orientation as presented in an initial memory display. The
oriented pacmen could be grouped to form a complete illusory star,
render a partially grouped triangle, or, respectively, an ungrouped
configuration – thus gradually manipulating the strength of the
complete-object representation (see examples in Figure 1A ).
Following a brief delay after the memory display offset, a single pacman
probe item appeared at one of the locations that had been occupied by an
item in the memory display. The task was to decide whether the probe
item was the same as or different from the pacman presented previously
at the same location in the memory display. Importantly, the change
could occur for grouping -relevant features (orientation), or forgrouping -irrelevant features (color). Thus, by systematically
varying the amount of closure in the Kanizsa-type configuration (from a
complete grouping through a partial grouping to an ungrouped
configuration) by systematic variations in orientation, memory
performance for individual features (orientation and color) could be
assessed relative to the presented grouping that was displayed. The
results showed that the grouped object enhanced both the
(grouping-relevant) orientation and (grouping-irrelevant) color
representations when both features were task-relevant (for the
same/different judgment), demonstrating that memory for various features
can be improved by encountering them in a spatial grouping.
While grouping benefited the storage of both grouping-relevant and
-irrelevant features in VWM, it remains unclear which processes
contribute to this benefit, as a facilitatory effect could emerge at
various stages of processing. For instance, current models that link
object perception, attention and memory (for reviews see e.g., Bundesen
et al., 2011; Walther & Koch, 2007) would differentiate between a
hierarchy of sequential processing stages that comprises differentiable
computational mechanisms and neuronal sources of processing, which
encompass the initial, early perceptual stimulus analysis, the
subsequent allocation of attention to selected objects, followed by
their maintenance in memory. The present study was designed to
investigate these component processes by taking advantage of previously
established event-related potential (ERP) and oscillatory markers
associated with the encoding and maintenance of working memory contents,
the aim being to identify critical processes that are influenced by
object grouping. That is, we tracked the temporal dynamics of illusory
figure processing in order to investigate how object integration impacts
early perceptual, attentional, and memory-related processing stages.
The first series of lateralized ERP components of interest include the
early positivity posterior contralateral (PPC), the subsequent posterior
N1pc, as well as the attention-related N2pc (also referred to as PCN).
PPC-like activations have been suggested to reflect selective visual
processing under conditions with relative saliency differences between
target and distracter stimuli (Akyürek & Schubö, 2011; Corriveauet al.,
2012; Fortier-Gauthier et al., 2012; Jannati et al., 2013; Gokce et al.,
2014; Barras & Kerzel, 2017), with a positive-going deflection emerging
contralateral to the target when the distracter is more salient than the
target in the opposite hemifield (Fukuda & Vogel, 2009; Wascher &
Beste, 2010; but see Töllner et al., 2012). For instance, the PPC was
found to be enhanced when the target was a non-salient “ungrouped”
Kanizsa-type configuration and the distractor a grouped, salient Kanizsa
figure (presented in the hemifield opposite to the target), relative to
a condition that reversed the target and distractors and required
observers to search for a salient (grouped) target among a non-salient,
ungrouped distractor (Wiegand et al., 2015). Thus, in visual search
experiments, all search items are usually distributed across both visual
hemifields and the PPC modulation in turn appears to reflect in
particular the difficulty to ignore salient distractors when actually
searching for a less salient target. By contrast, in working memory
tasks, the to-be-memorized array is typically only presented in one
hemifield which is prompted by an arrow cue. In this case, the PPC would
be interpreted as reflecting the initial (perceptual) processing of
task-relevant, attended stimuli
(Fortier-Gauthier et al.,
2012). A number of studies also found ERPs in response to illusory
figures, as compared to ungrouped baseline configurations, to reveal
differential processing in the posterior N1 (e.g., Herrmann & Bosch,
2001; Murray et al., 2004; Proverbio & Zani, 2002; Senkowski et al.,
2005; see also Murray et al., 2002, for even earlier effects), where
this early signal might reflect the initial biasing of attentional
priority towards illusory figures in the competition for selection
(Senkowski et al., 2005). In the subsequent time window, the actual
spatial-attentional selection of grouped vs. ungrouped configurations is
indexed by the N2pc (Conci et al., 2006; 2011; Töllner et al., 2015).
Previous work showed search to be more efficient for grouped, as
compared to ungrouped, targets (Conci et al., 2007; see also Nie at al.,
2016), and this is associated with larger N2pc amplitudes – which is
indicative of enhanced engagement of focal attention by the grouped
target (Conci et al., 2011) as opposed to a broader tuning of attention
by grouped, task-irrelevant distractors (Conci et al., 2006). Thus,
previous evidence suggests that the processing of an illusory figure
might be reflected in early perceptual ERPs (PPC), in the subsequent
biasing of initial attentional priorities (N1pc) and in the N2pc, which
is typically associated with the allocation of (focal) attentional
processing resources to a given (target) item (e.g., Eimer, 1996).
An additional component of interest is the contralateral delay activity
(CDA), a sustained negativity during the delay period between the memory
and test displays. The CDA has been found to monotonically scale with
the number of items held in VWM up to the measured storage limit (of
approximately 3 - 4 items; Fukuda et al., 2015; Luria et al., 2016;
Vogel & Machizawa, 2004). The CDA amplitude has also been reported to
decrease in some studies when to-be-remembered objects are bound or
grouped into higher-order units (Luria & Vogel, 2011; Luria et al.,
2016; Peterson et al., 2015), suggesting that it actually reflects the
number of “integrated units” represented in VWM. For example, the CDA
amplitude was comparable when memorizing only orientation features as
opposed to both color and orientation features, which were presented on
the same physical objects, whereas the CDA increased when the same
orientation and color features were presented as separate objects (Luria
& Vogel, 2011; Woodman & Vogel, 2008). The difference in the CDA
amplitude thus appears to reflect the number of separable objects.
Moreover, it has also been reported that similar colors may be
compressed in VWM such that the CDA amplitude for these colors is
essentially comparable to the amplitude for just one to-be-memorized
color (Gao et al., 2011; Peterson et al., 2015). Finally, the CDA has
also been shown to provide a characteristic, task-dependent signature of
the active maintenance process, where a larger CDA amplitude is observed
for identical stimuli when the task requires the encoding of objects
with high (as opposed to low) precision (Machizawa et al. 2012). In
agreement with this finding, Chen et al. (2018b) investigated “amodal”
completion (of occluded objects) in VWM and reported a sustained
increase in the CDA amplitude for globally completed objects (as
compared to uncompleted objects). For instance, when observers were
required to memorize occluded parts of an object, persistent mnemonic
activity (as indexed by an increased CDA amplitude) was required to
generate complete-object representations from physically specified
fragments and in order to maintain the resulting complete-object
representations in a readily accessible form (see also Ewerdwalbesloh et
al. 2016; Pun et al. 2012; Emrich et al. 2008). This suggests that the
representation of a globally completed object may, in some cases, also
require more (rather than less) mnemonic resources. Previous studies not
only reported comparable behavioral dynamics (e.g., Chen et al., 2018a)
but also partly overlapping neural mechanisms for amodal and modal
completions (Murray et al., 2004). It might therefore be conceivable
that modally completed, grouped vs. ungrouped variants of a Kanizsa
figure reveal similar VWM storage properties and generate similar CDA
patterns to shapes that are completed on the basis of amodal completion.
In sum, the role of the CDA concerning object binding and grouping
reveals a rather complex and seemingly flexible mechanism, which is not
necessarily reflecting bottom-up objecthood cues on the basis of their
salience alone (for a review, see Luria et al., 2016). Rather, the CDA
appears to depend on specific stimulus characteristics in combination
with the related task demands.
Apart from ERPs, the maintenance process can also be tracked with
oscillatory markers. Several studies have demonstrated that posterior
(putatively visual) alpha oscillations (8–12 Hz) in the retention
interval are reduced in amplitude contralateral vs. ipsilateral to the
retinotopic location of the to-be-retained items (e.g., Grimault et al.,
2009; Lozano-Soldevilla et al., 2014), evidencing a relative amplitude
difference between mnemonically relevant and irrelevant information.
Accordingly, lateralized alpha-band activity has been taken to play a
role in mnemonic retention (for a review, van Ede, 2018; Medendorp et
al., 2007; Fukuda et al., 2015; Erickson et al., 2017). Several studies
have further demonstrated a link between alpha oscillations during
retention and the concurrent location and orientation of
to-be-remembered items (Foster et al., 2016; Fukuda et al., 2016),
suggesting that alpha oscillations during VWM maintenance also track
feature-specific identity information of the to-be-memorized items
(Fukuda et al., 2016). Note that, posterior-occipital alpha has also
been widely suggested to reflect an online index of top-down adjustments
of attentional control (e.g., Thut et al., 2006; Murphy et al. 2020;
Wang et al., 2019; 2021; Woodman et al., 2022), which is a critical
factor contributing to effective VWM maintenance (Unsworth et al. 2014;
Engle & Kane, 2004). Moreover, posterior-occipital alpha suppression
has been shown to vary with changes in the attentional engagement
(Boudewyn & Carter, 2017), with larger alpha suppression being evident
when the attentional demands increase. Recall that VWM is usually
considered to reflect a system that provides both short-term stores of
representational formats and concurrent attentional, “executive”
control structures that keep task-relevant information active and
accessible during maintenance (Engle & Kane, 2004). The CDA and
lateralized alpha may thus be mapped onto two separable cognitive
mechanisms, relating to (i) the representation of individual objects and
(ii) associated internal attentional control processes, respectively.
That is, an increase in the lateralized alpha suppression for the
to-be-remembered items might be directly associated with the increase in
attentional control in particular when the number of items in the
display exceeds the individual’s capacity to select a manageable subset
of items for efficient VWM storage
(see also Fukuda et al.,
2015).
In summary, the present study was designed to examine neural processing
stages potentially implicated in the grouping benefits when memorizing
individual features. Participants’ (lateralized) electrophysiological
brain activity was recorded while they performed a change detection task
that presented a to-be-memorized configuration comprising six pacman
items on one side of the display and a to-be-ignored placeholder
configuration of six gray circles on the other side. Participants had to
memorize the color and orientation of pacman items that were presented
either as a fully grouped, a partially grouped, or an ungrouped
configuration. Note that the various pacman arrangements produced
configurations differing in grouping strength, however without impacting
the low-level properties of the image (see Figure 1A ). That is,
the number of items and their overall physical stimulation was identical
for the grouped, partially grouped and ungrouped stimulus configurations
(and for the task-irrelevant placeholders), and the three
to-be-memorized types of configuration would therefore only differ in
terms of grouping strength from each other. Subsequent to a retention
interval, the test display was presented, which would reveal a probe
item on the cued side (and a placeholder circle on the uncued side). The
probe would either depict a color change, an orientation change, or no
change (see Figure 1B ). In this way, we were able to track at
the neural level how the VWM representation of individual features is
aided by grouping. We assessed behavioral performance measures (change
detection accuracy) and lateralized ERP components, as well as
oscillatory signals.
Based on our previous, related study (Chen et al., 2021a), we expected a
grouping benefit in the change detection performance, that could in
principle be mirrored in several lateralized ERP components and/or in
corresponding oscillatory signals. We predicted that PPC amplitudes
which reflect the initial perceptual processing of the stimuli might be
modulated by the grouping of the to-be-memorized configurations because
of their inherent differences in the attentional requirements of initial
visual processing. For instance, the less a given configuration is
grouped, the greater the attentional requirements to process this
stimulus, which should be reflected in the PPC amplitudes. Variations in
attentional selection should also be evident in the subsequent N1pc and
N2pc components, revealing a more focused (and more strongly
lateralized) shift of attention to the to-be-memorized configuration
alongside with an increase in grouping strength. For the memory stage,
orientation-based grouping might reduce the load by maintaining
integrated, coherent shape representations, thus enhancing the VWM
capacity for both color and orientation features, resulting in increased
CDA amplitudes. At the same time, the generation of a global shape
representation in the grouped Kanizsa figure might also be expected to
require more mnemonic resources or, storage capacity than less grouped
items in order to achieve a higher representational precision and this
should also impact the CDA. Finally, lateralized alpha suppression
contralateral to the to-be-remembered configurations was expected to
reveal variations of cognitive control devoted to the memorized items in
order to keep them active and accessible during the execution of complex
cognitive tasks. There might be a larger alpha suppression for ungrouped
relative to more grouped configurations thus reflecting greater
executive attention (and increased difficulty) to hold the individual
features for ungrouped configurations during maintenance.
Method
Participants. 24 volunteers (12 females, mean age = 26.13 years;
SD = 2.67 years, all were right-handed) participated in the experiment,
for payment of \euro 9.00 per hour. All participants had normal or
corrected-to-normal visual acuity and normal color vision. No subject
reported mental or neurological diseases. All observers provided written
informed consent, and the experimental procedure was approved by the
ethics committee of the Department of Psychology at
Ludwig-Maximilians-University, Munich. The sample size was larger than
previous, similar studies (Chen et al., 2021a; Gao et al., 2016). A
power analysis conducted with G*Power (Erdfelder et al., 1996) revealed
that to detect a relatively large effect, f(U) = 0.5, of object
configuration with a power of 95% and an alpha of .05, a sample of only
12 participants would be required. We further increased our sample toN = 24 observers to ensure sufficient statistical power in our
analyses.
Apparatus and Stimuli. The experiment was programed in Matlab
using Psychophysics Toolbox functions (Brainard, 1997). Stimuli were
presented on a 19-inch computer monitor (1,024 × 768 pixels screen
resolution, 85-Hz refresh rate) against a black screen background (0.25
cd/m2). Participants were seated at a distance of
approximately 65 cm from the screen inside a shielded Faraday cage
(Industrial Acoustics Company GmbH, Germany).
A bilateral version of the change detection task was adapted from
previous studies, so as to be able to measure lateralized EEG components
(e.g., Vogel & Machizawa, 2004). The to-be-memorized stimulus
configuration (which was either presented on the left or right side of
the screen) consisted of six items, presented on an imaginary circle
(radius: 4° of visual angle), with all items arranged equidistantly to
one another. Each item was a filled circle with a radius of 2.4° of
visual angle and a 60° opening (1/6 of the overall area of the circle),
thus forming a “pacman”-like figure. Each pacman was presented in a
different color (all 5.0 cd/m2; blue, RGB: 49,64,249;
red, RGB: 172,11,2; green, RGB: 15,102,11; purple, RGB: 138,35,160;
orange, RGB: 140,70,0, and mint, RGB: 50,99,109) and with a different
orientation of its “mouth” (i.e., for a given pacman, the cut-out
section could be rotated at an angle of 0°, 60°, 120°, 180°, 240°, or
300°, respectively). The distribution of the six colors among the six
items was randomized on every trial. The distribution of the “mouth”
orientations was determined by the three experimental conditions that
were presented with equal probability throughout the experiment. In the
“ungrouped” condition, the six possible mouth orientations were
randomly assigned to the six display locations (Figure 1A,
Ungrouped ). In the “partial-grouping” condition, the openings of
three items were oriented towards the center of the display, thus
forming either an upward- or downward-pointing (illusory) triangle
(Figure 1A, Partially grouped ). The mouth orientations of the
other, remaining three items were selected randomly from the remaining
three orientations (without replacement of an already assigned
orientation). Finally, in the “grouped” condition, the openings of all
six items were oriented towards the center of the screen such that they
formed an illusory star (Figure 1A, Grouped ). In this way, a
given memory display would always consist of six distinct colors and six
distinct mouth orientations, irrespective of the grouping condition.
Thus, for all three types of configuration, each display presented an
equal number of (six) colors and orientations, such that the basic
physical stimulation was identical across conditions. Of note, the
ungrouped configuration served as a baseline: the pacman elements were
randomly oriented (as well as randomly colored), making them unlikely to
render any kind of grouped object, allowing us to assess whether change
detection performance would be enhanced by any type of grouped
structure. Finally, in the hemifield opposite to the memory array, a
to-be-ignored placeholder configuration was presented, which consisted
of six gray (RGB: 92,92,92) circles with a central hole (Figure
1A Placeholder ). These placeholders were similar in luminance to the
memory items, and the size of the removed central circle corresponded to
the size of the cut-out segment in the pacman items. This ensured that
both display halves presented stimulus arrays with an identical physical
stimulation, yet only the memory configuration provided task-relevant
color and orientation information, while the placeholders remained
constant throughout the entire experiment.
Procedure and Design. Figure 1B illustrates an example
trial sequence. Each trial started with the presentation of a central
white fixation circle (0.6° × 0.6°), which remained on the screen for
the entire trial. After 300 ms, two white arrows (1.1° × 1.1°) appeared
above and below the fixation circle for 300 ms, with both arrows
pointing either to the left or to the right (with equal probability).
After a short delay period (that lasted for a random interval between
300 and 500 ms), the memory display appeared for 300 ms, presenting an
ungrouped, partially grouped, or grouped configuration on the cued side
(i.e., as indicated by the initially presented arrows) together with a
gray placeholder configuration on the uncued side. This was followed by
a 1000-ms retention interval during which a blank screen was presented.
Next, a test display appeared consisting of a single gray circle on the
uncued side and a single pacman item – each positioned randomly at one
of the six possible item locations (that had been occupied in the memory
array) on the cued (and uncued) side. The probe display was presented
until the participant issued a response: pressing the left or,
respectively, the right mouse key to indicate whether the probe item was
the same as or different from the pacman at the same location in the
preceding memory display. Participants were instructed to respond as
accurately as possible. In half of the trials, the probe on the cued
side was identical (in terms of both color and gap orientation) to the
item presented at that particular location in the previous memory
display (no-change condition). In the other half of trials, the probe
item was changed in either color or orientation (with equal probability)
relative to the probed item in the memory array. The change was realized
by presenting the probed item in either the color or the orientation of
one of the other five items (randomly selected) in the memory display,
thus encouraging observers to memorize individual items as conjunctions
of color and orientation (rather than just independent sets of
orientations and colors).