### Participants

Five individuals (4 female; M = 24.3 years, SD = 5.1 years) participated in this study. All participants were naïve to the purpose of the study and had not previously participated in a perceptual learning or an rt-NFB study. All had previous experience with psychophysical tasks. An essential inclusion criterion was that subjects were able to maintain stable fixation throughout an entire trial during the experimental task. This was evaluated by the experimenter during a preliminary session where subjects were administered 20 example trials before the start of the experiment. All participants were right-handed, had no known neurological or psychiatric disorders, had normal or corrected-to-normal vision, and were not red-green color blind. They gave written informed consent in accordance with the Declaration of Helsinki and the procedures and protocols approved by the local Ethics Committees on Human Research at Boston University and Massachusetts Institute of Technology (MIT). Task examples were administered at Boston University. The neurofeedback training in MEG was conducted at the MIT McGovern Institute for Brain Research.

### Experimental protocol

The purpose of the NFB training was to decrease the time that subjects required for switching spatial attention from one visual field to the other over multiple consecutive days of training in MEG. The stimulus shown schematically in Fig. 4 is described in detail in Appendix. Before the beginning of the experiment, participants were instructed to pay attention to either red dots or green dots. They were also asked that for the duration of each trial they maintained their fixation on the cross-hair white fixation mark presented at the center of the display. The stimulus began by simultaneously displaying in the right and left visual field, in two circular apertures two random-dot kinematograms (RDK) consisting of red and green dots. The center of each aperture was presented at 8 degrees eccentricity from the central fixation mark. The red and green dot patterns were superimposed on each other and moved diagonally and orthogonally to each other. The superimposed patterns made attending to the aperture necessary to perceive a change in direction to the specified dot pattern. After 500–1000 ms from the beginning of the trial, a white arrow, pointing to the left or right, was superimposed onto the fixation mark, indicating which RDK (in the left or right visual field) to attend to. As soon as subjects perceived the direction change of the attended dots, they had to switch attention to the aperture displayed in the non-attended (opposite) visual field (during the “switch” window). As soon as the arrow was displayed, a small disk of either a faint red or green color was superimposed on the RDK in the location to be switched to. Throughout the attend and switch windows the disc briefly changed color (e.g., from red to green) at random times and was displayed at random locations within that aperture. The participants’ task was to respond as soon as possible, via a button box press with the right hand, to the color of the disk after they have shifted attention to that visual field. Of the total 80 trials in a block, in 16 (catch trials) the direction of the attended dots motion did not change, and thus in the absence of the cue to switch attention, the participants were instructed to continue attending to the motion until the end of the trial when the screen went blank.

### The feedback cue

After either the participant responded or the trial ended, the feedback was presented as a vertical, red thermometer situated 2° above the fixation mark. The thermometer had a maximum height of 3°, a width of 1°, and luminance of 40 cd/m^{2}. It was surrounded by a grey border of 0.1° thickness and luminance of 35 cd/m^{2}. The height of the thermometer was based on \( z_{t} = {{\left( {\rho - \mu_{\rho } } \right)} \mathord{\left/ {\vphantom {{\left( {\rho - \mu_{\rho } } \right)} {\sigma_{\rho } }}} \right. \kern-0pt} {\sigma_{\rho } }} \), the \( z \)-scored switch time, where \( \mu_{\rho } \) is the mean switch time and \( \sigma_{\rho } \) is the standard deviation of switch times from the previous block of trials. The computation of \( \rho \) will be explained in “The training component” and “Sampling from the training period” sections. A 0.5° deflection in the thermometer was set to correspond to \( \Delta z_{t} = 1 \) with \( z_{t} = 0 \) at the middle so that the range of the thermometer, whose height was 3° was \( z_{t} = - 3 \ldots 3 \).

### MEG data acquisition

The magnetoencephalography (MEG) study was conducted at the Athinoula A. Martinos Imaging Center at MIT’s McGovern Institute for Brain Research. Participants were seated in a chair under the MEG sensor array and faced the projection screen placed at a distance of 138 cm. The task stimuli, see “Experimental protocol” and “The feedback cue” sections, were projected onto a 44” back-projection screen through an aperture in the MEG chamber using a Panasonic DLP projector (Model #PT-D7500U). During the experiment, the room lighting was dimmed.

The MEG data were acquired with a 306-channel Neuromag Triux whole-head MEG system (Elekta-Neuromag, Finland), comprising 102 pairs of planar gradiometers and 102 magnetometers. The system was housed in a three-layer magnetically shielded and sound-proof room (Ak3b, Vacuumschmelze GmbH, Hanau, Germany). In the rt-NFB task discussed here, the MEG sensor data were segmented into trials, − 1000 … 3000 ms relative to the onset of motion direction change in the attended aperture. The data from the catch trials, where the direction of motion did not change, were discarded from the analysis.

The data were transmitted between the acquisition workstation to the stimulus workstation using Fieldtrip real-time buffer [28]. Data were transmitted in 100 ms pieces and were assembled on the stimulus workstation. Subsequently, to attenuate environmental noise a signal-space projection (SSP) operator constructed using the singular value decomposition (SVD) of the empty room MEG data was applied to the data at the stimulus machine [29].

### Real-time calculation of time-varying brain state for feedback

The aim of the sb-NFB method we propose is to train the temporal dynamics of the brain state, corresponding to the timing when a targeted cognitive behavior begins, ends, or changes. In this section, we describe the process of defining and detecting the brain states reflecting cognitive behaviors targeted by the training.

To decode a brain state, the decoding algorithm is trained on a set of MEG sensor data. Therefore, sb-NFB requires a time period in which the brain state is known (referred to as the training window in Fig. 5). Following conventions of the machine-learning literature, we refer to the time period in which the brain state change is measured, such as a shift of attention from one visual field to the other, as the development window. Due to the computational complexity (1530 dimensions) involved in training the decoder, the training is conducted between the testing blocks. This is the training component of the sb-NFB method. It results in a signal transformation that best separates the features of interest. The outcome of the signal transformation is used in the state decoding component to transform raw MEG signal data into a state signal that represents the brain states targeted by the sb-NFB.

### The training component

Using data from the training period, the signal transformation was trained as follows: (i) the dimensionality of the data was reduced by an unsupervised dimensionality reduction step, and (ii) the features that optimally separate the reduced dimensionality dataset were extracted.

### Sampling from the training period

In the algorithm, a fixed set of samples was selected from the training period and were labeled with the corresponding brain state. In our implementation, we randomly selected 3 samples during the training period for each trial. We used a 200 ms buffer after the cue was presented to ensure that the subject is provided sufficient time after seeing the cue to attend to the target side of the visual field to obtain samples.

### Dimensionality reduction

The dimensionality reduction is necessary due to the high dimensionality of the dataset (306 sensors). The amount of training data required for the feature extraction is proportional to the dimensionality of the data. Reducing dimensionality implies that fewer trials are needed to train the signal transformation. We applied principal component analysis (PCA) to the raw MEG sensor measurements. PCA transforms multivariate data into a set of orthogonalized components (“PCA space”) ordered by their contribution to the signal power.

To reduce the dimensionality of the \( m = 306 \) dimensional MEG data with \( T \) time samples

$$ X = \left[ {x_{1} , \ldots ,x_{T} } \right] \in {\mathbb{R}}^{m \times T} $$

, we arrange the first orthogonal \( k \) PCA components as columns of a matrix \( P_{k} \in {\mathbb{R}}^{k \times m} \). We then form the \( k \times T \) dimensional

,where due the properties of the PCA the matrix \( P_{k} \) minimizes the \( \ell_{2} \)-norm reconstruction error between the projected data \( P_{k} P_{k}^{T} X \) and \( X, X - P_{k} P_{k}^{T} X^{2} \) among all projections \( P_{k} P_{k}^{T} \) to \( k \) dimensional spaces. In Fig. 6, we show the explained variance as a function of the number of components. We chose \( k = 30 \), which resulted in \( P_{k} Y \) explaining over 80% of the variance of \( X \).

### Frequency bands as features of interest

Brain activity oscillations at different frequency bands have been linked to specific functions, such as attention and control mechanisms. The alpha-band (8–13 Hz) has been implicated in the inhibition of cortical regions and it has been shown to be sensitive to the attentional load [30,31,32,33]. The beta-band (13–30 Hz) is linked to top-down attention control and maintenance of function [34,35,36]. The gamma-band (> 30 Hz) is involved in a wide variety of conscious cognitive functions including sensory processing, attention, and executive control functions [35, 37, 38]. To extract the signals in these frequency bands we convolved \( Y\left( t \right) \), the MEG sensor measurements in the training samples *T*, with complex Morlet wavelet \( \psi \left( {t, f_{c} } \right) \).

- i.
$$ {\rm Z} = \left| {Y\left( t \right)*\psi \left( {\bar{t}, f_{c} } \right)} \right| \in {\mathbb{R}}^{30 \times 4 \times T} , \quad t \in T, f_{c} \in {\mathbb{R}} $$

where \( \psi \left( {t, f_{c} } \right) \) is defined as:

- ii.
$$ \psi \left( {\bar{t}, f_{c} } \right) = \frac{1}{{\sqrt {2\pi \sigma^{2} } }}e^{{ - \frac{{t^{2} }}{{2\sigma^{2} }}}} e^{{ - 2i\pi f_{c} t}} . $$

In the above formula, t is time, f_{c} is the center frequency, and σ is the bandwidth of the wavelet kernel. The four center frequencies are in the middle of the four frequency bands (8–13 Hz, 13–30 Hz, 30–60 Hz, and 60–90 Hz) of interest. Finally, *Z* is collapsed into a two-dimensional matrix *Z*\( \in {\mathbb{R}}^{120 \times T} \).

### Partial least squares regression (PLS)

To identify the components that best separate the attentional states we applied partial least squares regression (PLS) to the wavelet filtered data (*Z*). PLS regression [39] is a method similar to PCA. PLS regression operates on labeled data (\( y \)), which, in the case of sb-NFB, is the targeted cognitive state for NFB training. The PLS algorithm is trained on labeled data from the training period of the previous block of trials. PLS algorithm can be represented as

The resulting matrix *W* is used to project *Z* to a low 10-dimensional space \( {\rm Z}_{\text{PLS}} \), which will then be used to train the support vector machine (SVM):

- iii.
$$ Z_{\text{PLS}} = W^{T} Z \in {\mathbb{R}}^{10 \times T} $$

### The state signals

Linear support vector machine (SVM) can be described as a separating hyperplane with binary solutions on both sides (i.e., solutions equal to + 1 or − 1) whose main objective is to find a hyperplane which would minimize margin error. Linear SVMs have been used in fMRI neurofeedback studies [40]. Generally, linear SVMs are implemented to decode a behavioral state from the activity in a collection of voxels. Using this method, we decoded the targeted brain state from the collection of low dimensional PLS components. The linear SVM algorithm is trained using the PLS components during the “training” window. We used LibSVM [40] with a linear kernel and cost equal to one (\( C = 1 \)) to compute the hyperplane \( f_{w, b} :Z_{\text{PLS}} \to \left\{ { \pm 1} \right\} | Z_{\text{PLS}} \in {\mathbb{R}}^{10} \) parametrized by \( w, b \) that best separated the attention to the left or right visual hemifields fields. The hyperplane \( f_{w, b} \) is learned during the “training” window and then evaluated in the development window to evaluate performance. Finally, Platt calibration, which fits a logistic regression model to the SVM scores, was used to transform the outputs of the SVM model into a probabilistic quantity, which we will call the state signal:\( \rho (f_{w, b} = 1 | Z_{\text{PLS}} ) = \frac{1}{{1 + \exp \left( {Af_{w, b} \left( {Z_{\text{PLS}} } \right) + B} \right)}} \),where the parameters *A* and *B* are optimized using gradient descent to minimize the cross-entropy error [41].

### The state decoding component

To obtain the state signal, during the testing phase, through the following steps:

Where the data-dependent \( P, W \;{\text{and}}\; f_{w, b} \) were learned during the training phase and the data-independent wavelet kernels \( \psi \left( {t,f_{c} } \right) \) remain the same as above.

The signal transformation algorithm trained in the training component is run across the time course of the trial resulting in a time course which encodes the brain state, referred to as the state signal \( \left( \rho \right) \). The state signal is normalized to a \( z \)-score \( z_{t} = {{\left( {\rho - \mu_{\rho } } \right)} \mathord{\left/ {\vphantom {{\left( {\rho - \mu_{\rho } } \right)} {\sigma_{\rho } }}} \right. \kern-0pt} {\sigma_{\rho } }} \), where \( \mu_{\rho } \) is the mean of the corresponding state signal and \( \sigma_{\rho } \) is its standard deviation. We used \( z_{\rho } > 2 \) to indicate when the subject is in the targeted brain state.

For the feedback cue in switching spatial attention (SAST), we used a red thermometer, described “Experimental protocol” and “The feedback cue” sections. The first block of trials was used to compute the initial set of switch times, and thus during the feedback thermometer was not presented. In all the following blocks of trials, the distribution of switch times from the previous block was used to compute the switch time in the current block, see “Experimental protocol” and “The feedback cue” sections. The feedback bar was not updated for catch trials.