Self-Prompted Discrimination and Operant Control of EEG Alpha

EEG state discrimination studies may contribute to understanding the role of awareness in physiological selfregulation, but many individuals learn the existing paradigm very slowly. In this study, a self-prompted discrimination paradigm, in which subjects decide when to respond based upon their subjective state, was examined for the rate of learning and its effects on the ability to control EEG alpha. Twenty-nine participants received up to three 40-min sessions in which discrimination training was alternated with training to control alpha in four 10-min sets, compared to 22 participants who received control training only. Discrimination training appeared to facilitate the ability to control alpha amplitude, but only in the first session. The rate of learning of the discrimination paradigm was markedly greater than seen in previous studies. Comparing the time series of postresponse alpha amplitudes suggested that the lowest scoring sessions involved a behavioral inertia, or difficulty switching states, particularly when a higher alpha state was required. However, extreme amplitudes were discriminated better than moderate ones and discrimination task performances significantly exceeded the percent time that alpha amplitude was in the correct state. These two observations suggest that EEG discrimination involves awareness of, and not just manipulation of, one’s EEG state.


Introduction
In biofeedback, self-regulation of physiological function is learned by displaying or "feeding back" a physiological signal in real time to the individual producing it. Rewards are provided when the signal exceeds a threshold indicating a desired response, and over time individuals learn to produce the response without feedback (Sherlin et al., 2011). It is sometimes argued that attention to the feedback display increases awareness of otherwise unconscious internal sensations, and this awareness enables or facilitates voluntary control of the process (Brener, 1974;Congedo & Joffe, 2007;Frederick, 2016;Frederick, Heim, Dunn, Powers, & Klein, 2016;Olson, 1987;Plotkin, 1981). While voluntary action is possible without awareness of one's current state (Black, Cott, & Pavloski, 1977;Taub & Berman, 1963), performance can be substantially impaired (e.g., Taub, Bacon, & Berman, 1965). Operant conditioning is also possible without awareness (Becker, Kleinböhl, & Hölzl, 2012), but conscious perception is argued to involve access to more global processing in the brain (Dehaene, Charles, King, & Marti, 2014), allowing for explicit rehearsal processes and the kind of internal reinforcement seen in observational learning (Bandura, 1977). differences in a physiological signal, where individuals report their perception of whether a variable is high or low. For instance, human subjects have been trained to discriminate EEG alpha (Frederick, 2012;Kamiya, 1968Kamiya, , 2011, the sensorimotor rhythm (Cinciripini, 1984), P300 amplitude (Sommer & Matt, 1990), and slow cortical potentials (Kotchoubey, Kübler, Strehl, Flor, & Birbaumer, 2002).

Generalization of Skills Between Discrimination and Control of EEG Alpha
Our laboratory  found that seven sessions of control training (standard neurofeedback) of EEG alpha dramatically increased discrimination performance in three subsequent sessions. Among the participants who successfully learned to control EEG alpha, the average discrimination task performance was 81% correct (50% is a random performance). However, the reverse was not true. Seven sessions of EEG alpha discrimination training had no effect on three subsequent sessions of the standard neurofeedback task. While these results were consistent with arguments that awareness is not necessary or sufficient to learn physiological control (Black et al., 1977;Lacroix, 1981), our results suggested another possible interpretation. Learning of the discrimination task was relatively weak, the group average never exceeding 55% correct across seven sessions. This rate of learning was consistent with that seen in Frederick (2012), where the successful participants averaged 56% in the 10th session.

Self-prompted Versus EEG-prompted Discrimination
One possible explanation for the lack of robust learning of the discrimination task was that only a small proportion of excursions in alpha amplitude are related to discriminable changes in subjective states. Since the paradigm provided only about three prompts per minute, informative learning trials (that included discernible subjective correlates of the EEG state) might have occurred less than once per minute. Possibly, a higher proportion of informative learning trials might be provided (and more robust learning achieved) if subjects could decide when to respond based on their subjective states rather than the computer prompting based on alpha amplitude differences. For instance, Frederick (2005) reported a case study using this self-prompted discrimination paradigm, where one subject scored 68% in the first session and reached 81% in the 11th session. Figure  1 illustrates the theoretical suggestion, where only a small proportion of alpha amplitude differences involve subjective state differences, but a larger proportion of subjective state differences are associated with alpha amplitude differences.

Figure 1.
Hypothesized relationship between subjective state differences and EEG alpha state differences, where a discrimination paradigm prompted by subjective state differences might result in faster learning than a paradigm prompted by EEG alpha state differences.
If more rapid learning and a higher level of discrimination task performance could be achieved, then it would be possible to more specifically test whether discrimination training can facilitate learning to control the EEG through standard neurofeedback. Therefore, this study evaluated the effect of selfprompted discrimination training on standard neurofeedback training. It was hypothesized that dividing session time equally between standard neurofeedback control training and subject-prompted discrimination training would result in greater control of EEG alpha than control training alone.

Discrimination Versus Manipulation of the EEG
The self-prompted responding paradigm involves substantial efforts by the subject to manipulate their EEG. Participants are instructed in the subjective phenomenology of high and low alpha states and asked to press a button when they believe they have reached a high or low alpha state, in alternating order. Black et al. (1977) and Lacroix (1981) theorized that successful discrimination performance probably only involved successful reporting of a subject's voluntary effort to manipulate their state. However, the selfprompted discrimination paradigm allows for a direct test of this theory. During each (high or low) type of trial, it is possible to measure the percentage of time the subject spends in the "correct" EEG amplitude state before responding. A successfully manipulated EEG amplitude would then be correct more than 50% of the time in the self-prompted discrimination paradigm. However, discrimination performances significantly greater than the percent time correct would suggest that subjects are aware of more than just their effort to manipulate the EEG signal.

Psychophysics of Self-Prompted Discrimination
It was previously found that performance in EEGprompted alpha discrimination was strongest for very high (91-100th percentile) and very low (1-10th percentile) amplitudes compared to moderately high (71-80th percentile) and moderately low (21-30th percentile) amplitudes, consistent with an interpretation of alpha discrimination as a kind of a sensory or perceptual process (Frederick, 2012). It was of interest to see whether a similar pattern would be seen for self-prompted discrimination. Would participants' correct responses tend to cluster closer to the first percentile for low trials and the 100th percentile for high trials? Or, would they cluster just on the correct side of the 50th percentile, when perhaps they perceived some contrast with the previous correct trial, or perceived movement in the right direction?

Response Timing
Previous studies found that it was possible to use intertrial time intervals to "cheat" in the standard Kamiya paradigm, although subjects did not make significant use of this information (Frederick, Dunn, & Collura, 2015;Frederick et al., 2016). It is possible that some of the significantly correct performance in the self-prompted discrimination paradigm could be explained by attention to time cues rather than genuine discrimination. For instance, it might be more time-consuming to "clear the mind" and switch to high alpha than to "activate the mind" and switch to low alpha. Or, if there is significant postreinforcement synchronization after correct trials (Hallschmid, Mölle, Fischer, & Born, 2002;Sherlin et al., 2011), one might expect transitions from low to high trials to go more quickly.

Method Participants
With the approval of the institutional review boards at Middle Tennessee State University and Saint Cloud State University, 51 participants were recruited from students, faculty, staff, and the local community. To improve motivation, compensation was based partly on performance (Sherlin et al., 2011), where participants received $12 if their scores reached a criterion (67% in the discrimination task or 14% difference between increasing and decreasing alpha in the control task), or $9 otherwise. These criteria were determined by pilot data to make the average payment $10 per session.

Measurements and Apparatus
EEG was recorded at the parietal midline (Pz) using tin electrodes. Reference and ground were randomly assigned to left or right earlobes each session. Impedances were lowered to below 10 k, with no site greater than twice the others. Considering modern amplifier input impedances (Ferree, Luu, Russell, & Tucker, 2001), impedances of up to 15 k were occasionally accepted if repeated preparations would not bring them lower.
EEG was recorded with a BrainMaster Atlantis amplifier and BrainMaster 3.7i software using the default settings as described . The alpha band was defined as a 5-Hz band centered at each subject's peak alpha frequency (PAF). For example, if the PAF were 11 Hz, the alpha band was then defined as 9-13 Hz.
For the alpha amplitude control (standard neurofeedback) task, the experimenter maintained a percent reward between 15% and 30% while viewing a 60-s filtered alpha amplitude window and a 60-s running average of the percent time in reward. Adjustments to the reward threshold were made about every 20 seconds. To avoid triggering reward onset/offset, adjustments were only made when the alpha amplitude was not close to the threshold. Spectral amplitudes were saved in 1-s epochs for delta (1-3 Hz) and for each participant's custom alpha band. Epochs were assumed to include artifact and excluded when the delta amplitude exceeded 30 µV.
During the discrimination task, EEG amplitudes for each 1-Hz band from 1 to 32 Hz were sampled 10 times per second by custom software (Introspect, written in C++), which recorded both EEG and task responses. The task and recording were suspended (and an artifact warning tone was played) whenever lodelta (0.5-2.0 Hz) or hibeta (23-32 Hz) amplitude exceeded a threshold. Alpha amplitude was defined as the sum of amplitudes in the five 1-Hz bands centered at the PAF, smoothed over the most recent 2 s, delayed 500 ms. Following Libet's (1985) observation that the readiness potential-the brain's process underlying a decision to act-begins about 500 milliseconds before the action, the delay was introduced both to remove any effect of the readiness potential and to reflect the likelihood that responses indicate conscious contents with at least a 500-ms delay. A sliding baseline consisting of the most recent 600 alpha amplitude samples (60 s) was rank ordered for comparison to the alpha amplitude at the time of each participant response.
The baseline was updated every 15 s, with each response, or whenever the experimenter pressed the pause button.

Procedure
After obtaining informed consent, participants were given a set of instructions describing strategies to relax and reduce muscle artifact, and the phenomenology of alpha and nonalpha states (Frederick, 2012;Frederick et al., 2015). Participants sat in a cushioned chair with eyes closed in a dimly lit, sound-attenuated room. The PAF was determined from a 60-s eyes-closed baseline recording.
Participants were randomly assigned to two groups who each received up to three 40-min sessions. In the first group ("control task group"), sessions consisted of alternately rewarding increasing and decreasing alpha amplitude in 5-min runs. In the second group ("discrimination task group"), minutes 1-10 and 21-30 consisted of the same 5-min runs of increasing and decreasing alpha amplitude. However, during minutes 11-20 and 31-40 they received a discrimination task, in which they which they were given a trial type (high or low alpha) and asked to press a button when they believed they were in that state. The trial type would alternate after each correct response but would stay the same after each incorrect response. A response below the 50th percentile was correct for low trial, and a response above the 50th percentile was correct for a high trial. Correct responses triggered a reward (Microsoft "tada") sound, followed by a voice announcing the next trial ("high trial" or "low trial"). Incorrect trials resulted only in the repetition of the trial type. Responses within 2 seconds of the previous response or an artifact were not allowed and would trigger a verbal reminder of this rule.
The control task group included 22 participants (age 18-54, median 24, 10 female) while the discrimination task group included 29 participants (age 18-63, median 25, 15 female). Although the original intent was for the two participant groups to be equal in size, the need for the "percent time correct" measure (which applies only to the discrimination task), was discovered late in the progress of the study (see Frederick & Guetter, 2017). The discrimination group included extra subjects in order to get a larger number (n = 13) with the percent time correct measure.

Improvement Across Sessions
A total of 74 discrimination task sessions were completed among 29 participants. Among these, 23 completed two sessions and 22 completed three sessions.

Performances Significantly Above and Below 50%
Thirty-three of 74 sessions among 16 subjects showed performance significantly above 50% with binomial p < .05 (by chance alone, five percent or about four out of 74 sessions would be expected to have p < .05). However, 22 sessions among 12 subjects showed performance significantly below 50% at p < .05, about 5.9 times the amount expected by chance (Table 1). For instance, one participant's three session scores were 13/42 (31.0%), 7/38 (18.4%), and 15/43 (34.9%). Earlier sessions scores tended to predict later session scores. Only three of the 12 participants who scored significantly below 50% later scored significantly above 50%. Ratio of observed to expected below 5.9 7.1 0.6 Note. *In the third column, individual session percent times correct were used as the binomial null hypothesis for each discrimination task score, where the mean percent time correct was 44.0%.

Percent Time Correct Adjustment
The high level of "below-chance" performances was unexpected and prompted a revision of the task software to record EEG values between trials every 0.5 s for the final 13 participants. The task software informs the participant that either a "high" or "low" response is required for each trial and then waits for their response. Then, the alpha amplitude for each sample is assigned a percentile ranking from the sliding 60-s baseline, where the participant would be correct on a high trial if the percentile amplitude exceeds 50, incorrect otherwise. On a low trial, the participant would be correct for each sample if the percentile amplitude is 50 or below, incorrect otherwise. Thus, across all samples, it is possible to compute a "percent time correct," or the expected score if the participant responded continuously or randomly across the session. The mean percent time correct, not including the 2 s after each correct response when new responses were not allowed, was 44.0% (SD = 5.3) and appeared to change very little between sessions (Figure 2). The percent time correct during high trials (44.5%, SD = 4.9) was about the same as during low trials (43.4%, SD = 6.3).
A total of 31 sessions were completed by the 13 participants for whom EEG percent time correct was recorded, where nine subjects completed all three sessions. Nine of these 31 sessions were significantly above 50% at binomial p < .05, and 11 sessions were significantly below 50% at p < .05 (Table 1).
Among the 11 sessions significantly below 50% at p < .05, the average score was 39.5% (SD = 3.6, range 35.4-46.5), the mean percent time correct was 38.0% (SD = 4.7, range 27.8-46.5). Only four percent time correct values among the 31 recorded were above 50 (range 51.1-54.9), and all four of these were among the nine sessions significantly above 50%.

Performances Significantly Above Percent Time Correct
The observation of the average percent time correct being 44.0% suggested that unlike in the EEGprompted discrimination paradigm (Kamiya, 1968) where high or low alpha amplitude events trigger a prompt to respond, 50% is not the appropriate null hypothesis, or expected value for a random performance.
When each individual session percent times correct were used as the null hypothesis for the 31 sessions where it was measured, the number of sessions significantly above chance levels increased from 9 to 19 (Table 1). Only one of 13 participants failed to achieve one significant above chance session performance. The number of sessions significantly below chance levels decreased from 11 to 1, a number more consistent with chance levels at p < .05.

Response Timing
The average session had 115.3 trials (SD = 42.2) during the two 10-min sets of trials, or an average of 5.8 trials per minute (or one trial every 10.3 s). The correlation between discrimination performance and the number or frequency of trials was nonsignificant in the first (r = .134, df = 27), second (r = .288, df = 21), or third (r = .223, df = 20) sessions. All criterion sessions (significantly above chance at p < .05) among 24 subjects were examined for the effect of response timing on performance. These included a total of 5991 trials (not including the first trial in each session for which the intertrial interval was undefined).
Among these, 3580 (59.8%) followed a correct trial and were therefore different from the previous trial (because the trial type, high or low, switches after each correct response, or else it stays the same). The number of different trials for each criterion session was counted and summed for each of the following intertrial intervals: 2.1-5. Repeated measures analysis of variance found that there was no effect of intertrial interval on performance among all the trials, among the high trials or low trials alone, or among the differences between high and low trials.

Postresponse "Behavioral Inertia" in Alpha Amplitude
Since each correct response occurs when alpha amplitude is relatively high or low, it was of interest to see how long it took for alpha amplitude to recover from this deviation. The mean percentile alpha amplitude was computed every 0.5 s after each correct response (n = 13) and is shown in Figure 4. On average, it took about 3.5 seconds to reach the 50th percentile for both high and low alpha trials. Note that individual trials (e.g., Figure 5) are more variable than the grand averages shown in Figure 4.  To determine whether differences in recovery time could explain differences in discrimination task performance, the mean postresponse amplitudes were compared from sessions scoring significantly above 50% (n = 6 participants) to those from sessions scoring significantly below 50% (n = 6 participants). Among these two groups of six, there were a total of 11 participants, where one participant had both types of sessions. These results are summarized in Figure  6 and Figure 7. Figure 6 shows that for low alpha trials, subjects took about 2.5 seconds to reach the 50th percentile during high-performing sessions, but about 4 seconds during low-performing sessions. Figure 7 shows that during high alpha trials, subjects took about 3 seconds to reach the 50th percentile during high-performing sessions, whereas during lowperforming sessions, the average amplitude did not cross the 50th percentile during the first 10 seconds. Note: there are fewer data and measurements become less reliable after 10.3 seconds, the average response time (data not shown). Figure 6. Differences in behavior inertia or recovery times during low alpha trials after a correct high trial for sessions scoring significantly below 50% (n = 6 subjects) and above 50% (n = 6) subjects. Error bars indicate standard errors. Figure 7. Differences in behavior inertia or recovery times during high alpha trials after a correct low trial for sessions scoring significantly below 50% (n = 6 subjects) and above 50% (n = 6) subjects. Error bars indicate standard errors.
For purposes of comparison, Figures 8 and 9 show the postresponse alpha amplitudes for the same sessions after incorrect responses.
In these situations, participants are recovering from unintentionally low or high alpha. Figure 8 shows after an incorrect "low alpha" response, highperforming subjects appear to reach the 50th percentile in 1.5 seconds, or about 1 second sooner than after a correct trial ( Figure 6). However, in lowperforming subjects Figure 8 shows how (despite recovering from correct high trials in about 4 seconds on average, Figure 6) the first error in identifying low alpha seems to indicate an ongoing difficulty reaching the low alpha state, where the 50th percentile is not reached until about 8.5 seconds. Figure 8 shows a brief opportunity for a correct high alpha response starting around 3.5 seconds in both groups followed by varying difficulty reaching high alpha. Figure 8. Time series of mean alpha amplitudes in low alpha trials after an incorrect low alpha trial ("same" trial type) for sessions scoring significantly below 50% (n = 6 subjects) and above 50% (n = 6) subjects. Error bars indicate standard errors. Figure 9. Time series of mean alpha amplitudes after an incorrect high alpha trial ("same" trial type) for sessions scoring significantly below 50% (n = 6 subjects) and above 50% (n = 6) subjects. Error bars indicate standard errors.

Interactions Between Discrimination and Operant Control
Percentage differences between alpha amplitude during the increase and decrease conditions (100 percent times the difference between increase and decrease / average of increase and decrease) were computed for each 10-min segment, which consisted of 5 min of increasing and 5 min of decreasing alpha. The discrimination task group only received the operant control task during minutes 1-10 and 21-30 of the session, so only these segments were used for comparison between groups. Figure 10 shows averages for each 10-min session time interval across three sessions. The first 10 min of the first session was effectively a baseline for the operant control task because the two groups received identical treatments until the 11th minute. Performances in the control and discrimination tasks correlated significantly in the first session, Pearson r = .338, n = 28, one-tailed p < .05, and third session, Spearman r = .510, n = 20, one-tailed p = .010, but not the second session, Pearson r = .258, n = 20, onetailed p = .116 (nonparametric statistics were used whenever variable distributions failed to meet parametric assumptions).
During the first 10 min of the first session (before the treatments were different), the control task-only group achieved an average of 11.1% greater in the increase condition than the decrease condition (SD = 15.5), compared to 6.7% (SD = 15.3) in the discrimination task group, a nonsignificant difference. However, during minutes 21-30, the discrimination task group increased to 10.8% (SD = 14.0) while the control-task only group decreased to 0.7% (SD = 16.4; Figure 10). This group difference was significant, Mann-Whitney W = 185, one-tailed p = .008, n1 = 22, n2 = 28, rankbiserial correlation 0.399. However, performances were not significantly different during any of the remaining session time intervals. An alternative analysis was performed in which the session 1 baselines were subtracted from each segment, using a pretest posttest design. While the differences from baseline were greater in the discrimination task group during the second and third sessions, the effect was not significant.

Discussion
Learning of the self-prompted discrimination task was more robust than the learning of EEG-prompted discrimination seen in previous studies. Participants averaged 56.4% by the third session, 12.6% higher than a chance-level (44.0%) performance. By contrast, the mean score for the top 40 of 106 participants in Frederick (2012) was below 52% in the Vol. 6(2):81-92 2019 doi:10.15540/nr.6.2.81 third session-where a chance-level performance was 50%-and just under 57% by the 10th session. Similarly, in Frederick et al. (2016), 17 participants averaged about 53% in the third session and did not exceed an average of 55% by the seventh session. The greater discrimination performance in this study could be explained by several factors, including the larger number of trials per minute (5.8 compared to 3.0 in Frederick et al., 2016), or generalization of skills from the standard neurofeedback training. It may also indicate that using subjective states to prompt responses more reliably indicates EEG state differences than the other way around, providing more informative opportunities for learning (Frederick, 2006).
The finding that most subjects scored significantly higher than the percent time that the EEG was in the correct state in most sessions supports the interpretation that physiological state discrimination involves some genuine awareness of internal feedback about the physiological state (as suggested by Brenner, 1974). Participants in this study were reporting more than just their awareness of their effort to manipulate their state (as suggested by Black, Cott, & Pavloski, 1977;Lacroix, 1981).

Psychophysics of Self-Prompted Discrimination
The percentage of trials was significantly higher in the lowest (0-10th) percentile amplitude bin for correct low trials and significantly higher in the highest (91-100th) percentile amplitude bin for correct high trials ( Figure 3). This observation is consistent with the view of EEG alpha discrimination being a sensory or perceptual process involving some transduction of energy from the objective signal. Although subjects do not report perceiving EEG amplitude directly, it may be indirect, like the amount of visual phosphenes being related to the amount of pressure applied over the eyelid.
There appeared to be little or no significant differences with respect to alpha amplitude among the incorrect low trials, or among the incorrect high trials. For instance, below the 51st percentile on a high trial or above the 50th percentile on a low trial, subjects were equally likely to make moderately wrong and very wrong responses. This finding is mysterious because in the same percentile bins they did demonstrate an awareness of the differences between moderately correct and very correct responses. For instance, on an incorrect high trial, they may have no longer recognized a very low state that they had just correctly identified on a low trial. Possibly, this difference indicates top-down processing where subjects are deploying a kind of search-image for pattern-matching in each type of trial. This finding suggests that high and low alpha states are phenomenologically not just opposites, or one the absence of the other.

Percent Time Correct and Behavioral Inertia in Alpha Amplitude
This study began with the incorrect assumption that a random performance in the discrimination task would be 50% correct. The initial result was that the number of "significantly below chance" scores was 5.9 to 7.1 times the number expected at p < .05 (Table 1). However, the "percent time correct"-the percent that would be scored if subjects responded continuously or randomly (on average, 44.0%)-was lower than 50%. This finding could have been predicted from the fact that alpha amplitude is a physiological process that is not distributed randomly but varies with a finite rate of change. Figures 4-9 show how there is a behavioral inertia in alpha amplitude where, after every trial, it takes time for the subject to recover from the voluntary or spontaneous processes that resulted in the previous (currently incorrect) state. When the percent time correct was used as the null hypothesis, number of sessions significantly below the percent time correct was much closer to the 5% expected at p < .05 (Table 1).
Figures 6 and 7 suggest that performances significantly below 50% were explained more by a difficulty in achieving high alpha than in achieving low alpha. This difference could correspond to a general difference in achieving high and low states of arousal. For instance, it generally takes at least 5 minutes to fall asleep (Carskadon et al., 1980), but only a few seconds to wake someone up. It would be of interest to see how this greater relative difficulty in returning to high alpha in some subjects relates to measures of mood or arousal regulation.
Future studies should redefine a correct response to take account of the behavioral inertia when switching between trial types. Scores below 50%, resulting from the use of the 50th percentile as the threshold for a correct response, can be demoralizing for participants. A lower threshold for a correct response would allow for shaping, or the reinforcement of successive approximations to the correct response (Sherlin et al., 2011). One method would be to define a correct trial as above the percent time correct (updated each trial based on the 50th percentile) on high trials and below 100 minus the percent time correct on low trials. Another possibility that might produce equivalent results would be to only use the most recent 60 s of the same trial type (instead of just the most recent 60 s).

Response Timing
The lack of relationship between response timing and response performance suggests that self-prompted discrimination may not require complex controlling for the possible use of response timing to "cheat" in the discrimination task (compared to EEG prompted discrimination, Frederick et al., 2015;Frederick et al. 2016).
No evidence of a post-reinforcement synchronization (Hallschmid et al., 2002;Sherlin et al., 2011) was seen in this paradigm. That is, there was no advantage to having a high alpha trial rather than a low alpha trial after a correct response (Figure 4). Alpha amplitude also did not increase more in the few seconds after a correct response (Figures 6 and 7) than after an incorrect response (Figures 8 and 9). Thus, it did not seem generally possible to use a postreinforcement synchronization to cheat on high trials following correct low trials.
Figures 4-9 represent the 500-ms delayed amplitudes used in the task. While it is assumed based on Libet (1985;1993) that the phenomenal correlates of alpha amplitude represent their corresponding brain states with a 0.5-s delay, this assumption has not been tested. If true, a study comparing discrimination performance with varying 0 to 2-s delays might contrast with Sherlin et al.'s (2011) suggestion that latency between a correct EEG response and the reinforcement should not exceed 250 to 350 milliseconds.

Interactions Between Discrimination and Operant Control
A significant effect of discrimination training on the standard neurofeedback performance was observed in the first session but not in the second and third sessions ( Figure 10). This observation is consistent with awareness playing a greater role in the early stages of learning (Frederick, 2016;Fitts & Posner, 1967). However, the lack of effect beyond the first session suggests that further refinement of the paradigm is needed. It is possible that the limited facilitation of operant control performance by discrimination training seen in this study could be an effect of the limited opportunities for generalization of skills between the two tasks. That is, each 40-min session consisted of two 10-min runs of each task, alternating between tasks only three times. Future studies should alternate more frequently between the tasks. For instance, the training paradigm could require a subject to alternate immediately and repeatedly: first achieve a high alpha state, then achieve a low alpha state, then discriminate a high alpha state, then discriminate low alpha state, and repeat. Such an arrangement would maximize the number of opportunities for generalization between the two types of skills. However, it is also possible that the effect of discrimination on control task performance was some idiosyncratic effect of the first session. For instance, subjects may habituate to the novelty of the task(s) and the lab environment after the first session, which may interact with how boredom or fatigue with the control task is interrupted by the discrimination task during minutes 11-20.
It is worth noting that while the discrimination-trained group did not do better during sessions 2 and 3, they did not do worse. This finding suggests that dividing session time equally between standard neurofeedback and discrimination training is at least an equally useful way to do the training. Discrimination training may have benefits other than facilitation of voluntary control, such as increasing client motivation and engagement in the session. While it is possible to sit passively through a standard neurofeedback session without much attention or effort, attention and participation are intrinsic to every trial in the discrimination task. When integrated into a standard neurofeedback session, self-prompted discrimination training may function as "transfer trials" and facilitate generalization of self-regulation skills beyond the clinical setting (Sherlin et al., 2011). Discrimination training measures and trains awareness about the subjective correlates of physiological states. Regardless of how it interacts with voluntary control, the ability to discriminate physiological states may play a role in the clinical efficacy of biofeedback, just as the ability to discriminate emotional states is important in the efficacy of psychotherapy (Lau & McMain, 2005). The explicit training of contrasts between opposing states in discrimination training may improve flexibility or the ability to make transitions between states, as opposed to merely maintaining a desired state. By analogy, the standard neurofeedback approach is like lifting a weight once and holding it up the entire session (with some exceptions, e.g. Strehl, 2009). Finally, the discrimination task score may provide an alternative and more reliable measure of the success of neurofeedback training.

Conclusion
The self-prompted discrimination paradigm in this study was much more readily learned than the EEGprompted discrimination described in previous studies. The postresponse time series of alpha amplitudes suggested that recovering from correct low alpha trials was a particular challenge for some participants, contributing to session scores significantly below 50%. However, discrimination task scores frequently and significantly exceeded the percent time the EEG was in the correct state, providing evidence that the discrimination paradigm measures more than just the ability to manipulate EEG amplitude.
Observations that extreme amplitude events were discriminated better than moderate ones supported the interpretation that EEG alpha discrimination is more like a sensory than a motor performance. Discrimination training appeared to facilitate performance of the control task in the first session, consistent with awareness being important for early stages of learning. The lack of effect on control task performance in subsequent sessions suggests the need for further development of the paradigm. However, discrimination training may have other benefits, including client motivation and engagement, generalization beyond the clinical setting, and flexibility in making state transitions.