Performance of blink reflex in patients during anesthesia induction with propofol and remifentanil: prediction probabilities and multinomial logistic analysis

Background The amount of propofol needed to induce loss of responsiveness varied widely among patients, and they usually required less than the initial dose recommended by the drug package inserts. Identifying precisely the moment of loss of responsiveness will determine the amount of propofol each patient needs. Currently, methods to decide the exact moment of loss of responsiveness are based on subjective analysis, and the monitors that use objective methods fail in precision. Based on previous studies, we believe that the blink reflex can be useful to characterize, more objectively, the transition from responsiveness to unresponsiveness. The purpose of this study is to investigate the relation between the electrically evoked blink reflex and the level of sedation/anesthesia measured with an adapted version of the Richmond Agitation–Sedation Scale, during the induction phase of general anesthesia with propofol and remifentanil. Adding the blink reflex to other variables may allow a more objective assessment of the exact moment of loss of responsiveness and a more personalized approach to anesthesia induction. Results The electromyographic-derived features proved to be good predictors to estimate the different levels of sedation/anesthesia. The results of the multinomial analysis showed a reasonable performance of the model, explaining almost 70% of the adapted Richmond Agitation–Sedation Scale variance. The overall predictive accuracy for the model was 73.6%, suggesting that it is useful to predict loss of responsiveness. Conclusions Our developed model was based on the information of the electromyographic-derived features from the blink reflex responses. It was able to predict the drug effect in patients undergoing general anesthesia, which can be helpful for the anesthesiologists to reduce the overwhelming variability observed between patients and avoid many cases of overdosing and associated risks. Despite this, future research is needed to account for variabilities in the clinical response of the patients and with the interactions between propofol and remifentanil. Nevertheless, a method that could allow for an automatic prediction/detection of loss of responsiveness is a step forward for personalized medicine.


Background
In clinical practice, anesthesiologists use a variety of anesthetic drugs during surgery to render the patient unconscious/unresponsive, including the most widely used, intravenous drug propofol. In a previously published work [1,2], we found that the amount of propofol needed to induce loss of responsiveness (LORP) varied widely among patients (~ 300%) and that more than two-thirds of the patients required less than the initial dose recommended by the drug package inserts.
Identifying precisely the moment of LORP during the induction phase of general anesthesia is of extreme importance for the determination of the amount of propofol each patient needs. Using that information will help to guide the drug infusion rate required to maintain an adequate level of anesthesia throughout the surgery [3][4][5].
Currently, methods to decide the exact moment of LORP are based on subjective analysis [6]. Objectively, there are depth of anesthesia monitors, such as the Bispectral Index (BIS) (Aspect Medical System, Newton, MA, USA), which allow the maintenance of a steady-state during surgery, but do not enable the determination of the instant at LORP, which remains an open issue. BIS evaluation is characterized by a delay following the acquisition of a new dataset, which may even exceed one minute [7]. To prevent complications, such as awareness [8,9] or excessive anesthesia, the anesthesiologists should be aware of the conditions that cause incorrect BIS readings.
In a previous study, Mourisse et al. [10] found that the components of the blink reflex are attenuated and abolished with increasing concentrations of propofol. Mourisse et al. [11] showed, in another study that the blink reflex was more sensitive than BIS. Their results suggested that the differential sensitivity of the blink reflex components could be useful to monitor the depth of sedation/anesthesia, and thus, to detect when LORP occurs. However, their method used a 10-min stepwise increment in propofol, which is not compatible with anesthesia induction in a surgical setup.
We hypothesize that the blink reflex can be useful to characterize, more objectively, the transition from responsiveness to unresponsiveness during the induction of anesthesia with propofol and remifentanil. The administration of propofol may be stopped at this transition, personalizing the amount of propofol each patient requires and reducing the events of over and underdosing. Then, this can be used to titrate the infusion rate of propofol and to maintain an adequate level of sedation/anesthesia. In the current study, and based on a constant infusion of propofol at a slow rate, we intend to investigate the relationship between the electrically evoked blink reflex and the level of sedation/anesthesia. For this purpose, we extracted different electromyogram (EMG) features, and we compared the ability of these features to distinguish between different levels of sedation/anesthesia. The comparison was carried out using prediction probability analysis and multinomial logistic modeling. Adding the blink reflex to other variables already recorded during general anesthesia may allow a more objective assessment of the exact moment of LORP and a more personalized approach to anesthesia induction.

Results
Twenty-five patients (16 female and 9 male), aged 61 ± 13, weighing 72 ± 11 kg, heighten 162 ± 9 cm, 1 ASA I, 18 ASA II, 6 ASA III were enrolled. The supraorbital nerve was stimulated with 25.7 ± 8 mA. All patients reached the end of the depth of sedation/anesthesia scale. No patient had hemodynamic or respiratory problems.
Visual inspection of the raw electromyogram and analysis of the signal quality was performed before extracting the features to eventually discard poor-quality electromyogram. No signal was discarded.
The raw data from the study are presented in Fig. 1; each patient's observed adapted Richmond Agitation-Sedation Scale (aRASS) scores are plotted against the corresponding predicted propofol effect-site concentrations.
Concerning the raw electromyographic findings, after the propofol infusion was started, R 1 and R 2 decreased gradually. Ce concentrations, aRASS values, and times at which LOR 2 , LOR 1 and LORP occurred are presented in Table 1. R 2 was the first component to be abolished, followed by the R 1 component. aRASS median [minimum, maximum] for LOR 2 was 0 [0, − 2] and for LOR 1 was − 2 [0, − 4]. All patients were responsive (aRASS < − 5) when the R 2 and R 1 responses were last seen. There was a statistically significant difference between the propofol Ce concentration at LORP and LOR 1 (p < 0.05). The times for endpoints LOR 2 , LOR 1 and LORP are all statistically different from each other (p < 0.001).  The time between LORP and LOR 1 was 35.68 ± 23.41 s. The amount of propofol given between LOR 1 and LORP was 1.23 ± 0.82 μg/mL.
Results of the features extracted from the EMG signals in the time and frequency domain are presented in Table 2.

Prediction probability and Spearman correlation coefficient
The capacity of all extracted features to adequately assess the clinical sedation/anesthetic depth was evaluated by P k analyses. The user interface is shown in Fig. 2. The calculated P k values and correlation coefficients of the electromyography (EMG) extracted features are shown in Table 3. P k values were higher (P k > 0.700) in the period T 1 for V mean , V diff , f mean , f median , P bandwidth , and spectral entropy; and in the period T 2 for f mean , and spectral entropy. Results indicated that during the period T 1 there was a significant negative association, defined by a correlation coefficient R higher than − 0.500 and a p < 0.05, between aRASS and f mean , f median , and spectral entropy. During the period T 1 there was a significant negative association between propofol Ce concentration and the following features: f mean , f median , and spectral entropy. There was no significant correlation between aRASS and the extracted features, and no significant association between the Ce concentration of propofol and the extracted features.
The correlation between aRASS and the propofol Ce concentration was given by P k = 0.886, SE = 0.007 and by R = 0.751, p < 0.01. The clinical scale of the depth of sedation/anesthesia increased monotonically and positively with increasing propofol Ce concentration until LORP, revealing an increasing deepening of sedation/anesthesia.

Multinomial logistic analysis
Regarding MLR, we choose the features which were better correlated with the Ce concentration of propofol and with aRASS, and features which we believe were useful (with a P k > 0.700). The predictor features corresponding to these criteria were propofol Ce, and V mean , V diff , f mean , f median , P bandwidth and spectral entropy, during the period T 1 , and f mean and spectral entropy during the period T 2.
We tried to explore the effects of these variables by building the MLR model and then examined the results. To achieve this goal, we used SPSS software version 26, and calculated the MLR model with response variable and all explanatory variables to make the primary model. The overall effectiveness of the model was assessed using Chi-squared statistics. The Chisquare value of 696.430 and its respective p value < 0.001 indicated a significant relationship between the depth of sedation/anesthesia scale and the set of features in the final model. Model performance (Nagelkerke R 2 ) of the MLR for the combined features predicting aRASS level of sedation/anesthesia groups was 0.697. The overall predictive accuracy for the present model was 73.6%.
The likelihood ratio test shows the contribution of each feature to the model (Table 4).
Propofol Ce , f mean (T 1 ), spectral entropy (T 1 ), and spectral entropy (T 2 ) had a significant contribution (p < 0.05) to the model that predicts the aRASS's level of sedation/anesthesia, but not V mean (T 1 ) , f median (T 1 ) , P bandwidth (T 1 ) or f mean (T 2 ). Figure 3 illustrates the relation between the aRASS score and the mean of the significant aforementioned features. Table 2 Values from the extracted features during two different time windows: P 1 -from 10 to 25 ms, in which it is expected R 1 to be analyzed; P 2 -from 25 to 200 ms, in which R 2 is expected to be analyzed V mean mean amplitude, V diff difference between maximum and mean amplitude, P mean mean power, P max maximum power, f mean mean frequency, P meanfrequency power at mean frequency, f median median frequency, P band power band, P total total integrated of the spectrum, P total /f median ratio between total power and median frequency, SNR signal-to-noise ratio, P bandwidth power bandwidth The fitted logistic model was where P is the estimated probability of unresponsiveness (i.e., aRASS = − 5).

Discussion
In the present study, a standard electrical stimulus to evoke a blink reflex was used during the induction phase of general anesthesia with propofol and remifentanil to assess the relation between the recorded electromyogram and the level of sedation/anesthesia. The level of sedation was assessed every 6 s using an adapted version of the Richmond Agitation-Sedation Scale, entitled aRASS. The electromyographic-derived features were extracted during 2 specific subsets of samples: T 1, from 10 to 25 ms, in which it was expected the first response of the blink reflex, R 1 , to be analyzed, and T 2 , from 25 to 200 ms, in which the second response of the blink reflex, R 2, was expected to be analyzed. The electromyographic-derived features in the time domain included the mean amplitude (V mean ), and the difference between the maximum and the mean amplitude (V diff ). Because R 2 and R 1 responses were abolished before LORP and, consequently, there was an insufficient number of data points, a frequency-domain analysis was also performed. The electromyographic-derived features in the frequency domain included the mean power (P mean ), maximum power (P max ), mean frequency (f mean ), power at mean frequency (P meanfreq ), median frequency (f median), band power (P band ), total power (P total ), ratio between P total and f median (P total /f median ), signalto-noise (SNR), power bandwidth (P bandwidth ) and spectral entropy. These variables were

Table 3 Prediction probability (P k ) values calculated between the adapted Richmond Agitation-Sedation Scale (aRASS) and the electromyography (EMG) extracted feature. Correlation coefficient R with aRASS and with propofol effect-site (Ce) concentration. T 1 and T 2 are time windows from 10 to 25 ms, and from 25 to 200 ms, respectively
Prediction probability (P k ) values calculated with pooled data from all patients (n = 25). The standard error (SE) is also shown. Rank correlation coefficient from pooled data of all patients (n = 25) is shown. *Significant at the 0.05 level. **Significant at the 0.001 level. V mean mean amplitude, V diff difference between maximum and mean amplitude, P mean mean power, P max maximum power, f mean mean frequency, P meanfrequency power at mean frequency, f median median frequency, P band power band, P total total integrated of the spectrum, P total /f median ratio between total power and median frequency, SNR signal-to-noise ratio, P bandwidth power bandwidth  selected as they are the most useful and popular frequency-domain features for electromyography analysis both in clinical and engineering applications. The ability of the electromyographic-derived features to distinguish different levels of aRASS was assessed using prediction probability analysis. Spectral entropy, f mean , f median and P bandwidth during the T 1 time window period showed the best performance in detecting different levels of aRASS, as reflected by its higher P k value, followed by spectral entropy and f mean during the T 2 time window period. A statistically significant correlation (R > 0.500) between aRASS (or propofol Ce concentration) and f mean (T 1 ), f median (T 1 ) and spectral entropy (T 1 ) period was also found. At low anesthetic concentration, the EMG frequency was high, and it slowed down as the drugs concentrations increased. Spectral entropy considers both the overall signal variability characteristics, which are naturally related to the spectral content, and the signal's complexity or irregularity [12]. For this reason, spectral entropy is known to be an excellent index to distinguish between consciousness and unconsciousness states during propofol anesthesia, even in the presence of burst suppression [13]. f mean and f median are frequently used as the gold standard tool to detect force in the target muscles using EMG signals [14,15]. P bandwidth is supposed to be a good indicator of changes in the EMG signal when certain frequencies are lost, as is the case of the R 1 and R 2 component of the blink reflex, which have a particular signature. The effectiveness of spectral entropy, f mean, f median and P bandwidth to distinguish the different levels of aRASS, resulted by the inhibition of EMG activity by muscle relaxation, is presented and confirmed in this study.
Another finding was that aRASS scale was strongly correlated with estimated propofol Ce concentrations, indicating that the clinical scale aRASS increased monotonically and positively with increasing estimated propofol Ce concentrations until LORP. This revealed an increasing deepening of anesthesia. This finding is in line with those published by Mourisse and colleagues [10,11] who have done a similar study using a different sedation scale and a different anesthetic protocol. In the work of Mourisse and colleagues, the group of patients received propofol in a stepwise deepening of anesthesia with different targets and only 2 min after reaching target effect-site concentrations, the blink reflexes, and depth of anesthesia scores were recorded. In our study, remifentanil infusion started with a Ce concentration target 2.5 μg/mL, and then patients received propofol at an infusion rate of 3.3 mL/kg/h, slowly and continuously, until LORP. Starting before the propofol infusion, the stimulation of the supraorbital nerve was recorded every 6 s in our study and, for this reason, our method had more data to precisely identify the amount of propofol in the endpoints of interest.
Only the features that were both useful for predicting the aRASS scale (P k > 0.700) and correlated significantly (R > 0.500) with the propofol Ce concentration were used for the multinomial logistic regression model to predict LORP (defined as − 5 in the aRASS scale). The results of the multinomial analysis showed a reasonable performance of the model, explaining almost 70% of the aRASS variance. The effects and contributions of each feature were not the same: propofol Ce concentration, f median (T 1 ), spectral entropy during T 1 , and spectral entropy during T 2 had a significant overall effect (p < 0.05) on the aRASS score, while V mean (T 1 ), f median (T 1 ) and f mean (T 2 ) did not. The overall predictive accuracy for the model was 73.6%, suggesting that it is useful to predict LORP. The availability of accurate models for predicting the drug effect in patients undergoing general anesthesia is an important factor in producing a personalized drug infusion [16]. Our developed model can be employed in model predictive control strategies for closed-loop anesthesia. This will help the anesthesiologists with the optimization of drug titration without overshoot and controlling the physiological functions. This automated system could also result in a reduction of the workload of the anesthesiologists.
A major drawback of this our blink reflex method is that it is dependent on a normal neuromuscular transmission. The degree of relaxation can be estimated by stimulating the facial nerve and assessing the evoked response of that part of the orbicularis oculi muscle. The effect of muscle relaxants on the inferior part of the orbicularis oculi is still not known. Also, the blink reflex method, while promising, may imply that the anesthesiologists are stimulating the patient while simultaneously attempting to induce unconsciousness, even though it is a much smaller stimulation than taping (the standard in clinical practice). The multinomial regression model was applied to a small sample size of the unconsciousness states in this study. In particular, the small number of unconsciousness states could be the cause of a not so high performance in detecting LORP. Even so, the performance of our model is relatively good and therefore, we intend to implement our model for online estimation. However, preliminary studies should be conducted specially because of the computerization times the model can result.

Conclusions
By analyzing the electrically evoked blink reflex during the induction of general anesthesia with propofol and remifentanil, we determined that there is enough relevant information to predict the state of unresponsiveness during the transition from consciousness to unconsciousness with a multinomial logistic model. A method that could allow for an automatic prediction/detection of LORP is a step forward for personalized medicine.
With an accuracy of 73.6%, this model can help to greatly reduce the overwhelming variability observed between patients and avoid many cases of overdosing and associated risks. To our best knowledge, no studies have been conducted on LORP prediction using our approach. Despite our results, it would be accurate if we had a larger sample size for applying the multinomial regression model or any other prediction model to analyze the relation between the aRASS and the EMG effect reflected in extracted features. Further research should investigate the impact of remifentanil on such a technique.
Nevertheless, our method of electromyographic recording of the electrically evoked blink reflex in patients submitted to general anesthesia and in the continuum to unconsciousness showed to be a possible method to continuously monitor the EMG, in awake, sedated or unconsciousness patients, during the onset to unconsciousness, and in a real scenario in which clinical anesthesia takes place every day. Patients also found the stimuli to be easily tolerable. Devices using this technique could turn this method into a clinical routine way of monitoring the transition to unconsciousness.
In the future, we intend to add other variables to feed the model, such as heart rate, blood pressure, and electroencephalogram. We plan to use those variables to build an adaptive model that deals simultaneously with the variabilities in the clinical response of the patients and with the drug interactions. Additionally, we plan to use different EMG equipment for more robustness.

Patients
Twenty-five patients, aged over 18 years (ASA I, II or III), scheduled for neurosurgical procedures participated in this study. They had no hemodynamic, respiratory or ophthalmic problems, and did not use analgesics, psychotropic or excessive alcohol consumptions. The Hospital Ethical Committee approved the study and all subjects gave informed written consent. No premedication was given. The study took place in a quiet, warm anesthetic induction room. Before the start of the study, the patients were prepared as usual for anesthesia (intravenous access, continuous electrocardiogram, pulse oximetry, and non-invasive blood pressure).

Anesthetic protocol
Our standard practice for neurosurgical procedures consists of opioid-propofol anesthesia using a Target Controlled Infusion (TCI) system. In the operating room, after placement of standard monitor and an intravenous line in the dorsum of the hand, an infusion of a balanced electrolytic solution was started at 6 mL.kg −1 .h −1 . The anesthesiologist would then use a Fresenius Base Primea docking station (Fresenius-Kabi, Bad Homburg, Germany) to start a TCI of remifentanil (Minto PKPD model) [12,13], at an effect-site concentration (Ce) target of 2.5 ng.mL −1 . A bolus of 10 mg of lidocaine was administered locally to reduce the pain associated with propofol administration. One minute after the remifentanil pseudo-equilibration was achieved, baseline blinks were Page 11 of 16 Ferreira et al. BioMed Eng OnLine (2020) 19:84 recorded and, then, an infusion of 1% propofol (Schnider PKPD model) [14] was started using a TCI enabled infusion system, in the TCI-View mode, at 3.3 mL.Kg −1 .h −1 until LORP, determined by the anesthesiologist. This slow velocity of infusion during induction enabled a careful titration of the minimum amount of propofol required for loss of responsiveness. Once LORP was reached, the propofol infusion was stopped, the estimated propofol Ce concentration was noted and the TCI system was switched to effectsite TCI mode with a Ce of 75% of that at LORP. At this point, no additional analgesic/ opioid medication was given during induction.
Propofol infusion was subsequently titrated to maintain BIS (BIS Vista ™ monitor-Medtronic, Ireland) between 40 and 60. The study was terminated just before tracheal intubation.

Data acquisition
Using the VikingQuest ™ neurophysiological system (VikingQuest, Nicolet, WI, USA) the electromyographic stimulations and recordings were performed at a total sweep time of 200 ms with a sample rate of 10 kHz. A high-pass filter was applied with a cutoff frequency of 20 Hz. Before the induction of anesthesia and prior electrode application, all the patient's head skin surfaces were cleaned with an exfoliant agent. Surface electrodes (1.4 cm 2 ) coated with alcohol and conductive paste (electrode impedance < 8 kΩ) were applied to stimulate and record the electromyogram from the right orbicularis oculi muscle. The right supraorbital nerve was transcutaneously stimulated using a bipolar electrode with the cathode placed beneath the eyebrow over the supraorbital notch and the anode placed above the eyebrow (interelectrode distance 2 cm). The supraorbital nerve was electrically stimulated with a duration of 0.1 ms at 0.16 Hz. With regard to the electrode, the recording electrode was placed in the middle of the inferior rim of the orbit; the reference halfway on the eye-ear line and the ground electrode was placed on the cheek or on the shoulder of the patient (Fig. 4). The signals were stored in the Viking-Quest ™ software provided by the manufacturer. The raw data was exported to a personal computer to be treated and analyzed in MATLAB ® 2019b (MathWorks, USA).

Assessment of levels of responsiveness
The level of sedation was assessed every six seconds using an adapted version of the Richmond Agitation-Sedation Scale (RASS) score [17], entitled aRASS. This scale is a modification of the RASS scale [17], with the addition of the corneal reflex endpoint, yielding an adapted Richmond Agitation-Sedation Scale (aRASS). A score of 0 corresponds to a fully awake and alert behavior, − 1 with a not fully alert behavior, but a sustained awakening to voice (eye-opening and eye contact), − 2 with a brief awakening to voice (eye-opening and eye contact, − 3 no response to voice, but movement or eyeopening to shaking and shouting, − 4 is no response after shaking and shouting, but still have the corneal reflex and − 5 is no corneal reflex. Loss of responsiveness (LORP), defined as − 5 in aRASS scale was evaluated using a drop of sterile water to the cornea, after aRASS reached a score of − 4. Hereafter, this evaluation was intercalated with electrical stimulations at 6 s intervals. If the eyes blinked concomitantly, the reflex was intact. If only one eye blinked, the reflex was impaired, and if neither eye blinked, the reflex was absent. Baseline blink reflexes and evaluation of the aRASS scale were recorded several times before propofol was administered. Patients then received propofol, and at the same time four successive blink reflexes were elicited and recorded. A further four successive blink reflex stimuli were elicited after a 6-s interval and recorded continuously until LORP was reached.

Blink reflex parameters
The blink reflex neurophysiology and anatomy are reasonably well known [18]. The electromyography records of an electrically evoked blink reflex showed at least two components (R 1 and R 2 components). The first or early response (R 1 ) is brief and occurs after a latency of approximately 10 ms on the side of stimulation [19]. The second response (R 2 ) has a latency of approximately 30 ms, is bilateral, and more prolonged in time [19]. The R 2 response causes the actual contraction of the orbicularis oculi muscle [19]. The optimal stimulus intensity was sought by gradually increasing the current until visual observation of the EMG showed that, in the presence of a visible R 1 component, the R 2 component reach its maximum amplitude [20]. Figure 5, uppermost panels, illustrates the two components of a normal blink reflex. The averaged electromyographic records of the blink reflex were obtained from the four consecutive electrical stimulation of the supraorbital nerve (interstimulus interval 5 ms). The first rows showed baselines. Vertical lines marked the beginning and end of the individual components (R 1 and R 2 ), marked by an expert neurophysiologist at the end of the session, using the marker tool of the VikingQuest ™ neurophysiological device. The right margins showed estimated propofol concentrations (µg/mL), clinical endpoints (loss of R 1 , loss of R 2 and LORP), depth of sedation/anesthesia level (aRASS score), and the time since propofol started.

Data analysis
Before proceeding with data analysis, we removed the DC component by subtracting the mean amplitude from the EMG signals. We then removed the best straight-fit line (in the least-squares sense) from the EMG signals with the detrend function of MATLAB ® . For each pre-processed EMG signal, we selected two specific subsets of samples: T 1, from 10 to 25 ms, in which it is expected R 1 to be analyzed, and T 2 , from 25 to 200 ms, in which R 2 is expected to be analyzed.
Data from each subset was analyzed in both time and frequency domain. In the time domain, after rectifying and normalizing each subset of data, the following features were extracted: mean amplitude (V mean ), and the difference between the maximum (equal to 1) and the mean amplitudes (V diff ).
Because R 2 and R 1 responses were abolished before LORP and, consequently, there was an insufficient number of data points, a frequency-domain analysis was also performed. A non-parametric Fast Fourier Transform (FFT) using Welch's modified periodogram was used to extract functions from the frequency domain. EMG was windowed with a Hamming window and the modified periodogram was calculated for each time period. The Power Spectrum Density (PSD) was estimated by averaging over all resulting periodograms. For this purpose, MATLAB ® was used. The following features from the frequency domain were extracted: