Broadband beamforming compensation algorithm in CI front-end acquisition

Background To increase the signal to noise ratio (SNR) and to suppress directional noise in front-end signal acquisition, microphone array technologies are being applied in the cochlear implant (CI). Due to size constraints, the dual microphone-based system is most suitable for actual application. However, direct application of the array technology will result in the low frequency roll-off problem, which can noticeably distort the desired signal. Methods In this paper, we theoretically analyze the roll-off characteristic on the basis of CI parameters and present a new low-complexity compensation algorithm. We obtain the linearized frequency response of the two-microphone array from modeling and analysis for further algorithm realization. Realization and results Linear method was used to approximate the theoretical response with adjustable delay and weight parameters. A CI dual-channel hardware platform is constructed for experimental research. Experimental results show that our algorithm performs well in compensation and realization. Discussions We discuss the effect from environment noise. Actual daily noise with more low-frequency energy will weaken the algorithm performance. A balance between low-frequency distortion and corresponding low-frequency noise need to be considered. Conclusions Our novel compensation algorithm uses linear function to obtain the desired system response, which is a low computational-complexity method for CI real-time processing. Algorithm performance is tested in CI CIS modulation and the influence of experimental distance and environmental noise were further analyzed to evaluate algorithm constraint.

noise-suppression methods from HA applications. Many theoretical and experimental studies of array speech enhancement have been performed to improve CI speech recognition [9][10][11]. A dual-microphone CI was recently designed by Cochlear Ltd. to improve the front-end SNR. However, the clinical CI device is still equipped with a single omnidirectional microphone, and no microphone array-based method is applied in daily usage for the CI device.
Microphone array methods [12,13] add spatial information to the algorithms and are especially effective in suppressing directional noise. These array-beamforming methods include fixed [14,15] and adaptive beamformers [16][17][18][19][20]. Present complicated algorithms combine beamforming technology with single-channel signal-filtering and noise-suppression methods [21][22][23]. The delay and sum beamforming method, especially with two microphones, is more suitable for application in the CI device, given its appropriate size constraints and low complexity in real-time processing [24].
Microphone array technologies are very effective to narrowband signals, whereas speech is a broadband signal. The direct application of narrowband technologies to acoustic and phonic fields may result in low frequency roll-off. Studies have reported the practical implications of low frequency roll-off for HA devices [25], comparing sensatory performance results between situations with and without compensation for low frequency roll-off. Theoretically, slopes for the first-and second-order low frequency roll-off are 6 and 12 dB/octave, respectively [1].
A solution for the low frequency roll-off is gain adjustment in each sub-band. Corresponding gains need to compensate for the energy loss, and the sub-band signal is readjusted to be the same with the omnidirectional microphone [26]. A first-order differential low-pass filter is conventionally applied [17,19] for compensation of the low frequency roll-off. However, these methods cannot accurately compensate the desired gains for the low frequency roll-off. Increased focus has been given to broadband beamforming [27,28]. Post-filter technology uses maximum likehood filter for spectral shaping and noise suppression [29]. However, these methods are very complicated, require large computational complexity, and exceed the capabilities of current CI speech processors for portable application.
We use the array method for CI speech enhancement and further need to compensate for the signal distortion introduced from the low frequency roll-off. For a simple dual-microphone device with a delay parameter, we previously proposed the normalized beamforming algorithm [30] based on the Taylor approximation. However, this method is not applicable for a general array with delay and weight parameters.
In this paper, we theoretically analyze the low frequency roll-off feature under different parameters. A dual-microphone hardware platform, based on actual CI parameters, was constructed to obtain practical experimental results. From theoretical analysis and experimental observation of the roll-off, we propose a novel compensation algorithm for the low frequency roll-off with adjustable delay and weight parameters. The proposed algorithm can accurately compensate the signal distortion and requires very few calculations. Further testing of this adjustment method in the CI speech strategy revealed the efficiency and applicability of the proposed compensation algorithm.

Dual-microphone system in CI front-end
Given the size constraint of CI devices and the limited calculation capability in the speech processor, the dual-microphone array is most suitable for use in front-end signal acquisition. Figure 1 presents the system sketch for a dual-microphone array with delay and weight parameters based on the "delay-and-subtract" method. According to practical requirements, the inter-microphone distance is about 1 cm. Dual-channel signals recorded by these 2 microphones are delayed and subtracted to yield the desired directional output signal.
The delay-and-subtract method is one of the most effective technologies in sizeconstraint device, hearing aid and cochlear implant etc. In Figure 1, the two microphones each record the signal, d is the inter-microphone and c is the speed of sound in air. If we assume that the recorded signal in Microphone 1 is x(t), then the recorded signal in Microphone 2 is asynchronous due to the spatial difference, with a additional time delay d c cosθ. This signal is delayed and weighted with the corresponding parameters τ and β to yield a new signal, which is combinated with the signal in Microphone 1 to obtain the directional output y(t). The system magnitude response is given by Eq. (1).
where the parameters f corresponds to the narrowband frequency. For a different orientation θ, the system response is also different. The algorithm delay τ corresponds to the beam pattern (dipolar pattern: τ = 0; supercardiod pattern: τ = 0.342d/c; cardioids pattern: τ = d/c) [31].
A fixed value of the delay τ can yield a set of similar beam patterns. The magnitude response in Eq. (1) is based on a specified narrowband frequency to obtain the directional beams. Therefore, the system response is a function of the signal frequency f, and the use of different frequencies results in different magnitude responses.

Low frequency roll-off characteristic
We use the actual parameters of a CI device, d = 0.01 m and c = 340 m/s, in the theoretical analysis. We initially analyze a simplified situation, in which the weight parameter β is Figure 1 Sketch of delay-and-subtract method in the dual-microphone system. chosen to be 1, and the 2 microphones are equally weighted. Figure 2 shows the dipolar, supercardiod, and cardioid beam patterns for different frequencies in this situation. Figure 2 describes changes in the dual-microphone beams for different frequencies (correspond the center frequencies of 8-channel CI filter bank, shown in Table 1).
The system magnitude response in Eq. (1) is a function of signal frequency. In each beam pattern (dipolar, supercardiod, and cardioid), the system response increases as the corresponding frequency increases. Because speech is a broadband signal including various frequencies, the corresponding beams are not consistent with each other. In addition, the system response for low-frequency beams is smaller than that for highfrequency ones. Observed from these panels, the main lobe and side lobe amplitude both increase at higher frequency. Therefore, we observe low frequency roll-off in the broadband application, which will introduce distortion to the desired speech.
We can theoretically analyze the reason of low frequency roll-off. To simply the analysis, let β = 1, and the correspond response simplified in Eq. (2).
When signal frequency is low, the system response is approximately proportional to f. Therefore, low frequency corresponds to small response.
As an example, this roll-off feature can easily be observed when we use white noise as testing signal, shown in Figure 3. The spectrogram of white noise contains with information from all frequencies. Correspondingly, the first-order output will noticeably weaken the low-frequency signal in (b-2).
Based on Eq. (1), we can deduce the following inequality.  Figure 3 Waveform of the original white noise signal (panel a-1) and the first-order differential output signal (panel a-2); and the corresponding spectrum are presented in panels of (b-1) and (b-2) respectively.
It indicates that, for the situation of β = 1, the system response will approach zero, which corresponds to frequency f = 0. Therefore, the amplitude response of the lowfrequency signal is smaller and will approximate to zero.

Parameter analysis
The most common situation uses a weight parameter of β = 1. Use of different weights in the algorithm will change the corresponding beam patterns. We analyze beam features with delay times of τ = 0, 0.342d/c, and d/c ( Figure 4). Figure 4 describes the influence of different weights on the system response (Channel No. 1, 2, 3, 4, 5, 6, 7 and 8 correspond to 274, 517.5, 761, 1005, 1462, 2376, 3869 and 6276 Hz respectively). For β = 1, the aforementioned analysis indicates that the lowfrequency response will approach zero. For β ≠ 1, the system response in the lowfrequency band does not approach zero, but approximates to a specific amplitude. For β = 0.2, the low-frequency band approaches approximately 0.8. For β = 0.5, 2, and 4, the corresponding low-frequency responses are approximately 0.5, 1, and 3, respectively, consistent with the theoretical analysis of |H(e jω )| min = |1 − β| in Eq. (3).
In the previous analysis, the low-frequency response does not approach zero, but low frequency roll-off and response distortion are observed. In each panel, the system response increases with increasing signal frequency. In Figure 4, a-1, a-2, a-3, and a-4 refer to derivative patterns based on the standard dipolar beam (τ = 0). In the range of ± 180°, we observe the same amplitude-changing feature, such that the high-frequency band increases. The dipolar pattern in Figure 2(a) shows zero-response values (Nulls) at the ±90°orientations. In contrast, the derived dipolar beam (β ≠ 1) in the 4 panels of Figure 4(a) shows Minimal values (Mins) rather than Nulls at ±90°. For τ = 0.342d/c and d/c, the corresponding Mins are obtained at ±110°and ±180°, respectively.
In Figure 4, the system response is located at different amplitude ranges for different weights. Because the overall amplitude levels are different, low frequency roll-off cannot be quantitatively observed directly from the figure. To compare each roll-off characteristic on a unified standard, we normalize system responses at 0 Hz to be the same. For a fixed weight, the amplitude at each orientation differs from the others. Therefore, we need to normalize the average amplitude from -180°to 180°to be 1. The 0 Hz frequency is an infinitesimal case of low frequency, as the corresponding frequency response approximates to |1 − β| (shown in Eq. 3). Figure 5 shows the normalized system responses. Figure 5 describes the beam patterns on the basis of the response normalization at 0 Hz. In each panel, the system response increases with increasing frequency, but the magnitude of the increase differs for different parameters. After normalization, for β = 0.5 (a-2, b-2, and c-2) and β = 2 (a-3, b-3, and c-3), the system responses can be more easily distinguished and the high-frequency response increases more. In contrast, for β = 0.2 (a-1, b-1, and c-1) and β = 4 (a-4, b-4, and c-4), the system response changes more smoothly. The situation with β > 1 corresponds to a larger gain added to Microphone 2, and β < 1 corresponds to a larger gain added to Microphone 1. Similar to the previous comparison, when the weight approaches 1, the high-frequency beams grow more rapidly, which results in more low frequency roll-off. When the weight parameter is far from 1 (≪ 1 or ≫ 1), the roll-off and corresponding distortion are less than the previous ones.
The system response for β < 1 can be given with the following equation transform.
The equivalent equation in Eq. (4) indicates that the system response for β 1 < 1 corresponds to the case of β 2 ¼ 1 β 1 : > 1 , with only an additional gain paramete, β 1 . Therefore, the normalized system response presents identical roll-off features for both situations. Equation (4) indicate that the weight value has reciprocal and symmetrical features; thus, we only need to consider the case of β < 1.

Realization and results
Low frequency roll-off in dual-channel CI platform In this paper, we develop a dual-channel hardware platform, based on actual CI size constraints and design requirements, for signal acquisition and algorithm analysis. The hardware system includes microphone modules, a signal acquisition circuit, signal transmitting device, computer, and accessory devices (holder frame, etc.), as shown in Figure 6.
The inter-microphone distance is adjusted to be 1 cm, and the loudspeaker is 1.5 m apart from the hardware. The experiment is conducted in an ordinary room (10 m × 8 m × 3.5 m) with a reverberation time of about 400 ms.
As the experiment begins, the loudspeaker plays a voice recording of a speaker reading a paragraph of English material (Native American English). Two microphones record this signal from the loudspeaker, and the collected signal is amplified and filtered by the hardware circuit. The USB sound card transfers the analog signal to digital signal, which is transmitted to the computer. The data are stored on a PC hard disk for further analysis. To simplify the analysis of low frequency roll-off with a unified standard, an additional gain is added to the output signal to maintain a consistent inputoutput energy. On the basis of the previous theoretical analysis, the weight only needs to be ≤1. We choose the weight parameter to be 0.2, 0.4, 0.6, 0.8, or 1 and the delay parameter τ to be 0, 0.342d/c, or d/c, to analysis the practical spectrums of the recorded signal by the hardware (Figure 7). Figure 7 shows signal spectra for different delay times and weights. To simplify the analysis, the corresponding spectrum at each panel is normalized to maintain the same energy with the original signal. The spectrum of the original signal is shown for a detailed comparison of the low frequency roll-off. For a fixed delay time, as the weight increases, the high-frequency response strengthens and the low-frequency response weakens. For a weight parameter of 1, the low frequency roll-off is most noticeable. For this speech signal, after energy normalization, signals in the band between 0 and 1800 Hz are relatively weak, whereas the corresponding signals are relatively strong in the high-frequency band (>1800 Hz). The delay parameter also influences the spectrum distribution, although this effect is not as obvious as the influence of the weight parameter.

Compensation algorithm
We next attempt to compensate for the signal distortion based on the low frequency roll-off feature. For different weight and frequency parameters, the system response is a function of β and f, H (e jω ) = H (β, f ). Using a weight range of 0 ≤ β ≤ 1 (similar to the previous section), the system responses for a signal between 0 and 6000 Hz are shown in Figure 8 (dipolar-based parameter of τ = 0). Figure 8 presents the system responses for different weights and frequencies, in which the weight is increasing at an interval of 0.02. The response curves are equally spaced, with good linearity (especially in the high-frequency band). We use (ƒ, H (β, ƒ)) to describe points in the 2-dimensional plane for the f-H function. For points on the left vertical axis (corresponding to f = 0), the system responses are equally space distributed. For a fixed weight β, the response function H (β, f )| f = 0 = 1 − β, such that the coordinate of the left endpoint in the f-H function is (0, 1-β). For f = 6000 Hz, the right endpoints in the response curves are also equally spaced, but with a turning point of H 2 corresponding to β = 0.45.
From  from H 2 to H 3 . Consequently, we obtain the following system response at the frequency f = 6000 Hz.
The previous analysis indicates the approximate linearity of the system response. To reduce the computational complexity in CI devices, we linearize the f-H function. We use the curve endpoints (0, β) and (6000, H (β, f )| f = 6000 ) to yield the approximate system response H eva (β, ƒ), as shown in Eq. (6).   7).
For a fixed beam pattern, we use Eq. (7) to obtain the linear system response. The corresponding response curves are shown in Figure 9 (a). Panels (a) and (b) present the linear f-H function and the relative errors calculated by Eq. (8), respectively.
As seen in Figure 9(b), the relative error ranges from -10% to 40%. A comparison of Figures 8 and 9(a) indicates that the theoretical f-H function (ideal system response) is primarily a set of concave curves, such that most of the response amplitudes in the linearization curve are greater than the ideal ones. Therefore, the concave feature of the linearization curve indicates that most of the relative errors are positive, but negative relative errors are lacking. Therefore, the whole response amplitudes are larger than the ideal ones. This finding implies that the total energy of the output signal is strengthened. We only need to analyze the relative amplitude difference for different signal bands because the overall and coordinative enhancement for the compensation will not introduce distortion to the desired signal. To evaluate the compensation error in a detailed and accurate manner, we normalize the relative error to balance the input and output signals. The normalized error is defined by Eq. (9).
where the normalized coefficient G (β, ƒ) is given in Eq. (10) to maintain the energy equilibrium for a signal between 0 and 6000 Hz. When N is large, the frequency band is fractionized enough, and the calculated normalized coefficient can approximate the ideal value.
The corresponding normalized error is presented in Figure 10. According to Figure 10, the normalized error ranges between -10% and 40%. However, most of these errors are concentrated in the ±10% range (corresponding to the error distribution in the high-density dark part of the figure), with only a few large errors. Therefore, most of the errors are very small. Based on a fixed weight parameter, figure 10 also indicates that the error will change for different frequencies.
To evaluate the total error further, we use equation (11) to calculate the average error.
For different weight parameters, the corresponding average error (in red) is shown in Figure 11 (N is 50 in this paper).
For different weight values, the average error (in red) ranges between 0% and 7%, corresponding to the dipolar-based parameter τ = 0. For other delay values, supercardioid-(blue), and cardioid-based (green) parameters, analogous results can be obtained, with average errors ranging from 0% to 7%. These errors are acceptable and sufficiently small for the CI application. For a set of fixed parameters, the system beam pattern is fixed. The proposed linear algorithm for the compensation of low frequency roll-off can approximate the ideal system response with few errors. Additionally, this algorithm is low-calculation for the real-time processing in the CI speech processor.
The broadband signal is divided into several sub-bands during signal processing of the CI speech strategy. Each sub-band includes the signal in its corresponding frequency band, in which the signal envelope information is extracted and transferred to the next process. Because the CI filter bank divides the signal into many bands, the divided signal in each band is approximately narrowband. The CI speech strategy can modulate the information in each band and send it to the electrode array that, with a specific stimulating rate, stimulates the auditory nerve to generate acoustical sensation. The stimulating rates in the electrode array correspond to the center frequencies of the filter bank. The proposed algorithm uses the center frequency f cen-i in each sub-band for the compensation. One of the center frequencies is applied as the reference frequency, f cen-ref . Because most speech signals are concentrated in the low-frequency band peaking around 1000 Hz, the reference frequency is chosen in the band near 1000 Hz. For compensation of low frequency roll-off in the i-channel, the corresponding center frequency f cen-i and reference frequency f cen-ref are used in the gain adjustment, as shown in Eq. (12).
The reference and i-channel center frequencies in this gain compensation equation are based on the linearization response function; therefore, the gain adjustment needs only a few additional calculations. Compensation results in the dual-channel CI platform The low frequency roll-off experiments are conducted on our dual-channel hardware platform. We add compensation gains to the filter bank of the CI speech strategy to adjust the amplitude for each channel. Recorded signals are modulated by the Continuous Interleaved Sampling (CIS) strategy [32]. We use sinusoidal modulation to actualize this speech strategy. A set of sinusoidal signals are modulated by the corresponding envelope of the band-pass signals after frame-division in the filter bank. Then, the original continuous spectrum becomes several discrete frequency components (line spectrum), each of which corresponds to the sub-filter and CI electrode. The electrode array, based on the corresponding stimulating rate, stimulates the nerve to yield auditory perception. In this paper, the frequency components after CIS modulation match the center frequency of the CI filter bank. The Welch method [33] is used to calculate the Power Spectral Density (PSD), to compare results with and without algorithm compensation, as shown in Figure 12.
The experiment is conducted in quiet environment (The SNR is about 15 dB and the performance of the compensated beamformer is evaluated for different noise levels in the discussion section). For different weight parameters, Figure 12 describes the signal PSD (8 line spectrums, corresponding to the 8-channel filter bank) after CIS modulation of the CI device. The top panel presents the original CIS spectrum obtained by an omnidirectional microphone without signal distortion. Figure 12 shows the corresponding CIS spectrums with (b) and without (a) algorithm compensation when the signal is recorded by the dual-channel array. Both CIS spectrums are normalized to balance the overall energy with the original signal. Without compensation of the low frequency roll-off, the signal PSD distribution changes noticeably. The low-frequency amplitude is weakened, whereas the high-frequency amplitude is relatively strengthened for both weight parameters. A greater weight results in more obvious low frequency roll-off. Panel (b) shows that, after algorithm compensation, amplitudes in these 8 channels match the original CIS spectrum well. Therefore, the proposed adjustment algorithm can accurately compensate the signal distortion in array beamforming.
For further comparison, Figure 12(c) shows the compensation results based on the ideal compensation coefficients from the theoretical system response. The ideal compensation uses the i-channel center frequency f cen-i and reference frequency f cen-ref to obtain the gain adjustment, given in Eq. (13).
Equation (2) use the accurate response amplitude, based on a set of fixed frequency and weight parameters, to calculate the ideal adjustment coefficients for the compensation of low frequency roll-off.
A comparison of panels (b) and (c) shows that the compensation results by our algorithm are very consistent with the ideal response-based gain adjustment. For detailed comparison, we calculate the corresponding 8-channel spectrum amplitudes for situations (a) without compensation, (b) with compensation by the proposed algorithm, and (c) with ideal compensation. The spectrum amplitudes are compared to the original signal spectrum amplitude (in dB) to obtain the relative enhancement or attenuation results ( Figure 13). Figure 13 shows that low frequency roll-off is obvious when the array output signal is not adjusted (a). The low-frequency signal is weakened, ranging from 0 to -20 dB, and the high-frequency signal is relatively enhanced, ranging from 0 to +20 dB. For the whole frequency band, the signal overall distortion is extremely large, between -20 and +20 dB. Panel (b) presents the compensation results obtained with our proposed algorithm. The signal distortion is very small, with only a few amplitude differences between -2 and 3 dB. When the signal is compensated by the ideal coefficients (c), the error ranges from -1 to 2.5 dB. Therefore, the compensation accuracy by our algorithm can approach the ideal response-based adjustment. The adjusted signal matches the desired CIS spectrum well, with little distortion.

Discussions
In the previous hardware experiments, the loudspeaker is 1.5 m apart from the hardware. Actually, a common rule of for the range at which the transition from spherical waves to planes waves occurs for a monopole source is at least two times the wavelength, so the corresponding minimal distances for 8 channel of CI filter bank is given in Table 2.
Seen from Table 2, we find that signal acquisition will be influenced in Channel 1 as the minimal distance Distance between loudspeaker and microphone is 2.48 m. however, the practical usage and daily face-to-face communication for CI users is 1.5 m, so we finally adjusted the loudspeaker 1.5 m apart from the hardware platform. And 1.5 m correspond to 453 Hz, therefore, signal with frequency higher than 453 Hz will not be influenced.
The environment noise is different from the white noise, with more noise in the low frequency band. In daily usage, the car & fan noise (low-frequency signal) were recorded in microphones. We use the hardware platform to record the environment noise, shown in Figure 14.
Observed from panel (b-1), most environmental noises contain most of the energy concentrated at low frequencies. Panel (b-2) presents the signal spectrum after a firstorder differential processing in the dual-microphone array. The corresponding low frequency energy was sharply weakened; however, the amplitude was still very large. Therefore, the low frequency roll-off was not always a bad thing and boosting up low frequency contents might increase internal microphone noise together. A balance between low-frequency distortion and corresponding low-frequency noise need to be considered. This compromise needs an actual test of a set of suitable attenuation coefficient based on omnidirectional microphone, and the coefficients were added in the usage of array beamforming, to obtain low distortion and less low frequency noise. For different noise levels (10, 5 and 0 dB respectively), the performance of the compensated beamformer is given in Figure 15. Figure 15 describes the signal PSD for different noise levels. When noise increase, with smaller SNR, more low-frequency energy was added in. Particularly, the case of SNR = 0 dB presents noticeable results that the signals in low-frequency are enhanced from channel 1 to channel 3. Seen from these panels, lower SNR will introduce more distortion in CI devices.

Conclusion
The microphone array noise-suppression method can separate the desired speech and ambient noise on the basis of their spatial differences. Use of a dual-channel array with appropriate size constraints is more suitable for CI devices. However, direct application of the narrow-band method in broadband speech will yield low frequency roll-off and noticeable signal distortion. Low-frequency loss from the speech signal can be observed in first-and second-order differential systems [34]. To compensate for low frequency roll-off, conventional methods use only a simple low-pass filter to enhance the lowfrequency signal and weaken the high-frequency signal. These methods are not sufficiently accurate to match the original signal. The broadband beamformer was recently introduced as a method to obtain precise compensation. However, these algorithms require extensive calculation, preventing their actual application in CI devices.
In our previous work, we construct a microphone array based platform for signal acquisition. To suppress the environmental noise, we use delay-and-subtract method and proposed the optimal parameter section methods of delay and beamforming for CI speech enhancement [24]. In our later work, we aim to compensate the low frequency roll-off in speech application and proposed the normalized beamforming algorithm using a continuous interleaved sampling strategy [30]. However, this work only contain t (s)
delay parameter, but without weight parameter. In this paper, we propose a novel CI filter bank-based algorithm for the compensation of low frequency roll-off. This method, with adjustable delay and weight parameters, uses a linear function to approximate the desired system response, with very low computational complexity. Theoretical and experimental results indicate that our algorithm can accurately compensate the signal distortion and is easy to embed in the CI speech strategy, supporting its practical application in the CI device.
Abbreviations CI: Cochlear implant; SNR: Signal to noise ratio; CIS: Continuous interleaved sampling strategy.

Competing interests
The authors declare that they have no competing interests.