Biomedical Engineering Online Algorithm for Multi-curve-fitting with Shared Parameters and a Possible Application in Evoked Compound Action Potential Measurements

Background: Experimental results are commonly fitted by determining parameter values of suitable mathematical expressions. In case a relation exists between different data sets, the accuracy of the parameters obtained can be increased by incorporating this relationship in the fitting process instead of fitting the recordings separately.


Background
It's very common to analyse a system by making measurements and trying to fit a mathematical equation (i.e. the model of the system) to the results. That way the system is described by the fitted parameters. If the system is complex and the equation has many (M) parameters a 1 to a M (described by the vector a), receiving usable values from only one measurement/fit is difficult. In this case, one possibility is to make more than one (N) measurement and alter some of the test-conditions which should appear in one or more parameters of the fittings. If the results of these measurements were fitted separately, the situation would not improve much. Here an algorithm is introduced that fits those N measurements simultaneously to N equations which may be (but need not) different and may share some of the parameters a m .

Example
We have N = 3 curves described by the equations y(x) = g i e -kx + c (i = 1 ... 3) which are degraded by added normally distributed noise to simulate the measurement process (left part of figure 1(a)). The task is to retrieve the parameters used to create the curves by only using the degraded values for 0 ≤ x < 100. When the three curves are fitted separately, one obtains three values of each parameter g, k and c ( figure 1(b)).
The fit method we are introducing here is able to fit these curves simultaneously and takes into account that the parameters k and c are shared. Therefore it returns only one value for k and one value for c ( figure 1(d)). The following equations were used for fitting: y 3 (x) = g 3 e -kx + c (1) y 2 (x) = g 2 e -kx + c (2) y 1 (x) = g 1 e -kx + c (3) In this example, the vector of the fit-parameters a was assigned as follows: Example task Figure 1 Example task. Three curves of the form y = g i e kx + c were created (a) with k = -0.01 and c = 50. Then the range between x = 0 and x = 99 was taken and noise was added. The task for the fit methods was to extract the original parameters using only this noisy part. It was done by (b) fitting each curve separately with y = + c i , (c) taking the introduced fit method with shared k (y = g i e kx + c i where k was determined to be -0.0137) and (d) with shared k and c (y = g i e kx + c where k and c resulted in -0.0107 and 55.2 respectively).
g e i k x i a 1 = k a 2 = c a 3 = g 1 a 4 = g 2 a 5 = g 3 Figure 1(c) shows the result when using this algorithm with shared k only.

Principle of the algorithm
The basic idea is the following: When fitting one curve to 1. The equation is independent of a m . Then the (in the following needed) partial derivation of the equation according to this parameter a m is zero.
2. a m is a "normal" fit-parameter, that should be optimised.
3. The equation depends on a m , but should not partake in the optimisation. This is done by setting the partial derivation for this parameter to zero.
The total sum of the goodness of the fit parameter χ 2 is defined as sum of the goodness of the fit parameters of each curve. The bigger χ 2 , the worse the fit:

Method for linear functions
If all equations y n (x, a) are linear in the parameters a, we can use the least square fit method [1] to minimise the total χ 2 . In this case, the equations y n (x, a) have to be of the following form: The advantages of the least square method over all nonlinear methods are very fast processing, final results in one step and no need to specify starting parameters.

Method for nonlinear functions
If the equations y n (x, a) depend on some of the parameters a in a non-linear way, the requirement to use the least square fit method is not met. In this case, one of the fastest methods to minimise χ 2 is the Marquardt-Method [2], an iterative numerical process. Here it is adapted to fit N curves simultaneously. We use the Marquardt method here, because some of the ideas that are described later require non-linear fitting. The following description can be considered a recipe. For the mathematical background of the Marquardt-Fit-Method, see [1,2].  2. Set λ to 10 -3 .λ is a constant factor that controls whether the Marquardt-Fit-Method should behave more as a gradient search fit method (λ <<> 1) or an expansion fit method (λ Ŭ 1).
3. Calculate the vector δa, which has the same size as a (namely M elements) and describes the suggested correction to a as follows: (a) The M element vector β represents the first partial derivations of χ 2 to the fit parameters described by a:  (e) Calculate δa as follows: 4. Derive new trial-fit-parameters a' from the "old" ones a: a' = a + δa (14) 5. Determine χ' 2 with the trial-fit-parameters a' using equations (4) and (5 and (5)).
This iterative calculation may be stopped when one of the three following conditions is met: • χ 2 falls below a certain predefined threshold.
• The difference between χ 2 and χ' 2 decreases to less than a specified value.
• The count of the calculation loops exceeds a maximum number.
Note: Using this method, one has to treat "constant" parameters (parameters that do not participate in the fitting process) in a special way (and not within a having derivations of zero), because otherwise their value may be biassed to minimise χ 2 . This could be done by an extra parameter for y n (x, a) (e.g. called b) or by incorporating the constant parameters within the functions y n (x, a).

Comparsion to single fits
The benefit of parameter-sharing is that all retrieved values (not only the shared ones) are more precise compared to fitting them separately. To verify this, 1000 fits of the simulated "measurement" from the example above had been made with the following three methods: single fits, simultaneous fits with shared k and simultaneous fits with shared k and c. For every fit, a "new" noise was used. The distributions of the parameters obtained with the three different methods are shown in figure 2: The smallest deviation from the real parameters were obtained by the fits, where k and c were shared.

Use with cochlear implant ECAP measurements
Cochlear implants are medical devices that enable deaf people to perceive hearing impressions by electrically stimulating the auditory nerve [3]. The stimulations occur through a multichannel electrode that is placed directly into the cochlea and has contacts on different positions. This way, up to 100% speech recognition can be achieved [4].
Modern cochlear implants can record the evoked compound action potential (ECAP) of the auditory nerve. This is done by measuring the voltage at one of the contacts of the multichannel electrode after a stimulation pulse that evokes the action potentials. The recordings obtained show the sum of the ECAP and exponential decays resulting from residual charges (arising from capacitive components on the implant itself or from the electrodeelectrolyte transition). There are several well established recording methods to eliminate or reduce this so called artefact to obtain the pure ECAP: Alternating stimulation [5], masker probe methods [6,7], triphasic stimulation [8] and scaled template methods [9]. However, each method has its limitations: Some of them rely on the linearity of the system (alternating stimulation), others need up to between nine to sixteen times more measurements for one result (masker probe methods).
Another issue is the influence of the used measurement system itself, which may be seen on the measured curves as offsets, drifts, additional exponential decays, ... (called system effects here).
The introduced method can be used as a tool to help separate the ECAP signal from these undesired components. One way to do so is to fit just the artefact (without the ECAP) along with the system effects and subtract this fit from the measurements. This can be done with subthreshold measurements, recovery measurements, or in regions, without an ECAP. Another possibility is to combine this method with an artefact cancellation method mentioned above, for example to eliminate system effects.
The algorithm needs sequences with different conditions to gain advantage over single-curve-fits. In practical use this is no disadvantage, because usually two special sequences are measured: The amplitude growth sequence and the recovery sequence. The amplitude growth sequence raises the stimulation pulse amplitude from zero to the maximum comfortable loudness (MCL) level. The recovery sequence sends two MCL level pulses before the measurement -if the second pulse lies within the auditory nerve recovery time of the first pulse (about < 1 ms), the ECAP vanishes. Example ECAP measurement Figure 3 shows the results of an ECAP measurement using the amplitude growth sequence with ten amplitudes between 0 and 800 cu (current units, 1 cu corresponds nominally to 1 μA). It was recorded with the MedEl PULSARCI 100 cochlear implant and the software ArtResearch. Biphasic pulses beginning with the cathodic phase were used. For each of the ten curves, 50 recordings were taken and averaged. We intentionally chose an ECAP measurement with a large artefact and with different offset voltages for each curve to demonstrate the power of the algorithm.

Amplitude growth measurement result
The following equation was used as "model" of the artefact plus offset for each curve. We assumed that the artefact can be described by two exponentially decreasing terms and one offset parameter.
The final offset c and the amplitudes f and g are different for each curve, but the time constants τ and T are assumed to be global, so that the following a was used. It contains 32 parameters. a = (τ, T, f 1 , g 1 , c 1 , f 2 , g 2 , c 2 , ..., f 10 , g 10 , c 10 ) (16) As mentioned above, several ideas exist to take advantage of the algorithm. Three of them will be described in the following subsections:

Fit of the regions without an ECAP signal
The first possibility is to exclude the regions of the curves where an ECAP signal is expected from the fit. In this example, we included the whole range of the five measurements from 0 to 355 cu stimulation amplitude that are sub-threshold (figure 3), and the right part of the other five measurements, where the ECAP signal (duration of about 1 ms [10]) has vanished.
The run time of the algorithm was about one minute (interpreted Matlab™ code with no attempts to speed it up), calculating the fitted values for the 32 parameters. As expected, straight lines were obtained for the fitted regions when subtracting the fit from the original data, indicating a tight fit ( figure 4). The returned values for the time constants were: τ = 322 μs and T = 138 μs. What we could not expect was that the fit was good for the regions that did not partake in the fitting process, because those regions of the fit where g 6 to g 10 (the factors for the fast exponential function) could be determined properly were excluded from the fit as described before.
These patient specific time constants describe mainly the electrical properties of the electrode-electrolyte interface [11] and the geometry, and could be taken as diagnostic parameters along with the ECAP amplitude or the residual voltage measured in telemetry measurements.
If the ECAP signal is small compared to the amplitude of the exponential decay, a good guess of f 6 to f 10  A remark on the starting parameters: The method we used to get the starting parameters was to fit the curves with a simplified model having only one exponential term (with shared time constant) and taking into account only the right halves of the curves. In this way we obtained values for the long time constant τ, f i and c i . We then subtracted the fits of this simplified model from the curves and fitted again, using only one exponential term with a shared time constant T and taking into account the whole curve. After that we got starting values for g i and T. These additional fits needed only a few seconds calculation time.

Using σ 2 to fit over all regions
Instead of first excluding the parts of the curves with an ECAP signal and then fitting over all regions where some of the previously determined parameters are kept constant, we could initially fit over all regions and use the uncertainty parameter σ 2 (see section "Principle of the algorithm") to characterise regions where the ECAP signal is expected: These regions should receive a high σ 2 value compared with the σ 2 value from the other regions, so that the fitting algorithm doesn't punish the ECAP caused deviations with a high contribution to the χ 2 value.
The more of the following conditions are satisfied, the better the results from this approach: • The ECAP amplitude is small compared to the amplitude of the artefact.
• There is a region before and after the ECAP-part of the curve that is described by the artefact model used.
• The ECAP signal is dc-free, i.e. the integral over the ECAP signal (without any artefact) is very small.
• The artefact time constants are larger than the ECAP time constants. Figure 6 shows the results of this variant when using σ 2 = 10 instead of 1 within the selected areas where the ECAP signal is expected. The retrieved values for τ and T were 328 μs and 176 μs.

Using an ECAP model to fit over all regions
A further approach is to model not only the artefact, but the ECAP signal too. [12] would be an example for a model that could be used. This model takes into account the double-peak shape of some ECAPs as well. Doing so is part of current investigation and may be published in the future.

Conclusion
The introduced fit-algorithm uses the additional information of the relation between measured curves to retrieve more accurate parameters compared to the parameters extracted from single fits. In the case of ECAP measurements the algorithm could be used as (additional) artefact cancellation method with the following benefits: • There are no additional measurements necessary to apply this method.
• The method can be used in combination with other artefact cancellation methods.
• It can take into account implant specific system effects.
• Physiological or implant specific parameters like time constants or the artefact amplitude are gained as values that can be used for diagnostic purposes.
• No additional noise is added.
• It is very flexible because of the possibilities to judge different regions of a curve and different curves in special ways with the σ 2 parameter (e.g. curves with zero pulse amplitude fit some of the parameters very accurately), leave out data which should not be included in the fit (e.g. Influence of σ 2 holes where the ECAP is expected), use different equations for different curves (e.g. combining results of amplitude growth and recovery sequences), and exclude parameters of some curves from the fit process (e.g. time constants in curves with ECAP).
• Constant parameters (time constant, offset, ...) can be reused (at least as starting parameters) to save calculation time.
The drawbacks are that the calculation can be time-consuming (especially with many fit parameters) and that there have to be suitable starting parameters. The first issue can be improved by reducing the resolution of the curves to receive a rough result, and optimising the stop condition for the calculation loop to fit the problem. The second issue can be dealt with by reusing parameters from earlier runs, or by determining them by doing rough prefits with less complicated equations.
Further investigation is necessary to develop a final method based on the introduced ideas. ECAP signals obtained with this method should be compared in size and shape with results of traditional artefact cancelation methods.

Authors' contributions
PS developed the basic method and wrote the draft of the publication. It was discussed, criticised, and partly rewritten to clarify the presentation together with EH and CZ. They had the idea of the comparison with the straight forward approach of fitting the curves separately by doing simulated measurements.