A finite element model for protein transport in vivo

Background Biological mass transport processes determine the behavior and function of cells, regulate interactions between synthetic agents and recipient targets, and are key elements in the design and use of biosensors. Accurately predicting the outcomes of such processes is crucial to both enhancing our understanding of how these systems function, enabling the design of effective strategies to control their function, and verifying that engineered solutions perform according to plan. Methods A Galerkin-based finite element model was developed and implemented to solve a system of two coupled partial differential equations governing biomolecule transport and reaction in live cells. The simulator was coupled, in the framework of an inverse modeling strategy, with an optimization algorithm and an experimental time series, obtained by the Fluorescence Recovery after Photobleaching (FRAP) technique, to estimate biomolecule mass transport and reaction rate parameters. In the inverse algorithm, an adaptive method was implemented to calculate sensitivity matrix. A multi-criteria termination rule was developed to stop the inverse code at the solution. The applicability of the model was illustrated by simulating the mobility and binding of GFP-tagged glucocorticoid receptor in the nucleoplasm of mouse adenocarcinoma. Results The numerical simulator shows excellent agreement with the analytic solutions and experimental FRAP data. Detailed residual analysis indicates that residuals have zero mean and constant variance and are normally distributed and uncorrelated. Therefore, the necessary and sufficient criteria for least square parameter optimization, which was used in this study, were met. Conclusion The developed strategy is an efficient approach to extract as much physiochemical information from the FRAP protocol as possible. Well-posedness analysis of the inverse problem, however, indicates that the FRAP protocol provides insufficient information for unique simultaneous estimation of diffusion coefficient and binding rate parameters. Care should be exercised in drawing inferences, from FRAP data, regarding concentrations of free and bound proteins, average binding and diffusion times, and protein mobility unless they are confirmed by long-range Markov Chain-Monte Carlo (MCMC) methods and experimental observations.


Background
Transport of mass, energy, and momentum has a crucial role in many branches of science and engineering. In biological systems, transport phenomena are central to the biological processes that take place in different parts of organisms. They determine the behavior and function of cells, tissues, and organs, and regulate interactions between synthetic agents (e.g. drugs) and recipient targets. These phenomena are crucial elements in the design and use of biosensors, high density cell culture, filtration units for kidney dialysis, heart-lung bypass machines, and membrane oxygenators in human medical care and ion selective electrodes, pH-meters, electrical conductivity meters, and time domain reflectometery used in biosystem analysis. Transport processes are critical in the removal of toxins from the blood, remediation of impaired water bodies (sources of waterborne diseases), and bioremediation of contaminated landscape [1,2].
Fluorescence Recovery after Photobleaching (FRAP) is one of the most widely used experimental protocols to study biological transport processes such as diffusion and reaction . FRAP is a straightforward technique used to monitor the movement of fluorescence molecules. These molecules can absorb light of one wavelength (blue for instance) and emit light of another (e.g. green). However, if exposed to repeated cycles of excitation-emission, they lose their ability to emit fluorescence. This phenomenon is called "photobleaching" or "photochemical bleaching" [24]. In this technique a small region of living cell containing Green Fluorescent Protein (GFP)-tagged protein is exposed to a brief but intense laser beam, produced by a laser scanning confocal microscope, to irreversibly inactivate fluorescence emission in that region. Before exposure to the light, the living cell is in equilibrium with a uniform population of fluorescence [25]. Photobleaching creates two different populations of fluorescence molecules, which are spatially separated at the beginning of the experiment. Unbleached molecules from the undisturbed area move toward the bleached region and the rate of fluorescence recovery is measured as a function of time. The result is a noisy graph known as a recovery curve. However, because of the noise, the original graph by itself is not suitable for quantitative study of the dynamics of living cells. The FRAP community generally uses a data processed normalized average fluorescence recovery curve that has less noise. By analyzing the recovery curve, one can quantify how many fluorescent photons return to the bleached area in comparison to the amount of light that was there before photobleaching. This is known as percent recovery. The other question that can be addressed is that of how fast the fluorophores move toward the bleached area. This is a measure of the free molecular diffusion coefficient of the bio-macromolecule under study.
One of the first attempts to estimate bio-macromolecule mass transport and binding rate parameters using in vivo information was carried out by Kaufmann and Jain [13]. Sprague et al. [25] developed a diffusion-reaction model to simulate FRAP experiment but the solution is in Laplace space and requires numerical inversion to return to real time. The model presented recently by Lele et al. [26] properly respects cell boundaries but is in the form of a Fourier-Bessel series and can suffer from Gibbs phenomena. Carrero et al. (2003Carrero et al. ( , 2004 presented an excellent review on the effects of boundary conditions, influence of the membrane, and the location of the photobleaching on the estimation of diffusion coefficients for diffusing biomolecules in a bounded domain [24,28]. They showed that overestimations or underestimations can result from ignoring this influence [24]. Beaudouin et al. [29] used a diffusion-reaction models to study mobility of five chromatin-interacting proteins inside living cells. They found that transient interactions are common for chromatin proteins. Individual proteins locally sample chromatin for binding sites rather than diffusing globally followed by binding at random nuclear positions. They concluded that complementary procedures are needed to measure transient biochemical interactions in living cells. Although experimental methods are representative of the biological system, they are expensive, tedious and timeconsuming. An alternative approach is to use mathematical modeling. In this regard, several sophisticated mathematical models have been developed to predict and simulate the fate and transport of drugs and bio-macromolecules in biological systems. However, the use of these models is not an easy task since they contain numerous parameters that need to be determined before the model(s) can be used for the considered situation. The success of model predictions depends largely on the proper representation of relevant processes, uncertainty in model parameters [31,32], and parameter identification which is a critical step in the modeling process. Difficulties in model calibration and parameter identification are quite common in modeling mass transport problems in biological systems [1,30].
The main objective of this paper is to develop a masslumped Galerkin-based finite element model (FEM) to solve a system of two partial differential equations governing protein transport and binding in living cells and couple it with the Osborne-Moré [33,34] extended version of the Levenberg-Marquardt [35,36] algorithm and an experimental data set, obtained by the Fluorescence Recovery after Photobleaching (FRAP) technique, to quantify bio-macromolecule diffusion coefficient and binding rate parameters. The applicability of the developed FEM-based inverse modeling strategy is illustrated by simulating the mobility and binding of GFP-tagged glucocorticoid receptor in the nucleoplasm of mouse adenocarcinoma.

Direct problem
A one site-mobile-immobile model was selected as the direct (forward) problem to describe bio-macromolecular diffusion inside living cell in cylindrical coordinate system [1,30]: Since a circular bleach spot was used to bombard the cell and to track the fluorescently tagged biomolecules inside bleach spot during the time course of a FRAP experiment, system (1) was written in cylindrical coordinate system. Furthermore, the binding reaction was modeled by primary rate kinetic or single binding site model [1,13,25,26,30]: subject to the following initial and Neumann boundary conditions: where F is the concentration of free biomolecule, S is the concentration of vacant binding sites, C is the concentration of the bound complex(C = FS), D F is the molecular diffusion coefficient (L 2 T -1 ) of free biomolecule, K a is the free biomolecule-vacant binding site association rate coefficient (T -1 ), K d is dissociation rate coefficient (T -1 ), = K a S is the pseudo-association rate coefficient, ρ is radial coordinate (L) in the cylindrical coordinate system, w is the radius of the bleached area, R is the length of the spa-tial domain, and F eq and C eq are equilibrium concentration of F and C [25,30]: The initial condition implies that photochemical bleaching inactivates the fluorescence tag on the biomolecules in the bleached area but does not change the concentrations of free and bound biomolecules or vacant binding sites.
The boundary conditions imply that the diffusive biomolecule flux is zero at the center of the bleach spot and at the cell or nucleus membrane during time course of a FRAP experiment [30].
The forward problem was solved by the finite element method. The weak formulation of the dependent variables F and C were approximated using piecewise linear approximating functions [1,31,[37][38][39]: in which N is total number of nodes in the spatial domain, ϕ j (ρ) are the selected linear basis functions, and F j (t) and C j (t) are the associated time-dependent unknown coefficients that represent the solution of Equation (1) at nodes within the domain. (1) will not satisfy the partial differential equation and hence will produce a residual. The goal of the finite element approximation is to minimize this error. This can be accomplished by introducing the weight function, ϕ i (ρ), and setting the integral of the weighted residuals to zero. In the Galerkin method, which was used in this study, the weighting functions are chosen to be identical to the basis function [1,31]:

Substitution of expressions (3) into Equation
where Ω is the study domain. Applying Green's first identity [37,39] to equation (4) yields: , , in which Ω e is the domain of element and q fn and q cn are fluxes of the free and bound bio-macromolecules across the boundary out of the element, respectively.
The time derivatives in equation (5) were approximated using an Euler time-marching algorithm (backward finite difference scheme). Inserting equation (3) into equation (5) and integrating over the elements produces a system of time-dependent ordinary differential equations which can be formulated in matrix form: where: in which ρ e is the radial position of the centroid of element e.

Parameter optimization
The inverse problem was formulated as a nonlinear optimization problem in which model parameters p = [D f , , K d ] were optimized by minimizing a penalty function representing the discrepancy between the observed and predicted average fluorescence intensity recovery time series inside the bleach spot [1]. If the measurement errors asymptotically follow a multivariate normal distribution with zero mean and covariance matrix,V, the likelihood function can be formulated as [40]: where L(β) is the likelihood function, N is the number of observations, p is the vector of the parameters being optimized, U* is a vector and/or matrix of observations, and U is a corresponding vector and/or matrix of model predictions as a function of the parameters being optimized. This vector is obtained by solving the forward problem. The maximum likelihood estimator consists of those values of the unknown parameters that maximize the magnitude of the same likelihood function [40]. Since logarithm is a monotonically increasing function of its argument (the value of p that maximizes L(p) also maximizes ln L(p)), and because ln L(p) is simpler and much easier to use than L(p) itself, therefore ln L(p) is usually used in optimization: The maximum of the likelihood function must satisfy the set of equations ∂ ln L(p)/∂p = 0. When the error covariance matrix is known, maximization of equation (8) is equivalent to the minimization of the following weighted least square problem [1]: where ϕ(p) is the objective or penalty function. If there is information about the values and distributions of parameters, it can be incorporated in the objective function as well [1]: in which p* is parameter vector containing a priori information, is the corresponding predicted parameter vector, and V p is the covariance matrix for the parameter vector. This kind of optimization is known as Bayesian estimation. The second term in equation (10), which is sometimes called the plausibility criterion [1] insures that the optimized values of the parameters remain in some feasible region around p*. Matrices V and V p , which are sometimes called weighting matrices, provide information about the measurement accuracy as well as any possible correlation between measurement errors and between parameters.
p BioMedical Engineering OnLine 2007, 6:24 http://www.biomedical-engineering-online.com/content/6/1/24 A limitation of equation (10) is that the error covariance matrix is generally not known. A common approach to overcoming this problem is to make some a priori assumptions about the structure of the error covariance matrix. In the absence of any additional information regarding the accuracy of input data, the simplest and most recommended way is to assume that observation errors are uncorrelated which implies setting V equal to the identity matrix and V p to zero. In this case the optimization problem collapses to the well known ordinary least squares formulation [41,42]: where r is the residual (differences between the observed and predicted state variable) column vector.
Minimization of equation (11) was carried out iteratively by first starting with an initial guess of parameter vector, {p (k) } and updating it at each iteration until the termination criteria were met [1,31,32]: where α (k) is a scalar step length and ∆p (k) is the direction of the search or step direction [41].
Using QR decomposition [43] the linear least square problem below, which is the Osborne-Moré extended version of the Levenberg-Marquardt algorithm, was solved to obtain the search direction in each iteration: where λ is a positive scalar known as Marquardt's parameter or Lagrange multiplier [36], J is the Jacobian or sensitivity matrix, and D is a positive definite scaling matrix that ensures the descent property of the algorithm even if the initial guess is not "smart". For non-zero values of λ, the Hessian approximation is always a positive definite matrix, which ensures the descent property of the algorithm [41].
A combination of "one-sided" and "two-sided" finite difference methods [30][31][32] was used to calculate the partial derivatives of the state variable with respect to model parameters and to construct the Jacobian matrix. The "One-sided" finite difference method estimates the partial derivatives of the state variable with respect to model parameters by solving the forward problem (Eq. 1) p+1 times (p is number of parameters to be estimated). On the other hand, the "two-sided" finite difference method estimates the partial derivatives of the state variable with respect to model parameters by solving the forward problem (Eq. 1) 2p+1 times. At the early stages of the optimization, where the search is far from the solution, the "onesided" finite difference scheme, which is computationally cheap but not as accurate as the "two-sided" approach, was used. As the optimization proceeds in the descent direction, the algorithm switches to a more accurate but computationally expensive approach in which the partial derivatives of the state variable with respect to the model parameters are calculated using a two-sided finite difference scheme. The switch was made when ϕ(p) ≤ 1 × 10 -2 .
A detailed description of the procedure to update the Jacobian matrix is presented in [1,30].
In order to update λ at each iteration, the optimization starts with an initial parameter vector and a large λ(λ = 1).
As long as the objective function decreases in each iteration, the value of λ is reduced. Otherwise, it is increased. The approach avoids calculation of λ and step length in each iteration and is therefore computationally cheap. A detailed description of the code for updating λ is given in [31].
Finally, the following stopping rule was used to end the search [1,31,32]: The accuracy of the optimization was assessed by goodness-of-fit analysis. The Root Mean Squared Error (RMSE) and Coefficient of Determination (R 2 ) were calculated for every set of optimized parameters [44,45]: where I i and are the observed and predicted total normalized average fluorescence intensity, F + C, inside bleached area during time course of FRAP experiment, respectively and N is the number of observations on FRAP time series. The developed inverse modeling strategy was then used to quantify mass transport and binding rate parameters of GFP-tagged glucocorticoid receptor.  Figure 1c, which presents excel-lent agreement between the analytic and numerical solutions. The same time range was used to perform the comparison. As Figure 1d indicates, there is excellent agreement between the two solutions in simulating the average normalized fluorescence intensity within the bleach spot during time course of the FRAP experiment.

Model calibration
The developed methodology was then used to estimate diffusion coefficient and binding rate parameters of GFPtagged glucocorticoid receptor (GFP-GR). The results are given in Table 1 and Figure 2 (the experimental FRAP time series data are from McNally, personal communication). The Root Mean Squared Error (RMSE) and Coefficient of Determination (R 2 ) were calculated, using equations (14) and (15), for every set of optimized parameters and presented in the last two columns of Table 1. The values for diffusion coefficient, binding rate parameters, and corresponding indices estimated by [25] are given as the first run in Table 1 and Figure 2 for sake of comparison. Table  1 and Figure 2 indicate that many combinations of the three parameters can essentially produce the same error level (RMSE) and yields equally excellent fits. The values obtained by [25] represent only one of the possible solutions. In other words, the inverse problem is not wellposed and has no unique solution. Therefore, one may R  I I  I  I   I  I  I  I  Validation of the numerical model (dots) with the analytic solution (solid lines) of [25] is depicted in (d). The graph presents average normalized fluorescent intensity, obtained by equation (16), inside the bleach spot.
conclude that the Fluorescence Recovery after Photobleaching technique, though useful in studying the dynamics of biological systems, provide insufficient information to uniquely estimate free diffusion coefficient and binding rate parameters of biomolecule simultaneously.
To illustrate the non-uniqueness of the inversion, we plotted the optimized parameter values in three-dimensional parameter hyper-space (Fig. 3). The plot visibly indicates that the different combinations of model parameters can lead to same penalty (objective) function or error levels.
The plot also suggests a potential linear form of correlation between bio-macromolecular diffusion coefficient and pseudo-association rate parameter. The first set of optimized parameters, obtained by [25], shows that 86% of the GFP-GR is bound to DNA, nuclear matrix or unknown binding sites and only 14% is free. Our analysis, however, suggests that using FRAP, one cannot conclude how much of the bio-macromolecule under study is free and how much is bound. As Table 1 shows, the concentration of free GFP-GR ranges from zero to 100 per cent. The same is true for the concentration of the bound complex. For example, referring to the results obtained in run 11, one may conclude that 100 per cent of the GFP-GR is free, while the results of run 6 show that all of it is bound. Note that both parameter sets produce excellent fits with the same RMSE and coefficient of determination (see Figure  2).  In this study, the choice of a numerical approach rather than an analytic solution and a finite element approach rather than the finite difference scheme was made so that the parameter estimation could be readily extended to arbitrary initial and boundary conditions, complex domain geometry, and especially so that it could be extended to a ternary system of coupled nonlinear partial differential equations governing transport of free biomacromolecule, bound complex, and vacant binding site where all three transport entities are moving species and it is impossible to obtain an exact solution for the systems of equations.
So far most of the FRAP studies have assumed an infinite domain to specify boundary conditions and to solve the system of partial differential equations governing biomolecule transport in-vivo [ [13,25,29,30] among many others]. This assumption is unrealistic in the context of living cells. In this study, we address and improve this shortcoming by specifying a finite domain and by formulating Neumann boundary conditions on the cell membrane.

Residual analysis
The inverse methodology used in this study is based upon the following assumptions: 1) residuals have a mean of zero, 2) residuals have constant variance, 3) residuals are uncorrelated, and 4) residuals are normally distributed. When these assumptions are met, the parameter optimi-zation estimates poses optimal statistical properties [40][41][42]. When these conditions are not met the parameter optimization method may no longer produce optimal parameter estimates. Residuals, or errors in parameter optimization, are defined as the difference between the observed and simulated state variable(s). An analysis of the residuals is a useful and key technique to study possible trends, oscillations, and correlation of errors. It is also important in validating the assumptions on which the inverse modeling strategy rests.
To analyze the residuals, they were plotted against average normalized fluorescence intensity, (t), within the bleach spot during the time course of the FRAP experiment. Since the residuals are time and/or space series, their possible correlation was thoroughly analyzed. Different statistical measures such as error frequency analysis, normal probability plot, and hypotheses tests were explored to make decision about the residuals. The Student's t-test was used to test if the residuals have a mean of zero. Bartlett's test [47] was applied to determine if the residuals have constant variance. To test the normality of the residuals the Chi-square and Kolmogorov-Smirnov one sample tests were employed. Finally, the t-statistic [48] was used to test if the residuals are correlated.
The basic assumption in the hypothesis test on the residuals' mean is that the data come from a normally distributed population with unknown variance. In this study, the following null and alternative hypotheses were formulated: To perform the test the following critical t-statistic (t) was used: in which , s, and N are the mean, standard deviation, and size of the sample (errors), respectively. µ 0 is the mean of the population which is zero.
For -t α/2 <t <t α/2 the null hypothesis (mean is zero) cannot be rejected at the significance level of α. The rejection regions t < -t α/2 or t α/2 <t indicate that the null hypothesis can be rejected at the level of significance α.
The mean and standard deviation of the residuals were -0.0029 and 0.0234 with sample size n = 43. The t-statistic was calculated as: For 42 degrees of freedom, the tabled t-values for different levels of significance are given in Table 2. The calculated tstatistic was then compared with the tabled t-values at different levels of significance and the results summarized in Table 2. As the Table indicates the null hypothesis (mean of the residuals is zero) can not be rejected even at 20 per cent level of significance. The possibility of committing error type one is extremely slim.
The Bartlett's test statistic was used to verify for equality of variances across sub-groups of a sample against the alternative that variances are not constant. Equal variance across samples is called homogeneity of variances and is usually used in several statistical tests such as analysis of variance and nonlinear optimization which assumes that the errors have constant variance [47]: where is the variance of the subgroup, N i is the sample size of the subgroup, k is the number of subgroups, and is the pooled variance. This variance is a weighted average of the variances: The rejection region is those values of T > in which is the upper critical value of the chi-square distribution with k -1 degree of freedom at the level of significance α.
The following null and alternative hypotheses were formulated: To verify if the residuals have constant variance they were divided into different sections. One of the possible solutions in Table 1 was chosen and the residual plot versus laser beam recovery (Fig. 3) was divided into three regions. The variance in each region was calculated and compared with each other using the Bartlett test. The residuals were divided into three groups as: = S 2 (r(1:5)) = 9.2651 × 10 -4 = S 2 (r(6:24)) = 9.3655 × 10 -4 = S 2 (r(25:33)) = 9.3126 × 10 -5 The pooled weighted variance was found to be =   The following null and alternative hypotheses were used to test possible correlation among the residuals: where ρ is the correlation coefficient in the population.
For n > 2 these hypotheses can be tested using the following t-statistic [48]: in which r s is the correlation coefficient in the sample.
The null hypothesis (correlation coefficient is zero) is rejected when the absolute value of the t-statistic is greater than the critical t-value (t < -t α/2 or t α/2 <t) at the level of significance α.
The residuals were first divided into two sub-groups: The correlation coefficient (r s = 0.1931) was then calculated and was used to obtain the critical t-statistics (with sample size n = 42): These critical t-statistics were then compared with the tabled t-values at different levels of significances and the results presented in Table 3. As the Table indicates the null hypothesis (residuals are uncorrelated) can not be rejected even at 20 per cent level of significance. The possibility of having error type one is almost zero and, therefore, we didn't perform autocorrelation/serial correlation analysis.
One of the assumptions of the least squares theory, which was applied in this study, is the normality of the residuals. In other words, it is assumed that the errors are normally distributed. To analyze the normality of the errors, two qualitative and two quantitative methods were used: 1) Error frequency analysis and normal probability plots, and 2) Hypothesis tests on the normality of the residuals using the chi-square goodness of fit test, which is based on the differences between the observed (o i ) and expected (e i ) error frequencies will be used [49]: where k is the number of intervals or cells.
Error frequency analysis was first performed by constructing residuals' histogram. The histogram is presented in Figure 4. The Figure visibly shows that the errors are normally distributed. This was confirmed by the analysis of the normal probability plot (Fig. 5) and the chi-square hypothesis test on the normality of the random variable.
Residual frequency analysis and normal probability plots, though useful in figuring out the underlying probability distribution function, are only qualitative means to study possible normal distribution of random variable. To verify normality of the errors, the chi-square test was used. To perform the test, first the residuals were grouped into different cells (histograms). The number of residuals in every cell were counted which is e i . Then using the upper limit of the cells, the mean, standard deviation, and the cumulative normal distribution, the expected frequencies were calculated. The cells were merged when the observed error frequencies were less than 5. Then the χ 2 index was calculated and compared with [50]. This information was used in the hypothesis test. The null and alternative hypotheses were formulated as:  where µ and σ are the mean and the standard deviation of the residuals in population.
Since the calculated χ 2 (3.6190) is less than the tabled value ( = 4.61), the null hypothesis (the residuals are normally distributed) cannot be rejected even at 20 per cent level of significance implying that the residuals are normally distributed.
The chi-square test is a powerful test when the sample size is large. However, combining cells when the expected error frequencies are less than five loses information and hence decreases the power of the test. Furthermore, for very small samples this test is not applicable [49]. To overcome these limitations, the Kolmogorov-Smirnov one sample test is usually used since it treats each observation separately and does not lose information through merging of categories. This test is more powerful than the chisquare test when sample size is not large.
The Kolmogorov-Smirnov one sample test was used to verify if the residuals are normally distributed. Results (not shown here) indicate that the null hypothesis (the errors are normally distributed) cannot be rejected even at 20 per cent level of significance implying that the residuals are strongly normally distributed.
In conclusion, detailed residual analysis indicates that: 1) residuals have zero mean, 2) residuals have constant variance, 3) residuals are normally distributed, and 4) residuals are uncorrelated. Therefore, the necessary and Normal probability plot for FRAP experiment using one-site-mobile-immobile model Figure 6 Normal probability plot for FRAP experiment using one-sitemobile-immobile model.
Residuals versus normalized average fluorescent intensity in FRAP experiment using one-site-mobile-immobile model Figure 4 Residuals versus normalized average fluorescent intensity in FRAP experiment using one-site-mobile-immobile model.
Histograms of residuals for normalized average fluorescent intensity using one-site-mobile-immobile model Figure 5 Histograms of residuals for normalized average fluorescent intensity using one-site-mobile-immobile model. sufficient criteria for least square parameter optimization, which was used in this study, were met.

Conclusion
A Galerkin-based finite element model was developed and applied to solve a system of two coupled partial differential equations governing GFP-GR transport and reaction in living cells. A finite domain was used to formulate boundary conditions on the cell membrane. The simulator was coupled with the Levenberg-Marquardt algorithm and a FRAP time series to estimate protein mass transport and reaction rate parameters. The developed strategy presents excellent agreement with the experimental data and the developed strategy is an efficient approach to extract as much physiochemical information from the FRAP protocol as possible. Uniqueness analysis of the inverse problem indicates that the FRAP protocol provides insufficient information for unique quantification of diffusion coefficient and binding rate parameters.