Problem formulation
Generally, the classical TV-regularized model for CS-MRI reconstruction problems can be written as:
$$\begin{aligned} \min _{x} \quad \text {TV}(x)\triangleq |\nabla x| \quad s.t. \quad \Vert Ax-b\Vert _{2} \le \delta , \end{aligned}$$
(1)
where \(x \in R^{n}\) is the image to be reconstructed, \(b \in R^{m}\) denotes the undersampled k-space data from the MR scanner, and \(A\in R^{m \times n} (m < n)\) is the measurement matrix. The expression \(\delta > 0\) represents the noise level in the measurement data and \((\nabla x)_{i,j} = (x_{i+1,j}-x_{i,j},x_{i,j+1}-x_{i,j})\), where \(\nabla\) is the discrete gradient operator and \(|\nabla x|\) denotes the TV regularization of x. A variant of (1) is the following TV-regularization problem:
$$\begin{aligned} \min _{x} \frac{1}{2} \Vert Ax-b\Vert ^2_{2}+\tau |\nabla x|, \end{aligned}$$
(2)
where \(\tau > 0\) is a positive regularization parameter to balance between the two objectives. To enforce image smoothness, we add a quadratic term \(\frac{\gamma }{2}\Vert \nabla x\Vert ^2_{2}\) in the objective function to give the new TV-regularized model:
$$\begin{aligned} \min _{x} \frac{1}{2} \Vert Ax-b\Vert ^2_{2}+\tau |\nabla x|+ \frac{\gamma }{2}\Vert \nabla x\Vert ^2_{2}, \end{aligned}$$
(3)
where \(\gamma\) is a smoothing parameter. The augmented term \(|\nabla x| + \frac{\gamma }{2}\Vert \nabla x\Vert ^2_{2}\) can yield accurate solutions by using a proper value of \(\gamma\). In addition, the dual problem is continuously differentiable and facilitates effective use of gradient information. To split the variable x, an auxiliary variable z is introduced by \(\nabla x - z = 0\), and the unconstrained optimization problem (3) is transformed into:
$$\begin{aligned} \min _{x} \frac{1}{2} \Vert Ax-b\Vert ^2_{2}+ \tau |z| + \frac{\gamma }{2}\Vert z\Vert ^2_{2} ,\quad s.t. \quad \nabla x - z = 0. \end{aligned}$$
(4)
The augmented Lagrangian function for problem (4) is given by:
$$\begin{aligned} \mathcal {L} (x,z,\uplambda) = \frac{1}{2} \Vert Ax-b\Vert ^2_{2}+ \tau |z|+\frac{\gamma }{2}\Vert z\Vert ^2_{2} - \langle\uplambda , \nabla x- z\rangle + \frac{\mu }{2}\Vert \nabla x- z\Vert ^2_{2}, \end{aligned}$$
(5)
where \(\uplambda \in R^{m}\) is the Lagrangian multiplier, \(\mu\) is a positive penalty parameter, and \(\langle\uplambda , \nabla x- z\rangle\) denotes the inner product of the vectors \(\uplambda\) and \(\nabla x-z\). The classical ADMM minimizes the convex optimization problem (5) with respect to x and z, using the non-linear block–Gauss–Seidel technique. After minimizing x and z alternatively, \(\uplambda\) can be updated by:
$$\begin{aligned} \uplambda _{k+1}=\uplambda _{k} - \mu (\nabla x_{k+1}-z_{k+1}). \end{aligned}$$
(6)
The non-linear block–Gauss–Seidel iteration of ADMM can be written as:
$$\begin{aligned} \left\{ \begin{array}{ll} x_{k+1} &{}\leftarrow \text {arg}\min \limits _{x}\mathcal {L} (x,z_{k},\uplambda _{k}),\\ z_{k+1} &{}\leftarrow \text {arg}\min \limits _{z}\mathcal {L} (x_{k+1},z,\uplambda _{k}),\\ \uplambda _{k+1} &{}\leftarrow \uplambda _{k} - \mu (\nabla x_{k+1}-z_{k+1}). \end{array} \right. \end{aligned}$$
(7)
Suppose \(z_{k}\) and \(\uplambda _{k}\) are given, then \(x_{k+1}\) can be obtained by:
$$\begin{aligned} x_{k+1}=\text {arg}\min \limits _{x} \frac{1}{2} \Vert Ax-b\Vert ^2_{2} - \langle\uplambda _{k}, \nabla x- z_{k}\rangle + \frac{\mu }{2}\Vert \nabla x- z_{k}\Vert ^2_{2}. \end{aligned}$$
(8)
When \(x_{k+1}\) and \(\uplambda _{k}\) remain fixed, \(z_{k+1}\) can be minimized by:
$$\begin{aligned} z_{k+1}=\text {arg}\min \limits _{z} \tau |z|+\frac{\gamma }{2}\Vert z\Vert ^2_{2} - \langle\uplambda _{k}, \nabla x_{k+1} - z\rangle + \frac{\mu }{2}\Vert \nabla x_{k+1} - z \Vert ^2_{2}. \end{aligned}$$
(9)
The ADMM algorithm above, used to solve (4), is expressed in Algorithm 1.
The ADMM has been previously studied and analyzed [7, 8, 20]. Generally, if subproblems in (7) are not in closed-form, many solutions could exist within those subproblems. Moreover, when the objective functions are poor or difficult to handle at high precision, the conventional ADMM algorithms might also perform poorly in image reconstruction.
Proposed algorithm
Based on the above analysis, firstly, the minimization of (8) is given by:
$$\begin{aligned} x_{k + 1}&= \text {arg}\min _{x} \frac{1}{2} \Vert Ax-b\Vert ^2_{2} - \langle\uplambda _{k}, \nabla x - z_{k}\rangle + \frac{\mu }{2}\Vert \nabla x - z_{k}\Vert ^2_{2},\nonumber \\&= \text {arg}\min _{x} \frac{1}{2} \Vert Ax-b\Vert ^2_{2} + \frac{\mu }{2} \Vert \nabla x - \left(z_{k}+\frac{\uplambda _{k}}{\mu }\right)\Vert ^2_{2}. \end{aligned}$$
(10)
Hence, (10) is transformed into:
$$\begin{aligned} ( A^{T}A + \mu \nabla ^{T} \nabla )x_{k+1} = A^{T} b + \nabla ^{T}(\mu z_{k}+ \uplambda _{k}). \end{aligned}$$
(11)
Because the measurement matrix A is neither the identity matrix, nor is it typically fully dense, it is impossible to derive the exact solution with respect to the x vectors [21]. In addition, the computational cost to handle (11) is extremely heavy. In order to reduce the computational burden and get closed-form solutions, some variants could be considered. The linearization of the quadratic term \(\frac{1}{2} \Vert Ax-b\Vert ^2_{2}\) is used to update \(x_{k+1}\) as follows:
$$\begin{aligned} \frac{1}{2} \Vert Ax-b\Vert ^2_{2} \approx \frac{1}{2} \Vert Ax_{k}-b\Vert ^2_{2} + \langle Grad(x_{k}), x - x_{k}\rangle + \frac{\eta }{2}\Vert x - x_{k}\Vert ^2_{2}, \end{aligned}$$
(12)
where \(Grad(x_{k}) = A^T(Ax_{k} -b)\) is the gradient of \(\frac{1}{2} \Vert Ax-b\Vert ^2_{2}\) at the current point \(x_{k}\), and \(\eta\) is a positive proximal parameter. Then, the x subproblem in (10) can be iterated by:
$$\begin{aligned} x_{k + 1}&= \text {arg}\min _{x}\langle Grad(x_{k}), x - x_{k}\rangle + \frac{\eta }{2}\Vert x - x_{k}\Vert ^2_{2}\nonumber \\&- \langle \uplambda _{k}, \nabla x - z_{k} \rangle + \frac{\mu }{2}\Vert \nabla x - z_{k}\Vert ^2_{2}. \end{aligned}$$
(13)
Considering the quadratic term \(\frac{\mu }{2}\Vert \nabla x - z_{k}\Vert ^2_{2}\) can also be linearized, we also linearize \(\frac{\mu }{2}\Vert \nabla x - z_{k}\Vert ^2_{2}\) at \(x_{k}\). This variant is a fast linearized preconditioned ADMM(FLPADMM) algorithm that generates the iterates \(x_{k+1}\) by:
$$\begin{aligned} x_{k + 1}&= \text {arg}\min _{x} \mathcal {L} (x,z_{k},\uplambda _{k}),\nonumber \\&= \text {arg}\min _{x} \frac{1}{2} \Vert Ax-b\Vert ^2_{2} -\langle \uplambda _{k}, \nabla x - z_{k}\rangle + \frac{\mu }{2}\Vert \nabla x - z_{k}\Vert ^2_{2},\nonumber \\&= \text {arg}\min _{x} \langle Grad(x_{k}), x - x_{k}\rangle + \frac{\eta }{2}\Vert x - x_{k}\Vert ^2_{2} - \langle \uplambda _{k}, \nabla x \rangle \nonumber \\& \quad +\langle \mu (\nabla x_{k} - z_{k}), \nabla x \rangle,\nonumber \\&= \text {arg}\min _{x}\langle Grad(x_{k}), x - x_{k} \rangle + \frac{\eta }{2}\Vert x - x_{k}\Vert ^2_{2} \nonumber \\ &\quad + \langle \mu (\nabla x_{k}- z_{k}) - \uplambda _{k},\nabla x \rangle. \end{aligned}$$
(14)
The negative divergence operator \(-div\) can be used to solve (14) as follows:
$$\begin{aligned} x_{k+1} = x_{k} - \frac{1}{\eta }(-div(\mu (\nabla x_{k} - z_{k})-\uplambda _{k})+ Grad(x_{k})). \end{aligned}$$
(15)
Secondly, for a given \(x_{k + 1}\) and \(\uplambda _{k}\), \(z_{k+1}\) is computed by solving:
$$\begin{aligned} z_{k+1}&= \text {arg}\min _{z} \mathcal {L} (x_{k + 1},z,\uplambda _{k}), \nonumber \\&= \text {arg}\min _{z} \tau |z|+\frac{\gamma }{2}\Vert z\Vert ^2_{2} - \langle \uplambda _{k}, \nabla x_{k+1} - z \rangle + \frac{\mu }{2}\Vert \nabla x_{k+1} - z \Vert ^2_{2},\nonumber \\&= \text {arg}\min _{z} \tau |z| +\frac{\gamma }{2}\Vert z\Vert ^2_{2} + \frac{\mu }{2}\Vert z - \left(\nabla x_{k+1} - \frac{\uplambda _{k}}{\mu }\right)\Vert ^2_{2},\nonumber \\&= \text {arg}\min _{z} \tau |z| + \frac{\gamma + \mu }{2}\Vert z - \frac{\mu }{\gamma + \mu }\left(\nabla x_{k+1} -\frac{\uplambda _{k}}{\mu } \right)\Vert ^2_{2},\nonumber \\&= \text {arg}\min _{z} \frac{\tau }{\gamma + \mu }|z| + \frac{1}{2}\Vert z - \frac{\mu }{\gamma + \mu }\left(\nabla x_{k+1}-\frac{\uplambda _{k}}{\mu }\right)\Vert ^2_{2}. \end{aligned}$$
(16)
Hence, the solution to (16) obeys:
$$\begin{aligned} z_{k+1} = soft\left( \frac{\mu }{\gamma + \mu }\left( \nabla x_{k+1} - \frac{\uplambda _{k}}{\mu }\right),\frac{\tau }{\gamma + \mu }\right), \end{aligned}$$
(17)
where \(soft( \cdot , T)\) is the soft thresholding function that is defined as:
$$\begin{aligned} soft(v, T) = \left\{ \begin{array}{ll} v + T, \quad& v < -T,\\ 0, \quad& |v| \le |T|,\\ v - T, \quad& v > T. \end{array} \right. \end{aligned}$$
(18)
Finally, the Lagrangian multiplier is updated by \(\uplambda _{k+1}=\uplambda _{k} - \mu (\nabla x_{k+1}-z_{k+1})\).
The FLPADMM algorithm utilizes the gradient descent method and one soft thresholding operator to update variables at each iteration. In addition, this method is a variant of the classical ADMM algorithm framework. The proposed FLPADMM algorithm is presented in Algorithm 2.
The stopping criterion in all of the algorithms above is the relative change of x between two successive iterations, which is small enough, i.e.:
$$\begin{aligned} \frac{\Vert x_{k+1}-x_{k}\Vert _{2}}{\Vert x_{k}\Vert _{2}} \le tol \end{aligned}$$
(19)
where tol is usually a range chosen from \(10^{-5}\) to \(10^{-3}\). In the FLPADMM algorithm, \(\alpha _{k}\) represents a weighted parameter and \(Grad(x^{m}_{k})\) is the gradient of \(\frac{1}{2} \Vert Ax-b\Vert ^2_{2}\) at the point \(x^{m}_{k}\). Furthermore, \(x^{m}_{k}\) is the first weighted value that is used to update \(x_{k+1}\).
It should be noted that the appropriate parameter \(\alpha _{k}\) can improve the rate of convergence that has been proven in a previous study [10]. Moreover, the superscript m stands for “middle,” and w stands for “weight.” Before the gradient descent method in line 5 is applied, x is updated by the weighted sums of all previous iterates. Furthermore, after the gradient method is applied, x is updated again by the same weighting technique, that is, x is weighted twice at each iteration. Specifically, when the weighted parameter \(\alpha _{k}\) is set to 1, the x subproblem is simply the current point \(x_{k+1}\). At this point, FLPADMM becomes another variant of the ADMM algorithm [10]. The accelerated strategy of FLPADMM incorporates a multistep acceleration scheme with middle point \(x^{m}\) and weighted point \(x^{w}\), which was first applied in a previous study [10] and derived from the accelerated gradient method [22, 23]. Moreover, the optimal rate of convergence of FLPADMM is \(O\left(\frac{1}{k^{2}} + \frac{1}{k}\right)\).