Motivated in part by image pyramids and GNDPCA, we propose a novel ultrasound image denoising method, called multiresolution Generalized N Dimension PCA (MRGNDPCA). Figure 1 shows the proposed MRGNDPCA algorithm with its three stages.

(1)
In the multiresolution decomposition stage, logarithmical transformation is first used to convert the multiplicative speckle noise to an additive noise. Then, a Gaussian pyramid for the noisy image is built using the multiresolution scheme. Multiscale image stacks on each level are built for each Gaussian pyramid.

(2)
In the noise reduction stage, similar patches (twoway array) are extracted in each stack and grouped into several categories (threeway array). GNDPCA is directly utilized for threeway arrays to suppress the noise. The denoised patches are put back in the original places to obtain clean multiscale image stacks.

(3)
In the multiresolution combination stage, the representative image on each level can be accessed by image averaging. The combination of representative images followed by exponential transformation and sharpen procedure will yield the noise reduction ultrasound image.
The above stages will be formulated in detail in the following subsection.
Multiresulution decomposition
A reliable model of speckle noise is presented in [21, 22]
{g}^{\mathit{or}}\left(x,y\right)={S}^{\mathit{or}}\left(x,y\right){\eta}_{\mathit{mu}}^{\mathit{or}}\left(x,y\right)+{\eta}_{\mathit{ad}}^{\mathit{or}}\left(x,y\right)
(1)
where S^{or}(x, y) is a noisefree original grayvalued image, g^{or}(x, y) is an observed noisy grayvalued image, {\eta}_{\mathit{mu}}^{\mathit{or}}\left(x,y\right) is multiplicative noise (coherent interfering), {\eta}_{\mathit{ad}}^{\mathit{or}}\left(x,y\right) is additive noise (such as sensor noise), and x and y are variables of spatial locations, (x, y) ∝ R^{2}. We could approximate the Eq. 1 as Eq. 2, since the effect of additive noise is much smaller than that of multiplicative noise.
{g}^{\mathit{or}}\left(x,y\right)\approx {S}^{\mathit{or}}\left(x,y\right){\eta}_{\mathit{mu}}^{\mathit{or}}\left(x,y\right)
(2)
Logarithmic transformation can be executed on both sides of Eq. 2 to separate the noise from the original image.
log\left({g}^{\mathit{or}}\left(x,y\right)\right)\approx log\left({S}^{\mathit{or}}\left(x,y\right)\right)+log\left({\eta}_{\mathit{mu}}^{\mathit{or}}\left(x,y\right)\right)
(3)
We simplify Eq. 3 as Eq. 4 without modifying the definition:
g\left(x,y\right)\approx S\left(x,y\right)+\eta \left(x,y\right)
(4)
Therefore, multiplicative speckle noise is converted to additive noise with logarithmical transformation. This step is more reasonable for the proposed MRGNDPCA as applied to ultrasound image denoising.
The Gaussian pyramid for the noisy image in the logarithmic space and the multiscale image stacks on each pyramid level are built successively because only the possible scalespace kernel is a Gaussian function [23]. The scale space of the image g is produced from the convolution of variablescale Gaussian filters f(σ, (x, y)) with the input image:
T\left(g\right)=f\left(\sigma ,\left(x,y\right)\right)\ast g\left(x,y\right)
(5)
where “*” is the convolution operation in x and y, and σ denotes the scale parameter, and
f\left(\sigma ,\left(x,y\right)\right)=\frac{1}{2\pi {\sigma}^{2}}{e}^{\left({x}^{2}+{y}^{2}\right)/2{\sigma}^{2}}
(6)
In our paper, an associated Gaussian pyramid G
_{
a,0}, 0 ≤ a ≤ A is composed of a set of different resolutions of g. Here, A + 1 is the number of total levels and G
_{0,0} = g. G
_{
a,0} are created by applying the Gaussian filter and successive downsampling. Both operations are combined in the “Reduce” operation, and shown on the first line in Figure 2,
\mathrm{Reduce}\left({G}_{a,0}\right)=\left(2\downarrow \right)\left({f}_{1}\ast {G}_{a,0}\right)
(7)
where f
_{1} denotes a linear lowpass Gaussian filter, and (2 ↓) denotes downsampling that discards every data element with an odd index and is applied to each dimension of the image.
Multiscale image stacks on each pyramid level are subsequently constructed. Assume G
_{
a,0} as the basic image of the a^{th} stack at each level of the Gaussian pyramid. G
_{
a,b
}, 0 ≤ b ≤ B are a series of different corresponding scales of G
_{
a,0}; B + 1 is the number of scales on each level. The multiscale image stacks G
_{
a,b
}, 0 ≤ b ≤ B are created by applying the lowpass Gaussian filter f
_{2}. Thus, we have
{G}_{a,b}=\left\{\begin{array}{l}{G}_{a,0},\mathrm{Num}=0\\ \left(\mathrm{Num}\right){f}_{2}\ast {G}_{a,0},1\le \mathrm{Num}\le B\end{array}\right.
(8)
where (Num) denotes the number of f
_{2} convolutions with G
_{
a,0}. Both Gaussian filters f
_{1} and f
_{2} are defined in 5 × 5 sliding windows.
Noise reduction
GNDPCA as a multilinear subspace method is utilized for denoising the images of each stack on the pyramid level to reduce noise for the entire image. Take the first stack, G
_{0,b
}, 0 ≤ b ≤ B, for example. Q overlapping dense patches, {\mathbf{X}}_{q}^{\left(0\right)}\in {R}^{{I}_{1}\times {I}_{2}},q=1,2,\dots ,Q, are extracted from all images in this stack. Here, I
_{1} × I
_{2} is the size of the patch, and the superiors “(0)” denotes that all the extracted patches are in the first stack. Organizing the patches into sensible groupings is a fundamental process for GNDPCA denoising. Cluster analysis, which consists of hard and soft clustering, implements groupings according to measured intrinsic characteristics or similarities. Loosely speaking, hard clustering assigns the data point to one and only one cluster. In contrast, soft clustering assigns the data point to all clusters with different weights whose sum equals one. Clustering is unsupervised learning due to the absence of category information. Despite the fact that thousands of clustering algorithms have been published, we adopt the most popular and simple clustering algorithm, Kmeans [24], for our research. As the discussion of diversified clustering algorithms is beyond our scope, please refer to [25] for more details on data clustering.
The advantage of containing the original measured quantities and easy implementation allows for pixel intensity values instead of patch features to be employed for clustering. The overlapping dense patches on the first stack are unfolded as the set of Q (I
_{1} × I
_{2}) dimensional points \left[{\widehat{\mathbf{x}}}_{1}^{\left(0\right)},{\widehat{\mathbf{x}}}_{2}^{\left(0\right)},\dots ,{\widehat{\mathbf{x}}}_{Q}^{\left(0\right)}\right],q=1,2,\dots ,Q that will be clustered into K clusters. Nevertheless, massive noise existing in the patches decreases the reliability of the characteristic structure and eliminates the performance of the clustering algorithm. In consideration of computational effectiveness, (I
_{1} × I
_{2}) dimensional points are projected into the lower dimensional space by PCA to characterize the intrinsic variation of the data. After projection, (I
_{1} × I
_{2}) dimensional points \left[{\widehat{\mathbf{x}}}_{1}^{\left(0\right)},{\widehat{\mathbf{x}}}_{2}^{\left(0\right)},\dots ,{\widehat{\mathbf{x}}}_{Q}^{\left(0\right)}\right] is converted into J dimensional points \left[{\mathbf{x}}_{1}^{\left(0\right)},{\mathbf{x}}_{2}^{\left(0\right)},\dots ,{\mathbf{x}}_{Q}^{\left(0\right)}\right].
Assume {\mathbf{C}}^{\left(0\right)}=\left[{\mathbf{c}}_{1}^{\left(0\right)},{\mathbf{c}}_{2}^{\left(0\right)},\dots ,{\mathbf{c}}_{k}^{\left(0\right)},\dots ,{\mathbf{c}}_{K}^{\left(0\right)}\right],k=1,2,\dots ,K denotes K clusters and {\mu}_{k}^{\left(0\right)} denotes the mean of cluster c
_{
k
}. A partition of the patches can be found by minimizing the squared error between the grouping means and the points:
\mathit{cluster}\left({\mathbf{C}}^{\left(0\right)}\right)={\displaystyle \underset{{\mathbf{\mu}}_{k}^{\left(0\right)}}{argmin}}{\displaystyle \sum _{k=1}^{K}{\displaystyle \sum _{{\mathbf{x}}_{q}^{\left(0\right)}\in {\mathbf{c}}_{k}^{\left(0\right)}}{\Vert {\mathbf{x}}_{q}^{\left(0\right)}\mathit{\u2012}{\mathbf{\mu}}_{k}^{\left(0\right)}\Vert}^{2}}}
(9)
After clustering in one multiscale image stack, the corresponding patches that exhibit high correlation are grouped into K threeway arrays {\mathcal{X}}_{i}^{\left(0\right)}\in {R}^{{I}_{1}\times {I}_{2}\times {I}_{3}},i=1,2,\dots ,K, with I
_{3} denoting the number of similar images in the group. Threeway arrays are processed precisely by GNDPCA to suppress noise without destroying the internal connection of patch similarities.
A threeway array {\mathcal{X}}_{i}^{\left(0\right)}\in {R}^{{I}_{1}\times {I}_{2}\times {I}_{3}},i=1,2,\dots ,K can be unfolded into three matrices along each mode space and represented by {\mathbf{X}}_{i\left(1\right)}^{\left(0\right)}\in {R}^{{I}_{1}\times \left({I}_{2}\cdot {I}_{3}\right)}, {\mathbf{X}}_{i\left(2\right)}^{\left(0\right)}\in {R}^{{I}_{2}\times \left({I}_{3}\cdot {I}_{1}\right)}, and {\mathbf{X}}_{i\left(3\right)}^{\left(0\right)}\in {R}^{{I}_{3}\times \left({I}_{1}\cdot {I}_{2}\right)},i=1,2,\dots ,K. The unfolding procedures, called mode matricization of a thirdorder tensor, are visualized in Figure 3. For each unfolded matrix, the eigenvector matrix {\mathbf{U}}_{i\left(n\right)}^{\left(0\right)}=\left[{\mathbf{u}}_{i\left(n\right)1}^{\left(0\right)},{\mathbf{u}}_{i\left(n\right)2}^{\left(0\right)},\cdots ,{\mathbf{u}}_{i\left(n\right){D}_{i\left(n\right)}}^{\left(0\right)}\right] associated with the first {D}_{i\left(n\right)}^{\left(0\right)} largest eigenvalues (number of modesubspace bases) can be calculated from the covariance matrix {\mathbf{C}}_{{\mathbf{X}}_{i\left(n\right)}^{\left(0\right)}}=E\left\{{\mathbf{X}}_{i\left(n\right)}^{\left(0\right)}{{\mathbf{X}}_{i\left(n\right)}^{\left(0\right)}}^{T}\right\}, where, n = 1, 2, 3. Generally, the number of D
_{
i(n)} are decided by the experience and application. A simple determination procedure proposed in [26] can also be used. In each mode, this method decides the dimension by the variation ratio to be kept by eigenvalues. The order of the eigenvectors, {\mathbf{u}}_{i\left(n\right)1}^{\left(0\right)},{\mathbf{u}}_{i\left(n\right)2}^{\left(0\right)},\cdots ,{\mathbf{u}}_{i\left(n\right){D}_{i\left(n\right)}}^{\left(0\right)}, satisfies the corresponding eigenvalues {\lambda}_{i\left(n\right)1}^{\left(0\right)}\ge {\lambda}_{i\left(n\right)2}^{\left(0\right)}\ge \dots \ge {\lambda}_{i\left(n\right){D}_{i\left(n\right)}}^{\left(0\right)}. Dimension reduction is completed in each mode for denoising, and thus, the calculated {\mathbf{U}}_{i\left(n\right)}^{\left(0\right)} are not the optimal bases for the entire threeway array. The optimal bases for the threeway array and denoised patches can be obtained by utilizing the above {\mathbf{U}}_{i\left(n\right)}^{\left(0\right)} as the initialization. Then, an iteration algorithm, which calculates one matrix if the others are fixed, is used to determine all optimal matrices to satisfy the following energy function:
e={\displaystyle \underset{{\mathbf{U}}_{i\left(n\right)}^{\left(0\right)},n=1,2,3}{argmin}}{\Vert {\mathcal{X}}_{i}^{\left(0\right)}\mathit{\u2012}\left({Y}_{i}^{\left(0\right)}{\times}_{1}{\mathbf{U}}_{i\left(1\right)}^{\left(0\right)}{\times}_{2}{\mathbf{U}}_{i\left(2\right)}^{\left(0\right)}{\times}_{3}{\mathbf{U}}_{i\left(3\right)}^{\left(0\right)}\right)\Vert}^{2}
(10)
where {\mathcal{Y}}_{i}^{\left(0\right)}\in {R}^{{D}_{i\left(1\right)}^{\left(0\right)}\times {D}_{i\left(2\right)}^{\left(0\right)}\times {I}_{3}} are core tensors. Note that the dimension of the third mode is retained without reduction. The reconstructed threeway array shown in Eq. 11 is the denoised grouping. Figure 4 illustrates the dimension reduction of GNDPCA with a threeway array.
{\widehat{\mathcal{X}}}_{i}^{\left(0\right)}={\widehat{\mathcal{Y}}}_{i}^{\left(0\right)}{\times}_{1}{\widehat{\mathbf{U}}}_{i\left(1\right)}^{\left(0\right)}{\times}_{2}{\widehat{\mathbf{U}}}_{i\left(2\right)}^{\left(0\right)}{\times}_{3}{\widehat{\mathbf{U}}}_{i\left(3\right)}^{\left(0\right)}
(11)
After all the threeway arrays on the same level are denoised with GNDPCA, we place them back into their original positions. With the overlapping patches extracted, more than one patch estimation in the denoised threeway arrays can be located at exactly the same coordinate. An accumulation is performed by averaging the overlapping pixel positions. Take the example on G
_{0,0}, which denotes the first level image with no Gaussian filter. The denoised estimation {\widehat{G}}_{0,0} of G
_{0,0} is computed by an average of the patch estimates {\widehat{\mathbf{Y}}}_{p}^{\left(0\right)}\in {R}^{{I}_{1}\times {I}_{2}},p=1,2,\dots ,P in the first stack. Here, {\widehat{\mathbf{Y}}}_{p}^{\left(0\right)}\in {\widehat{\mathbf{Y}}}_{q}^{\left(0\right)},P\le Q. We have
{\widehat{G}}_{0,0}=\frac{{\displaystyle \sum _{p\in \left[1,P\right]}{\displaystyle \sum _{{x}_{R}}{\widehat{\mathbf{Y}}}_{p}^{\left(0\right),{x}_{R}}}}}{{\displaystyle \sum _{p\in \left[1,P\right]}{\displaystyle \sum _{{x}_{R}}{\mathbf{E}}^{{x}_{R}}}}}
(12)
where E has the same size as {\widehat{\mathbf{Y}}}_{p}^{\left(0\right)} and all elements of E are one, and x
_{
R
} is the coordinate iterated in patches. The denoised estimations for all stacks are denoted by {\widehat{G}}_{a,b},0\le a\le A,0\le b\le B.
Multiresolution combination
All images on the same stack are added up to become one representative image, as shown in Eq. 13.
{G}_{a}=\frac{1}{B+1}{\displaystyle \sum _{b=0}^{B}{\widehat{G}}_{a,b}}
(13)
The denoised image is obtained by combining the representative images G
_{
a
}, a = 0, 1, 2, …, A at each level. The corresponding pyramids with levels L
_{
a
} contain the differences of highfrequency components between G
_{
a
} and G
_{
a + 1}. Differences G
_{
a
} − G
_{
a + 1}, a = 0, 1, …, A − 1 cannot be calculated directly because G
_{
a + 1} contains fewer samples than G
_{
a
}. For this reason, the number of samples in G
_{
a + 1} is increased to match G
_{
a
} by interpolating the missing samples and the interpolation filter. Both operations are combined in the “Expand” operation,
\mathrm{Expand}\left({G}_{a}\right)={f}_{3}\ast \left[\left(2\uparrow \right){G}_{a}\right]
(14)
where (2 ↑) denotes upsampling by inserting a zero between adjacent data elements and f
_{3} denotes the linear interpolation filter. The output image \widehat{g} can be reconstructed from L
_{
a
} by,
{\widehat{G}}_{a}=\left\{\begin{array}{l}{\mathrm{L}}_{a}:a=A\\ {\mathrm{L}}_{a}+\mathrm{Expand}\left({G}_{a+1}\right):0\le a\le A1\end{array}\right.
(15)
and then,
\widehat{g}={\widehat{G}}_{0}
(16)
In the reconstructed process, the lowfrequency information of G
_{
a
} is discarded and replaced by G
_{
a + 1}. Finally, the despeckled image in logdomain is converted into spatial domain through exponential transformation. The complete MRGNDPCA method is shown as a diagrammatic sketch in Figure 2.