Optical flow method
The optical flow method is a classic tracking method [20]. Its primary assumption is an intensity consistency constraint that can be written:
$$I\left( {x,y,t} \right) = I\left( {x + \delta x,y + \delta y,t + \delta t} \right),$$
(1)
where I(x,y,t) means the intensity value of position (x, y) in the tth frame, and δx and δy are the displacement differences after the time interval δt. Using the Taylor series, \(I\left( {x + \delta x,y + \delta y,t + \delta t} \right)\) can be written:
$$I\left( {x + \delta x,y + \delta y,t + \delta t} \right) = I\left( {x,y,t} \right) + \frac{\delta I}{\delta x}\Delta x + \frac{\delta I}{\delta y}\Delta y + \frac{\delta I}{\delta t}\Delta t + H.O.T.$$
(2)
where H.O.T. means “higher order terms”.
Combining and reducing Eqs. (1) and (2), can be rewritten:
$$I_{x} V_{x} + I_{y} V_{y} = - I_{t} ,$$
(3)
where V
x
and V
y
are the x and y components of velocity (or displacement) at position (x,y) in the tth frame, and I
x
, I
y
, and I
t
are the derivatives of the pixels at (x, y, and t) in the x, y, and t dimensions. Lucas and Kanade [21] presented a differential method for estimating optical flow. They assumed that velocity flows in a small region are similar. Thus, Eq. (3) can be resolved by rewriting it in matrix form with the pixels in a small region:
$$\left[ {\begin{array}{*{20}c} {I_{x} \left( {p_{0} } \right)} & {I_{y} \left( {p_{0} } \right)} \\ {I_{x} \left( {p_{1} } \right)} & {I_{y} \left( {p_{1} } \right)} \\ \vdots & \vdots \\ {I_{x} \left( {p_{n} } \right)} & {I_{y} \left( {p_{n} } \right)} \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {V_{x} } \\ {V_{y} } \\ \end{array} } \right] = - \left[ {\begin{array}{*{20}c} {I_{t} \left( {p_{1} } \right)} \\ {I_{t} \left( {p_{2} } \right)} \\ \vdots \\ {I_{t} \left( {p_{n} } \right)} \\ \end{array} } \right],$$
(4)
where p
n
is the nth neighbor point in the computing region. I
x
, I
y
, and I
t
are the derivatives of the pixels at (x, y, and t) in the x, y, and t dimensions. V
x
and V
y
are the x and y components of velocity at position (x,y).
Multi-kernel block matching
Block matching is a detection and tracking method in image processing. It is used to compute the similarity between a reference block and a target block. However, block matching is sensitive to speckle noise. The speckle noise is the small scale brightness variations of speckle which affect the tracking results when the variations are significant.
Korstanje et al. [13] proposed an MKBM scheme to solve this problem. MKBM is a multi-kernel block matching method that separates the reference block into several sub-blocks. Each sub-block is initially examined using the block matching method to find the block that is the closest match. The matching results of all sub-blocks are then combined to obtain the overall matching result. By utilizing the multiple block matching, MKBM computes the normalized-cross-correlation (NCC) weighted average as the tracking result which is less affected by the speckle variations. However, MKBM still cannot perform well if the motion of tracking target is too small.
Optical-flow-trend-based multi-kernel block matching
Both the optical flow and MKBM methods have advantages and disadvantages. The optical flow method can track and evaluate the target displacement when there is a small amount of motion, but it will fail if the target’s motion is too large. The MKBM method can track the target motion between large time intervals; however, the tracking cannot perform well if the motion of tracking target is too small. Thus, we propose a tracking structure that combines the advantages of both: an optical-flow-trend-based multi-kernel block matching method. The optical flow method is first used to compute the displacement in adjacent frames (Fig. 3). The MKBM is then used in the two selected frames: the starting frame and the frame with an accumulated displacement larger than a given constant (λ). The detailed displacements of MKBM between selected frames are finally adjusted based on the results using the optical flow method. The process is repeated until all of the input images are completed.
In the optical flow method, the changes of region size used to compute the velocity flow will obtain different results. Since a tendon is non-rigid tissue, tendon deformation usually occurs with motion, and an estimated region that is too large will lead to the wrong result because the motions inside the region conflict with the assumption of the optical flow method because of deformation. However, if the region is too small, the tracking result will be severely affected by noise. A procedure to resolve these problems have been developed (Fig. 4).
We computed several velocity flows inside the rectangular region to obtain the region displacement. The velocities of all the flow points inside the region were calculated using the optical flow method. Throughout the experiment, the window size for computing the optical flow of each flow point was 17 × 17 pixels. Inside a 101 × 41-pixel region, 43 × 13 flow points (with 2-pixel increments in both the x and y directions) were calculated and used to determine the region displacement. Because the major tendon motion is horizontal (x direction), only the horizontal motions of the region were used when determining the MKBM step size. To exclude outliers, the flow points were classified based on their motion direction in the horizontal axis by using the following equation:
$$Dir\left( p \right) = \left\{ {\begin{array}{ll} {{\text{left}},} & {{\text{if }}V\left( p \right) <0;} \\ {\text{right,}} & {{\text{if }}V\left( p \right)>0;} \\ {\text{ignored,}} & {otherwise,} \\ \end{array} } \right.$$
(5)
where V(p) is the horizontal displacement of p, the flow point. The points with major motion direction that contains the most flow points are retained and used to calculate the displacement of the region. However, not all of the retained flow points are precise in displacement; thus, only partially retained flow points should be used to calculate the displacement. To determine the region displacement, we conducted an experiment to find the statistical relationship between the actual region displacement and the top 5% displacement of the flow points. Three hundred adjacent frame pairs were used to determine this relationship. For each adjacent frame pair, the traditional optical flow method was used to calculate the flow points in the target region. The average top 5% displacement V
a
inside the region calculated as an index. We manually tracked the tendon motion for each adjacent frame pair. For each target region, by referring to the obtained V
a
, we arrived at a ratio with the top N % average displacement inside the region was equal to the manually tracked displacement. From the experiment, the relationship between V
a
and N was plotted as follows (Fig. 5).
In the implementation, we chose the number of flow points (N) conveniently based on V
a
. For example, if the average displacement of the top 5% flow points is 1.5 pixels (V
a
= 1.5), the accurate displacement should be computed using the top 20% flow points (N = 20). The region displacement can then be obtained by averaging the displacements from the specific number of flow points.
The result of the optical flow method was used to compute the accumulated displacement (D
a
). If the magnitude of D
a
was less than a predefined threshold value λ, the optical flow method was repeated with the subsequent frame. If D
a
was larger than λ, the optical flow computing was then terminated and formed a flow period. Within the flow period, the MKBM method was then applied to the starting frame (t) and the end ending frame (t + n). In the MKBM procedure, a suitable algorithm for tendon tracking is proposed (Fig. 6).
As in the method described in Korstanje et al. [13], we first divided the reference block into four sub-blocks with ten overlapping pixels. Taking account of computational speed, rather than using normalized correlation coefficient, the sum of absolute differences (SAD) is used as the similarity measurement for each sub-block:
$$SAD = \frac{1}{MN}\sum\limits_{j = 1}^{N} {\sum\limits_{i = 1}^{M} {\left| {T_{i,j} - R_{i,j} } \right|} } ,$$
(6)
where M and N are the width and height of the sub-block, and T
i,j
and R
i,j
are the intensity values of pixels (i, j) at the target block and reference block, respectively. Because the soft tissue adjacent to the tendon might passively move with a smaller displacement, we computed the block displacement by choosing the maximal value of the four sub-blocks:
$${\text{D}}_{\text{t}} = {\text{Max(D}}_{\text{t,1}} ; {\text{ D}}_{\text{t,2}} ; {\text{ D}}_{\text{t,3}} ; {\text{ D}}_{\text{t,4}} ) ,$$
(7)
where D
t,n
is the displacement of nth sub-block at the tth frame. Although the displacement between the starting and ending frames was obtained, the detailed displacements between the selected frames were unknown. Because the optical flow method can track the target with little underestimation for small motion displacement, the displacement of each frame between t and t + n can be interpolated using the results of the optical flow method and MKBM method:
$$d_{OFTB\_MKBM} (t + i) = d_{MKBM} (t + i) + (d_{OF} (t + i)-d_{OF} (t))\times\frac{{d_{MKBM} (t + n)-d_{MKBM} (t)}}{{d_{OF} (t + n)-d_{OF} (t)}},\quad 0 \le i \le n,$$
(8)
where d
MKBM
(t) and d
OF
(t) are the displacements at the tth frame computed using the MKBM and optical flow methods, respectively.