### Hardware Setup and Software Flow Chart

For acquiring 2D ultrasound images, we use an Acuson Sequoia C256 (Siemens, USA) with a 7 MHz array transducer probe 7V3C (see Figure 1). A six-degree of freedom EPOM device, the Flock of Birds (FOB) (Ascension, Burlington, VT, USA) is used to record 3D location of each 2D image. For accurate 3D reconstructions, a calibration for determining the spatial relation between the sensor and the 2D images is performed [4, 39].

Figure 2 shows the system software flow chart. We start image acquisition with breath-holding and ECG gating at the standard acoustic window. The purpose of doing this is to avoid the cardiac deformation due to respiration and cyclic cardiac motion. Then, the only significant source of misregistration is due to the rigid movement of the patient. The position and orientation of the transducer associated with each acquired 2D image (640 × 480) are saved in the computer. The region of interest (ROI) is quickly outlined to reduce the computational and memory requirements. 2D+T image sequences are segmented automatically to identify the endocardial boundaries. The 3D surface of the LV is reconstructed with automated registration using segmented boundary walls. The whole software is developed in C programming language and MATLAB (MathWorks).

### Hybrid Gradient Vector Flow (GVF) Geometric Active Contour (GAC) Model

Level sets segmentation allows the development of new image segmentation methods that can adapt to complex object boundaries. We developed a new hybrid model that can deliver accurate segmentation results from relatively simple initializations.

Level sets was first introduced by Osher and Sethian [40]. The authors started with a model of a propagating front Γ(*t*) as the zero level set of a higher dimensional function: *φ*(*x*, *y,t*). Initially, we have:

\phi \left(x,y,t=0\right)=\pm d

where *d* is the signed distance from point (*x*, *y*) to the boundary of an region Ω (Γ(*t*) bounds the region Ω) at t = 0. If *d* = 0, the point (*x*, *y*) is on the boundary. For points that are inside the initial boundary, *φ*(*x*, *y,t =* 0) takes on negative values. For points that are outside the initial boundary, *φ*(*x*, *y,t =* 0) takes on positive values.

Osher and Sethian [40] had shown that *φ*(*x*, *y,t*) evolves according to the following dynamic equation:

\frac{\partial \phi}{\partial t}+\overrightarrow{F}\cdot \nabla \phi =0

with the initial condition *φ*(*x*, *y*,0) = *φ*
_{0} (*x*, *y*). \overrightarrow{F} is the propagation velocity function on the interface. Here, only the normal component of \overrightarrow{F} is needed. The unit normal vector (outward) to the level set curve is given by:

\overrightarrow{n}=\frac{\nabla \phi}{\left|\nabla \phi \right|}.

The evolution equation becomes

\frac{\partial \phi}{\partial t}+F\cdot \left|\nabla \phi \right|=0

Where *F* is the normal component of \overrightarrow{F}, given by

F=\overrightarrow{F}\cdot \frac{\nabla \phi}{\left|\nabla \phi \right|}=0

For image segmentation applications, the goal is to design the speed function *F* so as to propagate the evolving curve to the edges of the image. There are three main types of motion in curve evolution [41]. We write as:

F={F}_{norm}+{F}_{curv}+{F}_{adv}

where *F*
_{
norm
}denotes motion in the normal direction; *F*
_{
curv
}denotes motion in the curvature direction, and *F*
_{
adv
}denotes motion due to an externally generated velocity field, independent of the front itself.

The new hybrid deformable model is given by:

\frac{\partial \phi}{\partial t}=g\epsilon \kappa \left|\nabla \phi \right|-\left\{(1-s\left(x,y\right))\left[{\beta}_{1}\left(\left(u\left(x,y\right),v\left(x,y\right)\right)\cdot \nabla \phi \right)\right]+s\left(x,y\right){\beta}_{2}\nabla g\cdot \nabla \phi \right\}

The first product term in the right hand side describes motion in the curvature direction, while the motion due to externally velocity is shown in the second and third terms. Here, *ε* is constant, *k* denotes the curvature and *g* is an edge function. *g* is defined as an enhanced edge indicator applied to a Gaussian smoothed image given by:

g\left(x,y\right)=\frac{1}{1+{\left(\frac{\left|\nabla \left({G}_{\sigma}\left(x,y\right)\ast I\left(x,y\right)\right)\right|}{\alpha}\right)}^{2}}

where α is a constant strength coefficient. *g* is close to zero in regions where the gradient is high, and is approximately one in homogenous regions.

To minimize the edge leakage, the expansion term in the normal direction is excluded in this model.

To allow the deformable models to be initialized away from the object boundary, GVF vector field (*u*(*x*, *y*), ν (*x*, *y*)) is used for external driving force at the beginning. It diffuses the image gradient toward the homogenous region, allowing curve evolution in edge-free regions. It also allows for bi-directional flow that propagates the curve toward the object boundary from inside or outside the boundary. The edge indicator function *g* is also used for controlling the strength of the advection term.

Unfortunately, the GVF field can push the curve through poor edges causing edge leakage. A step function *s*(*x*, *y*) is used to control the external force. Initially, it is zero. Then, the advection force, GVF field drives the evolving curve rapidly toward the object boundary, even in a homogeneous field. When the segmentation curve is sufficiently close to the true boundary, the edge map assumes higher values. To detect when the evolving curve is approaching the target boundary, we evaluate the average of the edge map over the current zero level-set at each iteration. When the average value is above a certain threshold, we turn on the step function. The advection term is then dominated by the vector ∇g, which can be used to prevent the evolving front from "pass through" at weak boundary or small boundary gaps. We define *s*(*x*, *y*) as

s\left(x,y\right)=\{\begin{array}{cc}0,& Av{e}_{\phi \left(x,y,t\right)=0}\left\{f\left(x,y\right)\right\}<{T}_{res}\\ 1,& Av{e}_{\phi \left(x,y,t\right)=0}\left\{f\left(x,y\right)\right\}\ge {T}_{res},\end{array}

Where *f*(*x*, *y*) is edge map function defined by:

f(x,y)={\left(\frac{\left|\nabla ({G}_{\sigma}(x,y)*I(x,y))\right|}{\alpha}\right)}^{2}.

### Segmentation Parameter Optimization

Corsi *et al*. [33] set the parameters empirically: *α =* 0.1*, β =* 6*, ε* = 0.5. For the new model, a segmentation parameter optimization is also implemented.

The proposed hybrid model requires pre-setting a single threshold parameter. It is important to optimize this parameter since it can affect the performance of the segmentation method. For example, if the threshold value is too low, the hybrid method may not be able to reach the true boundary because of the relaxed initialization. For very high values, the evolving curve may pass the true boundary given as a result leakage at the edges. For that reason, it is necessary to consider all the possible values for the threshold in the optimization. In order to cover a wider range, a logarithmic sampling of the threshold values is chosen: *T*
_{
res
}= [5, 20, 50, 125.6, 315.5, 792.4, 1990.5, 5000].

The same logarithmic scale is also considered for the values of the *ε, β*
_{1,}
*β*
_{2} parameters. A total of 10 different values per parameter is set in the following way: *ε* = [0.1, 0.21, 0.46, 1, 2.15, 4.64, 10, 21.54, 46.42, 100] and *β*
_{1}= *β*
_{2} = [0.6, 1.29, 2.79, 6, 12.92, 27.85, 60, 129.27, 278.50, 600].

To evaluate segmentation performance for each parameter combination, representative images are selected. Then, a simple curve which is set to be inside the ROI but far from the true boundary is provided to the algorithm as an initialization.

Two metrics are used to determine the optimal parameters: Hausdorff distance and mean absolute difference (MAD) between the manually and automatically segmented boundaries. The Hausdorff distance measures the distance between the points on two curves that differ the most, while MAD provides the mean difference between the two curves. Finally, the minimum values of both metrics determine the optimal parameters.

### Multi-view Reconstruction with Automatic Registration

Automatic registration is a required and important step for combining acquisitions between different views. Clearly, misregistration is a big problem in freehand 3D ultrasound that affects the accuracy of the reconstruction and volume estimation. In general, there are three sources that cause misregistration in freehand 3D ultrasound: (i) spatial location error of 2D image, (ii) target movement during intra-view (deformation by cardiac motion and respiration) and inter-view scans (rigid movement of the subject), and (iii) unstable probe pressure applied on the scanning surface. The first error is largely reduced by the electromagnetic interference detector in the system [10]. The misregistration due to unstable probe movement is reduced by short acquisition time (about 15 seconds) for each view acquisition. Intra-view deformation can be addressed by breath-holding and ECG gating. Rigid movement of the subject in inter-view scans causes the majority of the registration errors.

Our basic assumption for achieving automatic registration is that there is a partial overlap between the image acquisitions from different views. And we do require that the images from different views share some common features or anatomical structures, such as chamber wall surfaces. As we pointed out earlier in this paper, only voxel intensity-based registration can lead to significant errors. This is due to ultrasound-gain variation, speckle noise, and viewing artefacts. Instead of intensity-based registration, we use a feature-based geometric approach.

The basic idea is to reconstruct each view in 3D and then register the views together. The reconstructed 3D surfaces are obtained by 3D reconstruction of the 2D planes. Here, each plane is generated using the difference between two binary images. First, we generate a binary image of the segmented region that contains all of the pixels that fall inside the object of interest. The second binary image is generated by eroding the first one using a circular element of radius of 4 to 8 pixels (based on target size). The difference image captures the boundary wall. A 3D reconstruction of the 2D planes generates a 3D binary surface model.

We note that registration is possible as long as the reconstructed wall surfaces exhibit some overlap. To satisfy the partial overlap criterion, we require that at-least one of the views is a full-sweep, covering the entire object of interest. We expect that the inherent appearances of chamber wall surfaces will guarantee the existence of a unique global minimum for the registration parameters.

To reach the globally optimal value, we first apply a global registration method by using 3D Hotelling transform to construct an object-based reference volume. This is needed to avoid local minima and ensure a significant overlap between different view acquisitions. Then, we perform a higher accuracy registration using a robust, non-linear least squares algorithm to archive the optimal parameters.

The overview of the registration algorithm is given in Figure 3. Reconstructions from different views are rigidly registered to the same object-based reference volume by using 3D Hotelling transform. Then the parameters are used to initialize a more accurate registration procedure. A non-linear least square method, Levenberg-Marquardt algorithm is used to finely estimate the optimal registration parameters.

We also provide a more formal description of the procedure. Suppose we want to register two sets of image sequences acquired from two different views *V*
_{1} and *V*
_{2}. Let *N*
_{
p
}
*,N*
_{
q
}be the number of pixels inside the wall surfaces in sets *P,Q* respectively. We write:

\begin{array}{cc}P=\left\{{\text{p}}_{i}\right\},& i=1,2,\cdots ,{N}_{p}\end{array}

\begin{array}{cc}Q=\left\{{\text{q}}_{j}\right\},& j=1,2,\cdots ,{N}_{q}\end{array}

where **p**
_{
i
}, **q**
_{
j
}are 3D voxel coordinates from the two views:

{\text{p}}_{i}={\left[{x}_{i},{y}_{i},{z}_{i}\right]}^{T}

{\text{q}}_{j}={\left[{x}_{j},{y}_{j},{z}_{j}\right]}^{T}.

We then reconstruct the 3D volume with the largest number of 2D image planes (for example, view *V*
_{1} ) over a regular Cartesian grid, and then register the 2D image slices from the rest of the views (for example, view *V*
_{2}) to it.

The data points from the second view *V*
_{2} are transformed using the initial transformation acquired using 3D Hotelling, denoted as *T*(**q**
_{
j
}). We interpolate the intensity values at *T*(**q**
_{
j
}) using the image points of *I*(**p**
_{
i
}) in 3D Cartesian grid volume. The optimal registration transformation is obtained as the one that minimizes the mean square error of the objective function:

f\left(P,T(Q)\right)=\frac{1}{n(T)}{{\displaystyle \sum _{{p}_{i},{q}_{j}\in O(T)}\left[{I}_{R}\left({p}_{i}\right)-{I}_{N}\left(T\left({q}_{j}\right)\right)\right]}}^{2}

where *O* is the overlapping region between the two volumes, *I*
_{
R
}refers to the reference 3D reconstruction, *I*
_{
N
}refers to the "new" 3D reconstruction to register, and *n* is the number of the voxels within set *O*. Once the images are registered, the 3D reconstructed volume is achieved by averaging the intensities from the different view volumes to attenuate artefacts and reduce noise.