Segmentation of pulmonary nodules using adaptive local region energy with probability density function-based similarity distance and multi-features clustering

Background Pulmonary nodules in computerized tomography (CT) images are potential manifestations of lung cancer. Segmentation of potential nodule objects is the first necessary and crucial step in computer-aided detection system of pulmonary nodules. The segmentation of various types of nodules, especially for ground-glass opacity (GGO) nodules and juxta-vascular nodules, present various challenges. The nodule with GGO characteristic possesses typical intensity inhomogeneity and weak edges, which is difficult to define the boundary; the juxta-vascular nodule is connected to a vessel, and they have very similar intensities. Traditional segmentation methods may result in the problems of boundary leakage and a small volume over-segmentation. This paper deals with the above mentioned problems. Methods A novel segmentation method for pulmonary nodules is proposed, which uses an adaptive local region energy model with probability density function (PDF)-based similarity distance and multi-features dynamic clustering refinement method. Our approach has several novel aspects: (1) in the proposed adaptive local region energy model, the local domain for local energy model is selected adaptively based on k-nearest-neighbour (KNN) estimate method, and measurable distances between probability density functions of multi-dimension features with high class separability are used to build the cost function. (2) A multi-features dynamic clustering method is used for the segmentation refinement of juxta-vascular nodules, which is based on the nodule segmentation using active contour model (ACM) with adaptive local region energy and vessel segmentation using flow direction feature (FDF)-based region growing method. (3) it handles various types of nodules under a united framework. Results The proposed method has been validated on a clinical dataset of 113 chest CT scans that contain 157 nodules determined by a ground truth reading process, and evaluating the algorithm on the provided data leads to an average Tanimoto/Jaccard error of 0.17, 0.20 and 0.24 for GGO, juxta-vascular and GGO juxta-vascular nodules, respectively. Conclusions Experimental results show desirable performances of the proposed method. The proposed segmentation method outperforms the traditional methods.


Background
Pulmonary nodules in high resolution computerized tomography (CT) images are potential manifestations of lung cancer. As pointed by literatures [1,2,3,4], despite much effort being devoted to the nodule segmentation problem, segmentation for various types of pulmonary nodules remains an ongoing research topic [4]. Some classical pulmonary nodules are shown in Fig. 1. One of the major difficults is the task of segmentation of non-solid and part-solid ground-glass opacity (GGO) nodules with faint contrast and fuzzy margins (shown as Fig. 1d). In particular, non-solid nodules are extremely subtle with fuzzy boundaries, and part-solid nodules exhibit highly irregular intensity variations (called intensity inhomogeneity) and boundary shapes. Studies have shown that nodules of non-solid and part-solid nature are frequent and have higher risks of being malignant than solid ones [1]. Additionally, there is difficulty associated with the segmentation of nodules that are adjacent to vessels when they have very similar intensities (shown as Fig. 1c Juxta-vascular nodule); and are nonspherical in shape. Juxta-vascular nodules account for the largest typology of lung nodules [2]. Thus, handling them under a united framework poses a great challenge to the task of segmentation of pulmonary nodules. Although various algorithms have been reported in literatures [1,2,3,4,5] for tackling these problems, the technical issues of segmentation still remain. Traditional segmentation methods, such as purely intensity thresholding or model-based segmentation methods, may fail to segment various types of nodules, especially GGO and juxta-vascular nodules, leading to boundary leakage or over-segmentation. All these factors lead to the belief that the field is relatively new and requires further investigation. This paper deals with the above mentioned problems. In this paper, a novel segmentation method is proposed for various types of pulmonary nodules in CT images, especially for GGO nodules (part-solid and nonsolid) and juxta-vascular nodules.

Previous work on segmentation of pulmonary nodules
The segmentation under a united framework of kinds of pulmonary nodules, especially for GGO nodules and juxta-vascular nodules, is a very difficult task. Active contour models (ACMs) have been one of the most successful methods for image segmentation [6,7,8,9,10], even in the segmentation of pulmonary nodules [11]. However, GGO nodules possess weak edges, intensity inhomogeneity and irregular shape, so when they is segmented by using traditional edge-based ACMs [12], region-based ACMs [6,7], or even some complex integrated ACMs combine edge and region energy [13,14], they  Li et al. BioMed Eng OnLine (2016) 15:49 often occur the problem of boundary leakage. Generally, the model using local information commonly can obtain better performance than that of global statistic information in solving the segmentation problems of intensity inhomogeneity and zigzag edge [7,15,16,17]. The idea of defined local neighbor is more reasonable, especially for segmenting the zigzag and inhomogeneous edge. Li et al. [7] presented a region-based active contour model (ACM) model, in which a data fitting energy is introduced to solve the problem of intensity inhomogeneity. Lankton et al. [16] summarized and proposed different segmentation models based on local information. But typically, the segmentation curve cannot obtain exact edge or deviates from the objects if the local domain is too large or too small [17]. Further, an active contour model based on nonparametric independent and identically distributed statistics of the image may segment an image according to the particular global to local strategy. The local histogram method using Wasserstein distance to measure distribution distance has a good performance in segmenting cluttered scenes [15]. However,the estimation effect in image segmentation is severely influenced by the small cabin volume and the sample distribution, when the pixel density functions are estimated by histogram method. It is difficult to acquire an ideal segmentation effect only by relying on general intensity-data-driven segmentation methods. In order to overcome the limitation, Krinidis et al. [18] used fuzzy energy to solve the problem of "weak" local minima. Also Assen et al. [19] presented a 3D ACM drived by fuzzy inference for cardiac CT and MR images. Zhang et al. [20] added the Bayesian error of edge direction and region statistical information into the ACM model, to improve the convergence speed.
As mentioned above, juxta-vascular nodules account for the largest typology of lung nodules [2]. So besides handling GGO nodules with intensity inhomogeneity and weak edges, it is also important for a segmentation algorithm to be able to treat juxta-vascular nodules. In clinic application, even some nodules are not only GGO but also juxta-vascular nodules. How to handle various types of pulmonary nodules, including GGO nodules and juxta-vascular nodules, under a united framework presents a great challenge [2,3]. Since vessels can be characterized by the tubular models and a pulmonary nodule is a small round or oval-shaped growth in the lung, so many approaches based on morphological operators [2,21,22] have been proposed to segment the juxta-vascular nodules. However the sizes and shapes of vessels as well as those of nodules are irregular, it may lead to the problem of a small volume overestimation if only morphological correction is relied upon. Hence a better segmentation refinement method should be taken into consideration furtherly. Besides intensity feature, the analysis of the shape of pulmonary structures has often been adopted to recognize small lung nodules from the background anatomy [2]. However, approaches utilizing simple criteria like shape rule or gray value evidence are typically not suitable to differentiate between different tubular tree structures and nodules. Lung nodules are embodied in a complex and structured background. Their identification and segmentation is usually affected by surrounding anatomical objects [2]. So, in a broad sense, the feature space for the recognition of nodules should be embed more prior information, including the target structures [23,24].
To our knowledge, there are few literatures aimed at handing GGO and juxta-vascular nodules under a united framework and multi-features classification space. In our previous work, we have built a very preliminary fuzzy integrated ACM incorporated multi-features analysis to realize segmentation of GGO and juxta-vascular nodules [14,25,26]. This paper deals with the above mentioned problems further. In our present study, the segmentation problem is converted into the optimization problem of measurable distance between probability density functions of multi-features. A multi-features dynamic clustering method is used for the segmentation refinement of juxta-vascular nodules.

Our approach
In this paper, a novel segmentation method for pulmonary nodules, especially for GGO nodules and juxta-vascular nodules in CT images is proposed, which uses an adaptive local region energy model with probability density function (PDF)-based similarity distance and multi-features dynamic clustering refinement method. The flowchart of the proposed segmentation algorithm for pulmonary nodules under a united framework is shown as Fig. 2.
Compared with existing traditional methods, our approach has several novel aspects: (1) in the proposed adaptive local region energy model, the local domain for local energy model is selected adaptively based on k-nearest-neighbour (KNN) estimate method, and measurable distances between probability density functions of multi-dimension features Fig. 2 Overview of the proposed segmentation method for pulmonary nodules Page 5 of 28 Li et al. BioMed Eng OnLine (2016) 15:49 with high class separability are used to build the cost function. (2) A multi-features dynamic clustering method is used for the segmentation refinement of juxta-vascular nodules, which is based on the nodule segmentation using active contour model with adaptive local region energy and vessel segmentation using flow direction feature (FDF)based region growing method. (3) It handles various types of nodules under a united framework.
The remainder of this paper is organized as follows. In "The proposed integrated ACM with adaptive local region energy and PDF-based similarity distance" section and "Segmentation refinement of juxta-vascular nodules based on multi-features dynamic clustering" section, the proposed segmentation methods of pulmonary nodules are introduced. The experimental results of our method are given in "Experimental results" section, followed by some discussions in "Discussion" section. This paper is summarized in "Conclusions" section.

The proposed integrated ACM with adaptive local region energy and PDF-based similarity distance
We will approach the segmentation problem as a typical optimization task. Thus we need to adopt an appropriate model of energy function as the cost function. The proposed integrated ACM with adaptive local region energy and PDF-based similarity distance differs from the model used in literatures [7,13,14,18,19], which has the following characteristics.
(1) The KNN-based adaptive local energy function model is proposed. The nodule with GGO characteristic is either part-solid or nonsolid, in which case it possesses typical weak edges and intensity inhomogeneity. The model using local information commonly can obtain better performance than that of global statistic information in solving the segmentation problem [7,15,16]. The local domain for local energy model is selected adaptively, which is approached as nonsupervised recognition problem and realized by a KNN estimate method based on medical prior knowledge. It will be explained in detail in "The proposed KNN-based adaptive local energy function" section. (2) The segmentation problem is converted into the optimization problem of measurable distance between PDF of multi-features. The Bhattacharyya distance function is applied to measure the distance of PDF between the foreground and background, and the PDF in the local region are measured by Wasserstein distance. It will be explained in detail in "The similarity distance based on adaptive local region probability density" section and "The PDF-based Bhattacharyya similarity distance for global energy" section. (3) Multi-features information with high class separability is reflected and used in the proposed integrated ACM model. It will be explained in detail in "Generation of multi-dimension feature with high class separability" section.
Taking both the edge, local and global region information into consideration, our proposed energy function of integrated active contour model E is given as Eq. (1).
(1) E = E edge + E local + E global Page 6 of 28 Li et al. BioMed Eng OnLine (2016) 15:49 where E is the proposed energy function model; E edge it the edge-driven energy term, which is used for curved surface to improve the evolved ability in concave regions; E local is the local-region-driven energy term; E global is the global-region-driven energy term. E local and E global are used to control the image force based on statistical multi-features information in the region and move the curved surface in the decrescence direction of feature variance.
Let ⊂ R 3 be the image domain, and D : → R be the given medical CT image sequence or 3D data set. The segmentation result of the images or data set (for 3D data set) D is achieved by finding a surface φ, which separates Ω into disjoint regions. Ω 1 and Ω 2 represent the inside regions and outside regions of φ, respectively. Besides intensity, more features are used in our active contour model. Our proposed detailed energy function model is given as Eq. (2) based on Eq. (1). Why and how to build the energy term model as Eq. (2) will be explained in detail in the following sections (from "The proposed KNN-based adaptive local energy function", "The similarity distance based on adaptive local region probability density", "Generation of multi-dimension feature with high class separability", "The PDF-based Bhattacharyya similarity distance for global energy" section).
where E(φ) is the proposed energy function model; v x, y, z ∈ is a given pixel/ voxel. In term E local of Eq. (2), λ 1 and λ 2 are the weights for region-driven energy term. The membership function u(v) ∈ [0, 1] is the degree of membership of D, and m is a weighting exponent on each fuzzy membership. The degree of membership is decided by multi-features value L. L is the value domain of feature space. H φ is the smoothed Heaviside function. P(v) is the probability density of multi-features vector O of v; P 1 (τ) and P 2 (τ) are the means of probability density in local region Ω τ selected adaptively. In term E global of Eq. (2), P − (τ) and P + (τ) are the kernel-based estimates of the image features observed over the sub-domains Ω − and Ω + . E edge is similar to our previous work [14], which is used for curved surface to improve the evolved ability in concave regions.
In term E edge of Eq. (2), μ is the weight of edge-driven energy term; δφ(v) is the smoothed version of the Dirac delta; g is the stop function.

The proposed KNN-based adaptive local energy function
The model using local information commonly can obtain better performance. The proposed local energy function model is inspired initially by literature [7,15]. But in our proposed local energy function model, the local domain is not fixed, but flexible and adaptive, which is realized by a KNN estimate method here. In order to segment elaborately for the image with zigzag edge and noise interference, we construct k-nearest neighbors and estimate the corresponding probability density functions with Parzen window method in each pixel/voxel. As we have known, lung nodules, especially malignant nodules, sometimes show spiculation and lobulation signs in CT images. Typically, the segmentation curve deviates from the objects if the local neighbor radius is too large or too small. Consider the synthetic image in Fig. 3. The segmentation results are very different with various radii with the implementation of local histogram-based segmentation method using the Wasserstein Distance [15]. For instance, conventional local segmentation algorithms often suffer from disturbances induced by the mass of irrelevant information when the radius of the neighbor increases. To improve the ability to segment precisely the object where is intensity inhomogeneity, zigzag and difficult to define the boundary, the local region Ω τ where y ∈ N k , z ∈ N c ; d O(x), O y is the similarity distance, as Eq. (4).

where J(O) is the normalized multi-features vector O.
In the KNN-based adaptive local energy function model, the parameters r and k are determined by medical prior knowledge and the physical resolution of the pixel in CT imaging. As shown in Fig. 4, the local region Ω τ in 2-dimensional space is determined by the parameters r and k. Rules for determining the parameters r and k, as well as their reasons, are as follows. Reasons are below: (1) The best choice of k depends upon the data. Generally, larger values of k reduce the effect of noise on the classification, but make boundaries between classes less distinct. (2) A pulmonary nodule is a small round or oval-shaped growth in the lung. It is sometimes also called a spot on the lung or a coin lesion. So the local region Ω τ is selected by a circle (a sphere for 3-dimensional space).
So r is selected as large as possible. (4) Pulmonary nodules are generally smaller than 3 cm (about ≤3 cm) in diameter. If the growth is larger than that, it is known as a pulmonary mass. (5) The nodule may have first been identified by a CT scan. CT scans can give information about the specific features of the nodule, including its shape, size, location and internal density. A CT scan can find very small nodules, as small as 1-2 mm in diameter. Most benign (not cancerous) nodules are small (less than 5 mm) in size. Most nodules between 5 and 10 mm will need additional imaging unless they are unlikely to be cancer based on the way they look. Larger nodules require more careful evaluation and examination including additional imaging tests and possibly a biopsy. So we can determine the parameters r and k according to the above medical prior knowledge. (6) According to medical knowledge, nodules less than 8 mm are usually too small for a biopsy. In other word, the nodule more than 8 mm should be evaluated carefully. In other words, we should make sure of the distinct in the local region of the nodule which is more than 8 mm.
Rule 3: k ≤ π(8 mm/2(pixel spacing)) 2 /K 0 , K 0 = 4 Reasons are below: (1) The local region Ω τ is selected by a circle (for 2-dimensional space). (2) N k ⊂ � τ . (3) Value of k should make boundaries between classes more distinct in the local region. (4) The following experiment of a dataset (the phantom nodules) was performed to determine and test the parameters.  15:49 In the experiment, some nodule models (5 mm in diameter, from −406 to −477 HU, shown as Fig. 5c) in a phantom (shown as Fig. 5a) are used to determine and test the parameters. A quantitative analysis was performed to determine the parameters. The well known Tanimoto/Jaccard error A(C m , C o ) is used as the validation merics, which refers to volume overlaps between the gold standard and the proposed segmentation method with different K 0 . In the experiment, the gold standard is the boundary of the phantom nodule, which is shown as Fig. 5d.
where C m and C o are the extracted and the desired contours, respectively.
Here, k = πr 2 K 0 · r is selected as 5 mm/(2 pixel spacing), which is corresponded to the size of the phantom nodule (5 mm in diameter). Segmentation results for different k 0 are shown in Fig. 6. Figure 6 shows that the segmentation result with K 0 = 4 is the best.

The similarity distance based on adaptive local region probability density
The segmentation problem is converted into the optimization problem of measurable distance between probability density functions. In this paper the probability density P(v) where is a kernel function, and generally as Gaussian density kernel function, and ∫ K (τ )dτ = 1.
The similarity distance W(P, P 1 ) between P(v) and P i is computed by Wasserstein distance function, as Eq. (7).
where L τ is the value domain of feature space, P(v) is the probability density of multi-features vector O of v; P i (τ) are the means of probability density in local region Ω τ selected adaptively, which is as Eqs. (8) and (9).

Generation of multi-dimension feature with high class separability
For one of the major difficults is the task of segmentation of non-solid and part-solid GGO nodules with faint contrast and fuzzy margins. In particular, non-solid nodules are extremely subtle with fuzzy boundaries, and part-solid nodules exhibit highly irregular intensity variations (intensity inhomogeneity) and boundary shapes. GGO pulmonary nodules are hard to distinguish if merely the intensity feature is utilized. Multi-features information with high class separability is reflected and used in the proposed integrated ACM model.
Here, in the feature space, a sample is O i = (x i , y i , z i , I i , level of ASM i ); x i , y i and z i is the position feature; I i is the intensity feature and ASM i is the angular second moment (texture feature) which is as Eq. (10).
(P(m, n)) 2 where P(m, n) is the matrix element of the normalized gray-level co-occurrence matrix. N g is the number of possible gray levels. A gray level co-occurrence matrix is a matrix where the number of rows and columns is equal to the number of gray levels. The matrix element P(m, n | Δx, Δy) is the relative frequency with which two pixel, separated by a pixel distance (Δx, Δy), occur within a given neighborhood, one with intensity m and the other with intensity n. Given an M × N neighborhood, let f (m t , n t ) be the intensity at sample s, line l of the neighborhood. P(m, n | Δx, Δy) is defined as Eq. (11).
where and This feature ASM i is a measure of the smoothness of the image. Indeed, if all pixels are of the same gray-level I = k I , then P(k I , k I ) = 1 and P(m, n) = 0, m ≠ k I or n ≠ k I , and ASM = 1. At the other extreme, if we could have all possible pairs of gray levels with equal probability 1 R , then ASM = 1 R . The less smooth the region is, the more uniformly distributed P(i, j) and the lower the ASM.
An example is illustrated in Fig. 7 and Table 1. It implies that the ASM value is an important feature for segmentation of GGO nodules, solid nodules or other regions.
The segmentation problem is converted into the optimization problem of measurable distance between PDF P(v) of multi-features O, shown in Eq. (2). In proposed integrated ACM model, P(v) is the probability density of multi-features vector In "The similarity distance based on adaptive local region probability density" section, P(v) is estimated by the Parzen window estimation method. In order to reduce the computation cost and save the memory, ASM i is sampled as level 0 (from 0.0000 to 0.1500), level 1 (from 0.1501 to 0.2300), level 2 (from 0.2301 to 0.3500), level 3 (from 0.3501 to 0.5000), level 4 (from 0.5001 to 0.8000) and level 5 (from 0.8001 to 1.0000).

The PDF-based Bhattacharyya similarity distance for global energy
In order to obtain more stable segmentation and achieve the global optimal segmenting, the global energy term E global shown as Eq. (2), is incorporated into the proposed integrated ACM model. E global is global energy term which maximizes the distance of probability density function in different regions. Bhattacharyya distance is applied to measure the distance of probability density distributions between the foreground and background. Bhattacharyya coefficient [27,28] is defined as Eq. (12). where E global can be thought of as a direction cosine between two points on the sphere, so its value domain is [0,1]. This measure varies between 0 and 1, where 0 indicates complete mismatch, and 1 indicates a complete match.
where τ ∈ R N , K − (τ) and K + (τ) are two Gaussian density functions that are normalized to have unit integrals with respect to the feature vector τ, viz.

The proposed model and its numerical implementation
The proposed integrated ACM model is shown as Eq. (2). The zero level set that makes the energy function Eq. (2) minimum is the optimal segmentation result. In the simple case, it is obvious that the boundary of the object φ 0 is the minimizer of the energy functional. The energy function model as Eq. (2) is solved by using variational level set approach. The numerical implementations of energy function as Eq. (14).

Implementation of potential pulmonary nodule segmentation based on the proposed integrated ACM model
The implementation algorithm for the proposed integrated ACM model is as follows.
(1) The pulmonary parenchyma is segmented by an overall segmentation method combining thresholding and morphology, which is shown in Fig. 8. Here P(v), P 1 (τ) and P 2 (τ) in the local region Ω τ are computed according to "The similarity distance based on adaptive local region probability density" section and "Generation of multi-dimension feature with high class separability" section. In each iteration, only P(v) in the local region Ω τ near to the curve/surface φ need to be computed, so pixels/voxels need to be computed are a very small proportion of (14) the total data set. An example is illustrated in Fig. 9. Only P(v) in the local region Ω τ near to the green curve φ need to be computed. The numbers are 1591, which is 0.6 % of the whole data set (512 × 512).

Segmentation refinement of juxta-vascular nodules based on multi-features dynamic clustering
In order to overcome the problem of a small volume over-segmentation in the adhesion region between the juxta-vascular nodule and its attached vessel, and obtain a better segmentation result, a multi-features dynamic clustering method is used for the segmentation refinement of juxta-vascular nodules, which is based on the nodule segmentation using the proposed integrated ACM model and vessel segmentation using FDF-based region growing method. Various types of pulmonary nodules, including GGO nodules (part solid and nonsolid) and juxta-vascular nodules, are segmented under a united segmentation framework. The refinement procession is just used for the pixels/voxels in some boundary regions between the juxta-vascular nodules and blood vessels, without modifying the nodule boundary elsewhere.

3D potential vessel segmentation by FDF-based region growing method
In this paper, the 3D vessel segmentation is used as the rough segmentation process of segmentation refinement of juxta-vascular nodules. 3D vessel segmentation can be accomplished by some traditional methods, such as region growing [23] and vessel enhancement filter [29,30]. However shapes and appearances of vessels are irregular, a satisfactory result often cannot be achieved if only these traditional methods are relied upon. According to the medical prior knowledge, vessels are characterized by a tubular model, the 3D gradient vectors in a vessel can be used to extract a vector in the direction of the vessel by identifying a vector that is approximately orthogonal to the gradients in a local neighborhood [21,22,31]. Moreover, an ideal vascular structure not only keeps a certain elongation, but also has no holes. Simply connected domain is a particular connected domain, in which every loop can be continuously pulled to a point without leaving the space [24]. So we can use these shapes and appearances as the constraints of the segmentation method, which are as follows. (1) Since vascular structures should have no holes, the segmented objects must be removed if they are not simply connected. (2) the flow direction can be used as the growing direction constraint condition, which is one of the constraint conditions of region growing method for segmentation of blood vessels and attached nodules.
Here a FDF-based region growing method is proposed for 3D vessel segmentation. In the FDF-based region growing method, the growing constraint conditions include the intensity threshold and flow direction defined by flow direction vector l d . Assume λ 1 , λ 2 and λ 3 (λ 1 ≤ λ 2 ≤ λ 3 ) are eigenvalues of the structure tensor defined by the statistical information of arithmetic average in local region. The structure tensors are computed by directly using the gradient information of each voxel [14,29]. Let e 1 be the unit length eigenvector belonging to the eigenvalue λ 1 . The flow direction vector l d of the vessel is set as Eq. (15).

Refinement region selection
In this paper, the intersection between potential vessel region and potential nodules is selected as the refinement region of juxta-vascular nodules based on multi-features dynamic clustering. The processes are as follows.
(1) The rough 3D potential vessel segmentation result S c1 is gotten by using the flow direction feature-based region growing method which is described in "3D potential vessel segmentation by FDF-based region growing method" section. (2) The segmentation result S c2 is gotten by using the segmentation method based on the proposed integrated ACM model described in "The proposed integrated ACM with adaptive local region energy and PDF-based similarity distance" section. (3) The potential object S r for segmentation refinement is S r = S c 1 ∩ S c 2 .

Generation and construction of multi-features vector in clustering space
Our proposed multi-features vector takes both the appearance and geometric information into consideration. Because CT values of pulmonary nodules and that of blood vessels are almost uniform, it is difficult to segment exactly the blood vessels and the attached nodules if merely the intensity feature is utilized. So the constructed feature space takes both the appearance and geometric multi-features information into consideration. We build a multi-features space for dynamic clustering in segmentation (15) refinement region. An extended observation vector X i in the clustering space is defined as Eq. (16).
where x i , y i and z i are the position features; I i is the intensity feature; SI i is the volumetric shape index, reflecting the geometric shape, which is a measure of local shape characteristics [5].

Implementation of segmentation refinement based on multi-features dynamic clustering
The algorithm for segmentation refinement based on multi-features dynamic clustering is carried out as the following steps.
(1) The segmentation refinement region S r is selected according to "The refinement region selection using the proposed nodule and vessel segmentation method" section. (2) The feature vector X i = (x i , y i , z i , I i , SI i ) is constructed and computed for the points in the region S r , according to "Generation and construction of multi-features vector in clustering space" section. (3) The refinement region S r is segmented by dynamic clustering method (K-Means clustering method [32]), then the juxta-vascular nodules is segmented exactly.

Experimental results
In this paper, a clinical dataset of 113 chest CT scans was used to evaluate the proposed method, which consisted of 60 thoracic CT scans obtained from LIDC databases [33] and 53 thoracic CT scans obtained from several big hospitals in Guangzhou City and Shenzhen City, Guangdong province, China. The list of these 60 CT scans from LIDC-IDRI databases is provided in Table 2. The used medical CT slices were data sets with an intensity value of 16bits and a resolution of 512 × 512. Slice thickness varied from 0.5 to 2.5 mm and the total slice number for each scan varied from 49 to 397 with an average of 136/scan. The X-ray tube current ranged from 30 to 280 mA, and the pixel size ranged from 0.5-0.75 mm/pixel. Pulmonary nodules in CT images are solid or GGO (part solid or nonsolid), whose sizes are from 3 to 30 mm. Locations of nodules are uncertain, some are isolated (Solitary Pulmonary Nodule, SPN), others are adhered to blood vessels or lung wall. Each scan was read individually by members of a qualified panel and then a consensual gold standard was defined by the panel. The panel members assigned each nodule to be either GGO, juxta-vascular, GGO juxta-vascular, or others. These nodules are manually segmented. This process defined ground truth of 157 nodules, and the number of different kinds of nodules is shown as Table 3. Moreover, The different nodule size groups for the GGO nodules, juxta-vascular nodules, GGO juxta-vascular nodules and others is illustrated in Table 4. In our experiment, there are four types of nodules, and they are registered respectively. The performance of our segmentation fusion algorithm was evaluated using both quantitative and qualitative methods. Experimental results in each research step as well as some discussions are presented below. Li et al. BioMed Eng OnLine (2016) 15:49 Qualitative validation

Validation of the proposed integrated ACM model
In order to validate the effect of the proposed segmentation method, the synthetic image is used to segmented. As shown in Fig. 10, the problem of boundary leakage at boundaries of objects with intensity inhomogeneity and typical weak edges, whose intensity feature is similar to that of GGO nodules, is solved by using the proposed segmentation method.

Validation of 3D potential vessel segmentation by FDF-based region growing method
In order to validate the effect of the proposed flow direction feature -based region growing method for 3D potential vessel segmentation, No. LIDC-IDRI-0010 (from LIDC database [33]) is used to segmented. As shown in Fig. 11, 3D potential vessels are segmented successfully by the proposed flow direction feature-based region growing method. It is validated that the method can be used as the segmentation of 3D potential vessels and rough segmentation of juxta-vascular pulmonary nodules.

Validation of multi-features dynamic clustering method
In order to validate the effect of the proposed multi-features dynamic clustering segmentation method, the synthetic image is used to segment. As shown in Fig. 12, the tubular Table 3 The number of GGO, juxta-vascular, GGO juxta-vascular and other nodules "GGO nodule" and "Juxta-vascular nodule" do not include GGO Juxta-vascular pulmonary nodule in the table     15:49 model and round or oval-shaped model can be segmented successfully. This implies the multi-features dynamic clustering method can be used for segmentation of juxta-vascular nodules, since vessels can be characterized by the tubular models and a pulmonary nodule is a small round or oval-shaped growth in the lung.

Validation of the proposed segmentation method of nodules
In order to validate the effect of the proposed segmentation method, the clinical data with GGO nodules and juxta-vascular nodules should be segmented and explored.
A comparison for segmentation of GGO nodule between the proposed segmentation method and the traditional approach [34], e.g. region-based active contour model [6,7] and integrated active contour model [13], is shown in Fig. 13. As shown in Fig. 13d, e, the problem of boundary leakage at boundaries of a GGO pulmonary nodule occurs, while the problem is solved in Fig. 13f.
In this paper, we handle various types of nodules under a united framework. Firstly the image sequences are segmented by using the proposed segmentation method. Then  the juxta-vascular nodules are segmented finely by using segmentation refinement based multi-features dynamic clustering method. Some experimental results for segmentation of juxta-vascular pulmonary nodule are shown in Figs. 14 and 15. From Fig. 15, we can see that the refinement procession is only applied to voxels in some regions S r containing the blood vessels and juxta-vascular nodules. The refinement method will not modify the nodule boundary elsewhere, so the problem of over-segmentation does not occur. It is validated that the method can be used as the segmentation of juxta-vascular pulmonary nodules. Figure 16 shows the segmentation results of a GGO juxta-vascular pulmonary nodule using the proposed method, region-based ACM [6,7] and the integrated ACM [13,34], respectively. As shown in Fig. 16d, e, the problem of boundary leakage and over segmentation occurs in the adhesion place between the juxta-vascular nodule and its attached vessel; while as shown in Fig. 16f, the problem of boundary leakage and over-segmentation is solved by the proposed segmentation method.
Other classical experimental results for segmentation of juxta-vascular pulmonary nodule are is shown in Figs. 17 and 18. It is validated that the method can be used as the segmentation of various types of pulmonary nodules.

Quantitative validation
Beyond the visual inspection, a quantitative analysis is necessary to ascertain the accuracy of the proposed segmentation method. Here, the well known Tanimoto/Jaccard error A(C m , C o ) as Eq. (5) is used as the validation merics, which refers to distances between segmentation results or to volume overlaps between the gold standard and the proposed segmentation method. The gold standard typically is a high-quality reference segmentation carried out by experts.
Segmentation measure results are shown in Table 5. Table 5 shows that the errors of the proposed method are less than the other three traditional methods, the edge-based active contour model, the traditional region-based active contour model and the integrated active contour model [6,7,13,34].
In our experiment, there are 3 nodules missed in the experimental test set of 157 nodules in total. Two of them are small GGO nodules close to 3 mm, and one of missing nodules is a small juxta-pleural nodule with very low contrast and close to 3 mm.
The proposed segmentation algorithm was implemented and tested on the computer with 3.46 × 2 GHz CPU, 192 GB Memory and Graphic Card (GPU memory 12 GB GDDRS, 317 GB/s). On average, it takes about 3.14 min/scan (about 1.37 s/each image), which does not include the cost for data pre-processing.

Discussion
Experimental results of segmentation for pulmonary nodules show desirable performances of the proposed segmentation method using the test dataset. The segmentation performance for GGO, juxta-vascular and GGO juxta-vascular nodules was an average Tanimoto/Jaccard error of 0.17, 0.20 and 0.24, respectively.
We attempt a comparison with the results reported by other research groups. Kubota et al. [1] proposed a segmentation method based on morphological approaches and the 3D segmentation result without using the proposed refinement method; c the 3D segmentation result using the proposed segmentation method with multi-features dynamic clustering refinement; d local zooming of Fig. 15b; e local zooming of Fig. 15d; f local zooming of Fig. 15c; g local zooming of Fig. 15f convexity models for segmenting the pulmonary nodules of various densities. Results on 21 LIDC cases were reported with segmentation overlap measures of mean 0.69 and standard deviation 0.18. Diciotti et al. [2]. presented a refinement method for the segmentation of juxta-vascular nodules, which was based on a local shape analysis of the initial segmentation making use of geodesic distance map representations. They observed a percentage of successful segmentations of 84.8 % in fully automated mode and of 91.0 % by using an additional interactive mode for improving the segmentation quality of juxta-vascular nodules. However, GGO juxta-vascular nodules were not reported in their work. Kostis et al. [21] collected 21 juxta-vascular nodules, and observed an 80 % successful rate. Okadaetal et al. [35] reported an 81.2 % estimation rate on a dataset of 1310 various types of nodules (3-30 mm in diameter). Unfortunately, though most of algorithms have been developed for lung nodules, most authors did not report quantitative results for various types of nodules. Ye et al. [5] proposed a shapebased SVM method for detecting nodules. The 3-D local geometric and statistical intensity features were used to detect potential solid and GGO nodule. But the segmentation results were not been reported. Murphy [32] used the local image features of shape index and curvedness to detect candidate structures in the lung volume, but the segmentation results were not been reported yet.
Comparing with different segmentation methods covered in literature [3] and other reported literatures above, it seems that the proposed method's relatively precise segmentation performances. The reasons why the proposed method has a better performance for segmenting all types of GGO and juxta-vascular nodules are as follows.
In the proposed integrated ACM model, the local domain for local energy model is selected adaptively, and measurable distances between probability density functions of multi-dimension features with high class separability are used. The model using local information commonly can obtain better performance than that of global statistic information in solving the segmentation problems of intensity inhomogeneity, such as part solid and nonsolid GGO nodules. Multi-dimension features are also important and helpful for the segmentation of GGO nodules. In many research papers, juxta-vascular nodules observed in CT images are outlined applying a global refinement procedure (i.e., throughout the initial segmentation boundary) after an initial rough segmentation. Differently, as is mentioned before, our solution for efficiently segmenting the potential nodule objects involves two steps: (1) a segmentation method is proposed for a whole segmentation, which is based on the proposed integrated ACM model. The method uses an adaptive local region energy model with PDF-based similarity distance, which is especially used for low-contrast nodules such as part solid and nonsolid GGO nodules, to overcome the problems of boundary leakage, intensity inhomogeneity; (2) a segmentation refinement method based on multi-features dynamic clustering method, which is referred to as a fine segmentation, is used to segment potential juxta-vascular nodules. So the correction method has the advantage that it locally refines the nodule segmentation along recognized vessel attachments only, without modifying the nodule boundary elsewhere. However, some nodules are missed by the proposed segmentation method. Typically, these nodules are too small (almost 3 mm), or small juxta-pleural nodules with very low contrast, which makes it difficult to segment. The small GGO juxta-pleural nodule with pleural tail is very near to the edge of lung wall.
To further improve the segmentation performance, some improvements need to be further investigated as follows: in order to recognize small and juxta-pleural pulmonary nodules in noisy image more effectively, an adaptive smoothing method needs to be further investigated, and the juxta-pleural nodules should be further researched; This requires further investigated in more detail.

Conclusions
When the traditional segmentation method is used to segment the GGO and juxtavascular nodules with weak edges and intensity inhomogeneity characteristic, the problems of boundary leakage and small volume over-segmentation often appear. To solve these problems, a novel segmentation method is proposed for pulmonary nodules in CT images, which is based on the proposed integrated ACM model and multi-features dynamic clustering method, especially for GGO nodules (part solid and nonsolid) and juxta-vascular nodules. This study demonstrates the superiority of the proposed method. The described segmentation method outperforms the traditional methods, and evaluating the algorithm on the provided test data leads to an average Tanimoto/Jaccard error of 0.17, 0.20 and 0.24 for GGO, juxta-vascular and GGO juxta-vascular nodules, respectively.