Lesion segmentation in breast ultrasound images using the optimized marked watershed method

Background Breast cancer is one of the most serious diseases threatening women’s health. Early screening based on ultrasound can help to detect and treat tumours in the early stage. However, due to the lack of radiologists with professional skills, ultrasound-based breast cancer screening has not been widely used in rural areas. Computer-aided diagnosis (CAD) technology can effectively alleviate this problem. Since breast ultrasound (BUS) images have low resolution and speckle noise, lesion segmentation, which is an important step in CAD systems, is challenging. Results Two datasets were used for evaluation. Dataset A comprises 500 BUS images from local hospitals, while dataset B comprises 205 open-source BUS images. The experimental results show that the proposed method outperformed its related classic segmentation methods and the state-of-the-art deep learning model RDAU-NET. Its accuracy (Acc), Dice similarity coefficient (DSC) and Jaccard index (JI) reached 96.25%, 78.4% and 65.34% on dataset A, and its Acc, DSC and sensitivity reached 97.96%, 86.25% and 88.79% on dataset B, respectively. Conclusions We proposed an adaptive morphological snake based on marked watershed (AMSMW) algorithm for BUS image segmentation. It was proven to be robust, efficient and effective. In addition, it was found to be more sensitive to malignant lesions than benign lesions. Methods The proposed method consists of two steps. In the first step, contrast limited adaptive histogram equalization (CLAHE) and a side window filter (SWF) are used to preprocess BUS images. Lesion contours can be effectively highlighted, and the influence of noise can be eliminated to a great extent. In the second step, we propose adaptive morphological snake (AMS). It can adjust the working parameters adaptively according to the size of the lesion. Its segmentation results are combined with those of the morphological method. Then, we determine the marked area and obtain candidate contours with a marked watershed (MW). Finally, the best lesion contour is chosen by the maximum average radial derivative (ARD).

Page 2 of 23 Shen et al. BioMed Eng OnLine (2021) 20:57 Background According to the 2020 global cancer data report, breast cancer ranks first among the three most common cancers in women, indicating that it has become a serious threat to the health of women worldwide [1]. Studies show that early detection and diagnosis of breast cancer can effectively increase the cure rate [2]. At present, digital mammography (DM) and breast ultrasound are the two main tools used in breast screening in China. However, ultrasound has no ionizing radiation and can show the anatomy and pathology of dense breast tissue, which DM cannot achieve. Therefore, ultrasound is more suitable for detecting breast lesions in Asian women with high density than DM and is becoming a popular screening tool for breast cancer [3]. Geisel et al. also demonstrated the effectiveness, practicability and feasibility of breast ultrasound as a screening tool for the early detection of occult breast cancer [4]. However, in the process of breast ultrasound imaging, the speckle noise generated by coherent waves greatly reduces the image quality, which requires a high degree of professionalism for radiologists to address. Due to the lack of radiologists in remote areas, ultrasound-based breast cancer screening cannot truly be popularized.
With the development of artificial intelligence technology, computer-aided diagnosis (CAD) systems based on medical images have made great achievements in cancer detection. In particular, the development of an ultrasound-based breast cancer CAD system is impressive. It can realize intelligent screening and diagnosis. When the system receives real-time images, it can perform lesion detection, segmentation and diagnosis automatically. Its application will greatly alleviate the lack of radiologists. However, due to the inherent problems of breast ultrasound (BUS) images such as speckle noise and low contrast, the accuracy of lesion segmentation has not been effectively improved, which greatly affects the reliability of the diagnosis results. Thus, finding a stable and effective BUS image segmentation method is of great significance to promote the application and popularization of ultrasound-based breast cancer CAD systems.
Therefore, we conducted this research. Focusing on solving the inherent problems of BUS images quality, we are committed to designing a stable and efficient image segmentation method. In recent years, many excellent image segmentation algorithms have emerged. Level set, first introduced in 1994 [5] and improved in 1995 [6], 2005 [7], 2012 [8] and so on, has proven to be very effective in image segmentation. However, it needs a great deal of time to solve partial deferential equations (PDEs), which is not very practical. To solve this problem, morphological snake (MS) was proposed. It uses morphological operations on a binary level set to approach the differential operators of a standard PDE [9]. It needs only numerical calculations, so MS is simple and fast. In terms of the field of BUS image segmentation, many scholars have used parameter deformable models and geometric deformable model technology [8,10,11]. However, to achieve ideal segmentation results, an appropriate initial tumour boundary or a precise edge-based stop function should be set in advance. Other researchers have used and improved graph-based segmentation methods, such as [12] and [13]. Boukerroui attempted to overcome the biggest drawbacks of the MRF model, i.e., a low optimization speed and local optimization [14]. In 2013, Zhao proposed the generalized fuzzy cluster method (FCM) with spatial information, which performs well in segmentation and has a rapid convergence speed [15]. FCM and the improved FCM algorithm were applied to lesion detection in BUS images [16,17]. Since 2013, there have been an increasing number of segmentation methods with supervised and semi-supervised learning, especially with the increasing popularity of deep learning, which has made great progress in solving the problem of BUS image segmentation. Supervised learning includes support vector machines, artificial neural networks (ANNs), and convolutional neural networks (CNNs), which have been applied to BUS image segmentation and have made great progress [18][19][20][21]. Zhuang proposed the RDAU-NET model [22], which performs best on BUS image segmentation compared to other models. To date, deep learning models have proven to be the best way to perform image segmentation. However, they face some major problems, which are also the main bottleneck for further development. For example, the prediction result is not sufficiently robust. Robustness is the basic performance metric determining whether a model can be widely used [23][24][25][26]. Additionally, the model is not explainable, and training data are not sufficient. To solve these problems, a new approach has integrated visual saliency into a deep learning model for BUS image segmentation [27]. Attention blocks were introduced into a U-Net architecture, which learns feature representations that prioritize spatial regions with high saliency levels and achieved a Dice similarity coefficient (DSC) of 90.5% on a data set of 510 images. However, this method relies greatly on the quality of the saliency maps, which is also not sufficiently robust.
Therefore, how can we have an efficient and robust BUS image segmentation method? We again turned to some excellent classic segmentation methods. It has been reported that tomography watersheds have a certain effect on solving complex segmentation problems and are more stable than other existing methods, but they are sensitive to noise and might cause over-segmentation. In view of this, many scholars have improved watersheds. Huang and Chen [10] combined a watershed with the active contour model to obtain a relatively accurate tumour boundary. In 2009, Gomez used a marked watershed (MW) algorithm incorporating morphological techniques and an average radial derivative function [28]. The method was improved by using an anisotropic diffusion filter guided by texture descriptors derived from a set of Gabor filters and creating segmenting functions generated by Newton filters to facilitate more precise segmentation [29]. However, the anisotropic diffusion filter requires many iterations to obtain a good preprocessing result, which takes a long time. In addition, the acquisition of the marker function is slightly complex, which reduces the efficiency of the algorithm. In view of these two problems, we made improvements in a previous study [30]. We combined contrast-limited adaptive histogram equalization (CLAHE) and curvature filtering to preprocess the images and used a morphological method to obtain the marker function, which is simple and efficient. However, this method not only improves the segmentation accuracy and DSC but also brings a higher false positive rate (FPR), which means that many false positive tissues are also segmented. Therefore, to further improve the performance, this paper makes technical contributions that are summarized as follows: 1. We use CLAHE and a side window filter (SWF) to enhance the lesion contour and eliminate the influence of noise. Compared with some other preprocessing methods, it is the most beneficial to BUS image segmentation. We propose an embedded segmentation method, adaptive morphological snake (AMS). It is more robust and stable than MS when processing complex datasets with different sizes of lesions collected from different types of ultrasound equipment. 2. We propose an optimized marked watershed segmentation method, adaptive morphological snake based on marked watershed (AMSMW). Its marker region is corrected by AMS. Taking full consideration of the advantages of classical segmentation algorithms, such as the level set method [5], morphological snake (MS) [9] and MW [31], we find that AMSMW has higher segmentation precision and is 3-4 times faster than other existing methods.

Evaluation metrics
We used both area and contour error metrics, which include the accuracy (Acc), true positive ratio (TPR), false positive ratio (FPR), Jaccard index (JI), Dice similarity coefficient (DSC), area error ratio (AER), Hausdorff error (HE), and mean absolute error (MAE), to evaluate dataset A. The calculation formulas of these indicators are listed below. In addition, we used the Dice coefficient (DC), area-under-curve (AUC), precision (PC), sensitivity (Sen), specificity (Sp), F1-score (F1) and mean-intersection-over-union (M-IOU) to evaluate the proposed method on dataset B. The calculation formulas of these indicators can be found in Zhuang's paper [22]: where A * is the number of pixels in *, A is the number of pixels contained in the image, and the subscripts G and S represent the ground truth and segmentation result, respectively. C represents the contour of the ROI. z and k are the points in the contour. Xian has noted how important these metrics are [12]. Large JI values and small AER, HE and MEA values indicate good performance. Supposing that JI is small, when AER, HE, and MAE are large, if TPR and FPR are both large, then the lesion was overestimated. If the TPR and FPR are both small, the lesion was underestimated.

Experiment details
First, we obtain 100 BUS images and 50 BUS images from dataset A and dataset B respectively as the research object of the research experiment on preprocessing method. And the remaining 500 images in dataset A and the remaining 205 images in dataset B are used as the test set for the comparison experiment. Then, we use Python to implement the algorithm and calculate the evaluation metrics. The parameter setting of the AMSMW method is shown in Table 1. Next, we introduce the process of the three experiments in detail.

Finding the most suitable BUS image preprocessing method
We explored the effect of preprocessing methods on the segmentation results by using four preprocessing schemes: SWF, CLAHE&SWF, CLAHE&CF&SWF, and CLAHE&CF to preprocess the 100 images from dataset A and the 50 images from dataset B. In addition, we set up a control group without preprocessing operations to determine the effectiveness of the preprocessing method on the segmentation result.

Comparing AMSMW with other classical image segmentation methods and deep learning methods
First of all, we used the best preprocessing method obtained in the previous step to preprocess the test set. Then, we compared the proposed method with some related segmentation methods and RDAU-NET on the test set of dataset A. Finally, we compared the proposed method with the typical deep learning segmentation method on the test set of dataset B.
In terms of traditional segmentation methods, we implemented some related and classical methods that include level set [5], MS [9], MW [31], and FSMW [30]. For the MS method, we set its initialization position and radius to be the centre of the RROI and 70% of the smallest length and width of the RROI. Moreover, it should be noted that we used the same preprocessing method when performing the comparison experiments except for FSMW. In terms of deep learning methods, many excellent image segmentation models have been borrowed, improved and used. In [22], several typical deep learning segmentation models were compared on dataset B. The results showed that the RADU-NET model performs best. To make an objective comparison between AMSMW and RADU-NET, we performed the following two experiments. In the first experiment, we used five-fold cross-validation method. First of all, we re-divided the test set of dataset A, according to the ratio of training set:validation set:test set is 6:2:2 and then used the angle transformation method to expand each training set by four times, and then used them to train and test RDAU-NET. Finally, we obtained five segmentation results. In order to achieve a scientific and fair comparison, we tested AMSMW on the five test sets to obtain five results too. The second experiment is that we used AMSMW to conduct a segmentation experiment on the test set of dataset B, and compared the quantitative results with those published in the paper [22].

Studying the sensitivity of AMSMW to benign and malignant lesions
Benign and malignant tumours are very different in size, morphology, margins, and internal state, which may greatly affect the algorithm's performance. If a relationship can be found, it will be of great significance to design a more adaptive BUS image segmentation algorithm. Therefore, we conducted an exploratory experiment on the algorithm's sensitivity in segmenting benign and malignant lesions. The specific operation was to first group dataset A into 250 benign and 250 malignant groups. Then, a quantitative segmentation experiment was performed.

Combined with CLAHE, SWF can enhance the edge of the lesion and contribute to better BUS segmentation results
Quantitative results and some examples of qualitative results are shown in Table 2 and Fig. 1, respectively. As shown in Table 2, the " √ " means "used" and " × " means "not used". Separate (A) and separate (B) respectively represents the 100 images from dataset A and the 50 images from dataset B used for the preprocessing method experiment. The "Overall" column lists the average values of Dice. Observed form Table 2, we can draw the following three conclusions. First, the segmentation results using preprocessing methods are much better than those without preprocessing methods, and different preprocessing methods improve the segmentation performance to different degrees, which shows that it is very necessary to choose a suitable preprocessing method. Second, the Dice of the CLAHE&SWF method is similar to that of CLAHE&CF&SWF, but the average value of the CLAHE&SWF method is slightly higher, indicating that the method is more stable on different source BUS image datasets. Third, it also shows that CLAHE can improve the contrast and SWF can smooth the noise and well preserve the lesion boundary. It can also be observed directly from Fig. 1. Compared with the images in the last two columns, the contrast of the images in the first three columns on the left which were preprocessed with CLAHE is significantly more balanced and stronger. Images in the third and fourth columns were treated with SWF and their lesion boundaries were clearly highlighted. However, although the images in the first column were also pre-processed by SWF, their lesion boundaries became blurred after using CF.

AMSMW performs best on both quantitative results and qualitative results
First, some relevant and excellent traditional segmentation methods were tested on dataset A, and the quantitative and qualitative results are shown later. It can be observed from Table 3 that even without the preprocessing method, MW still performs the best on  TPR, indicating that MW can segment the entire lesion area more sensitively and comprehensively, whereas the error rate is also the highest, causing the FPR to be too high. This means that a large part that does not belong to the lesion area is also segmented, which can be seen intuitively from the qualitative result in Fig. 2. Taking the images in the third and fourth columns as examples, many normal tissues were segmented by MW. However, level set has a lower FPR than MW, indicating that it can effectively improve the ability to identify lesion contours. However, the values of the other indicators of level set were relatively low, indicating that it cannot find the entire lesion area. In addition, it can be seen from the list of standard deviations in Table 3 that the dispersions are generally small, which shows that the data distribution is appropriate and that the experimental results are credible. Compared with level set, MS has the greatest advantage in that it uses morphological methods to replace the process of solving numerical differential equations, greatly improving the efficiency. Experiments on ordinary notebooks show that it takes 6 s for MS and 17 s for level set to segment an image. At the same time, it can be inferred from the quantitative results that MS is better than level set in most metrics, indicating that it can segment tumours more precisely than level set. However, as shown in Fig. 2, taking the third and fourth columns as examples, there are some lesions with many calcification points inside, which cannot be segmented completely by the MS method. Compared with MS, although AMS is slightly lower than MS on Acc, FPR, AER, HE and MAE, it has obvious advantages on TPR, DSC and JI, showing that AMS is more stable and can obtain more complete lesion. As shown in Fig. 2, images in the fourth row are segmentation results of AMS. It can be observed clearly that although parts of the surrounding normal tissue were mistaken for part of the tumour by AMS, all parts of the tumour were completely included, which is of great significance for AMSMW to obtain a complete marked area later. By comparing MS+MW with AMSMW, we find that AMS has great effects on improving performance. The level set+MW algorithm is a method in which level set is used instead of AMS as the embedded segmentation method. By comparing the quantitative results of level set+MW with other methods, it can be found that AMSMW has obvious advantages on indicators other than TPR. In addition, as observed from Fig. 2, we find that the qualitative result of AMSMW is much closer to that of GT. Taking the images in the sixth row and fourth column as an example, AMSMW can not only perfectly resist the interference of calcification points and speckle noise in the lesion area, but also resist the interference of back echo of the lesion, so as to accurately identify tumour boundaries. And reduce FPR as much as possible. Moreover, the AMSMW method runs fastest.
In summary, we believe that AMSMW has the highest efficiency and effectiveness. Second, we performed segmentation on the 205 images from dataset B, using AMSMW. Furthermore, We performed five-fold cross-validation experiments on the 500 images from dataset A, using RDAU-NET and AMSMW respectively. The quantitative and qualitative results are shown in Table 4 and Fig. 3 and in Table 5 and Fig. 4, respectively. As observed from Table 4, AMSMW is slightly lower than RDAU-NET    on SP, PC and M-IOU but obviously higher on the other five metrics. This indicates that AMSMW has good adaptability and can segment lesion areas more precisely than RDAU-NET, which can also be seen more intuitively from Fig. 3. We take images in the first, eighth and tenth column for example. They either have calcification points inside the tumor, severe speckle noise, or strong echoes behind the tumor. Facing with so much interference, AMSMW can still accurately identify tumour boundaries without causing excessive segmentation. RDAU-NET can also find the lesion area, whereas it segments more normal tissue area, increasing the FPR. It can be observed from Fig. 4 clearly, there are relatively more serious problems of over-segmentation and false positive in the segmentation results of RDAU-NET. In addition, observed from Table 5, RDAU-NET does not show a strong generalization ability because it outperforms AMSMW only on Sp and Pc. And it is far worse on the other seven metrics. Overall, it is obvious that if there are not enough training data and excellent hardware resources, even the best deep learning model is not available in regard to new or more complex data. Therefore, there is still a long way for deep learning to go to improve its generalization performance. In other words, although traditional algorithms cannot be fully automated, the semi-automated capability is sufficient to alleviate the burden on doctors, and excellent traditional segmentation algorithms still have good performance when processing complex data.

AMSMW is more sensitive to malignant tumours than benign tumours
As shown in Table 6, the performance of AMSMW in segmenting benign and malignant tumours is comparable. From the four indicators, TPR, DSC, JI and AER, we can conclude that AMSMW is more sensitive to discriminating malignant tumours. However, due to the strong echo behind the lesion and the possible strong inner calcification points, AMSMW has a high FPR when segmenting malignant lesions, causing the  algorithm to perform poorly on the other four indicators. Therefore, AMSMW is more stable when segmenting benign lesions.

Discussion
The proposed method consists of two parts: an image preprocessing scheme and an image segmentation algorithm. These two parts are complementary and inseparable. The goal of the preprocessing scheme is to highlight the boundary of the lesion to obtain more accurate segmentation results. Some preprocessing methods can suppress noise and improve contrast but cannot highlight the boundary. However, some preprocessing methods are the opposite. Therefore, it is necessary to find a suitable image preprocessing scheme. By exploring the effect of preprocessing methods on the segmentation results, we find that for BUS images, CLAHE & SWF is a better image processing method. Theoretically, SWF can preserve boundaries well, while CLAHE can suppress noise and improve contrast. The combination of the two methods is especially suitable for breast cancer ultrasound images with low contrast and speckle noise. In image segmentation, considering the complexity and particularity of BUS images, our goal is to find a robust and efficient lesion segmentation method. MW is very robust in solving complex image segmentation problems. However, its accuracy largely depends on the accuracy of the "marked area". Theoretically, the "marked area" is the known lesion area. The more accurate the "marked area" is, the higher the accuracy of the algorithm will be. Thus, we mainly optimized MW by improving the method of obtaining the "marked area". Our idea is to find an excellent algorithm as the acquisition method for the "marked area". AMS is an improved MS method that can adaptively change the working parameters without tedious calculation of the PED equation. It is proven to be very suitable for acquiring a "marked area" for MW. The experimental results show that the proposed algorithm achieves better segmentation results. Moreover, by comparing it with some other classic traditional segmentation methods, we find that the proposed method is the most efficient and effective algorithm. In addition, we compared the proposed method with the state-of-the-art deep learning models RDAU-NET and U-NET. It still performed well on most of the metrics on both dataset A and dataset B. Because it does not need a training set, it does not depend too greatly on data sets with different data distributions. Therefore, theoretically, its generalization performance should be better than that of deep learning algorithms. Therefore, in the present era of deep learning, which has attracted much attention and is widely sought after, most deep learning models do not have an ideal generalization ability, which leads to a bottleneck in its continued development. However, the traditional segmentation method is stable and efficient, which provides a way to solve the bottleneck problem of deep learning to a certain extent. Therefore, in future work, we should not neglect classical segmentation methods. Most likely, it would be a good solution to integrate the efficient and stable traditional segmentation method with a deep learning model to complete image segmentation work, and the results would certainly be greatly improved.
By evaluating the sensitivity of the algorithm in segmenting benign and malignant tumours, we find that the proposed method has high sensitivity in the delineation of malignant tumour boundaries and is relatively stable for benign tumours. Therefore, in future work, we could take the strong echo and characteristics of malignant tumours into account and set up an adaptive ideal segmentation method. Moreover, the current study has great potential for further studies. It will result in better and faster precise diagnosis and treatment of oncological diseases. There are many more medical specialities and diseases [32][33][34] in which there are known and applicable diagnostic imaging methods, but there are still few predictive modelling bases. Noninvasive diagnostic methods can be used not only in oncology but also in other medical specialities. Therefore, this study can also be applied to computer-aided diagnosis of these diseases.

Conclusions
In this paper, an efficient semi-automatic BUS image segmentation method was proposed and evaluated quantitatively. It was proven to be the most robust and effective BUS image segmentation method compared with classic traditional segmentation methods and a state-of-the-art deep learning model. In addition, by studying the sensitivity of AMSMW in segmenting benign and malignant lesions, we found that it is more sensitive to malignant lesions and more stable to benign lesions, which is of great significance for algorithm research in precision medicine in the future. Moreover, since the RROI in the proposed method is drawn manually, we are considering adding a deep learning network to automatically identify RROIs and completely liberate radiologists from this task in our future work.

Methods
The flowchart of the proposed method is shown in Fig. 5. It consists of five main parts: data acquisition, rectangular region of interest (RROI) acquisition, image preprocessing, marked area acquisition and final contour acquisition. The first three parts are image preparation and preprocessing. The last three parts are the process of image segmentation.

Data acquisition and difference analysis of the two datasets
Dataset A was collected by us. It has 600 BUS images, which include 300 benign solid cysts and 300 malignant solid cysts. They are captured from different devices, such as GE LOGIQ E9 and PHILIPS EPIQ 5, in a local hospital. The patient information in all images was hidden. An experienced radiologist sketched the lesion boundary for each image as the ground truth (GT). Dataset B is open source. It contains a total of 255 images. Among them, 213 images are from [22], and 42 images are from [35]. To study the generalization ability of the algorithm on different datasets, we analysed the significant differences between dataset A and dataset B. We used a grey level co-occurrence matrix to extract the following statistics for each image: the difference entropy, sum entropy, correlation, angular second moment, sum average, contrast, difference variance, entropy, homogeneity, sum variance, variance and information measure of correlation. Then, we used the Mann-Whitney U test to perform statistical analysis on datasets A and B to obtain the p-value of each statistic, as shown in Table 7. We can see that the p-values of all statistics are less than 0.05. Therefore, we find that datasets A and B have significant differences.

RROI acquisition
The RROI was obtained manually by the following steps: first, a point was selected as a starting point by left-clicking the mouse and holding down the left mouse button to move diagonally until the end position was found. Here, we define the RROI's four vertices as w 1 ,w 2 , h 1 , and h 2 , where w and h represent points along the tumour's width and height, respectively, and the sub-indices 1 and 2 represent the lower and upper limits of the tumour's width and height, respectively. The geometric centre of the RROI, which will be used later, can be defined as (10) µ = (µ w , µ k ) = w 1 + w 2 − w 1 2 , h 1 + h 2 − h 1 2 . Image preprocessing

Contrast enhancement
BUS images are characterized by low contrast and considerable noise, which can be improved by applying CLAHE, an optimization method based on adaptive histogram equalization (AHE) that limits the increase in contrast. It effectively overcomes the problem of over-amplifying noise in the AHE algorithm.

Edge highlighting
Local windows, whose centres align with the pixels being processed, usually cause blurred edges. To avoid this, [36] proposed SWF, which can significantly preserve edges. Thus, we introduced SWF to highlight the edges of lesions in BUS images. We give a brief introduction to SWF, and more information can be found in [36]. As shown in Fig. 6, eight side windows are defined only in a discrete case, where (x, y) are the coordinates of target pixel i and r and θ are the radius and angle of the window, respectively. ρ ∈ {o, r} , θ = k × π/2 , and k ∈[0,3]. Thus, we can obtain four side windows, W Di , W Ri , W Ui and W Li , by setting ρ=r, which aligns i with their sides. While ρ=0, we have W SWi , W SEi , W NEi and W NWi , which align i with their corners. For each pixel, the process of filtering can be regarded as the process of finding the I am value, which satisfies where, W ij is the weight of pixel j, which neighbours pixel i, based on the filtering kernel F; q j is the intensity of image q at location i; and S=L, R, U, D, NW, NE, SW, SE is the set of side window indices. The result of filtering by SWF is defined as where I ′ θ,ρ,γ i is the result for the eight side windows when ρ ∈ {o, r} , θ = k × π/2 , and k ∈[0,3].

Constrained Gaussian kernel set
Similar to the method proposed in [30], we multiply Gaussian functions with the filtered image I CF to obtain the region of interest (ROI). However, the difference is that (11) I ′θ ,ρ,r i = F (q i , θ , ρ, r).