Mass type-specific sparse representation for mass classification in computer-aided detection on mammograms

Background Breast cancer is the leading cause of both incidence and mortality in women population. For this reason, much research effort has been devoted to develop Computer-Aided Detection (CAD) systems for early detection of the breast cancers on mammograms. In this paper, we propose a new and novel dictionary configuration underpinning sparse representation based classification (SRC). The key idea of the proposed algorithm is to improve the sparsity in terms of mass margins for the purpose of improving classification performance in CAD systems. Methods The aim of the proposed SRC framework is to construct separate dictionaries according to the types of mass margins. The underlying idea behind our method is that the separated dictionaries can enhance the sparsity of mass class (true-positive), leading to an improved performance for differentiating mammographic masses from normal tissues (false-positive). When a mass sample is given for classification, the sparse solutions based on corresponding dictionaries are separately solved and combined at score level. Experiments have been performed on both database (DB) named as Digital Database for Screening Mammography (DDSM) and clinical Full Field Digital Mammogram (FFDM) DBs. In our experiments, sparsity concentration in the true class (SCTC) and area under the Receiver operating characteristic (ROC) curve (AUC) were measured for the comparison between the proposed method and a conventional single dictionary based approach. In addition, a support vector machine (SVM) was used for comparing our method with state-of-the-arts classifier extensively used for mass classification. Results Comparing with the conventional single dictionary configuration, the proposed approach is able to improve SCTC of up to 13.9% and 23.6% on DDSM and FFDM DBs, respectively. Moreover, the proposed method is able to improve AUC with 8.2% and 22.1% on DDSM and FFDM DBs, respectively. Comparing to SVM classifier, the proposed method improves AUC with 2.9% and 11.6% on DDSM and FFDM DBs, respectively. Conclusions The proposed dictionary configuration is found to well improve the sparsity of dictionaries, resulting in an enhanced classification performance. Moreover, the results show that the proposed method is better than conventional SVM classifier for classifying breast masses subject to various margins from normal tissues.


Background
According to the World Health Organization, breast cancer is the major leading cause of both incidence and mortality in women [1]. It has been generally believed that screening mammography is the most cost-effective approach for early detection of breast cancer [2]. For this reason, considerable research efforts have been devoted to develop Computer-Aided Detection (CAD) systems, which would be beneficial for detecting breast lesions.
In practical CAD systems, it is generally difficult to achieve high sensitivity at a low false positive (FP) detection rate [3]. Due to the variability of mass margins and the inherent superposition of normal tissues in mammography, mammographic mass detection can be much more challenging compared to micro-calcification detection [4]. In particular, a high number of FP detections could induce unnecessary breast biopsies so that patients would get anxious and unnecessary costs expense. Thus, reducing the number of FP detections is of great importance in practical breast cancer screening based on mammography.
In recent years, Sparse Representation based Classification (SRC) [5] has been increasingly important in the field of signal processing. The objective of sparse representation is to represent a signal pattern in a compact and sparse way for the purpose of representing a signal pattern with a few numbers of atoms [5]. Referring to [5], high degree of sparsity can be desirable to improve classification performance as much as possible. Generally, a higher sparsity could be achieved if a fewer number of atoms is able to represent signal patterns. Sparse representation could contain discriminating and crucial information of a signal pattern. In light of this fact, SRC may be appropriate to capture the unique and apparent patterns present in breast masses. Thus, it is reasonable to assume that applying SRC to mammographic CAD system can improve classification performance.
A solid and well-established study on the use of SRC for classification applications has been well-documented in the research area of face recognition. Wright et al. [6] demonstrated that SRC was robust to face occlusion and they showed that SRC outperformed other face recognition algorithms when classifying corrupted face images. However, only few studies proposed the use of SRC for developing classification algorithms devised for CAD systems. Liu et al. [7] designed a CAD system utilizing SRC with learned dictionaries in classifying lesions of colon and lung. Herrndsvela [8] made use of SR as pixel-wise classification to determine whether each pixel is located in mass regions or not. However, this paper has been limited to only deal with one type of possible mass margins (i.e., circumscribed mass). In addition, the feature for classifying pixels was limited to image intensities of n by n neighbourhood of each pixel. However, image level information is likely to be more affected by breast densities or surrounding tissues structures, mainly due to the direct use of pixel values.
The margin of a mass (i.e., the border of a mass) should be carefully examined because it is one of the most important criteria in determining whether the mass is benign or malignant [4]. Radiologists classify the mass margins into the following five types [4]: circumscribed, obscured, micro-lobulated, ill-defined, and spiculated margins. In most studies on SRC-CAD, breast masses are treated as a single class. However, this approach causes the increased diversity in positive class and subsequently degrades sparsity in sparse representation.
To cope with above-mentioned problem, we propose a dictionary configuration framework designed for improving the sparsity in terms of mass margins. The proposed dictionary configuration is incorporated into the sparse representation based classification (SRC) for mammographic mass classification in CAD systems. To this end, we adopt divide and conquer strategy [9] on the mass classification with various margins. In the proposed dictionary configuration, we construct individual and separate dictionaries each corresponding to a particular type of mass margins commonly encountered in clinical screening process. Thus, the number of dictionaries is equal to the number of types of mass margins predefined. The sparse solutions-each of which is solved using a corresponding dictionary component-are effectively combined using a score level fusion to make the final decision. In addition, our proposed method has been designed by adopting a dictionary learning in order to overcome insufficient sample problem. Further, the classification is performed at feature level rather than at image level in order to effectively make use of relevant information of mass margins in a better way and to reduce data dimension and computational cost [6].
Experiments had been conducted using the public DDSM database [10] and the clinical mammography dataset provided from a hospital in order to test the effectiveness of the proposed framework on mammograms. Experimental results show that the proposed method is able to achieve high sensitivity at a low FP rate compared with a well-established and generally used support vector machine (SVM) classifier in mammographic CAD systems.
The rest of this paper is organized as follows. In Section "Methods", we briefly introduce the region-of-interest (ROI) segmentation and feature extraction method used in this paper. In sequence, the proposed dictionary configuration and the sparse representation based classification (SRC) are described in detail. In Section "Results and discussion", experimental results and discussion are presented. The conclusion is drawn in Section "Conclusion".

ROI segmentation and feature extraction
Referring to [11], mammographic CAD systems generally consist of the following four stages: image preprocessing (enhancement), ROI segmentation, feature extraction, and classification as described in Figure 1. The focus of this paper is to develop the effective classification method so as to increase the mass classification performance. Since ROI segmentation and feature extraction are prerequisite steps prior to performing classification of ROIs, we briefly describe the segmentation and feature extraction technique used in this paper.
For image preprocessing, the mass enhancement technique [12] (developed by our group) is applied to original mammogram images for the purpose of increasing mass detection sensitivity. In addition, the multi-level thresholding based mass segmentation algorithm proposed in [13] is used to detect and segment mass candidates (ROIs) from the enhanced mammogram. Figure 2 shows an example of an enhanced mammogram with segmented ROIs generated by the preprocessing and ROI segmentation. As shown in the Figure 2, the preprocessing effectively increases the contrast of mammogram and ROI segmentation well detects and segments mass ROIs. The segmented ROIs were used as input for feature extraction. Herein, we used four different feature subspaces: texture, shape, intensity, and spiculation features. The features used in our study are summarized in Table 1. The features listed in Table 1 were used as a particular feature representation during the generation of dictionaries in the proposed SRC framework.
Classification of breast masses using the proposed method

Sparse representation based classification
In this section, we first briefly review a SRC algorithm and describe the way of applying SRC algorithm for classification of segmented ROIs. Note that all of the features described in Table 1 are used to find the corresponding sparse representations of segmented ROIs and to perform the classification task.
To formulate the classification problem based on sparse representation, mammographic mass features are used as atoms of dictionaries. n i training feature vectors from the i th class are put together into a dictionary of the i th class as where d is the feature dimension and n i is the number of samples in the i th class. Note that, in the present work, we are performing a binary classification task; thus, i = mass and normal, representing breast masses (positive class) and normal tissues (negative class). By concatenating feature vectors from the mass and normal tissue training samples, a Kim et al. BioMedical Engineering OnLine 2013, 12(Suppl 1):S3 http://www.biomedical-engineering-online.com/content/12/S1/S3 dictionary is generated as A = [A mass , A normal ]. When classifying a test sample, the test feature vector y ∈ R d can be approximated as a linear combination of the training feature vectors from corresponding class i. Since the membership to the i th class of the test feature vector is initially unknown, the linear combination of y can be rewritten as follows using the dictionary A: where x 0 = [0, · · · , 0, α i,1 , α i,2 , · · · , α i,n i , 0, · · · , 0] T ∈ R n is a coefficient vector whose entries are zero except those belonging to the corresponding the i th class.
Since a valid test sample y is likely to be sufficiently represented using only the training samples from the same class, it is possible to find a sparse solution of Eq. (1) by solving the following 0 -minimization problem [6]: where || · || 0 denotes the 0 -norm, which counts the number of nonzero entries in a input vector. However, the 0 -minimization problem is NP-hard (Non-deterministic Polynomialtime hard). Donoho [14] proved that the solution of the 0 -minimization can be approximated to that of 1 -minimization. Therefore, Eq. (2) can be rewritten as [6] x = arg min Then, we compute residuals for each class as follows: where δ i is the characteristic function which selects the coefficients associated with the i th class.
Note that small residual means test feature vector is sufficiently approximated as a linear combination of the training feature vectors from corresponding class. Therefore, the test feature vector y can be classified to the class that minimizes the residual:

The proposed dictionary configuration
In this section, we explain the proposed dictionary configuration method. For this purpose, we first describe the dictionary learning method adopted in this paper. Generally, dictionary generation can be categorized into two approaches: the analytic approach (i.e., wavelets) and the learning-based approach (i.e., K-SVD, FDDL). Advantages of the learning-based approach are the much finer-tuned (i.e., more sophisticated) dictionaries  [23][24][25] Uniform LBP histograms are computed from the segmented object; LBP operator with a circularly symmetric neighbourhood of P members on a circle radius of R is employed; the three-resolution combination is used by setting LBP parameters (P,R) values of (8,1), (8,2), and (8,3) 354 Spatial gray level dependence (SGLD) [26] 13 features, namely, "correlation", "energy", "entropy", "inertia", "inverse difference moment", "sum average", "sum variance", "sum entropy", "difference energy", "difference variance", "difference entropy", "information measure of correlation 1", "information measure of correlation 2" are extracted from each SGLD matrix at six different inter-pixel distances (d = 1 they produce compared to the analytic approaches, and their significantly better performance in applications [15]. It should be pointed out that mammographic mass classification is generally quite difficult due to the large variability in the appearance of mass patterns [4] such as its irregular size, obscured borders, and complex mixtures of margin types. Therefore, the learning-based dictionary generation is more appropriate for constructing dictionary that aims at maximizing mass classification performance, thanks to their capability of characterizing a wide variety of mammographic mass patterns in a sophisticated way. In typical mammographic CAD design, the number of positive training samples may be often insufficient because the training samples should be divided into small subsets according to its type of margin. However, it should be noted that to correctly classifying a large variety of mass types found in clinical practices, it would be desirable that dictionaries should contain a sufficient number of mass samples for each mass type to achieve better classification performances of SRC [16]. Also note that the goal of using sparse representation in our method is to express a given mass example as linear combination of a small number of atoms taken from a "dictionary" resource. Hence, largesized dictionaries may lead to a better sparse solution than small-sized dictionary [17,18]. In order to effectively represent mass examples with a given atoms, the Fisher discrimination dictionary learning (FDDL) [18] has been incorporated into the proposed dictionary configuration method. The FDDL aims to learn a structured dictionary whose sub-dictionaries have specific class labels. Each sub-dictionary of the learned whole dictionary has good representation power to the samples from the corresponding class, but has poor representation power to the samples from other classes [18]. The FDDL iteratively updates the dictionary so that the learned dictionary would have smaller within-class scatter degree while maintaining larger between-class scatter degree, resulting in improved SRC performances.
In general mass classification task, suspicious regions are classified as mass or normal tissues, i.e., binary classification problem. Herein, we assume that true masses are assigned to positive class while normal tissues for negative class. However, this results in increasing the diversity in positive class, and degrades sparsity in sparse representation. To cope with the problem, we propose a dictionary configuration framework that improves sparsity in terms of mass margins within conventional SRC framework for CAD systems. Note that as shown in Figure 3, the proposed dictionary configuration is used at the classification stage. Key property of the proposed dictionary configuration is to increase the sparsity of each dictionary, because each dictionary contains positive Figure 3 Proposed dictionary configuration method description. Note that the proposed dictionary configuration has been performed at the classification stage shown in Figure 1. samples that have the similar margin characteristics. In the proposed SRC framework, T dictionaries are learned separately where T is the number of types of mass margins. It should be noted that each dictionary contains features from mass samples in a single type of mass margins and features from normal tissues.
In addition, one major problem of typical CAD systems is the large number of false positives. Hence, an organized dictionary is likely to be unbalanced due to the difference in the number between true-positive and false-positive samples. This would make the sparse solution based on that dictionary to be highly biased toward the class that contain a large number of samples. Consequently, this biased sparse solution could cause low true positive rate and high true negative rate within a SRC framework. To address the aforementioned issue, the proposed SRC framework is designed for performing random sampling on negative samples, aiming to make the balanced dictionary.
After constructing dictionaries, the FDDL algorithm [18] is separately applied to individual dictionary; this can improve the sparsity of each dictionary. We now explain how to obtain the sparse solution for each type of mass margins. Let us denote the learned dictionary by D t , where t is an index of mass margin types and t = 1,..., T, and T is the total number of types of mass margins. When given D t and a test feature vector y, sparse solutions of the test feature vector y for each D t can be solved by using Eq. (3). Without loss of generality, the sparse solution for each type of mass margins can be defined as follows: The residuals of sparse solutions derived from each dictionary according to the types of mass margins in Eq. (4) are fused at score level by calculating residual corresponding to mass and normal classes as follows: Note that in Eq. (7), the fused residual represents reconstruction error with the given class i. Therefore, the fused residual is utilized as final decision. Smaller residual indicates that the test sample is sufficiently approximated with the training samples from corresponding class. Thus, the test sample can be classified to the class that achieves the minimization of the residual. In detail, in case of a normal ROI, residuals of the normal class should be smaller than that of the mass class for all dictionaries. Therefore, the fused residual also have a smaller fused residual for the normal class. In case of a mass ROI, a residual of the mass class should small compared to that of the normal class in the corresponding margin-type dictionary. Therefore, the fused residual of mass class should have a smaller values compared to that of normal class.

Experimental setup
The proposed dictionary configuration based classification method was tested on both public data, so-called Digital Database for Screening Mammography (DDSM) [10], and the real clinical dataset provided from Samsung Medical Center (SMC). From DDSM DB, we collected 303 mammograms (each with one mass) containing benign or malignant masses; it will be referred to as the "Dataset 1". The second dataset consists of a total of 165 clinical mammograms (containing benign or malignant masses). We called this DB as the "Dataset 2". Figure 4 shows information of Dataset 1 and Dataset 2, respectively, in terms of mass margin and breast density characteristics. It can be seen from Figure 4 that the masses with different margins and densities found in clinical practice were well represented in the used datasets by containing a variety of mass margins and breast densities commonly encountered in clinical mammographic CAD systems. In addition, it is known that it is hard to detect and classify masses in high density breast, because masses are concealed by surrounding Parenchyma [19]. As shown in the statistics, we tested mass ROIs with dense tissue to cover samples those are hard to classify.
By using the segmentation method described in "Methods" section, a total of 2,725 ROIs (234 masses and 2,491 normal tissues) and 691 ROIs (151 masses and 540 normal tissues) were automatically generated by using Dataset 1 and Dataset 2, respectively. The DDSM provides annotations of the true masses presented in each image [10], while for each clinical mammogram (coming from SMC), the region of interest containing the mass was annotated by a Mammography Quality Standards Actapproved radiologist. These annotations were considered as the ground truth in our experiments. Using ground truth information, a generated ROI was considered as a true mass only if it met the following two criteria [20]: (1) the centroid of a segmented region is included in the annotated area, and (2) a segmented region intersects with the true mass region more than 25%.
Evaluation protocol used in this paper was designed based on 10-fold cross validation scheme, i.e., a portion of 90% mass and normal tissue ROIs were used for training samples to construct dictionaries, while the rest of 10% mass and normal tissue ROIs were used for testing samples. To guarantee stable classification results, 30 independent runs of 10-fold cross validation were executed. Thus, all of the results reported were averaged over 30 runs with 10-fold cross validation.
To objectively quantify the improvement of sparsity compared to the conventional single dictionary configuration that contains various mass margins into a single dictionary, sparsity concentration in true class (SCTC) is defined as follows: where δ true is the characteristic function that selects the sparse coefficients associated with the true class of a test sample y andx t represents the sparse solution for each type of mass margins (see Eq. (6) for definition).
In order to evaluate the classification performance of the proposed SRC framework, area under the receiver operating characteristic (ROC) curve [21] was used (denoted by AUC) because AUC is a commonly used performance index for evaluating classification algorithms developed for mammographic CAD applications [11]. To evaluate the ROC curve for the proposed method, the difference between Res mass and Res normal is used as a confidence value because if a test sample has higher residual to mass class compared to normal class, it is reasonable to assume that the sample is much similar to the mass class. For comparative purpose, a state-of-the-art support vector machine (SVM) classifier [22] that utilizes a radial basis function kernel was employed. Table 2 shows the value of SCTCs (defined in Eq. (8)) of each mass margin using the conventional single dictionary configuration and the proposed dictionary configuration. Note that the SCTC value of each mass margin was computed when the corresponding mass margin was used as a test sample. Also note that the values of SCTCs in Table 2 have been averaged over 30 runs. The experimental results indicate that the proposed dictionary configuration is found to work well in terms of improving the sparsity of dictionary. Especially, the proposed method improved SCTC of up to 13.9% and 23.6% on Dataset 1 and Dataset 2 respectively. Table 3 shows the values of AUC for both the single dictionary configuration and the propose dictionary configuration. As shown in the Table 3, the proposed dictionary configuration attains considerably better AUC compared to the single dictionary configuration. This result indicates that the  improved sparsity would have a positive influence on the classification performance. Also, it can be seen that in Table 4, the proposed method considerably outperforms the SVM classifier, where the proposed method is able to increase classification performance with 8.2% and 22.1% (in terms of AUC values) on Dataset 1 and Dataset 2, respectively, compared to the SVM classifier. These results validates that the proposed method has high potential for reducing false-positive detections in mammographic CAD systems. Figure 5 shows examples of correctly and incorrectly classified mass ROIs. As shown in the Figure 5, correctly classified mass ROIs have more clear hyper-dense core regions and differentiable with surrounding tissues compared to incorrectly classified mass ROIs. The result indicates a weakness of the proposed method that mass ROIs should have apparent characteristics compared to surrounding tissues. Moreover, it should be noted that correctly classified mass ROIs have many number of similar samples. It indicates that to correctly classify the incorrectly classified mass ROIs, training samples should have more samples those have similar characteristics to the incorrectly classified mass ROIs.

Conclusions
In this paper, we propose a new sparse representation based classification (SRC) algorithm based on so-called mass type-specific dictionary configuration for mammographic CAD systems. It has been found that the proposed method is beneficial for improving Table 4 Comparisons of AUC between the SVM and proposed dictionary configuration mass type-dependent sparsity. In addition, experimental result validate that the proposed dictionary configuration algorithm can improve the sparsity of dictionary, thus leading to increased classification performance. Furthermore, experimental results show that the proposed method is considerably better than the conventional SVM classifier (extensively used for classification applications in CAD systems of breast masses on mammography) for differentiating mammographic masses (confined to various margins) from normal tissues.
For further work, information fusion (e.g., a complementary design) from different levels (i.e., image level and feature level) should be investigated to get the better classification performances.