Microaneurysms detection in color fundus images using machine learning based on directional local contrast

Background As one of the major complications of diabetes, diabetic retinopathy (DR) is a leading cause of visual impairment and blindness due to delayed diagnosis and intervention. Microaneurysms appear as the earliest symptom of DR. Accurate and reliable detection of microaneurysms in color fundus images has great importance for DR screening. Methods A microaneurysms' detection method using machine learning based on directional local contrast (DLC) is proposed for the early diagnosis of DR. First, blood vessels were enhanced and segmented using improved enhancement function based on analyzing eigenvalues of Hessian matrix. Next, with blood vessels excluded, microaneurysm candidate regions were obtained using shape characteristics and connected components analysis. After image segmented to patches, the features of each microaneurysm candidate patch were extracted, and each candidate patch was classified into microaneurysm or non-microaneurysm. The main contributions of our study are (1) making use of directional local contrast in microaneurysms' detection for the first time, which does make sense for better microaneurysms' classification. (2) Applying three different machine learning techniques for classification and comparing their performance for microaneurysms' detection. The proposed algorithm was trained and tested on e-ophtha MA database, and further tested on another independent DIARETDB1 database. Results of microaneurysms' detection on the two databases were evaluated on lesion level and compared with existing algorithms. Results The proposed method has achieved better performance compared with existing algorithms on accuracy and computation time. On e-ophtha MA and DIARETDB1 databases, the area under curve (AUC) of receiver operating characteristic (ROC) curve was 0.87 and 0.86, respectively. The free-response ROC (FROC) score on the two databases was 0.374 and 0.210, respectively. The computation time per image with resolution of 2544×1969, 1400×960 and 1500×1152 is 29 s, 3 s and 2.6 s, respectively. Conclusions The proposed method using machine learning based on directional local contrast of image patches can effectively detect microaneurysms in color fundus images and provide an effective scientific basis for early clinical DR diagnosis.


Fig. 1 Color fundus image of DR
automatic and reliable detection of MA in color fundus images using computer-aided techniques is significant for the early screening and diagnosis of DR, which also contributes to the identification of those eyes at risk of developing CSME [10]. Computer-aided automatic analysis of color fundus images is highly efficient and has been widely applied in the diagnosis of DR [12][13][14][15], which could reduce the workload of ophthalmologists and improve the efficiency of DR screening. With the development of computer-aided diagnose (CAD) of DR, due to the importance of MA for DR diagnose, there are more and more studies on automatic detection of MA recently. Saha et al. [16] detected red lesions and bright lesions using Naive Bayesian (NB) classifier and support vector machine (SVM), but they only used one database including 100 images for both training and testing. Seoud et al. [17] proposed a method based on random forest (RF) for the detection of both MA and HM using dynamic shape features, which did not need any prior segmentation of lesions. But the lesions linked to blood vessels were missed leading to false-negatives (FN) [15]. Srivastava et al. [18] proposed filters on different grid sizes and combined with multiple-kernel learning (MKL) and SVM to detect MA and HM dealing with the false-positives (FP) due to small blood vessel (BV) segments. Using MKL was found to improve the performance as compared to using a single grid size, but with a disadvantage of high computations for higher grid size. Zhou et al. [19] proposed an unsupervised classification method based on sparse principal component analysis (PCA) for MA detection, which can avoid the class imbalance problem. But there are some FPs during feature extraction. Ren et al. [20] detected MA by adaptive over-sampling boosting (ADOBoosting), while Dai et al. [21] and Dashtbozorg et al. [22] applied the random under-sampling boosting (RUSBoosting) classifier to detect MA, and all of them performed well with respect to class-imbalance problem. Wu et al. [23] used peak detection and region growing to get MA candidates, then detected MA based on K-nearest neighbor (KNN), with a relatively low FROC score of 0.273 on e-ophtha MA database. Wang et al. [24] proposed a method using singular spectrum analysis and KNN classifier for MA detection, which obtained some FNs due to missing of subtle or low contrast or blurry-outlined MA. They also missed few MA during candidate extraction. Adal et al. [25] presented a multi-stage approach for automated detection of longitudinal retinal changes due to MA and dot HM (small red lesions). Derwin et al. [26] detected MA by applying local binary pattern (LBP) for texture features' extraction and followed by SVM for classification. Javidi et al. [27] and Wei et al. [28] detected MA by discriminative dictionary learning (DDL) and multi-feature fusion dictionary learning (MFFDL), respectively. The former depended heavily on original grayscale feature dictionary, and since there is a large variability in color, luminosity, and contrast both within and between retinal images, using single grayscale feature will affect the performance.
As deep learning is an emerging computer vision application in medical image processing and proving to be of great help to mankind in machine learning [15], several MA detection methods based on convolutional neural networks (CNN) [8,[29][30][31][32] were proposed. The main limitation of CNN is the requirement of larger training time [15].
Moreover, some studies mentioned above for MA detection only used one database for training and testing [19,[25][26][27][28]32]. Large-scale data on heterogeneous patient cohorts are needed for full validation. Considering the limitations aforementioned, mainly improvement of MA detection results, lower computation time and enough data for validation are mainly required. We aim to propose a new method using machine learning algorithms based on directional local contrast (DLC) feature for MA detection, and validate it on two different databases. The first step is preprocessing to improve the image quality, including segmentation and removal of blood vessel (BV). An improved enhancement method based on Hessian matrix eigenvalue analysis was used for BV segmentation. With main structure of BV eliminated, MA candidates were extracted based on shape characteristics and connected components' analysis. Next, features were extracted from each candidate to differentiate the MA and non-MA. Last, machine learning methods were applied for classification based on the extracted features and comparing their performances on MA detection.
The main contributions of our study are (1) using DLC feature which has not been used in existing methods for MA detection. DLC indicates the local contrast characteristics in a neighborhood which was applied as feature to distinguish MA, and it was proven that the method performs better using DLC than not. (2) Applying three different machine learning methods for MA classification and comparing their performance, especially, Naive Bayesian classification which is simple with low computational complexity and efficient on accuracy and computation time. The overview of the proposed method is depicted in Fig. 2.
The remaining paper is organized as follows:"Results" section presents the results of our proposed method. In "Discussion" and "Conclusions" sections, discussion and conclusions are presented. The proposed method is described in "Method" section including experiment and evaluation measures.

Results
Results of the proposed method were evaluated on lesion level, including ROC curves of the three classifiers, computation time, and comparison with existing methods on FROC curve and time.

ROC curves
The ROC curve and corresponding AUC of MA detection results on e-ophtha MA and DIARETDB1 databases achieved by three classifiers (Naive Bayesian, KNN and SVM) are shown in Fig. 3. In this work, the directional local contrast of the center point p of each MA candidate patch was calculated, including totally 12 values of different directions of DLC. Without using the DLC feature, the results of three classifiers are shown in Fig. 4. As can be seen from Figs. 3 and 4, the proposed method performs better using DLC feature compared with not.
It can be seen that Naive Bayesian classifier performs better than SVM and KNN, with the AUC of 0.871 and 0.859 on e-ophtha MA and DIARETDB1 databases, respectively. Therefore, Naive Bayesian was selected for evaluation of MA detection performance in this study.

Computation time
To run the proposed method for MA detection, MATLAB 2016a (The MathWorks, Inc., Natick, Massachusetts, USA) was used in the environment of 64-bit Windows 10 operating system with 2.9 GHz Intel Core i5 CPU and 16GB memory. The computation time mainly depends on image resolution and the number of MAs labeled per image. The average computation time per image in the test set of e-ophtha MA database and DIARETDB1 database is shown is Table 1. This means the larger the resolution is, the more the computation time is. The more the MAs labeled per image, the greater the number of extracted MA candidates, resulting in more computation time for making candidate patches, feature extraction and classification. The average time per image in each processing step is shown in Table 2 and compared with that of Dashtbozorg et al. [22] Comparison with different methods Figure 5 shows FROC curves of the proposed MA detection method on e-ophtha MA and DIARETDB1 databases compared with existing algorithms. Tables 3 and 4 show comparison of Sensitivity between the proposed MA detection method and existing algorithms at different FPI values on e-ophtha MA and DIARETDB1 databases, respectively. Table 5 shows comparison of average computation time for different methods. It can be seen that the proposed method has better performance on MA detection compared to existing algorithms. Especially, as to those using deep learning techniques, the algorithms proposed by Eftekhari et al. [8] and Chudzik et al. [31] performed very well on FROC score, using two-step CNN and FCN, respectively. It is known that deep learning with CNN for classification has a better performance on accuracy but more computation time for training, requirement of a large number    In DIARETDB1 database, MA detection results corresponding to different confidences are shown in different colored circles in Fig. 8a. The evaluation shown in Fig. 8b is based on the confidence ≥ 75% . As can be seen from Fig. 8, with a standard of ground truth confidence ≥ 75% , a part of FPs belong to those labeled with a confidence of less than 75% . For example, the two FPs shown in Fig. 7b have a confidence of 50% (shown in Fig. 8a).

Discussion
In this study, a method using machine learning based on directional local contrast is proposed for microaneurysms' detection in color fundus images. The effectiveness of the method is proved through experiments on different databases.

Advantages
The accurate segmentation of BV main structure could improve the accuracy of MA detection. It has been proven that using Jerman's enhanced function can achieve the accuracy of 95.4% in BV segmentation [33], with better performance compared to the widely used Hessian-based enhanced function proposed by Frangi [34]. For the features'  Fig. 4). Through the analysis of characteristics of MA, it is found that the DLC of MA and non-MA patch is significantly different (Fig. 13). The accuracy of MA detection could be improved using the DLC feature. In addition, compared with existing methods, the proposed method performs better with relatively high accuracy and less computation time. As compared with methods using deep learning with CNN in high accuracy, whose main limitations are widely known that long training time and high requirement of computational resources (best with GPU), the proposed method is simpler and time saving.  Fig. 6b, the MA has an irregular shape and a relatively large area. It is a typical example that leads to misjudgment of the algorithm based on the shape features. As the shape features used to distinguish between MA and HM (shown in Table 6) that missed MA might be considered as HM by the classifier. Second, as shown by the red circle in Fig. 7c, this MA was missed in the process of MA candidate regions' extraction. Although the applied method for extracting MA candidate regions was efficient, it might miss a small part of MA in candidate regions, which caused the increase of FNs in the final MA detection results and affected the overall accuracy of MA detection. The main reasons are as follows: in the first case, MA was attached to BV and removed with it. This case is relatively rare, because MA usually appears at the end of BV which was isolated from the main structure of BV and preserved as an independent slender structure during BV removing, as shown in Fig. 9d, e. In the second case, MA was discarded together with the attached slender segment of BV end in the step of removing slender structures. Thus, the threshold for removing slender structures is important. If it was set too small, some MAs would be removed together. If it was too large, some slender and small BV might still be left. In another case, when MA characteristics are not obvious, as shown by the red circle in Fig. 7c, this missed MA is too small and has low contrast to background, which looks more like image noise. It is difficult to judge whether it is MA.

Typical misclassification analysis
MA detection is a difficult task that there is disagreement between different experts in the diagnosis of MA, as shown in Fig. 8. As the DIARETDB1 database shows, from the observation of the image in Fig. 7, it can be seen that the two FPs in Fig. 7b are very similar to actual MA, while only two experts label them as MA (confidence of 50% ). Because the MA structure is too small, unaided eye observation is also susceptible to image quality. From the viewpoint of clinical application, the algorithmic prediction and clinical diagnosis could be mutually complementary. If there is MA detected in an image, the ophthalmologist can further diagnose whether the image corresponds to DR by MA detection result. Therefore, some FN on lesion level will not affect the MA detection performance on image level, and has little impact on clinical diagnosis of ophthalmologist. Or a lower confidence level can be taken as ground truth for ophthalmologists to screen. Thus, the application of computer-aided MA detection in assisting clinical DR screening deserves further investigation.

Future works
For the current limitations above, in the subsequent research, the image denoising operation in preprocessing progress should be improved. In addition, to improve the generalization of the method, more databases that cover more heterogeneous samples are needed for as fully validation as possible, including real samples from hospitals for clinical validation.

Conclusions
Making use of directional local contrast feature can improve the performance of machine learning method for MA detection. Naive Bayesian classification is more accurate than SVM and KNN in this study. The proposed MA detection algorithm showed good performance on the two databases with AUC of 0.87 and 0.86, FROC score of 0.374 and 0.210, respectively. In addition, compared with deep learning-based methods, the proposed method is simple and takes less computation time, of 29 s for image with 2544 × 1969 resolution and 3 s for image with 1400 × 960 resolution and 2.6 s with 1500 × 1152 resolution. It is known that MA is important in DR progression, which is associated with DR severity and macular edema. Therefore, the proposed method with accurate detection of MA and less computation time is promising for clinical application of DR diagnosis and identification of eyes at risk of developing macular edema.

Methods
This study uses machine learning based on directional local contrast to detect MA on preprocessed and segmented image patches. First, preprocessing was used to enhance the image quality. Then blood vessels (BVs) were segmented using an improved enhanced function based on Jerman's work [33]. With BVs eliminated, candidate regions of MA were then extracted. Next, processed images were divided into patches according to the location of the MA candidate regions. Finally, based on the features extracted, candidate patches were classified to differentiate the MA and non-MA. The process is illustrated in Fig. 2. To comprehensively evaluate the algorithm on different patient cohorts, the proposed method was validated on two independent databases (e-ophtha MA and DIARETDB1) on lesion level.

Shade correction
BVs, HMs and MAs all appear dark red in color fundus images (as shown in Fig. 1). To make the dark red regions more obvious, preprocessing progress was performed in the Field of View (FOV). Compared to the other two channels, the green channel has higher contrast between target area and background, and therefore contains the most abundant information. Thus, the first step of preprocessing was to extract the green channel image I g . Then, median filter on the green channel image I g was applied to obtain the background I bg . The filter size was larger than the maximal BV width in the fundus image. In this study, the filter size of 50 × 50 was selected. Finally, the result derived by the median filter I bg was subtracted from the green channel image I g to eliminate the background and achieve the image with the effect of shade correction, named I eq . The main structure of BVs in I eq was clear. Other dark regions that include HM and MA in the original image were also highlighted in I eq . But the overall image was dark with low contrast. To make the image clearer, the mean gray value of the green channel image I g in FOV was added on image I eq to maintain in the same gray range as the green channel image. As follows: The result of shade correction ( I sc ) is shown as Fig. 9b where BV can be obviously observed.

Segmentation of blood vessels
For accurate MA detection, it is important to separate BV and MA which are similar in color and brightness. Thus, BVs were segmented and eliminated to minimize the influence on MA detection. The most widely used BV segmentation methods are machine learning [35] and deep learning [36,37], which have high accuracy but are complicated and timeconsuming. Inspired by Jerman et al. 's work [33], which was improved from the widely used method for BV enhancement of Frangi et al. 's work [34], this study constructed a response function to enhance BV by analyzing the eigenvalues of Hessian matrix, to achieve simple, fast, and accurate BV segmentation.
For a shade-corrected image I sc (x, y) , first, Gaussian filter was applied to reduce the noise. Different σ values were selected for Gaussian filter to obtain different enhancement functions, and the largest response function was selected to derive BV enhanced image. The value range of σ was 3 ∼ 6 and the step was 1 in this study.
The second-order partial derivatives of x and y on the filtered image were computed respectively, then convolved with I sc (x, y) , getting L xx , L xy , L yy , respectively. The Hessian matrix H was composed as follows: (1) I sc = I eq + mean(I g ) Then, two eigenvalues 1 and 2 could be obtained from Hessian matrix H as follows, where tmp represented an intermediate variable.
If |µ 1 | ≤ |µ 2 | , then 1 = µ 1 , 2 = µ 2 . Otherwise, 1 = µ 2 , 2 = µ 1 (that is, to ensure | 2 | ≥ | 1 | ). 2 was normalized as follows: Enhancement function was calculated as follows: The BV enhancement result is shown in Fig. 9c. Because the BV enhancement process might also enhance some other dark red regions in the original image, MA might be enhanced. The green circles in Fig. 9c indicate the MAs contained in the BV enhancement image. First, the Ostu's adaptive threshold method was used to binarize the BV enhanced image in order to obtain the preliminary BV segmentation result, as shown in Fig. 9d. MAs might be contained in the result, shown as the green circles in Fig. 9d. To remove MAs from BV segmentation result, small regions were excluded to obtain the final BV main structure segmentation result, as shown in Fig. 9e. Compared with Fig. 9d, MAs have been removed.

Contrast enhancement and noise reduction
Due to the tiny structure of MA, the accuracy of MA detection is sensitive to image quality. Therefore, further preprocessing was performed to improve the image quality.
To enhance the contrast between background and highlighted dark regions (including BVs, HMs, and MAs) in the shade-corrected image I sc , the image was processed with contrast-limited adaptive histogram equalization (CLAHE) method. A 7 × 7 Gaussian filter was then applied to reduce the influence of noise. The result ( I gauss ) is shown in Fig. 10b.

Blood vessels removal
To eliminate the effects of BV on MA detection, on the filtered image ( I gauss ), the gray value of the corresponding position of the segmented BV was directly set to zero, obtaining I gsBV0 as shown in Fig. 10c.

Extraction of MA candidate regions
In the previous step of BV removal, the gray value of BV has been directly set to zero. To further make all the other parts black except for the MA candidates, first, the mean gray value of original green channel image in FOV (see Eq. 1) was subtracted from I gsBV0 . Then contrast stretch was used, obtaining I CS as shown in Fig. 10d. Finally, the Ostu's adaptive threshold method was used to binarize to obtain the preliminary MA candidate regions I can1 , as shown in Fig. 10e. In BV segmentation, the small regions were excluded to ensure that candidate regions of MA were not included in the segmented BV. As a result, the preliminary MA candidate regions contained some small BV segments which appear as slender structures in Fig. 10e.
Then, the connected component analysis and shape characteristics were used to refine the results of MA candidates. Since MAs appear as dots, the remaining slender structures could be excluded by the ratio (named R) of the length of the major axis and the minor axis of each candidate region (or connected component). If the ratio R of a candidate region exceeds the set threshold, the region would be discarded.
The threshold was set based on the mean value of the ratio R of all connected components in the preliminary MA candidate regions. Considering the proportion of the number of the remaining slender structure, through several experiments, 1.2 was finally selected to multiply the mean value, formulated as follows: R(i), i = 1, 2, . . . , n, where n is the total number of connected components in the preliminary MA candidate regions. Finally, in remaining candidate regions, those with area larger than 500 pixels were discarded, since MAs are often much smaller (maximum MA with 100 ∼ 300 pixels in existing works [22,23,28,38], and 361 pixels labeled in e-ophtha MA database). The final result of MA candidate regions I can is shown in Fig. 10f, where green circles indicate ground truth of MAs.

Image patches
MA is difficult to observe and detect at the whole image level due to its tiny structure. Therefore, the detection of MA was based on image patches.
All images in training set and test set were divided into 25 × 25 patches according to the location of candidate regions, and a suitable classification method was selected to determine whether each test patch contains MA, so as to realize MA detection. MA patches and non-MA patches are shown in Fig. 11.

Features extraction and classification
Although the interference of BV on MA detection was eliminated, in addition to image noises in the obtained candidate regions, there might also be HMs. To further separate the true MA from the candidate regions, it is necessary to analyze the difference between MA and HM. As shown in Figs. 1 and 12, MAs appear as isolated dark red dots with clear borders, while HMs appear as dark red regions of different sizes and shapes  with blurred borders. Thus, using shape features or combined with gradient features could distinguish MA and HM. Moreover, for each MA candidate patch, the directional local contrast (DLC) of its center point p was calculated to distinguish MA and other structures. Assuming that the gray level of the center point p is I p , the DLC of the point p along the direction angle θ is defined as where I p(θ) is the mean gray value of neighboring pixels of point p along the direction angle θ , r the radius of neighboring region, and N p(θ ,r) the set of neighboring pixels of point p along the direction angle θ within the radius r: Finally, the obtained DLC vector of point p is where n is the number of angles, 12 was selected, that is, 360 • was divided into 12 angles with a step of 30 • . And the selected value of r was 12, which was half the patch size. DLC indicates the local contrast characteristics in a neighborhood. A negative DLC value indicates that the gray value of the current pixel is lower than that of the neighborhood. Therefore, DLC could be used to differentiate the structures with different local contrast. Figure 13 shows DLC distribution of center point of each region of MA, HM, BV and background (BG) shown in Fig. 12.
As shown in Fig. 13, the difference between MA and other regions (HM, BV, BG) in DLC distribution is obvious. The analysis of variance (ANOVA) results showed that the DLC of MA is significantly different from those of HM, BV, and BG (p< 0.05 for all).
With the aim of distinguishing between true MA and non-MA, totally seven types of features including color, grayscale, DLC, shape, texture, Gaussian filter-based, and gradient were selected for each candidate (or patch). The details are shown in Table 6, where I CLAHE indicates the result of CLAHE on the green channel.
Area is the total number of pixels in candidate region, that is the number of white pixels of each connected component in Fig. 10f.
Perimeter is the total number of boundary pixels that surround candidate region. Circularity of each candidate region is defined as Circularity = Perimeter 2

* π * Area
Aspect ratio is the ratio of the length of major axis to minor axis of candidate region.
Eccentricity is the ratio of the distance between the two focal points of the circumscribed ellipse of candidate region (focal length 2c) to its major axis length (2a), and it is equal to 0 for a circular region, as Eccentricity = c a Solidity is the ratio of area of candidate region to its convex hull area (ConA), as N p(θ ,r) = (x q , y q )|x q = x p + k cos θ, y q = y p + k sin θ , k = 1, 2, . . . , r .
(13) DLC p = (DLC p(θ 1 ) , DLC p(θ 2 ) , . . . , DLC p(θ n ) ), Entropy indicates image randomness or texture complexity, representing the clustering characteristics of gray distribution. With n different gray values in the image and proportion p(i) of each gray value i, it is defined as Entropy = − n i=1 (p(i) * log 2 p(i)), i = 1, . . . , n . When gray level of image is uniform, entropy can reach the maximum.
Energy reflects the uniformity of image gray distribution and texture thickness, and it is the sum of square of element values of gray-level co-occurrence matrix (GLCM). With P indicating GLCM, it is defined as Energy = i,j P(i, j) 2 Homogeneity reflects the tightness of distribution of elements to the diagonal in GLCM, defined as Homogeneity = i,j P(i,j) 1+|i−j| Skewness reflects the asymmetry of gray value distribution of image. For input gray values x of pixels, with µ and σ representing the mean and standard deviation of the input x, respectively, and E representing expectation, it is defined as As for gradient features, dx and dy indicate the gradient in the horizontal and vertical direction, respectively. Gradient at MA boundary has an abrupt change. As shown in Fig. 12a, gradient vector on the boundary of MA is divergent, where the absolute value of gradient is large. Whereas the absolute value of gradient at the center of MA is small. As shown in Fig. 12b, the distribution of gradient vector in the patch of HM is relatively messy. It can be seen from Fig. 12 that the difference in gradient vector distribution is obvious between MA and other structures. Therefore, features based on gradient can be used as reference to distinguish MA from image patches of candidate regions.

Machine learning algorithms for classification
After all the 44 features extracted from each MA candidate patch, machine learning technique would be used to classify each patch to MA or non-MA. As aforementioned, a variety of classification methods were used in automatic DR detection, including MA detection with Naive Bayesian [16], random forest [17], support vector machine [18,26,39] and K-nearest neighbor [23,24]. Three classifiers were selected for MA detection and compared the results in this work: NB, KNN and SVM.

Naive Bayesian
NB classifier is supervised learning technique based on Bayesian theory. It assumes that features are independent (naive) of each other. NB calculates the prior probability of each class and the conditional probability of each class for each feature from the training samples. For a test sample to be classified, Bayesian theory is used to calculate the posterior probability of test sample belonging to each class, and the class with the maximum posterior probability will be chosen. NB is simple and powerful therefore has been widely used in machine learning algorithms of spam processing, document classification, disease diagnosis and other aspects.

K-nearest neighbor
KNN is a very special kind of machine learning algorithm because it does not have a learning process in a general sense, which named lazy learning. Lazy learning is also called mechanical learning due to its high dependence on training samples. Mechanical learning does not build a model, so KNN is a non-parametric supervised learning method. For a test sample to be classified, KNN calculates the distance between the test sample and all training samples, selects the K samples closest to the test sample from the training dataset, and according to the majority-voting rule, the test sample is classified into the class that the more samples of K nearest neighbors belong to.

Support vector machine
SVM is a classic supervised learning algorithm in the field of machine learning, mainly used to solve the problem of data classification. SVM algorithm aims to find a maximum-margin hyperplane to correctly divide the training samples to different categories. For linearly inseparable problems, kernel functions are used to map the input features into the higher-dimensional feature space where a linear separation is possible. Kernel functions used in SVM classifier mainly include linear, sigmoid, polynomial and radial basis function (RBF).

Experiment materials
The fundus images of two available public retinal image databases (e-ophtha MA [40] and DIARETDB1 [41]) were used for experiments.
In e-ophtha MA database, there are four image resolutions ranging from 1440 × 960 to 2544 × 1696 pixels with 45 • FOV. The e-ophtha MA database contains 233 images without MAs, and 148 images MAs, which are manually annotated by an ophthalmologist and confirmed by another. The 148 images with totally 1306 MAs were used in this study for experiments. DIARETDB1 database contains 89 color fundus images with resolution of 1500 × 1152 pixels and 50 • FOV. Four medical experts marked MA in DIARETDB1 database independently. Considering the disagreement between the four experts annotations, we use the annotations with confidence ≥ 75% (at least three of the four experts agree that there is an MA) as standard for MA labeling, and there are 182 MAs with confidence ≥ 75%.
74 images were selected from e-ophtha MA database and segmented to derive the training set. The segmented patches on MA locations and non-MA candidate patches were positive and negative samples, respectively, making up the training set.
The remaining 74 images in e-ophtha MA database and the whole DIARETDB1 database were used as test set to verify the performance of the proposed MA detection method.

Experiment procedure
From each patch of the training set, the aforementioned 44 features were extracted to train the classifier. Then, the trained classifier was used on each test patch to determine whether the candidate was MA based on the 44 features extracted, so as to achieve the purpose of MA detection.
NB, KNN and SVM with RBF kernel function were used to perform the experiment and compared to choose the best performed classifier, and then the results of MA detection were evaluated on lesion level.

Evaluation
For the problems of class-imbalance, the receiver operating characteristic (ROC) curve can better evaluate the performance of the classifier. In this study, the number of MAs in candidate regions is much smaller than that of Non-MAs. Therefore, ROC curve and its corresponding Area Under Curve (AUC ) as well as free-response ROC (FROC) curve were used to evaluate the proposed MA detection method. The closer ROC curve is to the upper left corner, the more convex the ROC curve is, the larger the value of AUC is, the better the algorithm performs.
The abscissa of ROC curve is false-positive rate (FPR), which is the proportion of negative samples with positive classification results to all negative samples. The ordinate is Sensitivity, also named true-positive rate (TPR), which is the proportion of positive samples with positive classification results to all positive samples. The two variables corresponding were calculated as follows: TP indicates that MA is correctly predicted. TN indicates that non-MA is correctly predicted. FP indicates that non-MA is erroneously predicted as MA. And FN indicates that MA is erroneously predicted as non-MA. Compared with ROC curve, FROC curve has the same Sensitivity on the ordinate and the false-positive per image (FPI) on the abscissa.
As both databases provide the annotation of MA on lesion level, evaluation of the results was performed on lesion level. As aforementioned, the ground truth of DIARETDB1 database with confidence ≥ 75% was used as the standard for evaluation of MA detection results.
First of all, ROC curve and AUC were used to evaluate the performance of three classifiers on MA detection, so as to select the best performed classifier, and further analyze its results of MA detection.
Next, FROC curve and FROC score were used to evaluate the performance of the proposed MA detection method and be compared with existing algorithms.
Then, average computation time per image of the whole experiment and of each processing step was recorded and compared with existing algorithms. Last