Multi-tissue and multi-scale approach for nuclei segmentation in H&E stained images
BioMedical Engineering OnLine volume 17, Article number: 89 (2018)
Accurate nuclei detection and segmentation in histological images is essential for many clinical purposes. While manual annotations are time-consuming and operator-dependent, full automated segmentation remains a challenging task due to the high variability of cells intensity, size and morphology. Most of the proposed algorithms for the automated segmentation of nuclei were designed for specific organ or tissues.
The aim of this study was to develop and validate a fully multiscale method, named MANA (Multiscale Adaptive Nuclei Analysis), for nuclei segmentation in different tissues and magnifications. MANA was tested on a dataset of H&E stained tissue images with more than 59,000 annotated nuclei, taken from six organs (colon, liver, bone, prostate, adrenal gland and thyroid) and three magnifications (10×, 20×, 40×). Automatic results were compared with manual segmentations and three open-source software designed for nuclei detection. For each organ, MANA obtained always an F1-score higher than 0.91, with an average F1 of 0.9305 ± 0.0161. The average computational time was about 20 s independently of the number of nuclei to be detected (anyway, higher than 1000), indicating the efficiency of the proposed technique.
To the best of our knowledge, MANA is the first fully automated multi-scale and multi-tissue algorithm for nuclei detection. Overall, the robustness and versatility of MANA allowed to achieve, on different organs and magnifications, performances in line or better than those of state-of-art algorithms optimized for single tissues.
The evaluation of cell nuclei plays a crucial role in histopathological images analysis. In fact, parameters such as cell size, shape and spatial distribution are generally used by pathologists for cancer detection and reporting . In routine histology, the most widely used staining method to visualize tissues is the use of hematoxylin and eosin (H&E), which allow to distinguish cell nuclei (bluish color—hematoxylin) from cytoplasm (pinkish color—eosin) . Cell nuclei counting is time-consuming and prone to inter- and intra-observer variability, which results in a limited reliability. Manual delineation of nuclei is an even more cumbersome operation, which is never performed in routine, but which would be required to precisely assess nuclei size and morphology. The architectural arrangement of nuclear structures on histology is highly relevant in the context of disease (i.e., cancer grading) . Cancer grade is a key feature used to predict patient prognosis and in prescribing a treatment . Since most of the current pathology diagnosis processes are based on the subjective opinion of pathologists, solutions for the quantitative assessment of histological images would have scope of application.
With the recent advances of techniques in digitalized scanning, tissue histopathology slides can be stored in the form of digital images . In the last years, many efforts have been devoted to developing automatic nuclear segmentation techniques with the aim to improve the efficiency and the accuracy in histopathological image analysis.
Most current nuclei detection approaches on H&E stained images are based on color information [5, 6]. Using these techniques, a detection accuracy over 85% can be achieved . Since these approaches are dependent on either color and intensity-related attributes, none of these works have been tested on multi-tissue data or in pathological conditions, where nuclei may exhibit irregular shapes and different intensities.
Several methods have been proposed to perform cell segmentation using gradients  and morphological operations . Nevertheless, methods using a prior knowledge of nuclei shape are prone to fail because of the variation of tissue preparation procedures (sectioning and staining). Furthermore, the existence of touching nuclei makes their separation quite hard for automated segmentation methods .
In the last few year, deep neural networks drove advances in image recognition and they achieved state-of-art performance in many segmentation tasks of medical imaging [10, 11]. Above all, convolutional neural networks (CNNs) have shown promising results in nuclei segmentation for different tissues . These techniques estimate a probability map of the nuclear regions based on the learned nuclear appearances. In this way, CNNs can generalize across various nuclear color variations. Recently, a detection accuracy of 80% was obtained for seven organs . However, CNNs need a wide annotated training set of images to obtain adequate performance and the network architecture must be changed in case of variation in the magnification. This is because CNNs fail to generalize if the nuclei, in addition to changing color, also change size. For this reason, deep neural networks are not suitable for multiscale approaches.
To the best of our knowledge, no multi-tissue and multi-scale solution has been proposed so far. In this paper, we present the MANA (Multiscale Adaptive Nuclei Analysis) algorithm, a multi-tissue and multi-scale method for cell detection in histological images. The proposed technique takes an H&E staining image as input and it shows the nuclei boundaries found within the image.
The MANA algorithm was designed to automatically detect nuclei in H&E staining images. The algorithm was developed using MATLAB (MathWorks, Natick, MA, USA) environment. Three main steps composed the processing: object-based thresholding, area-based correction and nuclei separation. In the following sections, a detailed description of the algorithm is provided.
This step represents our technical innovation to achieve a first object-based detection for nuclei segmentation. The RGB image of a histological specimen is first converted into grayscale by eliminating the hue and saturation information while retaining the luminance. Then, its histogram is calculated and the progressive weighted mean (PWM CURVE ) of the grayscale histogram is computed.
Let’s consider a grayscale image with pixel intensities expressed by integer numbers between 0 and N. The histogram is then a distribution with N + 1 classes and it graphically displays the frequency (how many times) each gray level occurs. Considering a generic class P of the histogram (0 ≤ P ≤ N), the value of PWM CURVE for that class is defined as follows:
where w i is the histogram count for the ith class and x i is the respective bin location. The PWM CURVE is evaluated for each class of the histogram as the weighted mean of all the grayscale histogram values up to that class. The trend of PWM CURVE depends on the histogram shape so relevant characteristics on the color distribution of the image can be extracted using this function. In particular, if there are significant color variations from a certain point on the histogram with respect to the distribution that precedes it, here we can expect to see a change of concavity in the PWM CURVE . Inflection points of PWM CURVE may be potential threshold values for performing nuclei segmentation as they represent local stability points of the grayscale histogram.
Conceptually, PWM CURVE is therefore an alternative representation of the color distribution that makes it easier to apply object-based thresholds. For this reason, PWM CURVE can be used to automatically spot nuclei inside image. Nuclei are defined as objects with an intensity lower than a threshold.
First of all, the PWM CURVE is fitted with a 15th order polynomial function with the aim to estimate its inflection points (candidate thresholds). Then, the grayscale image is segmented using all the candidate thresholds and the median area of objects found is evaluated for all thresholds. Among all the candidate thresholds, the algorithm defined as the initial threshold the one that had the objects with the highest median area.
The processing for obtaining the initial threshold is illustrated in Fig. 1, where three sub-images from different tissues are used as examples. Figure 1 also shows the robustness of the proposed method, where the optimal threshold was chosen, regardless of the histogram shapes or cells’ appearance.
By summarizing, being this method an object-based thresholding, it is robust to different tissue types, image magnification, and staining.
This step is needed in order to correct oversegmentation from previous step because the object-based detection may lead to small or too large structures. Too small structures may be oversegmented or wrong objects, whereas too large areas may consist of a fusion of different nuclei. To lessen the oversegmentation and to optimize the nuclei detection, the mean area of segmented objects (mean total) is first evaluated. Then, areas are labelled as: ‘small’, ‘normal’, or ‘big’. ‘Small’ objects are structures smaller than 25% of mean area, whereas ‘big’ objects are structures greater than 5 times the mean area. The remaining objects are considered as ‘normal’.
‘Small’ objects are deleted because they are too little to be potentially considered as nuclei. ‘Big’ objects should be split, in case they were nuclei agglomerates. Separation is obtained by iteratively decreasing the initial threshold for these structures until they are classified as ‘normal’ (area less than 5 times the mean total). Figure 2a sketches the effect of this procedure. Using these criteria, the initial threshold found in the previous section is locally modified in order to identify the highest number of nuclei within the histological image.
The goal of this step is to further separate remaining fused nuclei. In literature, the watershed transform was successfully used to isolate merged nuclei . The MANA algorithm implements a variant of the classical watershed transform called marker-based watershed . In this technique, seeds close to nuclear centers (marker) are used as starting points for watershed transform. To identify nuclear seed, MANA performs the distance transform of the nuclei binary mask and calculates the local maxima using the extended-maxima transform . This transform estimates the regional maxima by searching in N-connected neighborhoods. The neighborhood size determines the sensitivity of the maxima-extended transform in the detection of nuclear seeds.
Additionally, the solidity of all objects is also evaluated. Solidity of a region is defined as the ratio between its actual area and its convex area. Since it is expected that nuclei are convex objects, a segmented region containing an actual nucleus should have a solidity approximately equal to 1. Hence, solidity can be used as a discriminant feature for varying the neighborhood size of the maxima-extended transform and then the sensitivity of the marker-based watershed. The MANA algorithm applied a low-sensitive watershed for high solidity shapes while sensitivity is increased for low solidity objects. In Fig. 2b is shown the application of a marker-based watershed sensitive to shapes solidity.
Finally, the mean area of the objects obtained after watershed is evaluated and items smaller than 25% of mean area are erased by the algorithm.
Automatic results provided by MANA were compared with manual segmentations. True positive (TP) represents the number of manual cells identified by the algorithm, false negative (FN) denotes all nuclei not found by the automatic method and false positive (FP) represents all cells obtained by MANA without a corresponding manual nucleus. The performance of nuclear detection was evaluated by calculating the recall, precision and F1-score, which are defined as follows:
Recall assesses the missed detection of ground truth objects (manual nuclei) while precision evaluates the false detection of ghost objects. F1-score is defined as the harmonic mean of recall and precision. F1-score is a common used object detection metric , but it penalizes only object-level errors . In fact, F1-score does not take pixel-level errors into account (i.e. under-segmentation of correctly detected objects). Let NCS, NUS and NSE represent the numbers of correct-segmentation (CS), under-segmentation (US) and segmentation-error (SE). The pixel-level performance is evaluated using the CS, US and SE rates , which are defined as follows:
where NGT (ground truth) represents the number of nuclei manually identified. The US rate indicates the failure to split nuclear regions in the correct number of nuclei while SE rate reveals the missed detection of cells. An example of CS, US and SE cells is provided in Fig. 3.
Our dataset consisted of H&E stained images taken from six different organ tissues. The six organs were: colon, liver, bone, prostate, adrenal gland and thyroid. In addition, images were acquired with three magnifications (10×, 20×, 40×) to test the multiscale approach of MANA algorithm. One expert pathologist (more than 10 years of experience) manually marked the nuclei centers in each image, for a total of 59,123 cells. The images were collected and digitalized at the Molinette Città della Salute University hospital (Torino, Italy) and all patients signed an informed consent. The overall dataset composition is shown in Table 1.
For each of the six organs analyzed, an example of the validation process is shown in Fig. 4.
Comparison with manual operator
The object-level (recall, precision and F1-score) and pixel-level (CS, US, SE rates) performances of MANA algorithm are summarized in Table 2. The processing was performed on a workstation with a 2.6 GHz quad-core CPU and 16-GB of RAM.
The algorithm can be considered very performing in object detection, being the average F1-score equal to 0.9305 on 30 images. For all tissues, precision and recall presented similar values so the accuracy of the proposed method was demonstrated (Table 2).
A CS rate of 81.97% coupled to a SE rate of 5.53% was also obtained. Moreover, the US rate was small where nuclei had crisp contours (5.81%) while it increased in organs with a high percentage of touching nuclei (21.35%).
Finally, the computational time is slightly dependent on image resolution, ranging between 11.3 and 23.7 s (average ± SD: 16.89 ± 5.72 s).
Benchmarking with open-source software
The results obtained by the proposed algorithm were also compared with three open-source software (CellProfiler, QuPath and Fiji) used in the analysis of histological images . CellProfiler  allows to create pipelines for the processing of biomedical images. The software is composed of a series of image-processing modules that allow the user to perform an automatic analysis of the histological images. QuPath  is a new bioimage analysis software designed to provide an open-source solution for digital pathology and whole slide image analysis. This software allows to perform several automatic analyses of histological images, including nuclei detection. Fiji  is a Java-based software that has a watershed transform-based nuclear segmentation plugin available. For this software, a semi-automatic pipeline was implemented, consisted of: (i) conversion of H&E image into grayscale, (ii) manual intensity thresholding and (iii) automatic cells separation. The comparison in the nuclei detection of CellProfiler, QuPath, Fiji and MANA is provided in Fig. 5. The performances of the three open-source software are also reported in Table 3.
As can be seen from Table 3, the CellProfiler segmentation is characterized by a low recall. Several nuclei are not identified by the software and this generate a high number of false negative cells. The average F1-score (0.7154) was lower than the proposed one for more than 20%. This software had also a poor pixel-level performance, with a low number of correct-segmentation (65.45%) and a high number of segmentation-errors (SE equal to 29.73%).
QuPath proved to be an efficient tool for nuclei detection, with fast mean computational time (11.37 s) and average recall of about 0.93. On the other hand, this method produced a lot of false-positive nuclei, causing a very low precision (0.7120). This low precision leaded to a lowering of the average F1-score (0.8004).
The average F1-score obtained with Fiji was slightly lower than those achieved with MANA (0.9030 vs 0.9305). In fact, Fiji processing is based on a single threshold while the proposed method can locally modify the threshold on the same image in order to identify the highest number of nuclei. Moreover, the average computational time in Fiji was 252.73 s, about 15 times higher than MANA algorithm.
In the present study, we proposed a fully automatic method for nuclei identification in histological images. The cell nuclei segmentation is crucially important and has a wide range of applications, such as cancer diagnosis , cancer grading  and quantification of molecular markers in healthy and pathological specimens . The proposed method is able to recognize nuclei boundaries inside H&E images. The cells detection in histological images is a challenging task because of nuclei variability in shape, intensity and dimension. Our technique did not require any user interaction and it was capable of automatically detecting nuclei in different tissues and magnifications. We chose to analyze six of the most studied organs in the development of automatic nuclei segmentation [24, 25]. Nuclei centers were manually marked by one expert pathologist, for a total of 59,123 cells. It was not necessary to segment nuclei boundaries since the proposed algorithm does not require a training set as deep learning-based methods. For this reason, having a faster manual segmentation, the number of annotated nuclei was increased, creating a dataset that contains more than twice the number of marked nuclei compared to previous works [12, 22, 26].
The automatic method was validated using metrics that penalizes both detection and segmentation errors. The comparison between manual and automatic segmentation showed high performances of the proposed technique. For each organ, MANA algorithm obtained always an F1-score higher than 0.91, with an average F1 of 0.9305 ± 0.0161. In literature, the only multi-tissue nuclei segmentation system  had an average F1-score of 0.8267. Compared with this state-of-art method, our approach achieved a large margin with 10.38% improvement of the identification rate.
Object-level and pixel-level performances were also comparable to previous works on nuclei detection [24, 27, 28]. Overall, the robustness and versatility of MANA allowed to achieve, on several organs and magnifications, performances in line or better than those of state-of-art algorithms designed for single tissues [19, 20].
The proposed algorithm allowed also to obtain the highest average F1-score compared to other open-source software designed for nuclei detection. MANA had one of the lowest computational time and, respect to other automatic methods, it had the best pixel-level performances.
Thanks to the reliable nuclei detection provided by MANA, automated systems for tumor patterns recognition , histological lesions evaluation  and markers quantification  can be easily developed in a straightforward manner. In the future, a novel cells separation will be implemented to further increase the pixel-level performances of the proposed algorithm. Future studies are also required to test the accuracy of MANA algorithm for nuclei detection in other tissues.
In this paper, an adaptive method for nuclei segmentation in H&E stained images is presented. To the best of our knowledge, MANA is the first fully automated multi-scale and multi-tissue algorithm for nuclei detection.
The algorithm was tested on different organs, in which nuclei had different intensities, shapes and dimensions. High segmentation performances were obtained for each image of the dataset. The observed robustness in nuclei detection provided by MANA was mainly due to the use of an adaptive thresholding and an optimized nuclei separation. The algorithm took around 20 s to perform segmentation in images with 1500 nuclei, indicating the efficiency of the proposed technique.
Being totally automated, this algorithm could be used in future studies as starting point to realize reliable systems for morphological tissue characterization and diagnosis. Our research group is currently working on a MANA-based algorithm for the automatic detection and quantification of tumor areas in different histological tissue.
Multiscale Adaptive Nuclei Analysis
hematoxylin and eosin
convolutional neural networks
- PWMCURVE :
progressive weighted mean
- NGT :
Gurcan MN, Boucheron LE, Can A, Madabhushi A, Rajpoot NM, Yener B. Histopathological image analysis: a review. Biomed Eng IEEE Rev. 2009;2:147–71. https://doi.org/10.1109/RBME.2009.2034865.
Fischer AH, Jacobson KA, Rose J, Zeller R. Hematoxylin and eosin staining of tissue and cell sections. CSH Protoc. 2008;2008:pdb.prot4986.
Doyle S, Feldman MD, Shih N, Tomaszewski J, Madabhushi A. Cascaded discrimination of normal, abnormal, and confounder classes in histopathology: gleason grading of prostate cancer. BMC Bioinf. 2012;13:282. https://doi.org/10.1186/1471-2105-13-282.
Chen H, Qi X, Yu L, Dou Q, Qin J, Heng PA. DCAN: deep contour-aware networks for object instance segmentation from histology images. Med Image Anal. 2017;36:135–46. https://doi.org/10.1016/j.media.2016.11.004.
Ruifrok AC, Johnston DA. Quantification of histochemical staining by color deconvolution. Anal Quant Cytol Histol. 2001;23:291–9.
Macenko M, Niethammer M, Marron JS, Borland D, Woosley JT, Guan X, et al. A method for normalizing histology slides for quantitative analysis. In: 2009 Proceedings IEEE international symposium biomedical imaging from nano to macro; 2009. p. 1107–10.
Al-Kofahi Y, Lassoued W, Lee W, Roysam B. Improved automatic detection and segmentation of cell nuclei in histopathology images. IEEE Trans Biomed Eng. 2010;57:841–52.
Ali S, Madabhushi A. An integrated region-, boundary-, shape-based active contour for multiple object overlap resolution in histological imagery. IEEE Trans Med Imaging. 2012;31:1448–60.
Ram S, Rodriguez JJ. Size-invariant detection of cell nuclei in microscopy images. IEEE Trans Med Imaging. 2016;35:1753–64. https://doi.org/10.1109/TMI.2016.2527740.
Zheng Y, Liu D, Georgescu B, Nguyen H, Comaniciu D. 3D deep learning for efficient and robust landmark detection in volumetric data. In: Navab N, Hornegger J, Wells WM, Frangi A, editors. Medical image computing and computer-assisted intervention—MICCAI 2015: 18th proceedings international conference, Part I, Munich, Germany, October 5–9, 2015. Cham: Springer; 2015. p. 565–72. https://doi.org/10.1007/978-3-319-24553-9_69.
Chen H, Shen C, Qin J, Ni D, Shi L, Cheng JCY, et al. Automatic localization and identification of vertebrae in spine CT via a joint learning model with deep neural networks. In: Navab N, Hornegger J, Wells WM, Frangi A, editors. Medical image computing and computer-assisted intervention—MICCAI 2015: 18th proceedings international conference, Part I, Munich, Germany, October 5–9, 2015. Cham: Springer; 2015. p. 515–22. https://doi.org/10.1007/978-3-319-24553-9_63.
Kumar N, Verma R, Sharma S, Bhargava S, Vahadane A, Sethi A. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE Trans Med Imaging. 2017;36:1550–60.
Cheng J, Rajapakse JC. Segmentation of clustered nuclei with shape markers and marking function. IEEE Trans Biomed Eng. 2009;56:741–8.
Xu H, Lu C, Mandal M. An efficient technique for nuclei segmentation based on ellipse descriptor analysis and improved seed detection algorithm. IEEE J Biomed Heal Inform. 2014;18:1729–41. https://doi.org/10.1109/JBHI.2013.2297030.
Soille P. Morphological image analysis: principles and applications. 2nd ed. New York: Springer-Verlag; 2003.
Malon CD, Cosatto E. Classification of mitotic figures with convolutional neural networks and seeded blob features. J Pathol Inform. 2013;4:9. https://doi.org/10.4103/2153-3539.112694.
Hui Kong H, Gurcan M, Belkacem-Boussaid K. Partitioning histopathological images: an integrated framework for supervised color-texture segmentation and cell splitting. IEEE Trans Med Imaging. 2011;30:1661–77. https://doi.org/10.1109/TMI.2011.2141674.
Wiesmann V, Franz D, Held C, Münzenmayer C, Palmisano R, Wittenberg T. Review of free software tools for image analysis of fluorescence cell micrographs. J Microsc. 2015;257:39–53. https://doi.org/10.1111/jmi.12184.
Carpenter AE, Jones TR, Lamprecht MR, Clarke C, Kang I, Friman O, et al. Cell profiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 2006;7:R100. https://doi.org/10.1186/gb-2006-7-10-r100.
Bankhead P, Loughrey MB, Fernández JA, Dombrowski Y, McArt DG, Dunne PD, et al. QuPath: open source software for digital pathology image analysis. Sci Rep. 2017;7:16878. https://doi.org/10.1038/s41598-017-17204-5.
Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, et al. Fiji: an open-source platform for biological-image analysis. Nat Methods. 2012;9:676–82. https://doi.org/10.1038/nmeth.2019.
Sirinukunwattana K, Raza SEA, Tsang Y-W, Snead DRJ, Cree IA, Rajpoot NM. Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images. IEEE Trans Med Imaging. 2016;35:1196–206. https://doi.org/10.1109/TMI.2016.2525803.
Irshad H, Veillard A, Roux L, Racoceanu D. Methods for nuclei detection, segmentation, and classification in digital histopathology: a review—current status and future potential. IEEE Rev Biomed Eng. 2014;7:97–114. https://doi.org/10.1109/RBME.2013.2295804.
Wienert S, Heim D, Saeger K, Stenzinger A, Beil M, Hufnagl P, et al. Detection and segmentation of cell nuclei in virtual microscopy images: a minimum-model approach. Sci Rep. 2012;2:503. https://doi.org/10.1038/srep00503.
Wang W, Ozolek JA, Rohde GK. Detection and classification of thyroid follicular lesions based on nuclear structure from histopathology images. Cytom Part A. 2010;77:485–94.
Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform. 2016;7:29. https://doi.org/10.4103/2153-3539.186902.
Xu J, Xiang L, Liu Q, Gilmore H, Wu J, Tang J, et al. Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images. IEEE Trans Med Imaging. 2016;35:119–30.
Lu C, Mahmood M, Jha N, Mandal M. A robust automatic nuclei segmentation technique for quantitative histopathological image analysis. Anal Quant Cytol Histol. 2012;34:296–308.
Nguyen K, Sabata B, Jain A. Prostate cancer detection: fusion of cytological and textural features. J Pathol Inform. 2011;2:3. https://doi.org/10.4103/2153-3539.92030.
FM conceived and supervised the study. MS design the method and performed data analysis. Both authors read and approved the final manuscript.
We thank Zhen Pan for the technical support in the lab.
The authors declare that they have no competing interests.
Availability of data and materials
The dataset used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Consent for publication
Ethics approval and consent to participate
This work was partially supported by the Cassa di Risparmio di Cuneo (CRC, Italy), Grant No. CRC_2016-0707 and the Proof of Concept of Politecnico di Torino (POC, Italy), Grant No. POC_16499.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Salvi, M., Molinari, F. Multi-tissue and multi-scale approach for nuclei segmentation in H&E stained images. BioMed Eng OnLine 17, 89 (2018). https://doi.org/10.1186/s12938-018-0518-0
- Nuclei segmentation
- Adaptive thresholding
- Cellular imaging
- Computer-aided image analysis