Method | Tissue | Dataset | Challenge | Highlight | DSC / IoU / F1 (%) |
---|---|---|---|---|---|
Swin-MIL [83] | Intestine | Custom | Image annotation and lack of related information between instances | Transformer-based weakly supervised approach | –/–/ 99.90 |
MCTrans [84] | Cell | Pannuke | Inability of CNN-based methods to model long-term dependencies | Multi-compound transformer with CNN | 68.90/– /– |
TSHVNet [85] | Cell | CoNSeP and Pannuke | Difficulties in differentiating various classes of nuclei and separating nuclear instances with high clustering, | Integration of multiattention modules (transformer and SimAM) | 85.6 /– /82.00 |
Diao et al. [86] | Colon | NPC2020 | Insufficient global context encoding | Transformer-based network using TransUNet | 83.30/73.00 /– |
DS-TransUNet [15] | Colon | GlaS | Ignoring the pixel-level intrinsic structural features inside each patch | Dual Swin transformer U-Net with standard U-shaped arch | 87.19/78.45/– |
TransAttUnet  [87] | Colon | GlaS | Modeling long-range contextual dependencies and Computational costs | Transformer with Multi-level Attention-guided U-Net | 89.11 / 81.13 /– |
ATTransUNet  [2] | Colon | GlaS and MoNuSeg | Heavy computational burden of paired attention modeling between redundant visual tokens | A transformer-enhanced hybrid architecture based on the adaptive token | 89.63 / 82.55 /– |
HiTrans [88] | Liver | PAIP 2019 | The inherent heterogeneity of hepatocellular carcinoma | A hierarchical transformer encoder-based network | –/ / 75.13 |
TransWS [40] | Colon and breast | GlaS and Camelyon16 | highlighting target regions roughly, sub-optimal solution and low efficiency | Transformer-based weakly supervised learning | – /–/ 85.20 |
TransNuSS  [50] | Colon and breast | TNBC and MoNuSeg | The challenges of pre-training nuclei segmentation models with ImageNet due to morphological and textural differences | Self-supervised learning incorporated with vision transformer model | 83.07 / 68.72 /– |
NST [89] | Liver, Breast, Colon, etc. | GCNS and MoNuSAC 2020 | The staining of WSI sections is not uniform and nuclei having different sizes and shapes | A gastrointestinal transformer-based network | 79.60 / 66.30 /– |
MedT [90] | Colon and cell | GlaS and MoNuSeg | Inherent inductive biases in CNNs and insufficiently annotated datasets | Gated axial-attention transformer-based model | –/ 69.61 / 81.02 |
SwinCup [48] | Colon and colorectal | GlaS | Inability of CNNs to model global context | Cascaded Swin transformer-based network | –/–/ 92.00 |
DHUnet [49] | Breast, liver, and lung | BCSS, WSSS4LUAD, etc. | Inability of the transformer model to capture fine-grained details in pathological images | Dual-branch hierarchical global–local fusion network | 93.07 / 87.04 /– |