Skip to main content

Table 2 Transformer applications in histopathological image segmentation

From: A survey of Transformer applications for histopathological image analysis: New developments and future directions

Method

Tissue

Dataset

Challenge

Highlight

DSC / IoU / F1 (%)

Swin-MIL [83]

Intestine

Custom

Image annotation and lack of related information between instances

Transformer-based weakly supervised approach

–/–/ 99.90

MCTrans [84]

Cell

Pannuke

Inability of CNN-based methods to model long-term dependencies

Multi-compound transformer with CNN

68.90/– /–

TSHVNet [85]

Cell

CoNSeP and Pannuke

Difficulties in differentiating various classes of nuclei and separating nuclear instances with high clustering,

Integration of multiattention modules (transformer and SimAM)

85.6 /– /82.00

Diao et al. [86]

Colon

NPC2020

Insufficient global context encoding

Transformer-based network using TransUNet

83.30/73.00 /–

DS-TransUNet [15]

Colon

GlaS

Ignoring the pixel-level intrinsic structural features inside each patch

Dual Swin transformer U-Net with standard U-shaped arch

87.19/78.45/–

TransAttUnet  [87]

Colon

GlaS

Modeling long-range contextual dependencies and Computational costs

Transformer with Multi-level Attention-guided U-Net

89.11 / 81.13 /–

ATTransUNet  [2]

Colon

GlaS and MoNuSeg

Heavy computational burden of paired attention modeling between redundant visual tokens

A transformer-enhanced hybrid architecture based on the adaptive token

89.63 / 82.55 /–

HiTrans [88]

Liver

PAIP 2019

The inherent heterogeneity of hepatocellular carcinoma

A hierarchical transformer encoder-based network

–/ / 75.13

TransWS [40]

Colon and breast

GlaS and Camelyon16

highlighting target regions roughly, sub-optimal solution and low efficiency

Transformer-based weakly supervised learning

– /–/ 85.20

TransNuSS  [50]

Colon and breast

TNBC and MoNuSeg

The challenges of pre-training nuclei segmentation models with ImageNet due to morphological and textural differences

Self-supervised learning incorporated with vision transformer model

83.07 / 68.72 /–

NST [89]

Liver, Breast, Colon, etc.

GCNS and MoNuSAC 2020

The staining of WSI sections is not uniform and nuclei having different sizes and shapes

A gastrointestinal transformer-based network

79.60 / 66.30 /–

MedT [90]

Colon and cell

GlaS and MoNuSeg

Inherent inductive biases in CNNs and insufficiently annotated datasets

Gated axial-attention transformer-based model

–/ 69.61 / 81.02

SwinCup [48]

Colon and colorectal

GlaS

Inability of CNNs to model global context

Cascaded Swin transformer-based network

–/–/ 92.00

DHUnet [49]

Breast, liver, and lung

BCSS, WSSS4LUAD, etc.

Inability of the transformer model to capture fine-grained details in pathological images

Dual-branch hierarchical global–local fusion network

93.07 / 87.04 /–