A survey of Transformer applications for histopathological image analysis: New developments and future directions

Atabansi, Chukwuemeka Clinton; Nie, Jing; Liu, Haijun; Song, Qianqian; Yan, Lingfeng; Zhou, Xichuan

doi:10.1186/s12938-023-01157-0

BioMedical Engineering OnLine

Table 1 Transformer applications in histopathological image classification tasks

From: A survey of Transformer applications for histopathological image analysis: New developments and future directions

Method	Tissue	Dataset	Challenge	Highlight	ACC / F1 / AUC (%)
ScoreNet [16]	Breast	BRACS, BACH and CAMELYON16	The huge size of WSIs and the cost of exhaustive localized annotations	Efficient transformer-based architecture local and global attention mechanism	–/ 81.10 /–
BreaST-Net [51]	Breast	BreakHis	Differentiating subtypes of benign and malignant cancers	Ensemble of Swin transformers	99.60 / 99.50 / 99.40
HATNet [52]	Breast	Custom	Diagnostic variability and misdiagnosis of breast cancer	End-to-end ViTs with self-attention mechanism	71.00 / 70.00 /–
dMIL-transformer et al. [53]	Breast (LNM)	CAMELYON16 and 17 and the SLN-Breast	Taking into account the morphology and spatial distribution of cancerous regions	Two-stage double max–min MIL transformer architecture	89.23 / 84.83 / 91.67
ASI-DBNet [54]	Brain	UHP	Lack of precision and accuracy in grading brain tumor	An adaptive sparse interactive ResNet ViT dual network	95.24 / 95.23 / 96.83
Ding et al. [55]	Brain	NCT-CRC-HE, BreaKHis and LDCH	Aliasing phenomena caused by downsampling operations and smoothing discontinuous	ViT-based network with wavelet position embedding	99.01 /–/–
DT-DSMIL [56]	Colorectal	Custom	Data annotations	Weakly supervised ViT-based MIL	93.50 / 94.37 / 97.69
IMGL-VTNet [57]	Gastric	IMGL	The problem of identifying IM glands	Multi-scale deformable transformer	–/ 94.00 /–
tRNAsformer [58]	Kidney	TCGA	Gather the information needed to learn WSI representations	Transformer-based learning to predict RNA sequence expressions	96.25 / 96.25 /–
i-ViT [59]	Kidney	TCGA-KIRP	Capturing cellular and cell-layer level patterns	Instance-based Vision Transformer network	93.01 / 93.60 /–
GTP [46]	Lung	CPTAC, TCGA and NLST	Label noise	Graph-transformer with vision transformer	91.20 /–/ 97.70
FDTrans [60]	Lung	TCGA-NSCLC	Large intra-class differences and a lack of annotated datasets	Frequency domain transformer-based architecture	92.33 / 94.64 / 93.16
Yacob et al. [45]	Skin	Custom	Time-consuming and inter-pathologist variability	Weakly supervised approach using graph-transformer	93.50 /–/–
KAT [61]	Stomach	Gastric-2K, Endometrial-2K	Over-smoothing and High computational complexity	Kernel attention transformer	94.9 /–/ 98.30
DT-MIL [62]	Lung and breast	CPTAC-LUAD and BREAST-LNM	The problem of learning an effective WSI representation	Deformable transformer model for MIL	–/ 96.92 / 99.06
TCNN [1]	Breast, Lung, etc.	MDD and RWD	Artifacts in WSIs	Transformer with CNN	96.90 / 97.40 / 98.50
CWC-transformer [63]	Breast and Lung	CAMELYON16, TCGA-LUNG and MSK	Loss of spatial information and problems associated with feature extraction in WSI	Combination of transformer and CNN	92.59 /–/ 94.88
TransPath [64]	Breast, Lung, etc.	TCGA, PAIP, PatchCam, etc.	Data annotation	Self-supervised learning transformer-based network	95.85 / 95.82 / 97.79
TransMIL [65]	Breast, Lung and Kidney	CAMELYON16, TCGA (NSCLC and RCC)	Correlation among different instances, Huge size and the lack of pixel-level annotations	Transformer-based multiple-instance learning (MIL)	94.66 /–/ 98.82
DecT [66]	Breast, Endometrium	BreakHis, BACH, and UC	Not taking into account the staining properties of histopathological images	Color deconvolution with transformer architecture	93.02 / 93.89 /–
LA-MIL [44]	Colorectal and stomach	TCGA-CRC and TCGA-STAD	Quadratic complexity of transformer architectures with respect to the sequence length	MIL local attention graph-based transformer model	–
Prompt-MIL [67]	Breast and colorectal	TCGA(BRCA and CRC and BRIGHT	Overfitting problems and a lack of annotated data	Prompt Tuning MIL transformer	93.47 /–/–
HAG-MIL [68]	Breast, Gastric, Lung, etc.	CAMELYON16, IMGC, TCGA-RCC and NSCLC	The difficulties in locating the most discriminative patches	Hierarchical attention-guided MIL transformer framework	91.40 / 89.40 / 98.20
MI-Zero [69]	Breast, cell, and lung	TCGA (BRCA, NSCLC and RCC), etc.	Computational issues and a scarcity of large-scale publicly available datasets	Transformer-based visual language pre-trained MI zero-shot transfer	70.20 /–/ –
HAG-MIL [69]	Breast, cell, and lung	TCGA (BRCA, NSCLC and RCC), etc.	Computational issues and a scarcity of large-scale publicly available datasets	Transformer-based visual language pre-trained MI zero-shot transfer	70.20 /–/ –
MEGT [47]	Kidney and breast	TCGA-RCC and CAMELYON16	The problem of learning multi-scale image representation from large images like gigapixel WSIs	Multi-scale efficient graph transformer-based network	96.91 / 96.26 / 97.30
MSPT [70]	Breast, and lung	TCGA-NSCLC and CAMELYON16	The problem of uneven representation between the negative and positive instances in bags	Multi-scale prototypical transformer-based network	95.36 /–/ 98.69
GLAMIL [71]	Breast, lung, and kidney	TCGA(RCC and NSCLC) and CAMELYON16	Overfitting, WSI-level feature aggregation, and imbalanced data challenges	Local-to-global spatial learning	95.01 /–/ 99.26

Back to article page

ISSN: 1475-925X

Contact us

Submission enquiries: journalsubmissions@springernature.com