Skip to main content

Table 4 Transformer applications in histopathological image survival analysis and prediction

From: A survey of Transformer applications for histopathological image analysis: New developments and future directions

Method

Tissue

Dataset

Challenge

Highlight

C-index (%)

TransSurv [97]

Colorectal

TCGA-CRC and NCT-CRC-HE

Inability of the previous models to extract useful predictive features from the multi-modal data

Transformer-based multi-modal feature fusion network

82.20

PG-TFNet [98]

Colorectal

TCGA-CRC

Inability to make use of the powerful representation learning capabilities of the neural networks

Transformer-based multi-modal feature fusion network

81.60

ESAT [99]

Lung

NLST and CHCAMS

Using a pre-selected subset of main patches or patch clusters as input instead of using the entire WSIs

Make use of the ViT backbone with convolution operations.

73.00

MCAT [26]

Bladder, Breast, Lung, Uterine

BLCA, UCEC, BRCA, BMLGG, LUAD

Computational complexity and large data heterogeneity gap between genomics and WSIs

Multimodal Co-Attention Transformer for Survival Prediction

65.30

HiMT [100]

Bladder, Breast, Lung, Brain, etc.

BLCA, BRCA, UCEC, LUAD, LGG, etc.

High computational cost of extracting patches from WSIs, which results in a large bag size

Hierarchical-based multi-modal Transformer framework

67.30

MaskHIT [82]

Breast, Lung, etc.

TCGA

Huge number of network parameters and insufficient labeled data

Masked pre-training of Transformers

61.20

SURVPATH  [101]

Breast, Bladder, Stomach, etc.

TCGA

Capturing dense multimodal interactions between different modalities

Memory-efficient multimodal Transformer

62.90

Surformer  [102]

Bladder, Breast, Lung, etc.

TCGA (BLCA, BRCA, LUAD, etc.)

Weak interpretability problems of the previous computational pathology model

Pattern-perceptive survival Transformer-based Network

68.70

HVTSurv [103]

Bladder, Breast, Lung, etc.

TCGA (BLCA, BRCA, LUAD, etc.)

The challenges of exploring contextual, spatial, and hierarchical interaction in the patient-level bag

Hierarchical ViT-based architecture

63.40

HMCAT [104]

Low Grade Glioma

TCGA-GBMLGG

The significant disparity between the spatial scales of radiology images and WSIs

Hierarchical multimodal co-attention transformer-based network

79.60

AMIGO [3]

Ovarian and bladder

InUIT and MIBC

ignoring specific details regarding the individual cells in a tile image

Sparse multi-modal graph Transformer-based network

61.00

SeTranSurv  [25]

Breast, Lung, Ovarian

OV, LUSC, and BRCA

Ignoring the important role of spatial information in patches and the correlation between patches and WSIs

Integration of patch features through self-supervised learning and Transformer

70.50