Application of visual transformer in renal image analysis

Table 2 Comparison of kidney image classification algorithm performance

Algorithms	Datasets	Evaluation indicators/results	Main views and contributions	Limitations
TransMIL [16]	CAMELYON16/TCGA-NSCLC/TCGA-RCC	AUC: (CAMELYON16: 93.09%, TCGANSCLC: 96.03%, TCGA-RCC: 98.82%)	Using multiple instance learning (MIL) to explore morphological and spatial information in images	Mainly dealing with weakly supervised classification in whole-slice image (WSI)-based pathology diagnosis
CTransPath [84]	TCGA-RCC	AUC:99.1%	Self-computation of localized window attention using Swin-Transformer as a backbone model	Large amounts of unlabeled data are required
UGBC [85]	private dataset	ACC (glomerulus: 96.30%, Kidney: 96.60%)	Assigning image labels based on kidney-level classification using a high-throughput batch labeling scheme to exploit label noise immunity associated with deep neural networks (DNNs)	Dependence on the accuracy of label annotations
DenseNet201–Random Forest [86]	CT KIDNEY DATASET: Normal-Cyst-Tumor and Stone	ACC: 99.44% (cyst: 99.60%, kidney: 98.90%, tumor: 100%)	Feature extraction using deep migration learning model DenseNet-201-Random Forest	More resources are needed to train and use both models simultaneously
RCCGNet [89]	KMC-kidney dataset/BreakHis dataset	KMC-kidney (ACC: 90.14%, F1:89.06%)/BreakHis (ACC: 90.09%, F1: 88.90%)	RCCGNet contains a shared channel residual (SCR) block, which shares information between two different layers and complements each other's shared data	The model integration is complex

ISSN: 1475-925X