Fig. 4From: Multimodal diagnosis model of Alzheimer’s disease based on improved TransformerThe architecture of deep 3D CNNs denoted with the sizes of each layer’s input, convolution, max pooling, and output layers and the numbers and sizes of generated feature maps. C is a convolutional layer, the P is max pooling layer, @ is the number of filters such as 15@ 3 × 3 × 3 is 15 filters whose size are 3 × 3 × 3 and P 2 × 2 × 2 is pooling layers, with a size of 2 × 2 × 2. The number below each layer represents the shape of the featureBack to article page