Scientific Reports

Table 1 A summarized view of relevant studies focused on maize disease classification.

From: Enhanced residual-attention deep neural network for disease classification in maize leaf images

Research Papers	Model	Pre-processing	Augmentation	Remark
[15]	Attention-CNN	resizing to 168 × 168	No	The model has 3 residual modules, 3 convolutions, and a GAP layer, followed by a SoftMax.
[17]	Attention-based DenseNet	Filtering, resizing to 224 × 224, edge filling, sharpening	Yes	Usage of depth-separable convolutions in dense blocks along with the attention mechanism
[22]	VGG + Inception	Filtering, resizing to 224 × 224, sharpening	Yes	The final convolutional layers of VGG were replaced with a convolution layer, batch normalization, and Swish activation.
[23]	VGG16	-	No	Using Otsu threshold segmentation, they differentiated the images into two categories: pixels with bright intensity and pixels with darker intensity.
[24]	Modified DenseNet	-	Yes	A dense block layer contains the Batch normalization layer, ReLU activation, conv(3 × 3), and Dropout. The layer amid two dense blocks performs downsampling.
[25]	Modified Inception-v3	resizing to 256 × 256	Yes	Three different Inception-v3-based models are developed
[20]	EffcientNetB0 + DenseNet121	resizing to 244 × 244	Yes	Merged features from multiple pre-trained CNNs using a concatenation technique.
[26]	VGG16, InceptionV3, ResNet50, Xception	resized to 224 × 224 , 299 × 299, 96 × 96 for different models	Yes	Tuned hyperparameters using Bayesian Optimization
[27]	TCI-AlexN	-	Yes	The model improves AlexNet including a 3 × 3 × 256 convolution after the last layer for pooling.
[29]	CNN trained from scratch	Cropping, expanding, mirroring	Yes	The potential areas for recognition are removed by sharing the features of the transmission module layer by layer.
[33]	CNN trained from scratch	-	No	Used image feature of Neuroph studio for model training
[34]	CNN trained from scratch	resizing to 224 × 224, rescaling	Yes	The finest model featured an image of 224 × 224 size, a batch of 32 samples, a 3 × 3 kernel size, and a train-test split of 80:20.
[35]	CNN trained from scratch	resizing to 227 × 227, rescaling	Yes	With three times fewer parameters to train, the model showed a 3.2% rise in prediction accuracy compared to the top-performing pre-trained network.
[36]	CNN trained from scratch	resizing to 224 × 224	Yes	The model has three layers of convolution, each linked to a pooling layer followed by the matching dense layers.
[37]	Plant-Xvit	-	No	The Conv2D blocks of the VGG and Inception model, with the ViT elements MLP and MHA with linear estimates, are the principal components.

Back to article page

Search

Advanced search

Quick links