Table 1 A summarized view of relevant studies focused on maize disease classification.
From: Enhanced residual-attention deep neural network for disease classification in maize leaf images
Research Papers | Model | Pre-processing | Augmentation | Remark |
|---|---|---|---|---|
[15] | Attention-CNN | resizing to 168 × 168 | No | The model has 3 residual modules, 3 convolutions, and a GAP layer, followed by a SoftMax. |
[17] | Attention-based DenseNet | Filtering, resizing to 224 × 224, edge filling, sharpening | Yes | Usage of depth-separable convolutions in dense blocks along with the attention mechanism |
[22] | VGG + Inception | Filtering, resizing to 224 × 224, sharpening | Yes | The final convolutional layers of VGG were replaced with a convolution layer, batch normalization, and Swish activation. |
[23] | VGG16 | - | No | Using Otsu threshold segmentation, they differentiated the images into two categories: pixels with bright intensity and pixels with darker intensity. |
[24] | Modified DenseNet | - | Yes | A dense block layer contains the Batch normalization layer, ReLU activation, conv(3 × 3), and Dropout. The layer amid two dense blocks performs downsampling. |
[25] | Modified Inception-v3 | resizing to 256 × 256 | Yes | Three different Inception-v3-based models are developed |
[20] | EffcientNetB0 + DenseNet121 | resizing to 244 × 244 | Yes | Merged features from multiple pre-trained CNNs using a concatenation technique. |
[26] | VGG16, InceptionV3, ResNet50, Xception | resized to 224 × 224 , 299 × 299, 96 × 96 for different models | Yes | Tuned hyperparameters using Bayesian Optimization |
[27] | TCI-AlexN | - | Yes | The model improves AlexNet including a 3 × 3 × 256 convolution after the last layer for pooling. |
[29] | CNN trained from scratch | Cropping, expanding, mirroring | Yes | The potential areas for recognition are removed by sharing the features of the transmission module layer by layer. |
[33] | CNN trained from scratch | - | No | Used image feature of Neuroph studio for model training |
[34] | CNN trained from scratch | resizing to 224 × 224, rescaling | Yes | The finest model featured an image of 224 × 224 size, a batch of 32 samples, a 3 × 3 kernel size, and a train-test split of 80:20. |
[35] | CNN trained from scratch | resizing to 227 × 227, rescaling | Yes | With three times fewer parameters to train, the model showed a 3.2% rise in prediction accuracy compared to the top-performing pre-trained network. |
[36] | CNN trained from scratch | resizing to 224 × 224 | Yes | The model has three layers of convolution, each linked to a pooling layer followed by the matching dense layers. |
[37] | Plant-Xvit | - | No | The Conv2D blocks of the VGG and Inception model, with the ViT elements MLP and MHA with linear estimates, are the principal components. |