Abstract
Wheat is one of the world’s most widely cultivated cereal crops and is a primary food source for a significant portion of the population. Wheat goes through several distinct developmental phases, and accurately identifying these stages is essential for precision farming. Determining wheat growth stages accurately is crucial for increasing the efficiency of agricultural yield in wheat farming. Preliminary research identified obstacles in distinguishing between these stages, negatively impacting crop yields. To address this, this study introduces an innovative approach, MobDenNet, based on data collection and real-time wheat crop stage recognition. The data collection utilized a diverse image dataset covering seven growth phases ‘Crown Root’, ‘Tillering’, ‘Mid Vegetative’, ‘Booting’, ‘Heading’, ‘Anthesis’, and ‘Milking’, comprising 4496 images. The collected image dataset underwent rigorous preprocessing and advanced data augmentation to refine and minimize biases. This study employed deep and transfer learning models, including MobileNetV2, DenseNet-121, NASNet-Large, InceptionV3, and a convolutional neural network (CNN) for performance comparison. Experimental evaluations demonstrated that the transfer model MobileNetV2 achieved 95% accuracy, DenseNet-121 achieved 94% accuracy, NASNet-Large achieved 76% accuracy, InceptionV3 achieved 74% accuracy, and the CNN achieved 68% accuracy. The proposed novel hybrid approach, MobDenNet, that synergistically merges the architectures of MobileNetV2 and DenseNet-121 neural networks, yields highly accurate results with precision, recall, and an F1 score of 99%. We validated the robustness of the proposed approach using the k-fold cross-validation. The proposed research ensures the detection of growth stages with great promise for boosting agricultural productivity and management practices, empowering farmers to optimize resource distribution and make informed decisions.
Similar content being viewed by others
Introduction
Smart agriculture, known as precision farming, utilizes the Internet of Things (IoT) techniques to enhance agricultural activity by monitoring critical parameters such as weather, soil moisture, and crop health. IoT technology, such as wireless sensors and data collection systems to continuously monitor these crucial parameters, allows farmers to carry out their activities more efficiently and effectively1,2. It allows farmers to use collected data from the field via sensors to increase their effectiveness and efficiency in cultivation3. Precision agriculture is applied to maximize crop growth and quality. As a result of this, resource management is optimized. It has numerous uses and offers many applications, ranging from pest detection to asset monitoring4. Nevertheless, efficiency in growth stage identification has been a substantial concern with smart agriculture due to various challenges. The traditional approaches are unable to offer precise solutions for crop growth stage identification, hence resource wastage5.
Wheat, scientifically known as Triticum aestivum L., is one of the most commonly cultivated basic cereal crops. Its global consumption has served as a staple for nearly 40% of the global population over 10,000 years. Its widespread cultivation and consumption around the world have tremendous implications for the food security issue today6. It contains calories, proteins, minerals, and fibers that are consumed, and wheat is also relevant in several sectors, including brewing, essential oils, and animal feeds around the globe7. Wheat remains a significant agricultural product owing to its high nutritional value, adaptability, and worldwide importance8. Since it is a globally significant crop, it is critical and necessary to monitor its health and growth process.
Wheat growth mainly includes plant height, leaf numbers, and tiller and dry matter accumulation. However, growth process factors such as wheat varieties, sowing dates, tillage, irrigation frequency, planting density, timing, and nitrogen are critical for optimizing yield in wheat cultivation. Nitrogen at 125 kg/ha increased plant height, protein content and days to anthesis significantly while decreasing thousand seed-weight, thereby affecting wheat development overall9,10,11. Genetic, physiological, and agronomic interventions, resource preservation measures, and precision breeding approaches are required for optimal wheat growth12. Increasing the efficiency of inputs, adopting sustainable practices, and applying modern technologies are all critical to maximizing yield capacity along with environmental sustainability13.
Technological advances have made it possible to develop models that use image data and hierarchical class structures to properly classify crop types and growth stages14. However, minimal research has been conducted on identifying growth stages in wheat crops because most studies have concentrated on yield prediction and weed identification. Several studies have encountered challenges, such as low accuracy rates, undefined image datasets, and training time complexity. Notably, the study15 used convolutional neural networks (CNN) to successfully distinguish between barley and wheat growth phases with moderate results for major stages, respectively. In16, the MRF-SSM methodology achieved 89% results in accurately determining the growth stages of winter wheat using VV polarization from Sentinel-1A time-series images. Similarly, several studies17,18,19 focused on yield estimation, disease, and weed detection.
The domain of wheat crop stage detection is not very well studied in the existing literature. This study intends to fill the research gap concerning crop sowing timing on crop development, phenology, and yield based on introducing a new framework that combines multiple data sources and predictive modeling. In the current study, seven wheat growth stages were covered, including ’Crown Root’, ’Tillering’, ’Mid Vegetative’, ’Booting’, ’Heading’, ’Anthesis’, and ’Milking’. The analysis concentrates on assessing the influence of the different sowing dates on crop yield by frequently capturing the entire crop field during the growth process using imagery, reinforced by a set of surveys in the field that record the agronomic parameters to understand the dynamics of crop growth. Following this, a wide variety of prediction models are utilized on the dataset that builds up finally and through all the growth stages to identify the optimal period to capture crop progression. The suggested framework makes the following contributions.
-
This study proposed a novel hybrid transfer learning approach MobDenNet, combining the MobileNetV2 and DenseNet-121 models to procure outstanding results for wheat crop growth stage prediction. The collection of real image data from wheat fields for crop stage prediction. A dataset of 4,110 images was captured that illustrate seven growth stages of wheat crops. The collected data is later used for experiments for improved prediction accuracy.
-
Preprocessing of collected image data is carried out. The selection of models and data preprocessing, including balancing of data and the development of image data augmentation, is a strategy that ensures more accurate results.
-
Several deep and transfer learning models are optimized and implemented for improved predictive performance. CNN, MobileNetV2, DenseNet-121, InceptionV3, and NASNet-Large models are implemented. Model performance was further enhanced using hyperparameter adjustment
-
K-fold cross-validation is used for performance validation. In addition, models from existing literature are selected for performance analysis. The models’ computational cost is also evaluated.
The subsequent sections of the paper are structured in the following manner: section “Literature review” presents a thorough analysis of the available research on forecasting wheat development stages using several stage images. A comprehensive explanation of the innovative approach can be found in section “Proposed methodology”. Section “Experiments and observations” outlines the experimental evaluations conducted in this study. The study’s findings and consequences are detailed in section “Conclusion”.
Literature review
Examining existing studies is essential as it enables the identification of areas that need further comprehension and improves the accuracy of computer models used in agriculture. Examining the growth stages of wheat, predicting crop yields, and identifying crop diseases is crucial for improving agricultural practices and ensuring food security. There has been a lack of extensive research on identifying growth stages. Therefore, we also investigate attempts to forecast crop yields and identify diseases in wheat crops. The majority of existing literature on the application of machine and deep learning in wheat cultivation emphasizes disease and weed detection using images19, determining optimal parameters for yield maximization20, and concentrates on water resources21. Several studies15,22,23,24,25,26,27,28 can also be found on crop growth stage prediction.
The authors aimed to develop a learning model to identify different stages of wheat growth, focusing on being efficient in terms of computation and energy usage22. The approach involved using a model to detect wheat growth phases and a dynamic migration algorithm based on reinforcement learning. The innovative dynamic migration algorithm showed a decrease in energy consumption by 128.4% and an increase in efficiency by 121.2% when compared to other methods. Further studies could focus on enhancing the model’s precision by adjusting hyperparameters and exploring improvements specifically designed for mobile edge computing applications in agriculture.
In23, the authors introduced the WE3DS dataset back in 2023, creating a collection of RGB-D images specifically designed for identifying types of plants in agricultural environments. The dataset consists of 2568 images that combine color images with distance maps along, with labeled masks to show the ground truth. These images were taken in light using an RGB-D sensor with configured RGB cameras. To train models, they used techniques like random forest (RF) and CNN such as U-net and FCN, achieving an intersection over union (mIoU) accuracy of up to 70.7%. The main goal of the study was to segment plant species under natural lighting conditions using the WE3DS dataset as a reference point for evaluating models trained on RGB, RGB-D, and D-data. With a total of 17 plant species included, this dataset marked a step, in simulating real-world crop farming scenarios under natural lighting conditions. The researchers also plan to enhance the dataset by annotating a set of 3656 RGB-D images in research endeavors.
The main goal24 was to distinguish between the growth stages of wheat and barley by using close-up images with convolutional neural networks (ConvNets) and compare their effectiveness with traditional machine learning methods. It explored three different approaches: feature extraction combined with a support vector machine, training ConvNets from scratch, and training ConvNets with transfer learning. It is worth noting that the ConvNet utilizing transfer learning showed the best performance, achieving accuracy rates of 93.5% and 92.5% for wheat and barley principal growth stage classification tasks. Respectively, future research endeavors may center on addressing the identified study limitations and investigating additional machine learning (ML) techniques to enhance performance further.
The study25 came up with a way to track wheat lodging that tackles two challenges: calculating lodging area and analyzing lodging at different growth stages. The technique suggests using deep learning methods for segmentation, quick prediction, and reliable generalization. Specifically, they utilize the SegFormer B1 model to determine lodging areas achieving an accuracy rate of 96.56% and showcasing strong generalization skills. Notably, the model, trained on a dataset of growth stages, outperforms models trained on single-stage datasets with a mIoU of 89.64%. This exceptional performance allows for its use throughout the wheat growth cycle. By using drones to capture images of lodged wheat, their approach enables invasive monitoring and precise calculation of lodging areas based on image data. This method is well suited for situations requiring real-time efficiency and accuracy in disaster monitoring.
The authors explored the identification of winter wheat at growth stages using sensing data analysis in29. They utilized Sentinel 2 sensing images to extract features for classification purposes. Two models were created: a forest classification model and a deep U-Net semantic segmentation model both leveraging bands from the Sentinel 2 images. The best accuracy in detecting winter wheat was achieved during the jointing heading phase, with the random forest classification model achieving an accuracy rate of 96.90%. Furthermore, the accuracy with two deep learning models for winter wheat extraction based on municipal statistical data reached 96% and 88%.
The study21 delved into the water requirements, efficiency of water use, crop coefficient, and depletion of soil water availability across growth stages of wheat in the New Delhi area. Measurements of evapotranspiration were taken using a lysimeter placed within the crop region, while meteorological data was collected from an observatory near the experimental farm. The crop coefficient (Kc) was calculated using established formulas and relationships considering factors like wilting point, soil bulk density, and depth of the root zone, and it attained high values of (1.1–1.2). The highest water usage was noted during the elongation phase, resulting in effects on yield when soil water availability dropped by 50% during crucial growth stages. The crop’s productivity was negatively impacted, reducing it by as 18% from its maximum yield.
The authors employed machine vision and deep learning techniques in19 for the real-time detection of weeds. The data-gathering process involved the acquisition of 6000 images depicting various weed and wheat crop scenarios under different weather conditions at the research farm of PMAS-Arid Agriculture University. Through the utilization of PyTorch, deep learning models exhibited superior performance compared to TensorFlow, yielding higher precision rates for both weed (0.89) and wheat plant (0.91) identification, with inference times recorded at 9.43 ms and 12.38 ms per image, respectively, utilizing an NVIDIA RTX2070 GPU. However, it is important to note that these inference times were specific to the GPU model and image dimensions used in this study and may vary in alternative settings. Moving forward, future research endeavors should explore the incorporation of supplementary sensors or data sources further to enhance the accuracy and efficiency of weed detection.
The evaluation of how well winter wheat grew using Sentinel-2 data is carried out in26 and compared with established standards. Across 75 fields in Ireland and the UK, researchers observed five characteristics of winter wheat at crucial growth stages. They developed models to predict crop growth, finding that models tailored for growth stages performed better than those covering the season. The results showed promising performance overall, with stage models achieving R2 values between 0.72 and 0.87. Future studies could focus on improving models for precise monitoring purposes.
The study15 conducted a review on advanced image processing techniques for tracking the growth of cereal crops. Those parameters include but are not limited to canopy cover, biomass, leaf area index, chlorophyll levels, and growth stages. Image processing methods conforming to the combination of those factors have been discussed in the study to find an optimal solution to their extraction from high-resolution images. Moreover, an analysis of the current approaches to image processing in the context of cereal crop monitoring has been made as it activates multiple hindrances, among which is the lack of proper lighting, camera position, and possible obstructions. A comparative analysis of those factors has been implemented in the research to identify the fields for better performance and elevated accuracy. The relevance of research activities before is supported by the research on driving factors important for both isotopic measurements and other fields.
The study27 proposed an approach to use agricultural knowledge to feed a network design meant for careful monitoring of the crop’s growth during the season. The researchers used a domain-guided neural network (DgNN) that has a long short-term memory (LSTM) architecture and an attention mechanism that varies the importance with which multiple factors feed the crop, and in this approach was fed using the sensing data from Iowa, U.S.A., during the years 2003–2019, accompanied by USDA-collated crop progress reports as a reference, and the DgNN model followed specifically corn-growth. In comparison to full dense-only neural network structures and the Hidden Markov Models, the DgNN model is outperforming other models. In total, across all the growth stages, it achieves a 4.0% better Nash-Sutcliffe efficiency and gets an additional 39% well-similarity weeks during the test years. Therefore, the approach can be used on other crops to allow for the near real-time examination of the growth stage and provide a basis for future investigations.
In20, the authors resolved to determine productivity with the assistance of synthetic aperture radar (SAR) data. For the winter wheat fields at the growth and maturity stages, the RAM time series data is taken for the investigation of SAR Sentinel-1 satellites. Their methodology included curves to the SAR time series, studying derivatives to pinpoint key stages in crop growth, and examining correlation matrices for predicting yields. They found that the day of the year with the VH/VV value correlated with yield (\(\hbox {r} = -0.56\)), while a longer duration of “full” vegetation showed a positive correlation with yield (r = 0.61). In the period of peak vegetation, the essential seasonal variation is (p = 0.042), the midway of growth (p = 0.037), the growing season duration is (p = 0.039), and yield (p = 0.016) was perceived. The study observed variations in various growth parameters and yields which aligned with existing knowledge of crop phenology. Further research is needed to explore uncertainties and the practicality of this approach in agroecosystems.
The study28 worked to predict the maturity dates of winter wheat in the study by combining MODIS LAI data with the WOFOST model using a data assimilation approach. They integrated sensing information into the WOFOST model to forecast when winter wheat would reach maturity. The WOFOST model was run with reinitialized parameters and TIGGE data for weather input. The regional predictions for maturity dates showed a high determination coefficient \(R^2\) of 0.94 and a low root mean square error (RMSE) of 1.86 days. Future studies could focus on improving the data assimilation framework and enhancing the accuracy of maturity date forecasts. Similarly,30 introduced a method to detect wheat crops by analyzing MODIS-TERRA MOD13Q1 data and applying a noise clustering soft classification technique. They optimized date combinations and vegetation index parameters to improve the accuracy of wheat crop classification. Separability analysis was used to refine date combinations, followed by noise clustering classification. The resolution of the noise clustering classifier parameter was 1.6 \(\times\) \(10^4\) used for the wheat crops identification. Based on the growth stage of the soil-adjusted vegetation stages index assessment, the SMA’s result achieved the highest area under the ROC curve for the detection of wheat crops.
Table 1 provides a brief overview of the discussed works. It can be seen that previous analyses of wheat crop growth stage identification have many limitations in terms of various techniques and systems. Traditional ways of observing wheat growth stages are inefficient methods that require a lot of labor and time and are exposed to human error. Moreover, most previous research used elementary machine learning models, which perform worse than more complex methods in most cases. Furthermore, there has been limited research on advanced deep-learning models that incorporate different architectures of transfer learning. Comprehensive details of the identified research gaps are provided here.
-
The traditional approaches and use of simple remote sensing with basic machine learning models cannot offer the level of accuracy and generalization required by practical applications.
-
For the most part, no large-scale and formal dataset is dedicated to wheat growth stage classification. Additionally, available datasets are generally limited in size and scope.
-
A major gap identified is the challenge in result performance analysis, especially regarding time complexity and accuracy.
-
The practical application of these models in agricultural settings is impeded by the absence of farmer-friendly real-time tools.
This study aims to overcome these issues by incorporating advanced techniques and modern models, which will lead to more accurate and reliable identification of development stages in wheat crops.
Proposed methodology
The research concentrated on identifying the wheat crop growth stage using a comprehensive approach. The primary materials used to test the experimental research in this investigation include wheat crop images from the region of Southern Punjab, Pakistan. The image detector software based on camera sensors is used to identify the growth stages in wheat crops. A total of 4,110 images are obtained from various crops at different stages of their growth. These high-resolution images depict the complete journey from the emergence of initial crown roots to the final milking stage.
Figure 1 presents the proposed methodological framework based on a standardized dataset of the wheat growth phases. The dataset generation process began with gathering an extensive set of images that capture the seven stages of wheat crop development. Ensuring that all classes were adequately represented, we explored the collected dataset, utilized data augmentation techniques, and addressed class imbalance. Subsequently, the dataset is divided into an 80% training and 20% validation portion. Deep and transfer learning methodologies are employed to create predictive models that use the training data to identify the wheat crop growth stages, but the results are not favorable. Following this training process, a new transfer learning approach, MobDenNet, is introduced, and the model is fine-tuned to produce highly accurate detection of the wheat crop stages. The steps for the proposed methodology for the identification of wheat crop growth stages are as follows.
-
Step 1: The initial phase involves data collection and preprocessing of the image data, which includes balancing the data and implementing image data augmentation strategies. This ensures more accurate alignment and improves the quality of the dataset for further analysis.
-
Step 2: The dataset is then divided into two portions: a training set and a testing set (80:20) ratio. The training set is used to train applied deep and transfer learning models, while the testing set is employed to evaluate the models’ performance and generalization capabilities.
-
Step 3: The performance of five models, CNN, MobileNetV2, DenseNet-121, InceptionV3, and NASNet-Large, is evaluated for predicting the wheat growth stages. This step helps identify the most effective models for the task.
-
Step 4: This research proposed a hybrid transfer learning method called MobDenNet, which combines the frameworks of MobileNetV2 and DenseNet-121. This novel approach is developed to achieve outstanding results in growth stage prediction.
-
Step 5: The evaluation of each model is conducted using appropriate metrics such as accuracy, precision, recall, and F1 score. This step ensures the effectiveness and reliability of the models in accurately identifying the growth stages of wheat crops.
-
Step 6: Model performance is further enhanced using hyper-parameter tuning and k-fold cross-validation methods while being mindful of computational costs. These techniques improved model precision and robustness.
Phase 1: Wheat crop growth stage image collection
This research created a comprehensive dataset by capturing the seven stages of wheat growth in the time frame between November 2023 and April 2024. The wheat field under consideration is situated in Khanpur, Southern Punjab, Pakistan, and spans a land of 10 acres. A total of 4,110 images were taken during morning and evening sessions. These photos provide a detailed view of each stage’s features, as the camera positions were maintained at heights of 2–4 feet. The photos were captured using an iPhone 14 Pro. It comes with a 48-megapixel main camera with an f/1.78 aperture and a whopping 48MP 1/1.28 sensor with Quad-Bayer color filters, giving the images high resolution.
This research leverages a corpus of images symbolizing the seven distinct stages of development for experimental usages, as depicted in Fig. 2. The dataset encompasses 4,110 files, organized into seven stages, ’Crown root initiation’, ’Tillering’, ’vegetative growth’, ’Booting’, ’Heading’,’ Anthesis’, and ’Milking’. A thorough scrutinization uncovered a variance in the number of images across these stages. This imbalance, accentuated in Fig. 2, underscores the necessity for techniques to address this issue to maximize accuracy and overall performance.
The original dataset is composed of 4,110 images. The image counts for each class are as follows: Crown root 478, Tillering 1306, Mid vegetative phase 645, Booting 272, Heading 536, Anthesis 417, and Milking 456, the dataset is highly imbalance as shown in Fig. 3.
Phase 2: Image preprocessing and data augmentation
To mitigate the data imbalance issue, data augmentation31 is used in this study. Data augmentation, which involves modifying the existing data in a variety of ways to increase the diversity and size of a dataset employing various parameters, as explained in Table 2.
The data augmentation technique is applied to the growth stages we have developed and gathered, which were, in particular, we did not apply the data augmentation to the Tillering and Mid Vegetative stages since they already have a sizeable number of images. As a result of the augmentation process, the total number of images with augmentation data became 4496 to accommodate the balance of five classes as illustrated in Fig. 4. The tillering stage has 1,306 images. Hence, for the selected study, the strategy used approximately 650 images from the Tillering stage. After augmentation, image counts for each five classes are as follows: Crown root 640, Booting 646, Heading 639, Anthesis 639, and Milking 640. Therefore, this extensive data augmentation process improves the balance of the dataset and its representative nature, which in turn improves the robustness and generalization capabilities of the models.
Phase 3: Image data splitting
Splitting datasets into subsets is fundamental in supervised machine learning as it facilitates model training and performance assessment. The 80:20 splitting ratio is used as the study split image data into 80% for training and 20% for testing. First, the training subset is used to fit parameters and improve the model learning potential. Second, the test subset is statistically unbiased and assesses how the machine will perform using new data unseen before. Therefore, the test subset offers an objective metric of the machine generalization while maintaining overfitting.
Phase 4: Applied transfer learning and artificial intelligence techniques
In this section, we present deep and transfer learning techniques utilized in the study. Transfer learning32 comprises pre-trained models applied to make predictions. In other words, an existing pre-learned feature helps to make an accurate prediction. The methods of transfer learning-based fine-tuning use a pre-existing trained neural network by training most of the layers of the network with the new dataset. Also describes the details of the employed methods in growth stage classification, such as the configuration parameters and the structural parameters of the neural networks.
Artificial intelligence in agriculture has already become a new word in precision farming33. Thus, an example of predicting the growth stages of wheat crops is a distribution of the vital aspects of crop management and cultivation. In the future, the combination of the forenamed practices with AI can become a key development34,35,36,37,38. Therefore, farmers would be able to apply innovations already developed to the hard-to-overcome cycle of wheat cultivation.
Convolutional neural network
The application of CNNs39 in the field of agriculture, specifically for predicting the growth stages, has proven to be an exceedingly potent approach. This methodology encompasses analyzing images that document the differing stages of crop maturation. The architecture of the CNN leveraged in this investigation is depicted in detail in Fig. 5. The model architecture begins with a Conv2D layer of 16 filters and ReLU activation, resulting in an output shape of (None, 222,222,16). This is further processed by a flattened layer that reshapes the output into (None, 790272). The model ends with a final output layer of 7 units with softmax activation for classification. The total parameters of the model are 5,531,911.
MobileNetV2
MobileNetV2 stands as an effective and lightweight convolutional neural network structure tailored for image classification tasks40, notably suitable for mobile and embedded systems facing constraints in computational resources. It has inverted residuals and linear bottlenecks, which increase efficiency while decreasing the number of parameters for performance; making it ideal for resource-constrained environments. Using fewer channels in bottleneck layers helps inverted residuals to cut down on computation. Meanwhile, model efficiency is preserved by linear bottlenecks through the application of linear activation functions. It serves as a cornerstone in the development of lightweight and efficient models for computer vision applications on mobile platforms.
As shown in Fig. 5, the MobileNetV2 model parameter configuration and layer architecture have also been analyzed. The output shape of the input layer is (None, 224, 224, 3) which is followed by a standard convolutional layer of CNN. Several inverted residual blocks are there in it that use a linear bottleneck, shortcut connections, and the expansion layer introduced to make the information flow easier and increase the input channels before applying to the following depthwise convolutions. The network ends with an output layer where the number of parameters is 1799, and a fully connected layer with softmax activation is used here for a classification task.
DenseNet-121
The DenseNet-121 architecture is a CNN41 that features dense connections between its layers, enabling access to the feature maps of all the preceding layers. This structure ensures optimal feature propagation and reuse and contributes to solutions to the vanishing gradient problem. The DenseNet-121 architecture unequivocally integrates 121 layers, rendering it a blend of complexity and computational efficiency, which is why it is optimal for many computer vision tasks, including image classification, and demonstrates competitive performance compared to other popular architectures.
The DesnseNet-121 model’s parameter configuration and layer architecture analysis are displayed in Fig. 5. The output shape of the input layer is( None,224,224,3 ) followed by the DenseNet base layer final output shape of (None, 7, 7, 1024). Then, the ReLU activation function leads into our final output layer with 7 units and softmax activation for classification tasks.
NASNet-Large
NASNet-Large is a deep CNN5 model primarily used for more complicated tasks and bigger datasets in computer vision. Generally, NASNet Large achieves comparatively greater performance on more intricate tasks and larger datasets, such as ImageNet, than the base NASNet model, but at the cost of more computing time and larger model complexities.
The NASNet-Large model’s parameter configuration and layer architecture analysis are displayed in Fig. 5. The initial layer accepts input arrays of shape (None, 224, 224, 3) and passes them to a convolutional layer, producing feature maps of shape (None, 111, 111, 96). The bulk of the network consists of the complex NASNet Large base architecture containing many layers transforming the representations into myriad higher-level feature maps. Ultimately, a GlobalAveragePooling2D layer reduces these final feature maps into a single vector fed into a dense layer of 32 units using a rectified linear unit (ReLU). A concluding dense output layer of 7 units with softmax activation then performs the classification task.
InceptionV3
An inception V3 is a more powerful deep CNN crafted to perform image classification tasks with maximum efficiency. It is designed to produce accurate performance with a significantly reduced computational demand as compared to many other network architectures.
The InceptionV3 model’s parameter configuration and layer architecture have also been analyzed in Fig. 5. The first layer gives an output shape of (None, 224, 224, 3). This is followed by a 32-filter Conv2D layer with ReLU activation with an output shape of (None, 111,111,32). Then, a batch normalization layer is applied, followed by a GlobalAveragePooling2D layer. Then, a dense layer with 1024 units and ReLU activation is incorporated. Lastly, a final output layer comprising 7 units uses the softmax activation function for the classification task.
Phase 5: Novel hybrid transfer learning-based proposed approach
This study introduces a novel approach, MobDenNet, which combines the architectural layers of both MobileNetV2 and DenseNet-121 transfer learning-based models. This marks the first instance of utilizing a hybrid transfer learning network design for the detection of wheat crop growth stages. By merging MobileNetV2 and DenseNet-121 layers, the model architecture aims to effectively capture patterns from historical data and generalize well to unseen data instances. We meticulously analyze the architecture and configuration parameters of the proposed model.
The integration of MobileNetV2 and DenseNet-121 in the hybrid MobDenNet model is a deliberate design choice aimed at harnessing the unique strengths of both architectures.
-
MobileNetV2 contributes its lightweight, efficient operations, making it ideal for initial feature extraction in resource-constrained environments.
-
Meanwhile, DenseNet-121 complements this by focusing on deep feature reuse and enhanced gradient flow, ensuring the model captures complex and nuanced patterns effectively.
Rather than introducing inefficiencies, this hybrid approach balances efficiency and performance by strategically assigning computational resources to different layers based on their functional requirements. MobileNetV2’s simplicity enables quick and efficient preliminary processing, while DenseNet-121 ensures robust feature extraction without compromising the model’s ability to generalize.
Empirical results demonstrate that MobDenNet achieves a superior trade-off between computational efficiency and predictive accuracy compared to standalone architectures. This synergy validates the compatibility of the two design philosophies and underscores the hybrid model’s adaptability to diverse tasks, offering both resource efficiency and high performance. We appreciate your feedback and believe this perspective strengthens the justification of our approach.
Table 3 provides a detailed breakdown of the configuration parameters for the proposed model, outlining the units and settings employed during model construction. Additionally, Fig. 6 visually illustrates the architecture analysis, showcasing the flow of wheat crop growth stage detection image data from input to prediction layers using the proposed approach. The model architecture leverages a combination of pooling, dropout, flatten, and fully-connected layers to realize this novel hybrid design. Algorithm 1 shows the flow of the proposed MobDenNet model.
In the proposed network, initially, the input image is processed through the MobileNetV2 architecture, which is a series of depthwise separable convolutions. The output of MobileNetV2 is a feature map with a shape of (None, 4, 4, 1280). Then, the input image is passed through the DenseNet-121 architecture, which is a densely connected convolutional network. The output of DenseNet-121 is a feature map with a shape of (None, 4, 4, 1024). As shown in Fig. 6, these two feature maps are concatenated along the channel dimension, resulting in a combined feature map with a shape of (None, 4, 4, 2304). The concatenated feature map is flattened into a vector of size (None, 36864). A dropout layer with a rate of 0.2 is used in this vector to reduce overfitting. A dense layer with 7 units and a softmax activation function is utilized as the last output layer for classification tasks.
The proposed MobDenNet model provides benefits over current cutting-edge models. Its structure, MobDenNet, has been carefully fine-tuned leading to decreased intricacy in comparison to models. Upon examination, it is clear that the suggested model is effective in handling data processing tasks. The incorporation of two sets of transfer learning network layers in the suggested model has produced scores for identifying wheat crop growth phases.
Phase 6: Hyperparameter tuning across neural network models
Hyperparameter optimization is an essential step of any neural network technique, and in the wheat growth stage prediction, it is more important. The hyperparameter tuning process involves a recursive cycle of training and testing to identify the optimal hyperparameters. A k-fold cross-validation technique is employed to ensure the selection of the best-performing parameters. A systematic approach is adopted to meticulously explore and adjust key hyperparameters, including the learning rate, batch size, and the number of hidden layers, prior to model training. Based on the implementations, the hyperparameter tuning analysis is presented in Table 4. The goal is to pinpoint the setup that could deliver more accurate outcomes. Following the implementation, we used k-fold cross-validation and iterative training and testing processes to identify the hyperparameters that suit our proposed approach.
Experiments and observations
In this section, we delve into a comprehensive examination of the research outcomes, accompanied by in-depth discussions. We also provide a comparative depiction of the experimental setup and performance results based on image data of wheat crops at varying growth stages. The results indicate the potential utility of these models proved as a valuable asset for agricultural practitioners, facilitating the identification of stages and the implementation of targeted interventions to enhance outcomes.
Experimental setup
The experimental environment is constructed using a cloud-based Notebook, namely Google Colab, for this study. In measuring the efficacy of neural network approaches, we used accuracy, F1, precision, and recall as the performance indicators, which served as the benchmarks for gauging the effectiveness of the models for recognizing wheat crop growth stages. Table 5 highlights the aspects of the environment that were used in the study.
Results with augmentation after data splitting
To ensure the reliability of model evaluation, the dataset was first split into training and testing subsets using an 80:20 ratio. Following the split, data augmentation was applied exclusively to the training set, with no augmentation performed on the testing set. Results in Table 6 indicate that the proposed hybrid model, MobDenNet, employed after augmentation, achieved an average accuracy of 94% in identifying wheat crop growth stages. The evaluation demonstrates strong performance across all growth stages, with particularly high precision and recall for classes such as “Booting” and “Milking.”
The confusion matrix results for the MobDenNet model demonstrate its effectiveness in accurately classifying wheat growth stages, even with data augmentation applied only to the training subset, as depicted in Fig. 7. The model excelled in identifying stages like “Crown Root” and “Milking,” achieving high accuracy with minimal misclassifications. However, slight errors were noted in stages such as “Tillering” and “Mid Vegetative Phase,” though these did not significantly impact overall performance. Compared to baseline models like CNN, NASNet-Large, and MobileNetV2, MobDenNet showed substantially reduced false positives and misclassifications, particularly for challenging stages like “Crown Root” and “Milking.” With an overall accuracy of 94%, the model’s performance, aided by exclusive data augmentation post-split, reflects its robustness and suitability for real-world agricultural applications requiring precise and scalable predictions.
The training and validation curves in Fig. 8 demonstrate the MobDenNet model’s significant improvement over 40 epochs. Starting with moderate performance, the model’s accuracy increased from 17.03 to 89.98% for training and from 34.91 to 94.18% for validation. The decrease in training and validation loss, from 6.5 to 0.3 and from 1.0 to 0.16, respectively, reflects the model’s ability to generalize well to unseen data. While there were some fluctuations in the middle epochs, indicating fine-tuning, the model stabilized by the final epochs. With a final validation accuracy of 94.18% and a low validation loss, the results confirm the model’s effectiveness and robustness for precise and scalable agricultural applications.
The comparative analysis of the neural network architectures, as illustrated in Fig. 9, reveals the superior performance of the proposed MobDenNet model, which combines MobileNetV2 and DenseNet-121. This hybrid model outperformed other networks such as MobileNetV2, DenseNet-121, Inception V3, and NASNet-Large, all of which achieved moderate accuracies ranging from 72 to 95%, MobDenNet yielded a high accuracy of 99%. These initial results were obtained from a dataset where data augmentation was applied prior to splitting the training and testing sets. For clearer and more reliable outcomes, further augmentation was applied exclusively to the training data, which was then injected into the proposed MobDenNet model. This refined approach resulted in an impressive 94% accuracy on unseen test data.
The significant improvement in accuracy with the MobDenNet model highlights its superior generalization capability across wheat growth stages. By applying augmentation exclusively to the training set, we not only preserved the integrity of the testing data but also enhanced the model’s ability to handle data variations effectively. This method mitigates overfitting, leading to better performance on unseen data compared to models that only relied on initial augmentation techniques. The 94% accuracy thus represents a notable advancement, offering a more precise, reliable, and scalable solution for agricultural applications than other existing deep learning models.
Analysis of runtime computational complexity
The proposed MobDenNet model demonstrates computational efficiency and consistency over 40 epochs as shown in Table 7. Training times progressively decreased from 2.71 s in the first 10 epochs to 2.02 s by epochs 11–20, stabilizing between 2.03 and 2.25 s for later epochs. This highlights the model’s resource efficiency, adaptability, and robust performance during extended training.
K-fold-based cross-validation results
For validation purposes, we utilized the k-fold cross-validation methodology with 5 folds. The efficacy of the proposed approach has been evaluated through k-fold cross-validation, and the results are provided in Table 8. The hybrid model is validated five-fold. Among them, the current research had got 97% average accuracy with a 0.12 loss score in our proposed model. Based on this cross-validation, one can infer that the proposed approach can successfully classify the growth stages of the wheat crops in a generalized way.
State-of-the-art studies comparisons
Table 9 presents the performance comparison of recent state-of-the-art studies and the newly proposed method. To make a fair comparison, only studies conducted between 2020 and 2024 are included. The most recent studies were based on deep learning, where the highest performance by previous studies is 93%, which is quite moderate. The proposed method is based on the transfer learning method so that the performance is enhanced. In comparison, all the studies have done their research on wheat maturity dates or stages except these23,29, which focused on the segmentation of plant spices and tracking wheat lodging. The proposed MobDenNet method uses architectural layers of two Transfer learning methods MobileNetV2 and DenseNet-121, which classified the seven growth stages of wheat with 94% accuracy. The current study has the highest performance compared to state-of-the-art techniques.
Study limitations and discussions
The dataset used in this research was indeed collected from a single geographic location, as our focus was on conducting initial experimental research to validate the methodology. We recognize the importance of geographic diversity to enhance the robustness and applicability of the model. Therefore, in future work, we plan to expand our dataset collection to include diverse geographic locations across the globe. This will allow us to evaluate the model’s performance in varying environmental and agronomic conditions, ensuring broader applicability and reliability. Regarding the model’s generalizability, we have taken measures to mitigate this limitation by ensuring that the dataset captures images with diverse features and details that are representative of the crop at various growth stages. As the same crop exhibits similar growth patterns across different geographic regions, our dataset includes comprehensive visual features essential for accurate crop stage prediction.
We addressed the potential risk of overfitting and inflated performance metrics due to image repetition by employing the k-fold cross-validation technique during model evaluation. This approach ensured that the dataset was divided into multiple folds, with the model being trained on different subsets and tested on the remaining ones. The validation results demonstrated that the model maintained consistent performance across all folds, indicating that it successfully learned generalizable features rather than merely memorizing repeated patterns specific to the dataset. These results confirm the model’s generalization capability for diverse climatic and agricultural conditions.
Future work
In future work, we intend to increase the diversity of the dataset by adding images from different geographical zones and environments to improve the generalization capability of the MobDenNet model. We also like to examine how incorporating more sophisticated machine learning methods, specifically ensemble learning and attention mechanisms, enhances performance results and time complexities. Through collaboration with agriculture experts, the goal is to develop applications that are readily available and smart enough to identify wheat growth stages in real-time, helping farmers by responding on a timely basis and leading to better decisions and resource management.
Conclusion
This study proposed and validated the performance of a novel hybrid deep learning framework MobDenNet with an exceptional accuracy of 99% to distinctly recognize seven wheat crop growth stages. The newly curated image dataset contains 4110 images that represent seven growth stages ’Crown Root’, ’Tillering’, ’Mid Vegetative’, ’Booting’,’ Heading’, ’Anthesis’, and ’Milking’. This study implemented data preprocessing, including advanced data augmentation techniques, to address the data imbalance challenge. In addition, advanced classification methods MobileNetV2, DenseNet-121, NASNet-Large, InceptionV3, and convolutional neural networks are employed. However, MobDenNet outperforms deep and transfer learning models using MobileNetV2 and DenseNet-121 architecture synergistically combined. Furthermore, to validate that the approach is robust and works correctly, k-fold cross-validation is performed. The proposed approach exhibited exceptional performance in comparison to existing state-of-the-art studies. Additionally, the computational complexity is determined. The functioning of MobDenNet has the potential to influence novel farming methods which can lead to decision-making and resource utilization more optimally.
Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Prof, S. et al. Smart agriculture with IoT. Int. J. Innov. Res. Inf. Secur. 9, 225–228 (2023).
Purohit, S. V. et al. Smart agriculture. Int. J. Innov. Res. Adv. Eng. 10, 149–156 (2023).
Garg, D. & Alam, M. Smart agriculture: A literature review. J. Manag. Anal. 10, 359–415 (2023).
Siddegowda, C. J. & Devi, A. J. A study on the role of precision agriculture in agro-industry. Int. J. Appl. Eng. Manag. Lett. 66, 57–67 (2021).
Zhang, R. & Li, X. Edge computing driven data sensing strategy in the entire crop lifecycle for smart agriculture. Sensors 21, 7502 (2021).
Shewaye, Y. & Solomon, T. Performance of bread wheat (Triticum aestivum L.) line originating from various sources. Open Access J. Agric. Res. 3(4), 2474–8846. https://doi.org/10.23880/oajar-16000166 (2018).
Singh, G. & Dhillon, B. S. Enhancing the performance of wheat (Triticum aestivum L.) through fly ash and nitrogen management. Int. J. Curr. Microbiol. Appl. Sci. 8, 333–340 (2019).
Chauhan, N., Sankhyan, N. K., Sharma, R. P. & Rav, G. Productivity and quality of wheat as affected by long-term addition of fertilizers and amendments: A review. Int. J. Curr. Microbiol. Appl. Sci. 9, 2369–2377 (2020).
Gangwar, H. K. & Lodhi, M. D. Effect of nitrogen levels and number of irrigation on growth and yield of wheat. Int. J. Curr. Microbiol. Appl. Sci. 7, 3663–3673 (2018).
Chauhan, S. S., Singh, A. K., Yadav, S., Verma, S. K. & Kumar, R. Effect of different varieties and sowing dates on growth, productivity and economics of wheat (Triticum aestivum L.). Int. J. Curr. Microbiol. Appl. Sci. 9, 2630–2639. https://doi.org/10.20546/ijcmas.2020.902.300 (2020).
Mashiqa, P., Pule-Meulenberg, F. & Ngwako, S. Wheat growth as affected by planting density. Plant. Time Nitrog. Appl. 6, 66. https://doi.org/10.21203/rs.3.rs-1276650/v1 (2022).
Johnson, V. A. World wheat production. Genetic improvement in yield of wheat. CSSA Spec. Publ. 66, 1–5. https://doi.org/10.2135/cssaspecpub13.c1 (2015).
Yang, W. & Shen, Y. Quality assessment of feed wheat in ruminant diets. Glob. Wheat Prod. 6, 66. https://doi.org/10.5772/intechopen.75588 (2018).
Jeong, S. W., & Lee, K. M. Estimation crop types and growth stages with hierarchical classification models. In 2022 Joint 12th International Conference on Soft Computing and Intelligent Systems and 23rd International Symposium on Advanced Intelligent Systems (SCIS &ISIS). https://doi.org/10.1109/scisisis55246.2022.10002044 (2022).
Rasti, S. et al. A survey of high resolution image processing techniques for cereal crop growth monitoring. Inf. Process. Agric. 9, 300–315. https://doi.org/10.1016/j.inpa.2021.02.005 (2022).
Li, N., Li, H., Zhao, J., Guo, Z. & Yang, H. Mapping winter wheat in Kaifeng, China using Sentinel-1A time-series images. Remote Sens. Lett. 13, 503–510. https://doi.org/10.1080/2150704x.2022.2046888 (2022).
Vashisth, A., Goyal, A. & Krishanan, P. Effect of weather variability on growth and yield of wheat crop under semi-arid region of India. J. Agrometeorol. 22, 124–131. https://doi.org/10.54386/jam.v22i2.152 (2021).
Zheng, Q. et al. Identification of wheat yellow rust using optimal three-band spectral indices in different growth stages. Sensors 19, 35. https://doi.org/10.3390/s19010035 (2018).
Haq, S. I. U., & Tahir, M. N. Weed Detection in Wheat Crop Using Image Analysis and Artificial Intelligence (AI). https://doi.org/10.20944/preprints202210.0284.v1 (2022).
Vavlas, N. C. et al. Deriving wheat crop productivity indicators using Sentinel-1 time series. Remote Sens. 12, 2385. https://doi.org/10.3390/rs12152385 (2020).
Verma, I. J. & Das, H. P. A study on available soil water during the growth of wheat (Triticum aestivum L.) at New Delhi. MAUSAM 55, 469–474. https://doi.org/10.54302/mausam.v55i3.1193 (2004).
Li, Y. et al. Research on winter wheat growth stages recognition based on mobile edge computing. Agriculture 13, 534. https://doi.org/10.3390/agriculture13030534 (2023).
Kitzler, F., Barta, N., Neugschwandtner, R. W., Gronauer, A. & Motsch, V. WE3DS: An RGB-D image dataset for semantic segmentation in agriculture. Sensors 23, 2713. https://doi.org/10.3390/s23052713 (2023).
Rasti, S., Bleakley, C. J., Silvestre, G. C. M., O’Hare, G. M. P. & Langton, D. Assessment of deep learning methods for classification of cereal crop growth stage pre and post canopy closure. J. Electron. Imaging 32, 66. https://doi.org/10.1117/1.jei.32.3.033014 (2023).
Jiang, S. et al. Monitoring wheat lodging at various growth stages. Sensors 22, 6967. https://doi.org/10.3390/s22186967 (2022).
Goh, B. B., King, P., Whetton, R. L., Sattari, S. Z. & Holden, N. M. Monitoring winter wheat growth performance at sub-field scale using multitemporal Sentinel-2 imagery. Int. J. Appl. Earth Observ. Geoinfor. 115, 103–124. https://doi.org/10.1016/j.jag.2022.103124 (2022).
Worrall, G., Rangarajan, A. & Judge, J. Domain-guided machine learning for remotely sensed in-season crop growth estimation. Remote Sens. 13, 4605. https://doi.org/10.3390/rs13224605 (2021).
Zhuo, W. et al. Prediction of winter wheat maturity dates through assimilating remotely sensed leaf area index into crop growth model. Remote Sens. 12, 2896. https://doi.org/10.3390/rs12182896 (2020).
Liu, A. C. C., Law, O. M. K. & Law, I. Understanding Artificial Intelligence. https://doi.org/10.1002/9781119858393 (2022).
Upadhyay, P., Ghosh, S. K. & Kumar, A. Temporal MODIS data for identification of wheat crop using noise clustering soft classification approach. Geocarto Int. 31, 278–295. https://doi.org/10.1080/10106049.2015.1047415 (2015).
Goceri, E. Medical image data augmentation: Techniques, comparisons and interpretations. Artif. Intell. Rev. 56, 12561–12605. https://doi.org/10.1007/s10462-023-10453-z (2023).
Al-Otaibi, S., Rehman, A., Raza, A., Alyami, J. & Saba, T. CVG-Net: Novel transfer learning based deep features for diagnosis of brain tumors using MRI scans. PeerJ Comput. Sci. 10, e2008. https://doi.org/10.7717/peerj-cs.2008 (2024).
Talaviya, T., Shah, D., Patel, N., Yagnik, H. & Shah, M. Implementation of artificial intelligence in agriculture for optimisation of irrigation and application of pesticides and herbicides. Artif. Intell. Agric. 4, 58–73. https://doi.org/10.1016/j.aiia.2020.04.002 (2020).
Raza, A., Munir, K., Almutairi, M. S. & Sehar, R. Novel transfer learning based deep features for diagnosis of down syndrome in children using facial images. IEEE Access 6, 66 (2024).
Madni, H. A., Raza, A., Sehar, R., Thalji, N. & Abualigah, L. Novel transfer learning approach for driver drowsiness detection using eye movement behavior. IEEE Access 6, 66 (2024).
Khalid, M. et al. Novel sentiment majority voting classifier and transfer learning-based feature engineering for sentiment analysis of deepfake tweets. IEEE Access 6, 66 (2025).
Thalji, N. et al. Segmented X-ray image data for diagnosing dental periapical diseases using deep learning. Data Brief 66, 110–539 (2024).
Younas, F. et al. An efficient artificial intelligence approach for early detection of cross-site scripting attacks. Decis. Anal. J. 11, 100466 (2024).
Naseer, A. et al. A novel transfer learning approach for detection of pomegranates growth stages. IEEE Access 12, 27073–27087. https://doi.org/10.1109/access.2024.3365356 (2024).
Dong, K., Zhou, C., Ruan, Y., & Li, Y. MobileNetV2 model for image classification. In 2020 2nd International Conference on Information Technology and Computer Application (ITCA). https://doi.org/10.1109/itca52113.2020.00106 (2020).
Arathi, B., & Dulhare, U. N. Classification of cotton leaf diseases using transfer learning-DenseNet-121. In Lecture Notes in Networks and Systems, Proceedings of Third International Conference on Advances in Computer Engineering and Communication Systems 393–405 (2023).
Funding
This research is funded by the European University of Atlantic.
Author information
Authors and Affiliations
Contributions
AN conceived the idea, performed data analysis and wrote the original draft. MA conceived the idea, performed data curation and wrote the original draft. AR performed data curation, formal analysis, and designed methodology. KM dealt with software, performed visualization and carried out project administration. AS performed visualization, deal with software and designed methodology. HFG acquired the funding for research, and performed visualization and initial investigation. CEUR performed initial investigation, provided resources and performed validation. IA supervised the study, performed validation and review and edit the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Naseer, A., Amjad, M., Raza, A. et al. Novel hybrid transfer neural network for wheat crop growth stages recognition using field images. Sci Rep 15, 11822 (2025). https://doi.org/10.1038/s41598-025-96332-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-96332-9