Introduction

Agriculture is one of the largest sectors and the backbone for an agrarian country’s economy. Pakistan is an agrarian country where 24% of its gross domestic product (GDP) emanates from the agricultural sector. In Pakistan, most of the population depends upon agriculture, either directly or indirectly, for their livelihood. Punjab Province has the most cultivated land in Pakistan with the most production of wheat and cotton. The basic crops like wheat, sugarcane, cotton, and rice give 75% of the total crop yield. Wheat (Triticum aestivum) is the largest cereals food crop in Pakistan because it is consumed on the daily basis. According to FAO organization, Pakistan produced 21,591,400 metric tons of wheat in 2005 which make it largest producer of wheat1, and it exceeded from 25 to 30 million metric tons of wheat in 2012. With the growing world population and the shrinking of water resources, it is of great importance to be able to produce maximum food crops with the available little resources2,3.

Crop models have the potential use for crops management decisions, planning policy and adopting to current and future climate change4,5. The crop simulation models can be used to forecast yield in future based on anticipated climatic conditions, soil characteristics and crop management practices. Different crop models have been developed around the world that are used for simulating crop productivity in response to different crops management practices. In the 1970s, the crops model was used for the first time in the agriculture sector. Now a days, different types of crop growth models are accessible with their level of expertise and exposure6,7. Crop simulation model is the illustration of crop phenology, yield and biomass by using mathematical equations for the crop and soil management8,9. The spatial distribution of soil properties, leaf area index, biomass, grain yield, nitrogen content, humidity and meteorological data are often unresolved for that crop models in which crops grain yield were estimated globally10,11. In the last two decades crop management, yield prediction, and regional monitoring of crops have become crucial components of the agriculture sector in order to mitigate the impacts of climate change, strengthen the economy and boost the trade globally12,13,14,15.

The crop growth models which are most commonly used include; Decision Support System for Agro Technology Transfer (DSSAT) model16,17, Environmental Policy Integrated Climate (EPIC) model, and Crop Growth Model for Legumes (CROPGRO) model18. In this study DSSAT model were used. For this study, the Decision Support System for Agro Technology Transfer (DSSAT) model was selected because of its extensive capacity to represent the growth and development of a range of crops in different ecological conditions. DSSAT is well known for its accuracy in forecasting phenological stages, crop yield, and water use (Jones et al., 2003; Hoogenboom et al., 2019). To evaluate crop phenology in a variety of agroecosystems, it incorporates a number of modules that enable the modeling of multiple management strategies and climatic conditions (Hoogenboom et al., 2019). The EPIC model is useful for policy and environmental analysis, but to study crop growth simulation it is less effective ((Williams et al., 1989)) that’s why DSSAT was used for this paper.

The DSSAT model, which is developed by (International Benchmark Sites Network for Agrotechnology Transfer) IBSNAT. The DSSAT model serves as a computational platform and is designed to simulate the crop growth, its development, and yield. It has been working for over 100 countries for at least 20 years now17. This model delineates crop development into nine stages from pre-sowing to harvesting, aligning with thermal time dynamics19. It computes biomass accumulation as the sum of photosynthetically active intercepted radiation and radiation usage efficiency20.

From the last few years, satellite Remote sensing (RS) technology and crop models have been evolving at a rapid pace21,22. RS data provides spatial and temporal information that allows crop growth monitoring for extended period of time23,24. MODIS is a sensor onboard Aqua and Terra satellites. The temporal resolution of MODIS is two days to image the entire globe. The spatial resolution of MODIS is 250 m, 500 m and 1000 m. The spectral resolution of MODIS is very high with total of 36 spectral bands. The 250 m resolution has two bands, 500 m has five bands and 1000 m has 29 spectral band. The spectral range extends from 0.4 μm to 14.4 μm, with a 2,330 km swath width. Among the many MODIS products, MODA3H has been used to assess crop biomass. The MODA3H has two further sub-products; gross primary production (GPP) and net primary production (NPP). GPP is the quantity of carbon captured by plants, while NPP is the amount of carbon appropriated to plant tissues for the autotrophic respiration25,26. NPP and GPP cannot be observed directly, therefore it requires some models for the estimation of these biophysical parameters26,27. For the estimation of LAI MODIS product MCD15A3H were used26,28,29.

The focus of recent research is mainly on both quality and quantity of wheat, to produce best quality wheat with maximum yield to cover the growing needs of the increasing global population30.Emergence of RS has resolved many issues regarding large area monitoring and spatial information concerns20,31,32,33. DSSAT-CERES model and proximal RS can be linked by the calibration approach. Satellite RS is an effective technology for monitoring of crops phenology and yield estimations. In the modern era agriculture research use the RS technology for the crop monitoring because it acquires spatial data about the crops condition i.e. health, growth etc. which is crucial for the management and to forecast the production of crops in the research area34.

Currently, RS technologies allow us to monitor crops both temporally and spatially24. The results of data collected by RS technology with finer spatial resolution is most accurate with more spatial details35,36. On the other hand, finer spatial resolution essentially means smaller footprint of the sensor, and hence, longer revisit time period, which may further exasperate by the presence of clouds37. Scientists still prefer coarse resolution satellite imageries with a high temporal resolution. The accuracy of RS data dependent on scale of resolution with agriculture zone37,38,39. Crop yields nowadays are highly vulnerable to climatic changes and unfavorable weather conditions. However, studies on impact of climate change on crops in Pakistan are very few, if any. Past studies are limited to extreme temperature and precipitation analysis only, and no one has specifically looked at their impact on crops. Therefore, our knowledge of changes in crop phenology with a wide range of possible future climatic scenarios is incomplete. The gap in our current understanding can potentially be filled by assimilating crop models and remote sensing data in the research. The working crop models can be used in future to predict future yield on the basis of projected climate conditions, soil and crop management practices. The following are the objectives of this research were to forecast crop phenology and crop yield using DSSAT and to validate DSSAT Crop Environment Resource Synthesis (CERES) Wheat model with ground observations. The calibration and validation of the DSSAT CERES Wheat model with ground observations. The Estimation of LAI and biomass from satellite remote sensing then comparison and validation of LAI and biomass derived from satellite remote sensing with DSSAT model. An empirical relationship is developed with RS and crop model data to predict crop yield directly through the RS data at the initial stages of the crops. The study is beneficial for farmers and decision makers by providing them with an easy-to-use method to evaluate crop yields well in advance.

The research was conducted at district Sheikhupura of Punjab province. The district is located between 31.32°- 32.07°N latitude and 73.63°- 74.67°E longitude as shown in Fig. 1. According to the 2017 census, the population of the district is 3,460,426. The total area of the district is 3,030 sq. km. The major language of the district is Punjabi, and the district is at 209 m above sea level [Figure 2]. There are two sites which are selected for this research, Shahbaz farm and Rattaber farm, which are located 15 km apart. The size of each field is 2,760 Sqm. During the monsoon season, which occurs in July and August, the region receives over 70% of the yearly rainfall in the form of torrents, while the remaining is in the form of gentle showers during January and February from western disturbances. During summertime maximum temperatures typically range from 36 to 42 degrees Celsius, while they can occasionally reach 48 degrees. The Rattaber farm is clay-silty loam soil, whereas the soil of Shahbaz farm is clay-loamy.

Fig. 1
figure 1

Source: For the creation of this Map shape file of Pakistan is downloaded from https://pakistangis.org/vector-datasets/, Digital Elevation Model (DEM) data downloaded from https://earthexplorer.usgs.gov/ site. The data has been processed using QGIS 3.4 version which is open-source software and downloaded from https://qgis.org/ site.

Geographic map of the Sheikhupura District of Punjab, Pakistan.

Fig. 2
figure 2

Source: For the creation of this map shape file is downloaded from https://pakistangis.org/vector-datasets/. Values is extraced in GEE .Finaal map layout is process in QGIS 3.4 version which is open-source software and downloaded from https://qgis.org/ site.

Geographic map of the Sheikhupura District of Punjab, Pakistan showing mixed pixels distribtion of the of the area.

Dataset and methodology

Ground dataset

The temperature and precipitation data are provided by Pakistan Meteorological Department (PMD). Out of all the meteorological stations of PMD, two stations in Lahore are the closest ones to the Sheikhupura district. The PMD data has been used to check the accuracy of the satellite base estimations of precipitations and observed gridded dataset from the Climate Prediction Center (CPC) using covariance coefficient and mean absolute error calculations. The crop data is provided by the National Agricultural Research Center (NARC). Faisalabad 2009 cultivar was used in both farms. Other datasets include biomass, yield, amount and timing of irrigation, date of sowing, date of harvesting, and amount and timing of fertilizer applications. The Soil data for both fields is provided by NARC Islamabad. The data contains soil texture information, the percentage of sand, silt, and clay, and soil’s geochemical properties like the pH value of water, the amount of potassium, phosphorous, and nitrogen in the field, bulk density, and the soil depth layer information.

Satellite data

CPC’s observed gridded data is used for temperature values at the study sites. The data contain daily maximum and minimum values of temperature. It represents a global, uniform gauge-based analysis of daily temperature, overseen by the National Oceanic and Atmospheric Administration (NOAA) Climate Prediction Center. The data is available from 01/01/1979 to present, but for this study, the range from 1 January 1979 to 21 April 2019 is used to generate climatic data for the study area. The spatial resolution of CPC is 0.50 × 0.50 degrees. Daily precipitation data is acquired from the Global Satellite Mapping of Precipitation (GSMaP) dataset. GSMaP is a component of the Global Precipitation Measurement (GPM) mission, offering global precipitation data. Spatial resolution is 0.1 × 0.1arc degree while the temporal resolution is 1 h. The data has been converted to daily precipitation over 24 24-hour period. There are two products of GSMaP data available on GEE; GSMaP reanalysis and operational datasets. GSMaP reanalysis data ranges from Mar 1, 2000 - Mar 12, 2014 while GSMaP operational available from Mar 1, 2014 - Aug 5, 2019. In this study, operational dataset has been used. Solar radiation, dew and relative humidity data were acquired from the National Aeronautics and Space Administration (NASA) Prediction of Worldwide Energy Resources (POWER) database. The relative humidity, solar radiation and dew data were obtained from NASA Predication of Worldwide Energy Resources (POWER). The data is available from 07/01/1983 to present. The spatial coverage of data is 0.50 × 0.50 degrees. The details of the dataset use in the research are presented in Table 1.

Table 1 Describe the datasets used for the research.

Methodology

In this study, the following dataset were used; weather data, soil data, and crop data. The weather data comprises daily precipitation, relative humidity, solar radiation, daily maximum and minimum temperature, and dew point temperature (Fig. 3).

CPC data were used for temperature, the data contain daily maximum and minimum values of temperature. The Pearson correlation between CPC and PMD is 0.89. The temperature values were extracted by using the R Studio application. Global Satellite Mapping of Precipitation (GSMaP) was used to acquire daily precipitation data. GEE was used to extract the precipitation values. The correlation between GSMaP and PMD data is 0.81. From NASA Prediction of Worldwide Energy Resources (POWER) relative humidity, solar radiation and dew data were acquired. For solar radiation, relative humidity, and dew values extraction, RStudio was used. The values for each parameter daily were extracted. A program is run for each parameter to extract accurate values. The crop data is provided by the National Agricultural Research Center (NARC). Faisalabad 2009 cultivar was used in the field. For both fields, the same cultivar was used. The data contain biomass, yield, irrigation, date of sowing, date of harvesting, and fertilizer data.

The soil data includes the following information soil sub-surface data i.e. depth, texture data (sand, clay and silt information, bulk density, and some chemical and physical characteristics i.e. pH value, cation exchange capacity, and organic carbon. The Soil data for both fields is provided by NARC Islamabad. The data contain soil texture information, the percentage of sand, silt, and clay, The water pH value, the amount of potassium, phosphorous, and nitrogen in the field, bulk density, and the soil depth layer information.

Fig. 3
figure 3

Methodological flow of current research.

Calibration and validation of the wheat CERES DSSAT

The DSSAT model consists of four main modules: Database management system for soil, weather, genetic coefficient, and management inputs. To generate weather data file in DSSAT, the min-max temperature, dewpoint temperature, precipitation, relative humidity, and solar radiation is provided to the weather data tool which creates weather file for the model. For soil data file, soil depth, soil bulk density, pH of water, hydraulic conductivity, soil organic content, soil horizon, and soil texture data is provided to the soil data module to generate a soil file for the desired site. For crop management data file, cultivar data, planting date, emergence date, row spacing, irrigation data, soil type, genotype data, date of maturity, harvest date and previous crop data were provided to generate a file which were used in the DSSAT model. These crop management files are generated for each site and each year and the DSSAT model is run for each farm.

For historical crop management practice crop, cultivar coefficients were changed40. For the above-mentioned farms crops simulations were carried out. For the simulation of wheat yield some standard management practices were taken into consideration. After the simulation of wheat, the simulated and observed were compared. Different types of cultivar coefficients were used by CERES wheat model such as G1, G2, G3, P1V, P1D, P5, and PHINT. The P1V, P1D, and P5 are the plant growth controlling parameters, G1 G2 and G3 are the grain filling parameters while PHINT is Phyllochron is the time interval between progressive leaf appearances. These coefficients show crop growth and phenology information. The DSSAT model was first calibrated and then validated for FSD-2009 wheat variety. The wheat cultivar FSD-2009 were used for model calibration. For the calibration DSSAT model the cultivar coefficients were adjusted in order to ensure precise simulation. Crop Phenological Parameters Calibration: First at the start of model calibration the crop phenological parameters were adjusted which control the crop growth stages. i.e., flowering (P1V), maturity (P1D), and other developmental stages (P5, PHINT). These parameters were adjusted using observed field data for the FSD-2009 variety. After the calibration of the phenological parameters the growth parameters such as the number of kernels per plant (GI, G2, and G3) were adjusted. These parameters were calibrated by comparing simulated biomass and the crop yield with the observed data. The model was run multiple times to compare the model simulated results e.g., crop growth stages, grain yield and biomass) to observed data. This process was iterative. The Cultivar coefficients were adjusted until the model’s predictions closely aligned with the ground data which assuring accurate simulations of the model.

The explanation of cultivar coefficient parameters of DSSAT CERES wheat model are given below in Table 2.

Table 2 The genotype coefficients of DSSAT CERES wheat model.

Leaf area index estimation

LAI represents the leaf area on one side of a leaf surface per unit horizontal ground area. For LAI estimation MODIS sensor level 4 products MCD15A3H.006 were used. The analysis was performed in google earth engine (GEE). The data were loaded to GEE and the LAI products were selected. The range was defined against each field. The code was executed and then the results were exported as a table to Google Drive.

Biomass estimation

Biomass estimation was a very crucial task. For the estimation of biomass, there were two products that were used. MOD17A2H.006, terra gross primary productivity, and MOD17A3H.006, terra primary productivity was used. Both products were loaded in the GEE platform, in the algorithm the NPP yearly products were divided by GPP eight days products. To make NPP products also for eight days. The results were divided by 2.5 because the biomass is the dry matter of vegetation which contains carbohydrates that’s why it is divided by 2.5. The results were divided by 2.5, because of standard conventions used in biomass and dry matter conversion. To convert vegetation biomass (wet weight) to dry matter, considering the average moisture and carbon content of the crop biomass (Wang et al., 2003). The specific factor of 2.5 reflects the moisture content found in crops like wheat, where about 50% of all biomass by dry weight is carbon. Therefore, the results were divided by 2.5 to get accurate estimation instead of other values. The values against each field were extracted in GEE and then imported as a table to MS Excel.

Model evaluation

The following terms are used as a reference to governing the performance of the model: Coefficient of residual mass (CRM), Root mean square error (RMSE), normalized root mean square error (nRMSE), and coefficient of determination (R2).

Coefficient of determination

The coefficient of determination (R2) is a statistical measure that quantifies the correlation between predicted and actual values. It ranges between 0 and 1, where a value of 1 indicates a strong correlation between the predicted and actual values, while a value of 0 signifies no relationship between the two datasets.

$$\:r=\frac{n\left(\sum\:xy\right)-\left(\sum\:x\right)\left(\sum\:y\right)}{\sqrt{\left[n\sum\:{x}^{2}-{\left(\sum\:x\right)}^{2}\right]\:\left[n\sum\:{y}^{2}-{\left(\sum\:y\right)}^{2}\right]}}$$
(1)

Where, x represents predicated values which is MODIS data and y represents actual values. The DSSAT model values were considered as actual values.

Root mean square error

The root means square error measures the discrepancy between actual and predicted values. To minimize the RMSE between measured and simulated yield, adjustments were made to the cultivar coefficient parameters.

$$\:\text{R}\text{M}\text{S}\text{E}=\sqrt{\frac{{\sum\:}_{\text{i}=1}^{\text{n}}\left({\text{E}}_{\text{i}}-{\text{M}}_{\text{i}}\right)}{\text{n}}}$$
(2)

Where, “E” denotes the MODIS values, “M” shows DSSAT model values, and “n” is the number of observations.

Normalized root means square error

The normalized root mean square error (NRMSE) quantifies the relative disparity between simulated model output and observed data. An NRMSE of less than 10% suggests excellent model performance. However, a normalized root means square error falling within the range of 20 to 30% indicates fair model performance. However, if the value is more than 30% then the performance result is poor41,42,43.

$$\:nRMSE\:=\sqrt{\sum\:\frac{{\left({E}_{i}-{M}_{i}\right)}^{2}}{n}\times\:\frac{100}{M}}$$
(3)

Where, “E” denotes the MODIS values, “M” shows DSSAT model values, and “n” is the number of observations.

Coefficient of residual mass

When the research model underestimates observed data, the Coefficient of Residual Mass index is employed. A negative CRM index indicates that the model tends to overestimate the observed data, a positive value indicates that the model tends to underestimate the experimental data42,43,44.

$$\:CRM=1-\frac{{\sum\:}_{i=1}^{n}{E}_{i}}{{\sum\:}_{i=1}^{n}{M}_{i}}$$
(4)

Where, “E” and “M” are the same as in the previous equations.

Index of agreement (D-index)

The index of agreement describes how much variation is observed by estimating the simulated values. The range of the D-index id from 0.0 to 1.0, The D-index is nearer to 0 indicate that is no relation between the observed and simulated data. If D-index is closer to 1 it means that the model completely perceives dispersion in data. Generally, if the D-index is less than 0.50 shows that there is high inconsistency and diversity in the observed values as compared to the simulated43,44,45.

$$\:D-index=1-\frac{{\sum\:}_{i=1}^{n}{\left({M}_{i}-{E}_{i}\right)}^{2}}{{\sum\:}_{i=1}^{n}{\left(\left|{E}_{i}-M\right|+\left|{M}_{i}-M\right|\right)}^{2}}$$
(5)

Results and discussions

In this study DSSAT model was first calibrated using data from one farm and validated for another form. The DSSAT model output were statistically compared with the ground data to check the accuracy. The comparison of growing season shows a very high correlation between emergence, and anthesis of both farms. In 2012-13 at Rattaber farm the crops emerged in 7 days, while in Shabaz farm it 4 days. while in 2013-14 both farms had the same days of emergence i.e. 4 days. It is also observed that the anthesis occurred in 2012-13 take 158 days while in 2013-14 extended to 165 days in both farms. At Shahbaz farm in 2012-13, the harvest index was 0.57, while in Rattaber farm 0.59. So, the harvest index in Rattaber farm was higher than Shahbaz farm in 2012-13. The same pattern was observed in 2013-14 when Rattaber farm’s harvest index was higher than that of Shahbaz farm, but harvest index values at both farms are much higher than the previous year (Table 3).

Table 3 Crop stages and different parameters as reported by the DSSAT model.

In this study it is observed that at Rattaber the maximum LAI was 9.4 m2/m2 during the 2013-14 growing season (Fig. 4), while in 2012-13 the maximum LAI was 6.9 m2/m2. At both farms the peak LAI was values was observed between 140 and 160 days. It is known that winter crops have more chlorophyll content in March and April because crop-growing conditions are near ideal at that time. After the plant growth reaches its peak, the LAI values start decreasing as the plant dies down. After the harvesting, the LAI again reaches zero.

Fig. 4
figure 4

Shows the corealtion between of LAI estimated by DSSAT model.

In Fig. 5, correlation between DSSAT and MODIS LAI estimates at Shahbaz farm is shown. The relationship between LAI obtained from DSSAT model and MODIS for the corresponding dates at Shahbaz farm 2012-13 has been plotted to visually check the co-variability and to fit a regression line (Figs. 5 and 6). As described earlier, the passive sensors reach saturation point when the LAI values reached to 445, therefore, only the days at which the LAI values did not reach a value of 4.0 from DSSAT have been used for this plot where a strong correlation is observed between satellite and model data with R2 0.90. The D-index is 0.96, a value very close to 1- indicating a strong agreement between DSSAT and satellite data. The CRM is 0.18. A positive but low CRM value indicates a slight underestimation by the satellite data. The nRMSE is 0.37, a value much lower than 10, indicating that there is excellent agreement observed between satellite and model data and confirms minimal predication errors.

Similarly, Shahbaz farm for the 2013-14 growing season (Fig. 5). The maximum LAI obtained from the satellite is 3.9 m2/m2, while for the same period the LAI estimated by DSSAT is 4.9 m2/m2. MODIS LAI product appears to reach the saturation point much earlier, at LAI value of about 2.2 from DSSAT after which, the MODIS-derived values hit a plateau even as DSSAT-derived LAI keep showing a gradual increase to 4. This, however, could probably be an anomaly and may need further research as saturation at this stage hasn’t been observed in any of the other three cases. Regardless, a strong correlation is still observed between satellite and model data. The R2 is 0.94, while the D-index is 0.94, showing a strong agreement between DSSAT and satellite data. The CRM is 0.036. The CRM value is close to 0.0, indicating that satellite data is underestimating the model. The nRMS is 0.43. As the nRMSE is less than 10 so there is excellent agreement observed between satellite and model data. The statistical analysis values of each parameter are mentioned in Table 4.

Fig. 5
figure 5

Shows the correlation between DSSAT and MODIS LAI estimates at Shahbaz farm.

Fig. 6
figure 6

Correlation between DSSAT and MODIS LAI estimates at Shahbaz farm.

At the Rattaber farm for the growing season 2013–2014, the LAI obtained from DSSAT model and MODIS shows a high correlation, although the spread is slightly higher than that of Shahbaz farm in the same year (Fig. 7). The maximum LAI obtained from MODIS is 3.7 m2/m2, while for the same period, the LAI estimated by DSSAT is 4.5 m2/m2. Like the Shahbaz farm, a strong correlation is observed between satellite and model data. The R2 is 0.85. The D-index is 0.94, and as the D-index value is near to 1 so it is clear there is strong agreement between DSSAT and MODIS data. CRM is 0.17. The CRM value indicates that satellite data is underestimating the model because its value is positive. The nRMSE is 0.55. As the nRMSE is less than 10 so there is excellent agreement observed between satellite and model data.

The relationship between LAI obtained from DSSAT model and satellite data is also plotted at Rattaber farm 2013-14 (Fig. 8). The maximum LAI obtained from the satellite is 3.6 m2/ m2, and DSSAT is 4.9 m2/m2. There is a strong correlation observed between satellite and model data. The R2 is 0.83. The D-index is 0.94, As the D-index value is near to 1 so it is clear there is strong agreement between DSSAT and satellite data. The CRM is 0.48. The CRM value indicates that satellite data is underestimating the model that because its value is positive. The nRMSE is 1.34. As the nRMSE is less than 10 so there is excellent agreement observed between satellite and model data.

Fig. 7
figure 7

Shows the correlation between DSSAT and MODIS LAI estimates at Rattaber farm.

Fig. 8
figure 8

Shows the correlation between DSSAT and MODIS LAI estimates at Rattaber farm.

Table 4 Statistical parameters for the LAI Estimation by comparison of DSSAT vs. MODIS simulation.

The biomass was obtained by the execution of DSSAT model for both farms (Fig. 9). The biomass estimated at Rattaber farm in 2013-14 is 11,456 kg/h, the highest among all other cases. This high biomass corresponds to high LAI values observed at the Rattaber farm during the same growing period. In 2012-13 the biomass observed at Rattaber farm was 8,293 kg/h which is lower than that of Shahbaz farm. The model shows maximum biomass was obtained in 2013-14 at both farms.

Fig. 9
figure 9

Shows the corelation of DSSAT model-based biomass from sowing to maturity.

At both Shahbaz and Rattaber farms, the biomass was also estimated from satellite data (Fig. 10) by the technique reported in the methodlogy section. The biomass observed at Shahbaz farm 2820 kg/h and 3325 kg/h in 2012-13, and 2013-14 respectively, while the biomass estimated at Shahbaz and Rattaber farms was 3527 kg/h 4210 kg/h in 2012-13, and 2013-14 respectively. Compared to the DSSAT-derived biomass, the satellite-derived estimates are greatly underestimated, yet the overall relationship between the two is still linear which shows the potential of method refinement for better estimations. As the optical RS cannot capture if the LAI exceeds 4. Therefore, the same case is here the signal is getting saturated and biomass can’t be estimated accurately. The statistical analysis values of each parameter are mentioned in Table 5.

Fig. 10
figure 10

Shows the corelation between MODIS-derived biomass with time phase.

Table 5 Statistical parameters derived by comparison of DSSAT vs. MODIS simulation.

The figure shows estimation of biomass from DSSAT model and MODIS data is shown in Fig. 11. As we know, after sowing crops, they take time to grow. To calculate biomass from satellite data, the date and time which is considered for the estimation of biomass from satellite data is 40 days. After the estimation of biomass as we are using RS data the plant was very small before 40 days therefore, the data were used after the end of December month. to find the magnitude of covariance, normalized root means square error, crop residual mass, and index of the agreement the data of DSSAT model was selected of the same time as MODIS. The biomass estimated from the satellite was very low as compared to DSSAT because the satellite products were used store accumulated carbohydrates, Accurate estimation of biomass was very difficult therefore researcher use different type of statistical analysis for biomass estimation. By comparing results of biomass obtained from DSSAT and satellite, the nRMSE value observed was 2.3. According to42,43 the performance of the data will be excellent if the nRMSE is less than 10, and the performance will be considered poor if nRMSE is greater than 30. So, at Shahbaz farm in 2012-13 the model performance is excellent. The CRM observed between model data and satellite data is 0.46. According to42,43 if the CRM value is positive the model will be overestimated while negative values indicate the model is overestimated. So, in this study the CRM value is positive it means that by comparing satellite data to DSSAT the model is overestimated. According to43,46 the D-index value 1.0 indicates that the model has strong agreement and inconsistency in data. while 0.0 value indicates no agreement. In this study the D-index observed by comparing biomass estimated from satellite data and model data is 0.40 which indicate that there is a notable divergence and inconsistency between the model prediction values and satellite-derived values. There is a high correlation observed at Shahbaz farms in 2012-13 by comparing biomass estimated from DSSAT with satellite data 0.99. The RMSE = 4072 kg/h was observed at Shahbaz farm in 2012-13 and the MAE = 3664 kg/h observed by comparing biomass estimated from satellite and crop models.

The scatterplot diagram of the estimation of biomass from the DSSAT model and RS data (Fig. 12). Plants take time to grow after their sowing. Biomass was estimated from satellite data. Biomass is estimated from the satellite after the first week of January, the period is from the sowing of crops up to the harvesting stage. MODIS data were used for estimation of biomass therefore the period selected after forty days. At first month plants are very small and cannot reflect more electromagnetic radiation nor absorb light rays which capture by passive satellite sensors. To find the magnitude of covariance, normalized root nRMS, CRM and index of agreement the biomass data of DSSAT model selected for the same period as MODIS. The biomass estimated from satellite was very low as compared to DSSAT). Biomass is dry weight matter of crops which contains carbohydrates (CH2O). Total biomass production is calculated by including the molecular mass of carbon, hydrogen, and oxygen in the algorithm. The biomass is divided by 2.5 in google earth engine algorithm which is approximately 50% of carbon storage. By comparing results of biomass obtained from DSSAT and satellite, the nRMSE values observed 0.73. According to42,43 the performance of the satellite will be excellent if the nRMSE is less than 10, and the performance will be considered poor if nRMSE is greater than 30. So, at Shahbaz farm in 2013-14 the model performance is excellent. The CRM observed between model simulated data and remote data is 0.60. According to42,43 if the CRM value is positive the model will be overestimated while negative values indicate the model is overestimated. So, in this study the CRM value is positive it means that by comparing satellite data to DSSAT the model it is overestimated. According to43,46 the D-index value 1.0 indicate that model has strong agreement and inconsistency in data. while 0.0 value indicate has no agreement. in this study the D-index observed by comparing biomass estimated from satellite data and model data is 0.46 which indicate that there is excellent diversity and inconsistency in data by comparing the model predication values to satellite values. As compared to Shahbaz farm 2012-13 the CRM value is higher therefore strong agreement is observed in 2013-14 at Shahbaz farm than Rattaber. The magnitude of covariance between DSSAT model and MODIS biomass data is 0.87 which shows the strongest relationship between two data sets. The RMSE = 3888 kg/h observed at Shahbaz farm in 2013-14 and the MAE = 3142 kg/h observed by comparing biomass estimated from satellite and crop model.

Fig. 11
figure 11

Shows the correlation between DSSAT and MODIS Biomass estimates at Shahbaz farm.

Fig. 12
figure 12

Shows the correlation between DSSAT and MODIS Biomass estimates at Shahbaz farm.

Fig. 13
figure 13

Shows the correlation between DSSAT and MODIS Biomass estimates at Rattaber farm.

Fig. 14
figure 14

Shows the correlation between DSSAT and MODIS Biomass estimates at Rattaber farm.

The biomass was estimated at Rattaber farm for the growing season 2012-13 by executing DSSAT model and the biomass was estimated from RS satellite data through the GEE (Fig. 13). As biomass is the amount of living thing which is measured as a dry weight. Carbon storage is a part of an ecosystem which is approximately 50% of biomass. By comparing the results of biomass obtained from DSSAT and satellite, the nRMSE values were observed 1.62. According to42,43 the performance of the satellite will be excellent if the nRMSE is less than 10%, and pair, when the nRMSE is less than 30% but greater than 20%. So, at Rattaber farm in 2012-13 the model performance was observed excellent. The CRM observed between model data and satellite data is 0.60. According to42,43 if the CRM value is positive the model will be overestimated while negative values indicate the model is overestimated. So, in this study the CRM positive value indicates that the model is overestimated. According to43,46 the D-index value 1.0 indicate that model has strong agreement and inconsistency in data. while 0.0 value indicate has no agreement. In this study the D-index observed by comparing biomass estimated from satellite data and model data is 0.55 which indicate that there is excellent diversity and inconsistency in data by comparing the model predication values to satellite values. In 2012-13 by comparing the D-index of Rattaber farm with Shahbaz farm high consistency is observed at Rattaber farm There is a high correlation observed at Rattaber farm in 2012-13 by comparing biomass estimated from DSSAT with satellite values. The magnitude of covariance and variation observed between the simulated model and DSSAT 0.99 which shows that RS data has strong agreement with model data. The RMSE = 3384 kg/h was observed at Rattaber farm in 2012-13 and the MAE = 3019 kg/h was observed by comparing biomass estimated from satellite and crop models.

The scatterplot diagram of the estimation of biomass from DSSAT model and RS data. As we know, after sowing crops, plants take time to grow. We are calculating biomass from satellite data (Fig. 14). Biomass is estimated from satellite after the first week of second month sowing of crops up to the harvesting stage of time. Passive sensor data were used for the estimation of biomass therefore the period selected after forty days. At first month plants are very small and cannot reflect more electromagnetic rays nor absorb light rays which capture by passive satellite sensors. To find the magnitude of covariance, normalized root means square error, crop residual mass, and index of agreement the biomass data of DSSAT model selected for the same period as MODIS. The biomass estimated from satellite was very low as compared to DSSAT because the satellite products (NPP) which store biomass (pool for storage of carbon). Biomass is dry weight matter of crops which contains carbohydrates (CH2O). Total biomass production is calculated by including the molecular mass of carbon, hydrogen, and oxygen in the algorithm. The biomass is divided by 2.5 in google earth engine algorithm which is approximately 50% of carbon storage. Accurate estimation of biomass is very difficult therefore researchers use different types of statistical analysis for biomass estimation. By comparing results of biomass obtained from DSSAT and satellite, the nRMSE values observed 1.76. According to42,43 the performance of the satellite will be excellent if the nRMSE is less than 10, and the performance will be considered poor if nRMSE is greater than 30. So, at Rattaber farm in 2013-14 the model performance is excellent. The CRM observed between model data and satellite data is 0.61. According to42,43 if the CRM value is positive the model will be overestimated while negative values indicate the model is overestimated. So, in this study the CRM value is positive it means that by comparing satellite data to DSSAT the model is overestimated. According to43,46 the D-index value 1.0 indicate that model has strong agreement and inconsistency in data. while 0.0 value indicate has no agreement. in this study the D-index observed by comparing biomass estimated from satellite data and model data is 0.61 which indicate that there is moderate diversity and inconsistency in data by comparing the model predication values to satellite values. As compared to Rattaber farm 2012-13 the CRM value is higher therefore strong agreement is observed in 2013-14 at Rattaber farm There is a high correlation observed at Rattaber farms in 2012-13 by comparing biomass estimated from DSSAT with satellite data 0.94. The RMSE = 4281 kg/h observed at Rattaber farm in 2013-14 and the MAE = 3763 kg/h observed by comparing biomass estimated from satellite and crop model.

The yield estimated from DSSAT model and ground observed yield (Fig. 15). The figure shows that at Shahbaz farm in 2012-13, the observed yield was less than as compared to model simulated yield. The model observed yield is 3,513 kg/h while from the model the yield is estimated 4,251 kg/h. the same cultivar and same genotype coefficient used at the same farm for the next year. The yield observed at Shahbaz farm 3,795 kg/h while the model simulated 3,586 kg/h which is less than as compared to observed yield. The model is underestimating as compared to the observed yield because at the end of March, more rainfall was observed which is grain filling period of yield due to which health condition is poor that’s why the model shows less yield. By executing DSSAT CERES wheat model for Rattaber farm in 2012-13. The yield simulated from the model is 4,219 kg/h while observed at the ground is 3,795 kg/h. In Rattaber farm the simulated yield and observed are very close to each other. In 2013-14 at Rattaber farm the yields simulated from model 4000 kg/h while the observed yield is 4,427 kg/h. the yield. The result shows that the yield observed at the ground is more than model-predicated yields. Overall, as both fields have the same size same time of irrigation, the same cultivars and fertilizer and genotypes coefficient were used but the yield observed at Rattaber farm is higher than the yield at Shahbaz farm, the soil properties of both yields were different, the texture Shahbaz farm soil was clay loamy while Rattaber farm silty clay due to which production of Rattaber farm is more than Shahbaz farm. The nRMSE observed between simulated and ground yield is 0.5. According to42 the model performance will be considered excellent if the nRMSE values are less than 10% and the will be considered poor if the values are greater than 30%. For this study model performance is excellent. The MAE describes the difference between actual and predicted yield. The MAE for this study is 364.75 kg/h. it means that there is 364.75 kg/h difference between the model simulated and the actual yield. According to42,43 the CRM positive shows the tendency of the model to be overestimated while the negative value indicates that model performance is underestimated The CRM − 0.19 between the model and actual yield means that the model overestimates the actual data. As the RMSE describe the difference between predicated values of the model and the actual values observed at the ground. The RMSE between the DSSAT model and actual ground data is 440.98 kg/h.

Fig. 15
figure 15

Show the relationship between DSSAT CERES wheat model and observed yield.

Conclusion

Leaf area index and biomass are the most important biophysical parameters for any type of crop model. The two parameters can be assessed through satellite RS data or through crop model simulations. Satellite RS has many benefits over crop models like daily coverage over vast areas in a very economical way, while crop models provide more detailed information about crop growth; from sowing to harvest. The two fields had been growing largely on their own in the past, but more recently, data assimilation techniques from RS to crop model to improve crop models’ performance has been gaining attention. the RS data requires ground-based observations for validations purposes. In this study, a crop model – DSSAT has been calibrated and validated at two farms in the Sheikhupura district for the two growing years: 2012-13 and 2013-14. The model requires many data sets to simulate crop growth and yield calculations. The following parameters have been used as input in DSSAT daily meteorological data, soil type data, previous crops data, cultivars data and genotype data. The model has been executed to provide daily estimates of crop growth parameters, e.g., LAI, biomass, days of anthesis, days of maturity, harvest index and yield. To compare the performance of the crop model with satellite product, MODIS products are used for the estimation of LAI and biomass. The LAI was calculated after 20 days of sowing up to 100 days to avoid saturation problem of the RS instruments. The LAI reached to peak at that time and the optical sensors cannot record any further LAI values beyond 4.0 m2/m2. The biomass was calculated from net primary productivity and gross primary productivity products of MODIS sensor. By comparing, LAI estimated by DSSAT and satellite RS at Shahbaz farm in 2012-13 shows very high correlation with R2 value of 0.90. The crop residual mass (CRM) is 0.18, the nRMSE was 0.37 and D-index was 0.96. In 2013-14 at the same farm, the values observed are R2 = 0.43, CRM = 0.03, nRMSE = 0.43, D-index = 0.94. The model was then validated at the Rattaber farm, and similar estimates were made at the site. Although the performance was still good but not on par with the performance at the calibration site, i.e. Shahbaz farm. In the next step of the research, LAI and biomass were estimated from MODIS products and the data was compared with model derived products to estimate the relationships between the two methods. The coefficient of correlation observed in 2012-13 at Rattaber farm is R2 = 0.85 which shows good agreement between model and satellite data. The CRM observed 0.17. The nRMSE was observed as 0.55 and the index of agreement (D-index) is 0.94, indicating that there is excellent agreement between the two data sets. While in 2013-14 the values observed at Rattaber farm are R2 = 0.83, CRM = 0.03, nRMSE = 1.34, D-index = 0.79. The maximum LAI observed in 2012 -13 in Shahbaz farm from DSSAAT CERES wheat model from date of sowing to harvesting was 8.0 m2/m2, while in 2013-14 the maximum LAI observed 9.4 m2/m2, which is much higher than the LAI in 2012-13. By validating the DSSAT CERES wheat model for Rattaber farm, the maximum LAI observed in 2012-13 for the season was 6.9 m2/m2, while in 2013-14 high value of LAI 7.9 m2/m2. Rattaber farm also shows maximum value of LAI in 2013-14.

Biomass was the other parameter that was estimated from DSSAT model and satellite RS and the values were compared. An excellent agreement was found between them. In 2012-13 at Shahbaz farm the Covariance coefficient (R2) was 0.99. The index of agreement (D-index) = 0.40, the crop residual mass (CRM) = 0.69 observed between model and satellite biomass. While in 2013-14 at the same farm, the values were R2 = 0.87, CRM = 0.60, nRMSE = 0.73, and D-index = 0.46. Same statistical indices were used for Rattaber farm to find out the relationship between satellite and crops model data. The result in 2013-14 is R2 = 0.94, CRM = 0.61, nRMSE = 1.76, and D-index = 0.53 While in 2012-13 R2 = 0.99, CRM = 0.60, nRMSE = 1.62, and D-index = 0.55.

From calibration of model, it is also noted that in 2012-13 at Shahbaz farm plants emerge after four days of sowing while in Rattaber farm the same cultivar were used but it took seven days for emergence. While in 2013-14 wheat took four days for emergence at both farms. Both Shahbaz and Rattaber farm took 158 days to anthesis in 2012-13 while in 2013-13 the same fields and cultivars took 165 days to anthesis. The variation in timing is largely determined by the meteorological and soil conditions prevailing at the time.

From this study it is concluded that MODIS data products performed well in estimating LAI and biomass. At Shahbaz farm the simulated yield is 4,251 kg/h while actual yield was 3,513 kg/h in 2012-13 while in 2013-14 the actual yield was 3,795 kg/h but model yield is 3,586 kg/h. At Rattaber farm in 2012-13 observed yield was 4,134 kg/h, and simulated yield is 4,219 kg/h which is exceeded from actual yield while in 2013-14 the actual yield was 4,427 kg/h but model yield is 4,000 kg/h. It is concluded from this study the passive sensing show saturation after certain limit, for example if LAI is greater than 4, we can’t measure it. It is also concluded that the DSSAT model can be used to estimate crop yield based on different crop management practices. This will be of great help in optimizing irrigation practices at the fields.