Variance estimation using memory type estimators based on EWMA statistic for time scaled surveys in stratified sampling

Tariq, Muhammad Umair; Qureshi, Muhammad Nouman; Alamri, Osama Abdulaziz; Iftikhar, Soofia; Alsaedi, Basim S.O.; Hanif, Muhammad

doi:10.1038/s41598-024-76953-2

Download PDF

Article
Open access
Published: 04 November 2024

Variance estimation using memory type estimators based on EWMA statistic for time scaled surveys in stratified sampling

Muhammad Umair Tariq¹,
Muhammad Nouman Qureshi²,
Osama Abdulaziz Alamri³,
Soofia Iftikhar⁴,
Basim S.O. Alsaedi³ &
…
Muhammad Hanif¹

Scientific Reports volume 14, Article number: 26700 (2024) Cite this article

2178 Accesses
3 Citations
Metrics details

Subjects

Abstract

In this article, we have proposed memory-type exponential and non-exponential estimators for population variance based on exponentially weighted moving average (EWMA) statistic in stratified sampling. We drive mathematical expressions for mean square errors of the proposed estimators using Taylor and exponential expansions. Our analysis demonstrates that the proposed memory-type estimators outperform the conventional estimators under stratification, particularly, when the information of previous sample is utilized. The performances of the proposed estimators are evaluated mathematically by deriving the conditions in which the memory-type estimators would perform better than their corresponding conventional estimators under stratification. Through an extensive simulation, we evaluated the performance of the proposed estimators across various population parameters, revealing their enhanced efficiency in time-scaled survey. Additionally, a real data application is also used to support the mathematical findings, confirming the practical utility of the proposed estimators. The results of numerical study underscore the importance of the use of previous sampled information which significantly improves the accuracy and reliability of the proposed estimator for variance estimation for time-scaled surveys.

Optimal class of memory type imputation methods for time-based surveys using EWMA statistics

Article Open access 28 October 2024

Innovative memory-type calibration estimators for better survey accuracy in stratified sampling

Article Open access 03 October 2025

Combination of memory type ratio and product estimators under extended EWMA statistic with application to wheat production

Article Open access 20 August 2023

Introduction

Time-scaled surveys are crucial across various fields, including economics, health sciences, agriculture and environmental studies as they provide timely data essential for tracking tendencies and updating decision-making. Many countries have done or currently conducting various surveys on the basis of regular time interval. Such surveys usually designed to ensure that the collected information can be compared, at broad levels, with the information of other countries such as; gross domestic production, rate of inflation, real exchange rate, fertility and mortality rate etc. In economics, regular collection of indicators like (Gross Domestic Product) GDP and inflation rates permits responsive strategy adjustments. Similarly, environmental studies trust on the time-scaled data to monitor the ecological variations and assess the impacts aimed at sustainability. Longitudinal data on disease prevalence supports effective public health inversions in health sciences. In agricultural study, yearly statistical book is considered to be a good example for the time-scaled data. Many organizations, such as, U.S. Bureau Labor Statistics, Australian Bureau of Statistics, National Bureau of Statistics of China, etc., conducted several surveys on agricultural, ecological, environmental, economics and health sciences on annually, quarterly or monthly basis. For instance, Punjab Bureau of Statistics, managing Multiple Indicator Cluster Survey (MICS) regularly since 2004. The Institute of Basic Medical Sciences (IBMS) conducted the China National Health Survey (CNHS) using multi-ethnic cross-sectional interview and health examination from 2012 to 2017. Pakistan Bureau of Statistics conducting Labor Force Survey (LFS) annually. The development of suitable efficient estimators for such surveys is vital, as accurate and timely estimates can significantly improve the efficacy of interventions and resource allocation in these critical areas.

In survey sampling, statisticians mostly work to enhance the performance of the estimators by using the auxiliary information. The ratio-type estimators are applicable when there is positive linear relation between the study variable and the auxiliary variable is observed, whereas the product-type estimators are considered to be useful for the populations having negative correlation. Although, the conventional estimators having auxiliary information perform well for the surveys based on time scaled data but the efficiency of the estimators may improve by using memory-type estimators based on exponentially weighted moving average (EWMA) statistic. Robert¹ first gave the notion of EWMA statistic which utilizes the present and past information simultaneously from the sample. The EWMA statistic was initially defined for mean as.

$$Z_{j}={\lambda}Y_{j}+(1-{\lambda})Z_{j-1},\quad for\,j= 1,2,3,...,n.$$

where Y_j be the observations over time j, Z_j and Z_j−1 indicates the present and past observations of the data respectively. Here $\lambda$($0 \leqslant \lambda \leqslant 1$) is a constant that determines the depth of the memory. The initial value Z₀ is taken as the average value of prior sample and n is the number of observations to be monitored including Z₀.

Estimation of parameters has a great significance in planning surveys. Stratified sampling provides more precise estimates over simple random sampling for the populations having different characteristics. In Stratified Sampling, the population is initially divided into mutually exclusive groups based on homogeneous characteristics and then random samples are drawn from each group, resulting in stratified samples. Recently, Rana et al²., and Bhushan et al^3,4. proposed efficient classes of estimators under stratified random sampling. For time scaled surveys, Noor-ul-Amin⁵, Aslam et al⁶., Qureshi et al⁷., Qureshi et al⁸. and Bhushan et al^9,10. proposed some memory-type estimators for mean and variance estimation using EWMA statistic.

In this study, we have proposed memory-type ratio, exponential ratio, product and exponential product estimators using EWMA statistic for variance estimation under stratification. After the brief discussion on time-scaled surveys in Sect. Introduction, the rest of the paper is organized as follows. In Sect. Sampling procedure and conventional estimators, we describe the methodology, basic notations and sampling errors and derive the MSE of the proposed memory-type estimators for population variance. Section Proposed memory-type estimators is based on the mathematical evaluation of proposed memory-type estimators with their corresponding conventional estimators. An extensive simulation study is conducted in Sect. Mathematical comparison to assess the performance of the proposed estimators numerically over the conventional estimators at different levels of correlation and sample sizes. A real data of agricultural sector is also considered in Sect. Simulation study to confirm the findings of simulation. Final remarks on the paper are given in Sect. Real data application .

Sampling procedure and conventional estimators

Let us consider a heterogeneous population $U=\left\{ {{U_1},{U_2},{U_3} \ldots ,{U_N}} \right\}$of N distinct objects which are partitioned into L homogeneous non-overlapping strata of sizes N_h where $\sum\nolimits_{{h=1}}^{L} {{N_h}} =N$ for. We assume that Y be the study variable and X be the auxiliary variable which are defined on identifiable but distinct units of the finite population. Let the sample of size n pairs of observations are selected for both variables from each stratum.

It is well known that the unbiased mean estimator in stratified sampling is given as, where and ${\bar {y}_h}=\sum\nolimits_{{j=1}}^{{{n_h}}} {{y_{hj}}} /{n_h}$ is the hth stratum sample mean, while${\overline {Y} _h}=\sum\nolimits_{{j=1}}^{{{N_h}}} {{y_{hj}}} /{N_h}$ is the mean of stratum population.

The variance of mean using stratified sample is given as.

$$S_{{yst}}^{2}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)S_{{yh}}^{2}} ,$$

where $S_{{yh}}^{2}=\sum\nolimits_{{j=1}}^{{{N_h}}} {{{\left( {{y_{hj}} - {{\overline {Y} }_h}} \right)}^2}} /{N_h}$ is the hth stratum population variance.

The unbiased variance estimator of $S_{{yst}}^{2}$ is given as

$$s_{{yst}}^{2}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)s_{{yh}}^{2}} ,$$

where $s_{{yh}}^{2}=\sum\nolimits_{{j=1}}^{{{n_h}}} {{{\left( {{y_{hj}} - {{\bar {y}}_h}} \right)}^2}} /\left( {{n_h} - 1} \right)$ is hth stratum sample variance.

Classical ratio estimator under stratified sampling is defined as.

$$s_{{yst}}^{2}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)s_{{yh}}^{2}} ,$$

where $s_{{xh}}^{2}=\sum\nolimits_{{j=1}}^{{{n_h}}} {{{\left( {{x_{hj}} - {{\overline {x} }_h}} \right)}^2}} /\left( {{n_h} - 1} \right)$ is hth stratum sample variance of the auxiliary variable.

The approximate MSE for the conventional ratio estimator under stratified sampling is.

$$MSE\left( {{t_{rst}}} \right) \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}\left( {{\Omega _{40 h}}+{\Omega _{04 h}} - 2{\Omega _{22 h}}} \right)S_{{yh}}^{4}} ,$$

where ${\Omega _{40 h}}=\theta \left( {{\beta _{2yh}} - 1} \right),{\Omega _{04 h}}=\theta \left( {{\beta _{2yh}} - 1} \right),$${\Omega _{22 h}}=\theta \left( {{\mu _{22 h\left( {X,Y} \right)}} - 1} \right),{\beta _{2zh}}={\mu _{4zh}}/\mu _{{2zh}}^{2}$,

${\mu _{rzh}}=E{\left( {{z_{ih}} - {\mu _{zh}}} \right)^r}$, and ${\theta _h}=\left( {\frac{1}{{{n_h}}} - \frac{1}{{{N_h}}}} \right).$ for z = Y and X.

The EWMA statistic based on the variances of both study and the auxiliary variables for the hth stratum under stratified sampling are.

$${V_{yht}}=\lambda s_{{yht}}^{2}+\left( {1 - \lambda } \right){V_{yh\left( {t - 1} \right)}}.$$

and

$${W_{xht}}=\lambda s_{{xht}}^{2}+\left( {1 - \lambda } \right){W_{xh\left( {t - 1} \right)}}.$$

The mean and variance of EWMA statistic for both study variable and the auxiliary variable for the hth under stratification are respectively given as.

$$E\left( {{V_{yht}}} \right)=\theta S_{{yh}}^{2}{\text{, }}Var\left( {{V_{yht}}} \right)=\theta S_{{yh}}^{4}{\Omega _{40 h}}\left( {\frac{\lambda }{{2 - \lambda }}} \right).$$

and

$$E\left( {{W_{xht}}} \right)=\theta S_{{xh}}^{2}{\text{, }}Var\left( {{W_{xht}}} \right)=\theta S_{{xh}}^{4}{\Omega _{04 h}}\left( {\frac{\lambda }{{2 - \lambda }}} \right).$$

For every survey, a pilot study must have reliable results. So, the initial values of ${V_{yht}}$ and ${W_{xht}}$ are taken as average value estimated from pilot study.

Proposed memory-type estimators

The proposed memory-type estimators offer a significant advancement over the conventional estimators by incorporating both past and current data, allowing for more comprehensive analysis of time-scaled surveys. While conventional estimators rely solely on the latest sample information, the memory type strategy utilizes EWMA framework to account for trends and patterns over time. This integration enhances the robustness of parameter estimation, mostly in lively environments where data may vary.

In this section, we have proposed memory-type ratio, product, exponential ratio and exponential product estimators based on exponentially weighted moving average statistic under stratification. The estimators are given by.

(i)
Memory-type ratio estimator.

The memory-type ratio estimator based on exponentially weighted moving average statistic under stratification is suggested as.

$$t_{{rst}}^{M}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {\frac{{S_{{xh}}^{2}}}{{{W_{xht}}}}} \right)} {V_{yht}}.$$

where $S_{{xh}}^{2}=\sum\nolimits_{{j=1}}^{{{N_h}}} {{{\left( {{x_{hj}} - {{\bar {X}}_h}} \right)}^2}} /{N_h}$ is assumed to be known population variance of hth stratum of the auxiliary variable.

To derive the expression of MSE of the proposed ratio estimator, we define the following relative sampling errors as.

${\xi _{0ht}}=\frac{{{V_{yht}}}}{{S_{{yh}}^{2}}} - 1$or ${V_{yht}}=S_{{yh}}^{2}\left( {1+{\xi _{0ht}}} \right)$ and ${\xi _{1ht}}=\frac{{{W_{xht}}}}{{S_{{xh}}^{2}}} - 1$ or ${W_{xht}}=S_{{xh}}^{2}\left( {1+{\xi _{1ht}}} \right)$.

where

$$\left. \begin{gathered} E\left( {{\xi _{0ht}}} \right)=E\left( {{\xi _{1ht}}} \right)=0 \hfill \\ E\left( {\xi _{{0ht}}^{2}} \right)=\frac{{Var\left( {{V_{yht}}} \right)}}{{S_{{yh}}^{4}}}={\theta _h}{\Omega _{40 h}}\left( {\frac{\lambda }{{2 - \lambda }}} \right),\,\,\,E\left( {\xi _{{1ht}}^{2}} \right)=\frac{{Var\left( {{W_{xht}}} \right)}}{{S_{{xh}}^{4}}}={\theta _h}{\Omega _{04 h}}\left( {\frac{\lambda }{{2 - \lambda }}} \right) \hfill \\ \,\,E\left( {{\xi _{0ht}}{\xi _{1ht}}} \right)=\frac{{\operatorname{cov} \left( {{V_{yht}},{W_{xht}}} \right)}}{{S_{{yh}}^{2}S_{{xh}}^{2}}}={\theta _h}{\Omega _{22 h}}\left( {\frac{\lambda }{{2 - \lambda }}} \right) \hfill \\ \end{gathered} \right\rangle$$

(1)

We re-write the proposed memory-type ratio estimator in terms of $\xi$’s as

$$t_{{rst}}^{M}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {1+{\xi _{0ht}}} \right){{\left( {1+{\xi _{1ht}}} \right)}^{ - 1}}} S_{{yh}}^{2}.$$

(2)

Simplifying and Applying Taylor series up to second-order, we have

$$t_{{rst}}^{M} \approx \sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {1+{\xi _{0ht}}} \right)\left( {1 - {\xi _{1ht}}+\xi _{{1ht}}^{2}} \right)} S_{{yh}}^{2}.$$

(3)

On simplification of Eq. (3), we have

$$\left( {t_{{rst}}^{M} - S_{{st}}^{2}} \right) \approx \sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {{\xi _{0ht}} - {\xi _{1ht}} - {\xi _{0ht}}{\xi _{1ht}}+\xi _{{1ht}}^{2}} \right)S_{{yh}}^{2}} .$$

(4)

The expression of approximate MSE of estimator $t_{{rst}}^{M}$ is

$$MSE\left( {t_{{rst}}^{M}} \right) \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}\left[ {\frac{{Var\left( {{V_{yht}}} \right)}}{{S_{{yh}}^{4}}}+\frac{{Var\left( {{W_{xht}}} \right)}}{{S_{{xh}}^{4}}} - 2\frac{{\operatorname{cov} \left( {{V_{yht}},{W_{xht}}} \right)}}{{S_{{yh}}^{2}S_{{xh}}^{2}}}} \right]} S_{{yh}}^{4}.$$

(5)

Or

$$MSE\left( {t_{{rst}}^{M}} \right) \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}\left( {\frac{\lambda }{{2 - \lambda }}} \right){\theta _h}\left[ {{\Omega _{40 h}}+{\Omega _{04 h}} - 2{\Omega _{22 h}}} \right]S_{{yh}}^{4}} .$$

(6)

(ii)
Memory-type product estimator.

The memory-type product estimator based on exponentially weighted moving average statistic under stratification is suggested as.

$$t_{{pst}}^{M}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {\frac{{{W_{xht}}}}{{S_{{xh}}^{2}}}} \right)} {V_{yht}}.$$

To derive the expression of MSE, we re-write the proposed memory-type ratio estimator in terms of$\xi$’s as

$$t_{{pst}}^{M}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {1+{\xi _{0ht}}} \right)\left( {1+{\xi _{1ht}}} \right)} S_{{yh}}^{2}.$$

(7)

Simplifying and Applying Taylor series up to second-order, we have

$$t_{{pst}}^{M} \approx \sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {1+{\xi _{0ht}}} \right)\left( {1+{\xi _{1ht}}+\xi _{{1ht}}^{2}} \right)} S_{{yh}}^{2}.$$

(8)

On simplification of Eq. (3), we have

$$\left( {t_{{pst}}^{M} - S_{{st}}^{2}} \right) \approx \sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {{\xi _{0ht}}+{\xi _{1ht}}+{\xi _{0ht}}{\xi _{1ht}}+{\xi _{1ht}}^{2}} \right)S_{{yh}}^{2}} .$$

(9)

The expression of approximate MSE of estimator $t_{{pst}}^{M}$ is

$$MSE\left( {t_{{pst}}^{M}} \right) \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}\left[ {\frac{{Var\left( {{V_{yht}}} \right)}}{{S_{{yh}}^{4}}}+\frac{{Var\left( {{W_{xht}}} \right)}}{{S_{{xh}}^{4}}}+2\frac{{Cov\left( {{V_{yht}},{W_{xht}}} \right)}}{{S_{{yh}}^{2}S_{{xh}}^{2}}}} \right]} S_{{yh}}^{4}.$$

(10)

Or

$$MSE\left( {t_{{pst}}^{M}} \right) \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}\left( {\frac{\lambda }{{2 - \lambda }}} \right){\theta _h}\left[ {{\Omega _{40 h}}+{\Omega _{04 h}}+2{\Omega _{22 h}}} \right]S_{{yh}}^{4}} .$$

(11)

(iii)
Memory-type exponential ratio estimator.

Memory-type exponential ratio estimator based on exponentially weighted moving average statistic under stratification is suggested as.

$$t_{{erst}}^{M}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\exp \left( {\frac{{S_{{xh}}^{2} - {W_{xht}}}}{{{W_{xht}}+S_{{xh}}^{2}}}} \right)} {V_{yht}}.$$

To derive the expression of MSE, we re-write the proposed memory-type exponential ratio estimator in terms of$\xi$’s with simplification as

$$t_{{erst}}^{M}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {1+{\xi _{0ht}}} \right)\exp \left( { - \frac{{{\xi _{1ht}}}}{2}{{\left\{ {1+\frac{{{\xi _{1ht}}}}{2}} \right\}}^{ - 1}}} \right)} S_{{yh}}^{2}.$$

(12)

Applying Taylor and exponential series and simplifying Eq. (12), we have

$$t_{{erst}}^{M} - S_{{yst}}^{2} \approx \sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {{\xi _{0ht}} - \frac{{{\xi _{1ht}}}}{2} - \frac{{{\xi _{0ht}}{\xi _{1ht}}}}{2}+\frac{{3\xi _{{1ht}}^{2}}}{8}} \right)} S_{{yh}}^{2}.$$

(13)

Simplifying and ignoring the terms beyond the first order in Eq. (13). Further, squaring and applying expectation on both sides, we have

$$E{\left( {t_{{erst}}^{M} - S_{{yst}}^{2}} \right)^2} \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}E\left( {\xi _{{oht}}^{2}+\frac{{\xi _{{1ht}}^{2}}}{4} - \xi _{{0ht}}^{{}}\xi _{{1ht}}^{{}}} \right)} S_{{yh}}^{4}.$$

(14)

Applying expectations on Eq. (14), we have the expression of approximate MSE of estimator $t_{{erst}}^{M}$ is

$$MSE\left( {t_{{erst}}^{M}} \right) \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}\left[ {\frac{{Var\left( {{V_{yht}}} \right)}}{{S_{{yh}}^{4}}}+\frac{{Var\left( {{W_{xht}}} \right)}}{{4S_{{xh}}^{4}}} - \frac{{cov\left( {{V_{yht}},{W_{xht}}} \right)}}{{S_{{yh}}^{2}S_{{xh}}^{2}}}} \right]} S_{{yh}}^{4}.$$

(15)

Or

$$MSE\left( {t_{{erst}}^{M}} \right) \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}\left( {\frac{\lambda }{{2 - \lambda }}} \right){\theta _h}\left[ {{\Omega _{40 h}}+0.25\,{\Omega _{04 h}} - {\Omega _{22 h}}} \right]S_{{yh}}^{4}} .$$

(16)

(iv)
Memory-type Exponential Product Estimator.

Memory-type exponential product estimator based on exponentially weighted moving average statistic under stratification is suggested by.

$$t_{{epst}}^{M}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)exp\left( {\frac{{{W_{xht}} - S_{{xh}}^{2}}}{{{W_{xht}}+S_{{xh}}^{2}}}} \right)} {V_{yht}}.$$

To derive the expression of MSE, we re-write the proposed memory-type exponential product estimator in terms of $\xi$’s as

$$t_{{epst}}^{M}=\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {1+{\xi _{0ht}}} \right)\exp \left( {\frac{{{\xi _{1ht}}}}{2}{{\left\{ {1+\frac{{{\xi _{1ht}}}}{2}} \right\}}^{ - 1}}} \right)} S_{{yh}}^{2}.$$

(17)

Applying Taylor and exponential series and simplifying Eq. (17), we have

$$t_{{epst}}^{M} - S_{{yst}}^{2} \approx \sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {{\xi _{0ht}}+\frac{{{\xi _{1ht}}}}{2}+\frac{{{\xi _{0ht}}{\xi _{1ht}}}}{2} - \frac{{\xi _{{1ht}}^{2}}}{8}} \right)} S_{{yh}}^{2}.$$

(18)

Simplifying and ignoring the terms beyond the first order of Eq. (18). Further, squaring and taking expectation on both sides, we have

$$E{\left( {t_{{epst}}^{M} - S_{{yst}}^{2}} \right)^2} \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}E\left( {\xi _{{oht}}^{2}+\frac{{\xi _{{1ht}}^{2}}}{4}+\xi _{{0ht}}^{{}}\xi _{{1ht}}^{{}}} \right)} S_{{yh}}^{4}.$$

(19)

Applying expectation on Eq. (19), we have the expression of approximate MSE of estimator $t_{{erst}}^{M}$ is

$$MSE\left( {t_{{epst}}^{M}} \right) \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}\left[ {\frac{{Var\left( {{V_{yht}}} \right)}}{{S_{{yh}}^{4}}}+\frac{{Var\left( {{W_{xht}}} \right)}}{{4S_{{xh}}^{4}}}+\frac{{cov\left( {{V_{yht}},{W_{xht}}} \right)}}{{S_{{yh}}^{2}S_{{xh}}^{2}}}} \right]} S_{{yh}}^{4}.$$

(20)

Or

$$MSE\left( {t_{{epst}}^{M}} \right) \approx \sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}\left( {\frac{\lambda }{{2 - \lambda }}} \right){\theta _h}\left[ {{\Omega _{40 h}}+0.25\,{\Omega _{04 h}}+{\Omega _{22 h}}} \right]S_{{yh}}^{4}} .$$

(21)

Mathematical comparison

In this section, we have obtained the condition in which the proposed memory-type estimators would perform better than the conventional estimators mathematically. We have considered memory-type and conventional ratio estimators for comparison.

$$MSE\left( {t_{{rst}}^{M}} \right)<MSE\left( {{t_{rst}}} \right)$$

which implies.

$$\sum\nolimits_{{h=1}}^{L} {\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)\left( {\frac{\lambda }{{2 - \lambda }}} \right){\theta _h}S_{{yh}}^{4}\left[ {{\Omega _{40 h}}+{\Omega _{04 h}} - 2{\Omega _{22 h}}} \right]} <\sum\nolimits_{{h=1}}^{L} {{{\left( {\frac{{M_{h}^{2}}}{{{n_h}}}} \right)}^2}{\theta _h}S_{{yh}}^{4}\left( {{\Omega _{40 h}}+{\Omega _{04 h}} - 2{\Omega _{22 h}}} \right)}$$

Or

$$\left( {\frac{\lambda }{{2 - \lambda }}} \right)<1$$

(22)

As the value of $\:\lambda\:$ varies from 0 to 1, the proposed memory-type estimators will be more efficient than the conventional estimators under stratification. The performance of both memory-type estimators and conventional estimators will be same for$\lambda =1$. The condition (22) will hold for the suggested memory-type sample variance, product, exponential ratio and exponential product estimators compared to the conventional sample variance, product, exponential ratio and exponential product estimators respectively for time-scaled surveys. The theoretical evaluations demonstrate that our estimators achieve lower MSE compared to classical methods, providing more reliable and stable estimates. Our proposed estimators are particularly suited for improving estimation accuracy in time-scaled data studies by effectively leveraging past observations.

Simulation study

In order to supplement the theoretical approximation properties of the proposed estimators, we conducted a comprehensive simulation study using artificial populations generated through “mvtnorm” package in R-Software¹¹. The RE and MSE of the proposed and conventional estimators are computed using the following formulas:

$$RE\left( {{t_i},s_{{st}}^{2}} \right)=\frac{{Var\left( {s_{{yst}}^{2}} \right)}}{{MSE\left( {{t_i}} \right)}},$$

where $MSE\left( {{t_i}} \right)=\frac{1}{{50000}}\mathop \sum \limits_{1}^{{50000}} {\left( {{t_i} - S_{{yst}}^{2}} \right)^2},$and ${t_i}=s_{{yst}}^{2},\,t_{{rst}}^{M},\,t_{{erst}}^{M},\,\,t_{{pst}}^{M},t_{{epst}}^{M}\,,\,t_{{rst}}^{{}},\,t_{{erst}}^{{}},\,\,t_{{pst}}^{{}}\,\operatorname{and} \,t_{{epst}}^{{}}.$

The MSE of t_i is iterated for 50,000 times to generate samples of sizes, $n=50,100,200,300,\,\,\operatorname{and} \,\,500$. The results of MSE and RE are calculated for different levels of correlation $\:{\rho\:}_{YX}=0.75,\:0.80,\:0.85,\:0.90\:\text{a}\text{n}\text{d}\:0.95$ along with different weight constants.

In this simulation, we also varied the sampling ratio by adjusting the sample size relative to the population size which allowed us to access the impact of different sampling ratios on the performance of the proposed variance estimators.

The MSE and R.E of the proposed memory type estimator is computed using following Algorithm:

A bivariate population of size N (= 10,000) is generated using normal distributions having two strata. The first stratum of size N₁ (= 5000) is generated having $\:\:{\mu\:}_{YX}=\left[\begin{array}{cc}5&\:6\end{array}\right],\:{S}_{y}^{2}=100\:,\:{S}_{x}^{2}=125\:\:\text{a}\text{n}\text{d}\:\:{{\uprho\:}}_{\text{Y}\text{X}}$and second stratum of size N₂ (= 5000) is generated having$\:\:{\mu\:}_{YX}=\left[\begin{array}{cc}50&\:60\end{array}\right],\:{S}_{y}^{2}=100,\:{S}_{x}^{2}=125\:\:\text{a}\text{n}\text{d}\:\:{{\uprho\:}}_{\text{Y}\text{X}}.$

2.
Select the values of weight constants $\:\lambda\:$ from each stratum using SRS.
3.
Fifty thousand replicates are generated using SRS without replacement from each stratum and the proposed memory-type and conventional estimators are computed for each sample by using the EWMA statistic under stratification.
4.
Finally, the MSE and RE of each estimator is calculated and presented in Table-1 to Table-6 respectively.

The second bivariate population is generated using negative co-variances in each stratum with the same parameters for the assessment of proposed memory-type estimators for different levels of negative correlation $\:{\rho\:}_{YX}=-0.75,\:-0.80,\:-0.85,\:-0.90\:\text{a}\text{n}\text{d}-0.95$ along with different weight constants.

Table 1 MSE of memory-type ratio and conventional ratio estimators.

Full size table

Table 2 RE of memory-type ratio and conventional ratio estimators.

Full size table

Table 3 MSE of memory-type exponential ratio and conventional exponential ratio estimators.

Full size table

Table 4 RE of memory-type exponential ratio and conventional exponential ratio estimators.

Full size table

Table 5 MSE of memory-type product, exponential product and conventional product and exponential product estimators.

Full size table

Table 6 RE of memory-type product, exponential product and conventional product and exponential product estimators.

Full size table

In this simulation we have considered the populations for the assessment of performances of proposed memory-type estimators over their corresponding conventional estimators under stratification. Two types of populations are considered for both ratio-type and product-type estimators. From the results summarized in Table-1 and Table-3, it is observed that the MSE of proposed memory-type ratio and exponential ratio estimators is less than the amount of MSE of conventional ratio and exponential ratio estimators in case of the positive correlation between the study and the auxiliary variables. Correspondingly, the MSE of proposed memory-type product and exponential product estimators perform well than the conventional product and exponential product estimators for negatively correlated populations. The relative efficiencies of memory-type ratio and exponential ratio estimators found to be higher than their corresponding estimators, t_rst and t_erst as shown in Table-2 and Table-4. Similarly, the relative efficiencies of the proposed memory-type product and exponential product estimators are greater than their corresponding conventional product and exponential product estimators. As far as ρ_yx is concerned, the MSE of the proposed ratio and exponential ratio type estimators decreases as ρ_yx approaches to one, which shows that the use of auxiliary variable enhanced the efficiency of the proposed memory-type ratio and exponential ratio estimators. Similarly, the proposed product and exponential product estimators perform well as the ρ_yx goes to -1. It is also observed that the efficiency of the proposed estimators increases as the sample size increases from 50 to 500. Similarly, the weight λ assigned to current and previous values has great impact on the efficiency of the proposed estimators as shown in Table-2, Table-4 and Table-6. It is notice that the larger weight assigned to prior values improved the efficiency of proposed estimators. For λ = 1, the efficiency of the proposed estimators based on EWMA statistic would be as good as of the conventional estimators.

The simulation results demonstrate that the empirical bias and MSE closely align with the theoretical approximations across different population settings and sample sizes. This consistency reinforces the reliability of our theoretical derivations. The results also indicate that the proposed estimators perform consistently across a range of sampling ratios, validating their robustness in different finite population scenarios.

Real data application

In order to compare the efficiency of the proposed and conventional estimators, we have considered a real data based on time-scaled survey taken from Agricultural Census Wing, Pakistan Bureau of Statistics¹¹. The data of twenty-eight (1981-82 to 2008-09) years of wheat production (in tons) is considered as the study variable (Y) whereas the area of cultivation (in hectares) is used as an auxiliary variable X. The data is divided into four strata consisting of the provinces of Pakistan, named as; Punjab, Sindh, Khyber Pakhtunkhwa (KPK) and Baluchistan given in Table-7.

Table 7 Production and cultivated area of wheat in Pakistan.

Full size table

One hundred samples of size 25 were taken randomly from the above population and the relative efficiencies of the proposed and conventional ratio and exponential ratio are computed and summarized in Table-8. The overall average relative efficiency of the proposed and conventional estimators considered in this paper is given in Table-9.

Table 8 RE of classical and memory-type ratio and exponential ratio estimators for Real Data.

Full size table

Table 9 The overall average RE of classical and memory-type ratio and exponential ratio estimators for Real Data.

Full size table

On comparing the estimates, the proposed memory-type estimators found to be more efficient than the corresponding conventional estimators for every sample. Moreover, the memory-type ratio estimator performs better than the memory-type exponential ratio estimator since the coefficient of correlation between the study and the auxiliary variable is high (${\rho _{YX}}$= 0.97). We have also computed the average estimates of 100 samples which show the noteworthy performance of the proposed estimators. The superior performance of proposed memory-type estimators over the commonly used conventional estimators under stratified sampling evident that the conventional estimators revealed more variation as compared to proposed memory-type estimators for time-scaled data.

Conclusion

The main purpose this survey research is to provide various methods of estimation for different population parameters and different designs which have developed by many authors with the least MSE. These methods are useful to estimate population parameters using auxiliary variable(s), but only based on the current sample information. In this study, we have suggested memory-type ratio-product exponential ratio-product variance estimators which utilize the current as well as previous sample information under stratified sampling. We derived the equations of MSE of the proposed memory-type estimators using Taylor and exponential series. The optimum condition is also obtained in which the proposed estimators perform better than the competing estimators. From the results summarized in the various Tables, it is shown that the proposed estimators work more efficiently over their corresponding conventional estimators. Consequently, on the basis of numerical findings calculated from simulation and real data application, it is suggested that the proposed memory-type estimators based on EWMA statistic are more efficient and useful in practice for time-scaled surveys to estimate population variance under stratification.

In this research, we have considered only single auxiliary variable for variance estimation of time-scaled survey using stratification. In future, we will incorporate multi-auxiliary variables for the same situation to improve the efficiency of the estimators. We will also incorporate non-response and measurement errors simultaneously.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author uponreasonable request.

References

Roberts, S. W. Control chart tests based on geometric moving averages. Technometrics. 1 (3), 239–250. https://doi.org/10.1080/00401706.1959.10489860 (1959).
Article Google Scholar
Rana, Q., Qureshi, M. N. & Hanif, M. Generalized Estimator for Population Mean using Auxiliary Attribute in Stratified two-phase sampling. J. Stat. Theory Appl. 21, 44–57. https://doi.org/10.1007/s44199-022-00040-6 (2022).
Article Google Scholar
Bhushan, S., Kumar, A. & Singh, S. Some efficient classes of estimators under stratified sampling. Communications in Statistics - Theory and Methods, 52(6), 1767–1796 a) (2023).
Bhushan, S., Kumar, A., Lone, S. A., Anwar, S. & Gunaime N.M. An efficient class of estimators in stratified random sampling with an application to real data. Axioms. 12, 576 (2023b).
Article Google Scholar
Noor-Ul-Amin, M. Memory type estimators of population mean using exponentially weighted moving averages for time scaled surveys scaled surveys. Commun. Stat. - Theory Methods. 50 (12), 2747–2758. https://doi.org/10.1080/03610926.2019.1670850 (2019).
Article MathSciNet Google Scholar
Aslam, I., Amin, M. N., Yasmeen, U. & Hanif, M. Memory type ration and product estimator in Stratified Sampling. J. Reliab. Stat. Stud. 13 (1), 1–20. https://doi.org/10.13052/jrss0974-8024.1311 (2020).
Article Google Scholar
Qureshi, M. N., Tariq, M. U. & Hanif, M. Memory-type ratio and product estimators for population variance using exponentially weighted moving averages for time-scaled surveys. Commun. Stat. - Simul. Comput. 53 (3), 1484–1493. https://doi.org/10.1080/03610918.2022.2050390 (2022).
Article MathSciNet Google Scholar
Qureshi, M. N. et al. Memory-type variance estimators using exponentially weighted moving average statistic in presence of measurement error for time-scaled surveys. PLoS ONE. 18 (11), e0277697. https://doi.org/10.1371/journal.pone.0277697 (2023).
Article CAS PubMed PubMed Central Google Scholar
Bhushan, S., Kumar, A., Al-Omari, A. I. & Alomani, G. A. Mean estimation for time-based surveys using memory-type logarithmic estimators. Mathematics, 11, 2125 a) (2022).
Bhushan, S., Kumar, A., Alrumayh, A., Khogeer, H. A. & Onyango, R. Evaluating the performance of memory type logarithmic estimators using simple random sampling. PLoS ONE. 17 (12), e0278264 (2022b).
Article CAS PubMed PubMed Central Google Scholar
Genz, A. & Bretz, F. mvtnorm: Multivariate normal and t distributions (Version 1.1-4). (2023). https://CRAN.R-project.org/package=mvtnorm
Pakistan Bureau of Statistics. Agricultural Census 2010 - Pakistan Report (Islamabad, 2012).

Download references

Acknowledgements

The authors are grateful to the Editor-in-Chief, Associate Editor, the anonymous reviewers, and the Editorial Support team at Scientific Reports for their thorough review, critical comments, and assistance, which have significantly improved this article.

Author information

Authors and Affiliations

Department of Statistics, National College of Business Administration and Economics, Lahore, Pakistan
Muhammad Umair Tariq & Muhammad Hanif
School of Statistics, University of Minnesota, Minneapolis, MN, USA
Muhammad Nouman Qureshi
Department of Statistics, University of Tabuk, Tabuk, Saudi Arabia
Osama Abdulaziz Alamri & Basim S.O. Alsaedi
Department of Statistics, Shaheed Benazir Bhutto Women University, Peshawar, Pakistan
Soofia Iftikhar

Authors

Muhammad Umair Tariq
View author publications
Search author on:PubMed Google Scholar
Muhammad Nouman Qureshi
View author publications
Search author on:PubMed Google Scholar
Osama Abdulaziz Alamri
View author publications
Search author on:PubMed Google Scholar
Soofia Iftikhar
View author publications
Search author on:PubMed Google Scholar
Basim S.O. Alsaedi
View author publications
Search author on:PubMed Google Scholar
Muhammad Hanif
View author publications
Search author on:PubMed Google Scholar

Contributions

All Authors contributed equally.

Corresponding author

Correspondence to Muhammad Nouman Qureshi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1 (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Tariq, M.U., Qureshi, M.N., Alamri, O.A. et al. Variance estimation using memory type estimators based on EWMA statistic for time scaled surveys in stratified sampling. Sci Rep 14, 26700 (2024). https://doi.org/10.1038/s41598-024-76953-2

Download citation

Received: 23 May 2024
Accepted: 17 October 2024
Published: 04 November 2024
Version of record: 04 November 2024
DOI: https://doi.org/10.1038/s41598-024-76953-2

Variance estimation using memory type estimators based on EWMA statistic for time scaled surveys in stratified sampling

Subjects

Abstract

Similar content being viewed by others

Optimal class of memory type imputation methods for time-based surveys using EWMA statistics

Innovative memory-type calibration estimators for better survey accuracy in stratified sampling

Combination of memory type ratio and product estimators under extended EWMA statistic with application to wheat production

Introduction

Sampling procedure and conventional estimators

Proposed memory-type estimators

Mathematical comparison

Simulation study

Real data application

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Electronic supplementary material

Supplementary Material 1 (download DOCX )

Rights and permissions

About this article

Cite this article

Keywords

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

Optimal class of memory type imputation methods for time-based surveys using EWMA statistics

Innovative memory-type calibration estimators for better survey accuracy in stratified sampling

Combination of memory type ratio and product estimators under extended EWMA statistic with application to wheat production

Introduction

Sampling procedure and conventional estimators

Proposed memory-type estimators

Mathematical comparison

Simulation study

Real data application

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Electronic supplementary material

Supplementary Material 1 (download DOCX )

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links