Improved exponential type ratio estimator in double sampling for stratification

Gupta, Anurag; Tailor, Rajesh; Barod, Nitu

doi:10.1038/s41598-023-49772-0

Download PDF

Article
Open access
Published: 18 December 2023

Improved exponential type ratio estimator in double sampling for stratification

Anurag Gupta¹,
Rajesh Tailor¹ &
Nitu Barod¹

Scientific Reports volume 13, Article number: 22520 (2023) Cite this article

1491 Accesses
Metrics details

Subjects

Abstract

The objective of this research is to create a chain-ratio-type exponential estimator in order to estimate the finite population mean in double sampling for stratification. An estimator for population mean has been constructed based on the concept of chain-ratio estimators. The constructed estimator is compared to the standard unbiased estimator, as well as the other relevant existing estimators and conditions are shown to yield better results in terms of efficiency. To support the theoretical results the study has been done on both natural as well as simulated populations.

Efficient class of estimators for finite population mean using auxiliary attribute in stratified random sampling

Article Open access 24 June 2023

An enhanced estimator of finite population variance using two auxiliary variables under simple random sampling

Article Open access 05 December 2023

Improvement in variance estimation using transformed auxiliary variable under simple random sampling

Article Open access 06 April 2024

Introduction

Stratified random sampling is a commonly used approach in sampling, In recent years, significant advancements have been made in the realm of stratified sampling estimators, with a particular focus on integrating innovative techniques like L-moments. L-moments, an extension of conventional moments, offer a robust approach for characterizing the shape and scale of probability distributions. When applied to stratified sampling, L-moments provide a nuanced understanding of the underlying data distribution within strata, enabling more accurate and efficient estimation of population parameters. The utilization of L-moments in stratified sampling estimators represents a cutting-edge methodology that enhances the precision of estimations, especially in scenarios where the data may exhibit non-normality or complex distribution patterns. (for instance see Hosking¹ and Shahzad et al.².

Stratified random sampling is particularly used when there is prior knowledge about the sampling frame and strata weights. However, in many situations, obtaining up-to-date information on strata weights can be challenging due to the addition or deletion of units to the population. For instance, studying the socio-economic status of people in a particular region becomes difficult and expensive due to factors like immigration, emigration, and other demographic changes that affect the strata sizes consequently strata weights.

To address this issue, double sampling for stratification is often employed as an alternative to stratified random sampling. In double sampling for stratification, a large sample is initially selected, which is then divided into homogeneous strata to estimate the strata weights. From each stratum, a sample is selected using simple random sampling without replacement, and both study and auxiliary variables are observed.

Double sampling for stratification is a widely used sampling design in forest and resource inventory, particularly in forest ecosystems. For instance, Lam et al.³ applied double sampling for stratification in monitoring sparse tree populations in Chinese forests. This approach is cost-effective and robust.

The concept of double sampling traces its origins to Neyman⁴, when he first developed it to gather data on strata weights in stratified sampling. Rao⁵ extended its application to address non-response issues and analytical comparisons. Ige and Tripathi⁶ proposed alternative sampling strategies based on double Sampling for Stratification (DSS), utilizing auxiliary information from the first-phase sample in both survey design and estimation.

This work led the way for Singh and Vishwakarma⁷ to introduce a general procedure for estimating population means using double sampling for stratification and auxiliary information. Tailor et al.⁸ built upon the foundation laid by Ige and Tripathi⁶ by exploring ratio-type and product-type exponential estimators.

For further research in this field, readers are encouraged to explore the papers by Tailor and Lone⁹, Singh and Nigam¹⁰, Gupta and Tailor¹¹, Lone et al.¹², and Verma et al.¹³.

Previous research has focused on classical ratio and product estimators for population mean in double sampling for stratification. Motivated by the aforementioned studies, this research introduces a novel approach by developing a chain ratio-type exponential estimator for estimating the population mean in double sampling for stratification. By exploring this new estimator, we aim to contribute to the existing literature and provide an alternative method for population mean estimation in double sampling for stratification.

Procedure for double sampling for stratification and notations

Suppose $U = \left( {U_{1} ,U_{2} , \ldots ,U_{N} } \right)$ is a finite population of size N units, which consists of strata weights $\frac{{N_{h} }}{N},\,\left( {h = 1,2,3, \ldots ,L} \right)$. The weights of the population U are unknown. In this scenario double sampling for stratification will be used.

Procedure for double sampling for stratification:

a.
In the initial phase, a sample S of size n' is drawn using simple random sampling without replacement, and auxiliary variables x and z are recorded.
b.
The sample is then divided into L strata based on the observed variables x and z. Let $n_{h}^{\prime }$ denote the number of units in each stratum $\left( {h = 1,2,3, \ldots ,L} \right)$, such that $n^{\prime } = \sum\nolimits_{h = 1}^{L} {n_{h}^{\prime } }$.
c.
From each stratum with size $n_{h}^{\prime }$, a sample of size $n_{h} = v_{h} n_{h}^{\prime }$ is drawn, where $0 < v_{h} < 1,\left( {h = 1,2,3, \ldots ,L} \right)$. These predetermined probabilities $v_{h}$ determine the sample size $n_{h}$ from each stratum $n_{h}^{\prime }$. The combined sample S′ is obtained with a total size $n = \sum\limits_{h = 1}^{L} {n_{h} }$. In S′, both the study variable y and auxiliary variables x and z are observed. Let y be the study variable and x and z are first and second auxiliary variables respectively and $\overline{Y},\,\,\overline{X}\,$ and $\,\overline{Z}\,$ are population means of variables y, x, and z respectively

where $\overline{Y}\,\, = \frac{1}{N}\,\,\sum\nolimits_{h = 1}^{L} {\sum\nolimits_{i = 1}^{{N_{h} }} {y_{hi} } }$, $\overline{X}\,\, = \frac{1}{N}\,\,\sum\nolimits_{h = 1}^{L} {\sum\nolimits_{i = 1}^{{N_{h} }} {x_{hi} } }$ and $\overline{Z}\,\, = \frac{1}{N}\,\,\sum\nolimits_{h = 1}^{L} {\sum\nolimits_{i = 1}^{{N_{h} }} {z_{hi} } }$, $R_{1} = \frac{{\overline{Y}}}{{\overline{X}}}$, $R_{2} = \frac{{\overline{Y}}}{{\overline{Z}}}$, R₁ and R₂ are ratio of two population means, $\overline{Y}_{h} ,\,\,\overline{X}_{h} \,$ and $\overline{Z}_{h} \,$ are hth stratum mean for variable y, x, and z respectively

where $\overline{Y}_{h} \,\, = \frac{1}{{N_{h} }}\,\sum\nolimits_{i = 1}^{{N_{h} }} {y_{hi} }$, $\overline{X}_{h} \,\, = \frac{1}{{N_{h} }}\,\,\sum\nolimits_{i = 1}^{{N_{h} }} {x_{hi} }$ and $\overline{Z}_{h} \,\, = \frac{1}{{N_{h} }}\,\sum\nolimits_{i = 1}^{{N_{h} }} {z_{hi} }$,

$S_{yh}^{2} ,\,\,S_{xh}^{2} \,$ and $S_{zh}^{2} \,$ are hth stratum population variance of the variable y, x, and z respectively, where

$S_{yh}^{2} \,\, = \,\,\frac{1}{{N_{h} - 1}}\,\,\sum\nolimits_{i = 1}^{{N_{h} }} {\left( {y_{hi} - \overline{Y}_{h} } \right)^{2} }$, $S_{xh}^{2} \, = \,\frac{1}{{N_{h} - 1}}\,\,\sum\nolimits_{i = 1}^{{N_{h} }} {\,\left( {x_{hi} - \overline{X}_{h} } \right)^{2} }$ and $S_{zh}^{2} \, = \,\frac{1}{{N_{h} - 1}}\,\,\sum\nolimits_{i = 1}^{{N_{h} }} {\,\left( {z_{hi} - \overline{Z}_{h} } \right)^{2} }$, $S_{yxh} ,S_{yzh}$ and $S_{xzh}$ are hth stratum covariance between the variable $y$ and x, $y$ and z, x and z respectively, where

$S_{yxh}^{{}} \,\, = \,\,\frac{1}{{N_{h} - 1}}\,\,\sum\nolimits_{i = 1}^{{N_{h} }} {\left( {y_{hi} - \overline{Y}_{h} } \right)\left( {x_{hi} - \overline{X}_{h} } \right)}$, $S_{yzh}^{{}} \,\, = \,\,\frac{1}{{N_{h} - 1}}\,\,\sum\nolimits_{i = 1}^{{N_{h} }} {\left( {y_{hi} - \overline{Y}_{h} } \right)\left( {z_{hi} - \overline{Z}_{h} } \right)}$, and $S_{xzh}^{{}} \,\, = \,\,\frac{1}{{N_{h} - 1}}\,\,\sum\nolimits_{i = 1}^{{N_{h} }} {\left( {x_{hi} - \overline{X}_{h} } \right)\left( {z_{hi} - \overline{Z}_{h} } \right)}$,

$\overline{x}^{\prime } = \sum\nolimits_{h = 1}^{{n_{h} }} {w_{h} \overline{x}_{h}^{\prime } }$: First phase sample mean of auxiliary variable x which is unbiased estimator of $\overline{X}$,

$\overline{z}^{\prime } = \sum\nolimits_{h = 1}^{{n_{h} }} {w_{h} \overline{z}_{h}^{\prime } }$: First phase sample mean of auxiliary variable z which is unbiased estimator of $\overline{Z}$

$f = \frac{{n^{\prime } }}{N}$: Sampling fraction at first phase,

$n = \sum\nolimits_{h = 1}^{L} {n_{h} }$: sample size,

$w_{h} = \frac{{n_{h}^{\prime } }}{{n^{\prime } }}$: $hth$ First phase sample’s stratum weight,

$\nu_{h} = \frac{{n_{h} }}{{n^{\prime}_{h} }}$ Sampling fraction for sample units selected in second-phase sample.

$S_{y}^{2} \,\, = \,\,\frac{1}{N - 1}\,\sum\nolimits_{h = 1}^{L} {\sum\nolimits_{i = 1}^{{N_{h} }} {\left( {y_{hi} - \overline{Y}_{h} } \right)^{2} } } \,$: Population variance of the study variable $y$,

$S_{z}^{2} \,\, = \,\,\frac{1}{N - 1}\,\sum\nolimits_{h = 1}^{L} {\sum\nolimits_{i = 1}^{{N_{h} }} {\left( {z_{hi} - \overline{Z}_{h} } \right)^{2} } } \,$: Population variance of the auxiliary variable z,

$S_{yz}^{{}} \,\, = \,\,\frac{1}{N - 1}\,\sum\nolimits_{h = 1}^{L} {\sum\nolimits_{i = 1}^{{N_{h} }} {\left( {y_{hi} - \overline{Y}_{h} } \right)\left( {z_{hi} - \overline{Z}_{h} } \right)} } \,$: Population covariance between the variable $y$ and z,

Some relevant existing estimators

The unbiased estimator for $\overline{Y}$ is defined as

$$\overline{y}_{ds} = \sum\nolimits_{h = 1}^{L} {w_{h} } \overline{y}_{h} ,$$

(3.1)

Cochran¹⁴ ratio estimator was studied in double sampling for stratification procedure by Ige and Tripathi⁶ and suggested a ratio estimator as

$$\hat{\overline{Y}}_{R}^{ds} = \overline{y}_{ds} \frac{{\overline{x}^{\prime } }}{{\overline{x}_{ds} }},$$

(3.2)

Using exponential function a ratio-type exponential estimator for $\overline{Y}$ was envisaged by Bahl and Tuteja¹⁵ in simple random sampling as

$$\hat{\overline{Y}}_{{\text{Re}}} = \overline{y}\,\exp \,\frac{{\left( {\overline{X} - \overline{x}} \right)}}{{\left( {\overline{X} + \overline{x}} \right)}},$$

(3.3)

Bahl and Tuteja¹⁵ estimator $\hat{\overline{Y}}_{{\text{Re}}}^{{}}$ was studied by Tailor et al.⁸ in double sampling for stratification procedure as

$$\hat{\overline{Y}}_{{\text{Re}}}^{ds} = \overline{y}_{ds} \exp \,\frac{{\left( {\overline{x}^{\prime } - \overline{x}_{ds} } \right)}}{{\left( {\overline{x}^{\prime } + \overline{x}_{ds} } \right)}},$$

(3.4)

Lakhre¹⁶ developed dual to ratio type exponential estimator in case of double sampling for stratification

$$\hat{\overline{Y}}_{{\text{Re}}}^{*} = \overline{y}_{ds} \exp \;\left( {\frac{{\overline{x}_{ds}^{*} - \overline{x}^{\prime } }}{{\overline{x}_{ds}^{*} + \overline{x}^{\prime } }}} \right),$$

(3.5)

where $\overline{x}_{ds}^{*} = \frac{{N\;\overline{x}^{\prime } - n\;\overline{x}_{ds} }}{N - n}$.

Lone et al.¹⁷ proposed the alternative of Ige and Tripathi⁶ estimator using the dual approach introduced by Srivenkataramana¹⁸ and Bandyopadhyay¹⁹ as

$$\hat{\overline{Y}}_{Rd}^{*} = \overline{y}_{ds} \left( {\frac{{\overline{x}_{ds}^{*} }}{{\overline{x}^{\prime } }}} \right),$$

(3.6)

Lone et al.¹² worked out dual to ratio-cum-product type estimator in double sampling for stratification motivated by Singh²⁰ and Lone et al.¹⁷ as

$$\hat{\overline{Y}}_{Rpd}^{*} = \;\overline{y}_{ds} \;\left( {\frac{{\overline{x}_{ds}^{*} }}{{\overline{x}^{\prime } }}} \right)\left( {\frac{{\overline{z}^{\prime } }}{{\overline{z}_{ds}^{*} }}} \right),$$

(3.7)

Proposed estimator

Motivated by Ige and Tripathi⁶ and Tailor et al.⁸, we have developed an improved exponential type ratio estimator by assuming that $\overline{Z}$ is known and $\overline{x}^{\prime }$ is replaced by ratio-estimator $\overline{x}^{\prime } \exp \left( {\frac{{\overline{Z} - \overline{z}^{\prime } }}{{\overline{Z} + \overline{z}^{\prime } }}} \right)$, the developed estimator for estimating population mean $\overline{Y}$ is defined as

$$\hat{\overline{Y}}_{{C{\text{Re}} }}^{ds} = \left( {\frac{{\overline{y}_{ds} }}{{\overline{x}_{ds} }}} \right)\,\overline{x}^{\prime } \exp \left( {\frac{{\overline{Z} - \overline{z}^{\prime } }}{{\overline{Z} + \overline{z}^{\prime } }}} \right)$$

(4.1)

where $\overline{x}_{ds} \, = \,\sum\nolimits_{h = 1}^{L} {\,w_{h} \overline{x}_{h} }$: is unbiased estimator of $\overline{X}$ in second phase, $\overline{y}_{ds} \, = \,\,\sum\nolimits_{h = 1}^{L} {\,w_{h} } \overline{y}_{h}$: is unbiased estimator of $\overline{Y}$ in second phase,

The expression of bias and MSE of $\hat{\overline{Y}}_{{C{\text{Re}} }}^{ds}$ can be easily find by considering error terms e_i in such a way that

$$\overline{y}_{ds} = \overline{Y}\left( {1 + e_{o} } \right),$$

$$\overline{x}_{ds} = \overline{X}\left( {1 + e_{1} } \right),$$

$$\overline{x}^{\prime } = \overline{X}\left( {1 + e_{1}^{\prime } } \right)\,{\text{and}}$$

$$\overline{z}^{\prime } = \overline{Z}\left( {1 + e_{2}^{\prime } } \right)$$

Such that $E\,\left( {e_{o} } \right)\,\, = E(e_{1} ) = \,\,E\left( {e_{1}^{\prime } } \right)\,\, = E\,\,\left( {e_{2}^{\prime } } \right)\, = 0$ and

$$E\left( {e_{0}^{2} } \right) = \frac{1}{{\overline{Y}^{2} }}\left[ {S_{y\,}^{2} \left( {\frac{1 - f}{{n^{\prime } }}} \right) + \frac{1}{{n^{\prime } }}\sum\limits_{h = 1}^{L} {W_{h} \,S_{yh}^{2} \left( {\frac{1}{{v_{h} }} - 1} \right)} } \right]\,,$$

$$E\left( {e_{1}^{\prime 2} } \right) = \frac{1}{{\overline{X}^{2} }}S_{x\,}^{2} \left( {\frac{1 - f}{{n^{\prime}}}} \right),$$

$$E\left( {e_{1}^{2} } \right) = \frac{1}{{\overline{X}^{2} }}\left[ {S_{x\,}^{2} \left( {\frac{1 - f}{{n^{\prime } }}} \right) + \frac{1}{{n^{\prime } }}\sum\limits_{h = 1}^{L} {W_{h} \,S_{xh}^{2} \left( {\frac{1}{{v_{h} }} - 1} \right)} } \right]\,,$$

$$E\left( {e_{2}^{\prime 2} } \right) = \frac{1}{{\overline{Z}^{2} }}S_{z\,}^{2} \left( {\frac{1 - f}{{n^{\prime}}}} \right),$$

$$E\left( {e_{0}^{{}} e_{1} } \right) = \frac{1}{{\overline{Y}\,\overline{X}}}\left[ {\left( {\frac{1 - f}{{n^{\prime}}}} \right)S_{yx}^{{}} + \frac{1}{{n^{\prime}}}\sum\limits_{h = 1}^{L} {W_{h} \,S_{yxh}^{{}} \left( {\frac{1}{{v_{h} }} - 1} \right)} } \right]\,,$$

$$E\left( {e_{1}^{{}} e_{1}^{\prime } } \right) = \frac{1}{{\,\overline{X}^{2} }}S_{x\,}^{2} \left( {\frac{1 - f}{{n^{\prime } }}} \right),$$

$$E\left( {e_{0}^{{}} e_{1}^{\prime } } \right) = \frac{1}{{\overline{Y}\,\overline{X}}}\left( {\frac{1 - f}{{n^{\prime } }}} \right)S_{yx\,}^{{}} ,$$

$$E\left( {e_{0}^{{}} e_{2}^{\prime } } \right) = \frac{1}{{\overline{Y}\,\overline{Z}}}\left( {\frac{1 - f}{{n^{\prime}}}} \right)S_{yz\,}^{{}} ,$$

$$E\left( {e_{1} e_{2}^{\prime } } \right) = \frac{1}{{\overline{X}\,\overline{Z}}}\left( {\frac{1 - f}{{n^{\prime}}}} \right)S_{xz\,}^{{}} ,$$

$$E\left( {e_{1}^{\prime } e_{2}^{\prime } } \right) = \frac{1}{{\,\overline{X}\,\overline{Z}}}\left( {\frac{1 - f}{{n^{\prime}}}} \right)S_{xz\,}^{{}} .$$

where e_i’s are error terms.

Substituting these values in (4.1), the developed estimator $\hat{\overline{Y}}_{{C{\text{Re}} }}^{ds}$ becomes

$$\hat{\overline{Y}}_{{C{\text{Re}} }}^{ds} = \overline{Y}\left( {1 + e_{0} } \right)\left( {1 + e_{1} } \right)^{ - 1} \left( {1 + e_{1}^{\prime } } \right)\exp \left\{ {\left( {\frac{{ - e_{2}^{\prime } }}{2}} \right)\left( {1 + \frac{{e_{2}^{\prime } }}{2}} \right)^{ - 1} } \right\}$$

$$\hat{\overline{Y}}_{{C{\text{Re}} }}^{ds} - \overline{Y} = \overline{Y}\left( {e_{0} - e_{1} + e_{1}^{\prime } - \frac{{e_{2}^{\prime } }}{2} - e_{0} e_{1} - e_{1} e_{1}^{\prime } + e_{0} e_{1}^{\prime } + e_{1}^{2} - \frac{{e_{0} e_{2}^{\prime } }}{2} + \frac{{e_{1} e_{2}^{\prime } }}{2} - \frac{{e_{1}^{\prime } e_{2}^{\prime } }}{2} + \frac{{3e_{2}^{\prime } }}{8}^{2} } \right)$$

(4.2)

Expectations of (4.2) proceeds towards the bias of $\hat{\overline{Y}}_{{C{\text{Re}} }}^{ds}$ and finally, up to the first degree of approximation (fda), the bias is obtained as

$$B\left( {\hat{\overline{Y}}_{{C{\text{Re}} }}^{ds} } \right) = \left[ {\frac{1}{{n^{\prime}\overline{X}}}\sum\limits_{i = 1}^{L} {W_{h} \left( {\frac{1}{{\nu_{h} }} - 1} \right)\left( {R_{1} S_{xh}^{2} - S_{yxh} } \right) + \frac{1}{{8\overline{Z}}}\frac{{\left( {1 - f} \right)}}{n^{\prime}}} \left( {3R_{2} S_{z}^{2} - S_{yz} } \right)} \right]$$

(4.3)

To find MSE of the developed estimator, we square and take expectation of (4.2)

$$MSE\left( {\hat{\overline{Y}}_{{C{\text{Re}} }}^{ds} } \right) = \overline{Y}^{2} E\left[ {e_{0} - e_{1} + e_{1}^{\prime } - \frac{{e_{2}^{\prime } }}{2}} \right]^{2}$$

$$= \overline{Y}^{2} E\left( {e_{0}^{2} + e_{1}^{2} + e_{1}^{\prime 2} + \frac{{e_{2}^{\prime 2} }}{4} - 2e_{0} e_{1} - 2e_{1} e_{1}^{\prime } + 2e_{0} e_{1}^{\prime } - e_{0} e_{2}^{\prime } + e_{1} e_{2}^{\prime } - e_{1}^{\prime } e_{2}^{\prime } } \right)$$

Using expected values of e’s, the MSE of the developed estimator $\hat{\overline{Y}}_{{C{\text{Re}} }}^{ds}$ is obtained up to fda as

$$MSE\left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\overline{Y}}_{{C{\text{Re}} }}^{ds} } \right) = \left( {\frac{1 - f}{{n^{\prime}}}} \right)S_{y}^{2} + \frac{1}{n^{\prime}}\sum\limits_{h = 1}^{L} {W_{h} \left( {\frac{1}{{\nu_{h} }} - 1} \right)} \left( {S_{yh}^{2} + R_{1}^{2} S_{xh}^{2} - 2R_{1}^{{}} S_{yxh}^{{}} } \right)$$

$$+ \frac{1}{4}\left( {\frac{1 - f}{{n^{\prime}}}} \right)\left( {R_{2}^{2} S_{z}^{2} - 4R_{2} S_{yz} } \right)$$

(4.4)

Comparison with relevant estimators

From an efficiency perspective, the proposed estimator is compared to all other estimators discussed in Section “Some relevant existing estimators”. The variance of an unbiased estimator, the MSEs of an Ige and Tripathi⁶, Tailor et al.⁸, Lakhre¹⁶, Lone et al.¹⁷ and Lone et al.¹² estimator are all provided in DSS as

$$V\left( {\overline{y}_{ds} } \right)= S_{y\,}^{2} \left( {\frac{1 - f}{{n^{\prime}}}} \right) + \frac{1}{{n^{\prime}}}\sum\limits_{h = 1}^{L} {W_{h} \,S_{yh}^{2} \left( {\frac{1}{{v_{h} }} - 1} \right)} ,$$

(5.1)

$$MSE\left( {\hat{\overline{Y}}_{R}^{ds} } \right) = S_{y}^{2} \left( {\frac{1 - f}{{n^{\prime } }}} \right) + \frac{1}{{n^{\prime } }}\sum\limits_{h = 1}^{L} {W_{h} \left( {\frac{1}{{\nu_{h} }} - 1} \right)} \left( {S_{yh}^{2} + R_{1}^{2} S_{xh}^{2} - 2R_{1} S_{yxh} } \right),$$

(5.2)

$$MSE\left( {\hat{\overline{Y}}_{{\text{Re}}}^{ds} } \right) = S_{y}^{2} \left( {\frac{1 - f}{{n{\prime} }}} \right) + \frac{1}{{n{\prime} }}\sum\limits_{h = 1}^{L} {W_{h} \left( {\frac{1}{{v_{h} }} - 1} \right)\left[ {S_{yh}^{2} + \frac{{R_{1}^{2} }}{4}S_{xh}^{2} \left( {1 - \frac{{\beta_{yxh} }}{{R_{1} }}} \right)} \right]} ,$$

(5.3)

$$MSE(\hat{\overline{Y}}_{{\text{Re}}}^{*} ) = S_{y}^{2} \left( {\frac{1 - f}{{n^{\prime } }}} \right)\; + \frac{1}{{n^{\prime } }}\sum\limits_{h = 1}^{L} {W_{h} \left( {\frac{1}{{\nu_{h} }} - 1} \right)} \;\left[ {S_{yh}^{2} + \;\frac{1}{4}R_{1}^{2} g^{2} S_{xh}^{2} \; - gR_{1} S_{yxh} } \right],$$

(5.4)

$$MSE(\hat{\overline{Y}}_{Rd}^{*} )\; = \;S_{y}^{2} \left( {\frac{1 - f}{{n^{\prime } }}} \right)\; + \;\frac{1}{{n^{\prime } }}\sum\limits_{h = 1}^{L} {W_{h} } \left( {\frac{1}{{\nu_{h} }} - 1} \right)\;\left[ {S_{yh}^{2} \; + g^{2} R_{1}^{2} S_{xh}^{2} \; - 2gR_{1} S_{yxh} } \right],$$

(5.5)

$$MSE(\hat{\overline{Y}}_{Rpd}^{*} )\; = \;S_{y}^{2} \left( {\frac{1 - f}{{n^{\prime } }}} \right)\; + \frac{1}{{n^{\prime } }}\sum\limits_{h = 1}^{L} {W_{h} } \left( {\frac{1}{{\nu_{h} }} - 1} \right)\;\left[ \begin{gathered} S_{yh}^{2} \; + g^{2} R_{1}^{2} S_{xh}^{2} \; + g^{2} R_{2}^{2} S_{zh}^{2} \hfill \\ - 2gR_{1} S_{yxh} + 2gR_{2} S_{yzh} - 2g^{2} R_{1} R_{2} S_{xzh} \hfill \\ \end{gathered} \right],$$

(5.6)

where $g = \frac{n}{N - n}$.

when (4.4), (5.1), (5.2), (5.3), (5.4), (5.5) and (5.6) are compared, it is clear that the developed chain ratio type exponential estimator would be more efficient than

(i)
$\overline{y}_{ds} \,\,\,if$
$$\left( {1 - f} \right)\;\left( {R_{2}^{2} S_{z}^{2} - 4R_{2} S_{yz} } \right)\; \le \;\sum\limits_{h = 1}^{L} {W_{h} \left( {\frac{1}{{\nu_{h} }} - 1} \right)\;\left( {8R_{1} S_{xh}^{2} - 4R_{1}^{2} S_{xh}^{2} } \right)} \;,$$
(5.7)
(ii)
Ige and Tripathi⁶ estimator $\hat{\overline{Y}}_{R}^{ds}$ if
$$S_{z}^{2} \; \le \;\frac{{4S_{yz} }}{{R_{2} }}\;\;,$$
(5.8)
(iii)
Tailor et al.⁸ estimator $\hat{\overline{Y}}_{Re}^{ds}$ if
$$\left( {1 - f} \right)\;\left( {R_{2}^{2} S_{z}^{2} - 4R_{2} S_{yz} } \right)\; \le \;\;\sum\limits_{h = 1}^{L} {W_{h} \left( {\frac{1}{{\nu_{h} }} - 1} \right)\;\left( {8R_{1} S_{yxh} - 8R_{1} S_{xh}^{2} - R_{1} S_{xh}^{2} \beta_{yxh} } \right)} \;,$$
(5.9)
(iv)
Lakhre¹⁶ estimator $\hat{\overline{Y}}_{{\text{Re}}}^{*}$ if
$$\left( {1 - f} \right)\;\left( {R_{2}^{2} S_{z}^{2} - 4R_{2} S_{yz} } \right)\; \le \;4\;\left[ {\sum\limits_{h = 1}^{L} {W_{h} \left( {\frac{1}{{\nu_{h} }} - 1} \right)\;\left( {R_{1}^{2} S_{xh}^{2} \left( {\frac{{g^{2} }}{4} - 1} \right) + R_{1} S_{yxh} \left( {2 - g} \right)} \right)} } \right],$$
(5.10)
(v)
Lone et al.¹⁷ estimator $\hat{\overline{Y}}_{Rd}^{*}$ if
$$\left( {1 - f} \right)\;\left( {R_{2}^{2} S_{z}^{2} - 4R_{2} S_{yz} } \right)\; \le \;4\;\left[ {\sum\limits_{h = 1}^{L} {W_{h} \left( {\frac{1}{{\nu_{h} }} - 1} \right)\;\left( {R_{1}^{2} S_{xh}^{2} \left( {g^{2} - 1} \right) + 2R_{1} S_{yxh} \left( {1 - g} \right)} \right)} } \right],$$
(5.11)
(vi)
Lone et al.¹² estimator $\hat{\overline{Y}}_{Rpd}^{*}$ if
$$\left( {1 - f} \right)\;\left( {R_{2}^{2} S_{z}^{2} - 4R_{2} S_{yz} } \right)\; \le \;4\;\left[ {\sum\limits_{h = 1}^{L} {W_{h} \left( {\frac{1}{{\nu_{h} }} - 1} \right)\;\left( \begin{gathered} R_{1} S_{xh}^{2} \left( {g^{2} - 1} \right) + 2R_{1} S_{yxh} \left( {1 - g} \right) + g^{2} R_{2}^{2} S_{zh}^{2} \hfill \\ + 2gR_{2} S_{yzh} - 2g^{2} R_{1} R_{2} S_{xzh} \hfill \\ \end{gathered} \right)} } \right].$$
(5.12)

Empirical study

In Section “Comparison with relevant estimators”, the developed chain ratio type exponential estimator was compared theoretically. In this section numerical illustration is being discussed to show the performance of different considered estimators as well as the proposed estimator practically and the percent relative efficiency (PRE) of the proposed estimator compared to other considered estimators is also shown in Table 2. For this purpose, two data sets have been considered. Description of data set is given below:

Variables	Population I- [Source: Singh and Chaudhary²², P. 177]	Population II- [Source: Murthy²¹, p. 228]
$y$	Productivity	Output
$x$	Production	Fixed capital
$z$	Area	Number of workers

	Population I		Population II
Parameters	Stratum I	Stratum II	Stratum I	Stratum II
$N_{h}$	10	10	5	5
$n_{h}$	4	4	2	2
$n^{\prime}_{h}$	6	6	4	4
$\overline{Y}_{h}$	264.00	214.70	1925.80	3115.60
$\overline{X}_{h}$	939.00	1121.50	214.40	333.80
$\overline{Z}_{h}$	263.20	202.90	51.80	60.60
$S_{yh}$	149.53	192.02	615.92	340.38
$S_{xh}$	389.67	1165.20	74.87	66.35
$S_{zh}$	162.85	178.54	0.75	4.84
$S_{yxh}$	53,277.00	68,650.00	39,360.68	22,356.50
$S_{yzh}$	23,798.00	33,841.00	411.16	1536.24
$S_{xzh}$	58,729.00	60,376.00	38.08	287.92
$S_{y}^{2}$	31,814.87		668,351.00
$S_{z}^{2}$	31,692.05		34.84.00
$S_{yz}$	29,562.58		1668.23

Simulation study

In this section, simulation study has been carried out to observe the performance of the developed estimator as compared to other considered estimators by using R-software. Six different pseudo populations of size $N$ having two strata $N_{1}$ and $N_{2}$ of equal and unequal sizes have been generated. All the populations are simulated from normal distribution. The values of PRE and MSE of the populations having equal strata sizes are given in Tables 3, 4 respectively and Tables 5 and 6 show PRE and MSE values of the populations having unequal strata sizes. The results of the simulated data sets are also represented with the help of line graphs, where it is clearly shown that the developed estimator has highest PRE and least MSE in each population.

Populations having equal strata size:

Population 1: $N$ = 800, $N_{1}$ = 400, $N_{2}$ = 400, $n^{\prime }$ = 600, $n_{1}^{\prime }$ = 300, $n_{2}^{\prime }$ = 300, $n$ = 300.

Population 2: $N$ = 500, $N_{1}$ = 250, $N_{2}$ = 250, $n^{\prime }$ = 350, $n_{1}^{\prime }$ = 175, $n_{2}^{\prime}$ = 175, $n$ = 174.

Population 3: $N$ = 1000, $N_{1}$ = 500, $N_{2}$ = 500, $n^{\prime}$ = 600, $n_{1}^{\prime}$ = 300, $n_{2}^{\prime}$ = 300, $n$ = 300.

Populations having unequal strata size:

Population 1: $N$ = 1500, $N_{1}$ = 650, $N_{2}$ = 850, $n^{\prime}$ = 1000, $n_{1}^{\prime}$ = 400, $n_{2}^{\prime}$ = 600, $n$ = 500.

Population 2: $N$ = 400, $N_{1}$ = 150, $N_{2}$ = 250, $n^{\prime}$ = 300, $n_{1}^{\prime}$ = 100, $n_{2}^{\prime}$ = 200, $n$ = 150.

Population 3: $N$ = 2000, $N_{1}$ = 1200, $N_{2}$ = 800, $n^{\prime}$ = 1200, $n_{1}^{\prime}$ = 800, $n_{2}^{\prime}$ = 400, $n$ = 500.

Results and discussions

(i)
Table 1 demonstrates that among the two real data sets, population 1 meets all the conditions outlined in Section “Comparison with relevant estimators”, under which the proposed estimator outperform other considered estimators. In contrast, population 2 fails to fulfill the conditions specified in Eqs. (5.11) and (5.12). Table 2 presents the percent relative efficiency (PRE) of all the considered estimators discussed in Section “Empirical study” as well as the proposed estimator for the two real data sets. The symbol ‘*’ is used to indicate instances where PRE is not applicable, as these conditions, as detailed in Eqs. (5.11) and (5.12), remain unsatisfied, a fact also highlighted in Table 1 (rows 5 and 6).
(ii)
The PRE and MSE values of the proposed and other considered estimators for the first three simulated normal populations having equal strata sizes are given in Tables 3 and 4 respectively. Similarly, Tables 5 and 6 show the PRE and MSE values for the next three simulated normal populations having unequal strata sizes.
(iii)
Figures 1 and 2 show the PRE and MSE of the first three simulated normal populations having equal strata sizes and the next three simulated normal populations having unequal strata sizes respectively.
(iv)
It is observed from all the tables as well as from all the graphs that the proposed estimator has least MSE and highest PRE among other considered estimators which indicates that the proposed estimator will perform better for practical purpose compared to other considered estimators such as $\overline{y}_{ds}$, $\hat{\overline{Y}}_{R}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{*}$ $\hat{\overline{Y}}_{Rd}^{*}$ and $\hat{\overline{Y}}_{Rpd}^{*}$ under the conditions given in Section “Comparison with relevant estimators”.

Table 1 Empirical exhibition of theoretical conditions given in Sect. “Some relevant existing estimators”

Full size table

Table 2 Percent relative efficiencies of $\overline{y}_{ds}$, $\hat{\overline{Y}}_{R}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{*}$$\hat{\overline{Y}}_{Rd}^{*}$, $\hat{\overline{Y}}_{Rpd}^{*}$ and $\hat{\overline{Y}}_{CRe}^{ds}$ with respect to $\overline{y}_{ds}$.

Full size table

Table 3 Percent relative efficiencies of $\overline{y}_{ds}$, $\hat{\overline{Y}}_{R}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{*}$, $\hat{\overline{Y}}_{Rd}^{*}$, $\hat{\overline{Y}}_{Rpd}^{*}$ and $\hat{\overline{Y}}_{CRe}^{ds}$ with respect to $\overline{y}_{ds}$ for equal strata size.

Full size table

Table 4 Mean squared error of $\overline{y}_{ds}$, $\hat{\overline{Y}}_{R}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{*}$, $\hat{\overline{Y}}_{Rd}^{*}$, $\hat{\overline{Y}}_{Rpd}^{*}$ and $\hat{\overline{Y}}_{CRe}^{ds}$ with respect to $\overline{y}_{ds}$ for equal strata size.

Full size table

Table 5 Percent relative efficiencies of $\overline{y}_{ds}$, $\hat{\overline{Y}}_{R}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{*}$, $\hat{\overline{Y}}_{Rd}^{*}$, $\hat{\overline{Y}}_{Rpd}^{*}$ and $\hat{\overline{Y}}_{CRe}^{ds}$ with respect to $\overline{y}_{ds}$ for unequal strata size.

Full size table

Table 6 Mean squared error of $\overline{y}_{ds}$, $\hat{\overline{Y}}_{R}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{ds}$, $\hat{\overline{Y}}_{{\text{Re}}}^{*}$, $\hat{\overline{Y}}_{Rd}^{*}$, $\hat{\overline{Y}}_{Rpd}^{*}$ and $\hat{\overline{Y}}_{CRe}^{ds}$ with respect to $\overline{y}_{ds}$ for unequal strata size.

Full size table

Conclusion

In this study, we have investigated the problem of estimating the population mean of the study variable. We have proposed a chain ratio-type exponential estimator and examined its properties such as bias and mean squared error up to the first degree of approximation. Our analysis in Section “Comparison with relevant estimators” has established the conditions under which the proposed estimator outperforms all other estimators shown in Section ““Some relevant existing estimators””. The empirical as well as simulation study have been conducted to support the theoretical findings. The proposed estimator found to be more efficient compared to other considered estimators under some conditions given in Section ““Comparison with relevant estimators”” as it has least MSE and highest PRE among other estimators. The results are shown with the help of Tables 1, 2, 3, 4, 5 and 6 and also by using graphs shown in Figs. 1 and 2.

Overall, our research contributes significantly to the theory of estimating the population mean in the context of double sampling for stratification. Therefore, we recommend the application of our proposed estimator for the estimation of population mean in real-life situations.

Data availability

All the necessary data generated and/or analyzed during the current study are included in this published article.

References

Hosking, J. R. L-moments: Analysis and estimation of distributions using linear combinations of order statistics. J. R. Stat. Soc. Ser. B 52(1), 105–124 (1990).
MathSciNet Google Scholar
Shahzad, U., Ahmad, I., Almanjahie, I. M. & Al–Noor, N. H. L-moments based calibrated variance estimators using double stratified sampling. Comput. Mater. Contin. 68(3), 3411–3430 (2021).
Google Scholar
Lam, T. Y., Kleinn, C. & Coenradie, B. Double sampling for stratification for the monitoring of sparse tree populations: the example of Populus euphratica Oliv. Forests at the lower reaches of Tarim River, Southern Xinjiang, China. Environ. Monit. Assess. 175, 45–61 (2011).
Article PubMed Google Scholar
Neyman, J. Contribution to the theory of sampling human population. J. Am. Stat. Assoc. 33, 101–116 (1938).
Article Google Scholar
Rao, J. N. K. On double sampling for stratification and analytical surveys. Biometrika 60, 125–133 (1973).
Article MathSciNet Google Scholar
Ige, A. F. & Tripathi, T. P. On doubling for stratification and use of auxiliary information. J. Indian Soc. Agricult. Stat. 39, 191–201 (1987).
Google Scholar
Singh, H. P. & Vishwakarma, G. K. A general procedure for estimating the mean using double sampling for stratification. Model Assist. Stat. Appl. 2(4), 225–237 (2007).
MathSciNet Google Scholar
Tailor, R., Chouhan, S. & Kim, J. M. Ratio and product type exponential estimators of population mean in double sampling for stratification. Commun. Stat. Appl. Methods 21(1), 1–9 (2014).
Google Scholar
Tailor, R. & Lone, H. A. Ratio-cum-product estimator of finite population mean in double sampling for stratification. J. Reliab. Stat. Stud. 7(1), 93–101 (2014).
Google Scholar
Singh, H. P. & Nigam, P. Ratio-ratio-type exponential estimator of finite population mean in double sampling for stratification. Int. J. Agricult. Stat. Sci. 16(1), 251–257 (2020).
Google Scholar
Gupta, A. & Tailor, R. Ratio in ratio type exponential strategy for the estimation of population mean. J. Reliab. Stat. Stud. 14(2), 551–564 (2021).
Google Scholar
Lone, H. A., Tailor, R. & Verma, M. R. A note on the estimation of population mean in double sampling for stratification. J. Indian Soc. Agricult. Stat. 76(3), 115–120 (2022).
Google Scholar
Verma, M. R., Lone, H. A. & Tailor, R. Generalized dual to ratio-cum-product type estimators in double sampling for stratification. J. Indian Soc. Agricult. Stat. 77(1), 125–132 (2023).
Google Scholar
Cochran, W. G. The estimation of the yields of the cereal experiments by sampling for the ratio of grain to total produce. J. Agricult. Sci. 30, 262–275 (1940).
Article Google Scholar
Bahl, S. & Tuteja, R. K. Ratio and product type exponential estimators. J. Inf. Optim. Sci. 12(1), 159–164 (1991).
MathSciNet Google Scholar
Lakhre, A. Dual to ratio and product type exponential estimators of finite population mean in double sampling for stratification. Int. J. Sci. Res. Math. Stat. Sci. 4(5), 1–8 (2017).
Google Scholar
Lone, H. A., Tailor, R. & Verma, M. R. An alternative to ratio and product type estimators of finite population mean in double sampling for stratification. J. Indian Soc. Agricult. Stat. 74(1), 63–68 (2020).
Google Scholar
Srivenkataramana, T. A dual of ratio estimator in sample surveys. Biometrika 67(1), 199–204 (1980).
Article MathSciNet Google Scholar
Bandyopadhyay, S. Improved ratio and product estimators. Sankhya SeriesC 42(2), 45–49 (1980).
MathSciNet Google Scholar
Singh, M. P. Ratio cum product method of estimation. Metrika 12(34–42), 8 (1967).
MathSciNet Google Scholar
Murthy, M. N. Sampling Theory and Methods 228 (Statistical Publishing Society, 1967).
Google Scholar
Singh, D. & Chaudhary, F. S. Theory and Analysis of Sample Survey Designs (New Age Intern. Pvt. Lmt., 1971).
Google Scholar

Download references

Acknowledgements

Authors are grateful to the editor and reviewers for their valuable suggestions regarding improvement of this paper.

Author information

Authors and Affiliations

School of Studies in Statistics, Vikram University, Ujjain, M.P., 456010, India
Anurag Gupta, Rajesh Tailor & Nitu Barod

Authors

Anurag Gupta
View author publications
Search author on:PubMed Google Scholar
Rajesh Tailor
View author publications
Search author on:PubMed Google Scholar
Nitu Barod
View author publications
Search author on:PubMed Google Scholar

Contributions

The idea of the estimator generation and main body of the article prepared by A.G. and R.T. N.B. carried out simulation study of the estimator. All authors read and approved the final study manuscript.

Corresponding author

Correspondence to Anurag Gupta.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gupta, A., Tailor, R. & Barod, N. Improved exponential type ratio estimator in double sampling for stratification. Sci Rep 13, 22520 (2023). https://doi.org/10.1038/s41598-023-49772-0

Download citation

Received: 29 August 2023
Accepted: 12 December 2023
Published: 18 December 2023
Version of record: 18 December 2023
DOI: https://doi.org/10.1038/s41598-023-49772-0