Efficient class of estimators for finite population mean using auxiliary attribute in stratified random sampling

Singh, Housila P.; Gupta, Anurag; Tailor, Rajesh

doi:10.1038/s41598-023-34603-z

Download PDF

Article
Open access
Published: 24 June 2023

Efficient class of estimators for finite population mean using auxiliary attribute in stratified random sampling

Housila P. Singh¹,
Anurag Gupta² &
Rajesh Tailor¹

Scientific Reports volume 13, Article number: 10253 (2023) Cite this article

2795 Accesses
14 Citations
Metrics details

Subjects

Abstract

The aim of this paper is to develop more effective methods for estimating population means in sample surveys using auxiliary attributes. To achieve this goal, we introduce a modified version of the estimators proposed by Koyuncu (2013b) and Shahzad et al. (2019), as well as a new class of estimators. We derive expressions for the bias and mean squared error of these new estimators up to the first degree of approximation. Our results show that the suggested classes of estimators perform better than other existing methods, with the lowest mean squared error under optimal conditions. We also conduct an empirical investigation to support our findings.

An enhanced estimator of finite population variance using two auxiliary variables under simple random sampling

Article Open access 05 December 2023

A new improved generalized class of estimators for population distribution function using auxiliary variable under simple random sampling

Article Open access 03 April 2023

A new auxiliary variables-based estimator for population distribution function under stratified random sampling and non-response

Article Open access 19 April 2025

Introduction

The use of auxiliary attribute is a well-known method for improving the efficiency of an estimator in estimating population parameters. Auxiliary attributes (say $\phi$), which are highly correlated with the study variable (y), are commonly encountered in practice. Examples include a person's height (y), the amount of milk produced by a cow (y), and the yield of a particular variety of wheat (y), which may respectively depend on factors such as gender, breed of the cow, and type of wheat. For more examples, see Kendale and Stuart¹, Shabbir and Gupta², and Sharma and Singh³, among others. The estimation of the population mean of the study variable (y) using an auxiliary attribute $\left( \phi \right)$ under simple random sampling without replacement has been extensively studied. See, for instance, Naik and Gupta⁴, Jhajj et al.⁵, Solanki and Singh⁶, Singh et al.⁷, Gupta and Tailor⁸, and other relevant literature.

The simple random sampling scheme is commonly used when the population units are homogeneous. However, in many practical situations, the population units are heterogeneous, and to obtain a better estimate of the population parameters, we use stratified random sampling. Therefore, our objective is to estimate the population mean of the study variable (y) using information on an auxiliary attribute $\left( \phi \right)$ under stratified random sampling. Various authors, including Sharma and Singh³, Koyuncu^9,10, Shahzad et al.¹¹, Zaman^12,13, Hussain et al.^14,15, Zaman et al.^16,17, and Ahmad et al.^18,19, have discussed the problem of estimating the population mean of the study variable using an auxiliary attribute under stratified random sampling.

Shahzad et al.^20,21 used a calibration approach in stratified random sampling. They also proposed an estimator to estimate the coefficient of variation using calibrated estimators in stratified random sampling Shahzad et al.²².

Notations

Consider a population size N unit, is divided into L strata units hth stratum containing N_h units, where h = 1, 2,…, L such that ${\sum }_{h=1}^{L}{N}_{h}=N$. A simple random sample of size n_h is drawn without replacement the hth stratum such that ${\sum }_{h=1}^{L}{n}_{h}=n$. Let $\left({y}_{hi},{\phi }_{hi}\right)$ be observed value of study variable y and the auxiliary attribute $\phi$ on the ith unit of the hth stratum, respectively, where $i=\mathrm{1,2},...,{N}_{h}$ and $h=\mathrm{1,2},...,L$.

Further, let ${\overline{y}}_{h}={\sum }_{h=1}^{{n}_{h}}\frac{{y}_{hi}}{{n}_{h}}$ and ${\overline{y}}_{st}={\sum }_{h=1}^{L}{W}_{h}{\overline{y}}_{h}$ be unbiased estimators of population means ${\overline{Y}}_{h}={\sum }_{i=1}^{{N}_{h}}\frac{{y}_{hi}}{{N}_{h}}$ and $\overline{Y}={\sum }_{h=1}^{L}{W}_{h}{\overline{Y}}_{h}$, where ${W}_{h}=\frac{{N}_{h}}{N}$ is the stratum weight.

We also assume that

$$\phi_{hi} = \left\{ {\begin{array}{ll} {1, \;\;\; i{^\text{th}}\,{\text{unit}}\,{\text{of}}\,{\text{the}}\,h{^\text{th}}\,{\text{stratum}}\,{\text{possesses}}\,{\text{the}}\,{\text{attribute}}\,\phi ,} \\ {0, \;\;\; {\text{otherwise}}{.}\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,} \\ \end{array} } \right.$$

Let ${s}_{yh}^{2}={\sum }_{i=1}^{{n}_{h}}\frac{{\left({y}_{hi}-{\overline{y}}_{h}\right)}^{2}}{\left({n}_{h}-1\right)}$ and ${s}_{\phi h}^{2}={\sum }_{i=1}^{{n}_{h}}\frac{{\left({\phi }_{hi}-{p}_{h}\right)}^{2}}{\left({n}_{h}-1\right)}$ be the hth sample variances and ${S}_{yh}^{2}={\sum }_{i=1}^{{N}_{h}}\frac{{\left({y}_{hi}-{\overline{Y}}_{h}\right)}^{2}}{\left({N}_{h}-1\right)}$ and ${S}_{\phi h}^{2}={\sum }_{i=1}^{{N}_{h}}\frac{{\left({\phi }_{hi}-{P}_{h}\right)}^{2}}{\left({N}_{h}-1\right)}$ be the hth population variances of the study variable y and the auxiliary attribute $\phi$, respectively.

Further, ${s}_{y\phi h}={\sum }_{i=1}^{{n}_{h}}\frac{\left({y}_{hi}-{\overline{y}}_{h}\right)\left({\phi }_{hi}-{p}_{h}\right)}{\left({n}_{h}-1\right)}$ and ${\widehat{\rho }}_{y\phi h}=\frac{{s}_{y\phi h}}{{s}_{yh}{s}_{\phi h}}$ are the hth sample covariance and point bi-serial correlation and ${S}_{y\phi h}={\sum }_{i=1}^{{N}_{h}}\frac{\left({y}_{hi}-{\overline{Y}}_{h}\right)\left({\phi }_{hi}-{P}_{h}\right)}{\left({N}_{h}-1\right)}$ and ${\widehat{\rho }}_{y\phi h}=\frac{{S}_{y\phi h}}{{S}_{yh}{S}_{\phi h}}$ are population covariance and point bi-serial correlation between the study variable y and the auxiliary attribute $\phi$, respectively.

To derive the bias and mean squared error (MSE) of the estimators, we write

$$\overline{y}_{st} = \overline{Y}\left( {1 + e_{o} } \right),\;p_{st} = P\left( {1 + e_{1} } \right)$$

such that $E\left( {e_{o} } \right) = E\left( {e_{1} } \right) = 0$ and

$$E\left({e}_{0}^{2}\right)=\frac{1}{{\overline{Y}}^{2}}\sum \limits_{h=1}^{L}{W}_{h}^{2}{\gamma }_{h}{S}_{yh}^{2}={A}_{y},$$

$$E\left({e}_{1}^{2}\right)=\frac{1}{{P}^{2}}\sum \limits_{h=1}^{L}{W}_{h}^{2}{\gamma }_{h}{S}_{\phi h}^{2}={A}_{\phi },$$

$$E\left({e}_{0}{e}_{1}\right)=\frac{1}{\overline{Y}P}\sum \limits_{h=1}^{L}{W}_{h}^{2}{\gamma }_{h}{S}_{y\phi h}={A}_{y\phi },$$

where ${p}_{st}={\sum }_{h=1}^{L}{W}_{h}{p}_{h}$ such that $E\left({p}_{st}\right)=P={\sum }_{h=1}^{L}{W}_{h}{P}_{h}$ and ${\gamma }_{h}=\frac{{N}_{h}-{n}_{h}}{{n}_{h}{N}_{h}}$.

Reviewing some existing estimators

The conventional unbiased estimators for population mean $\overline{Y}$ of the study variable y under stratified random sampling is given by

$${t}_{0}={\overline{y}}_{st}=\sum \limits_{h=1}^{L}{W}_{h}{\overline{y}}_{h}$$

(1)

The variance/MSE of the estimator $t_{0}$ is given by

$$MSE\left({t}_{0}\right)=\sum \limits_{h=1}^{L}{W}_{h}^{2}{\gamma }_{h}{S}_{yh}^{2}={\overline{Y}}^{2}{A}_{y} .$$

(2)

The ratio estimator for population mean $\overline{Y}$ using auxiliary attribute $\phi$ in stratified random sampling due to Naik and Gupta⁴ is given by

$${t}_{1}={\overline{y}}_{st}\left(\frac{P}{{p}_{st}}\right).$$

(3)

The MSE of the estimator $t_{1}$ up to the first degree of approximation (fda), is given by

$$MSE\left({t}_{1}\right)={\overline{Y}}^{2}\left[{A}_{y}+{A}_{\phi }\left(1-2C\right)\right],$$

(4)

where $C=\frac{{A}_{y\phi }}{{A}_{\phi }}$.

The stratified version of ordinary product estimator for population mean $\overline{Y}$ is defined by

$${t}_{2}={\overline{y}}_{st}\left(\frac{{p}_{st}}{P}\right).$$

(5)

The MSE of $t_{2}$ to the fda, is given by

$$MSE\left({t}_{2}\right)={\overline{Y}}^{2}\left[{A}_{y}+{A}_{\phi }\left(1+2C\right)\right].$$

(6)

The usual regression estimator for $\overline{Y}$ is given by

$${t}_{3}={\overline{y}}_{st}+{b}_{st}\left(P-{p}_{st}\right),$$

(7)

where $b_{st}$ is the sample regression coefficient of y on $\phi$.

To the fda the MSE of $t_{3}$ is given by

$$MSE\left({t}_{3}\right)={\overline{Y}}^{2}{A}_{y}\left(1-{\rho }_{y\phi }^{2}\right),$$

(8)

where ${\rho }_{y\phi }=\frac{{A}_{y\phi }}{\sqrt{{A}_{y}{A}_{\phi }}}$.

Koyuncu⁹ suggested a class of estimators for $\overline{Y}$ is given by

$${t}_{4}=\left[{w}_{1}{\overline{y}}_{st}+{w}_{2}\left(P-{p}_{st}\right)\right]\left\{\frac{{a}_{st}P+{b}_{st}}{{a}_{st}{p}_{st}+{b}_{st}}\right\},$$

(9)

where $\left({w}_{1},{w}_{2}\right)$ are suitable constants, ${a}_{st}\left(\ne 0\right)$ and ${b}_{st}$ are either real numbers or the functions of the known parameters for the hth stratum of the auxiliary attribute $\phi$, such as standard deviation ${S}_{\phi \left(st\right)}={\sum }_{h=1}^{L}{W}_{h}{\sigma }_{\phi h}$, coefficient of variation ${C}_{\phi \left(st\right)}={\sum }_{h=1}^{L}{W}_{h}{C}_{\phi h}$ with ${C}_{\phi h}=\frac{{S}_{\phi h}}{{P}_{h}}$,skewness ${\beta }_{1\left(\phi \right)st}={\sum }_{h=1}^{L}{W}_{h}{\beta }_{1h}\left(\phi \right)$ and kurtosis ${\beta }_{2\left(\phi \right)st}={\sum }_{h=1}^{L}{W}_{h}{\beta }_{2h}\left(\phi \right)$ and correlation coefficient ${\rho }_{\left(y\phi \right)st}={\sum }_{h=1}^{L}{W}_{h}{\rho }_{y\phi {h}}$ where ${\beta }_{1h\left(\phi \right)}=\frac{{\mu }_{3h}^{2}\left(\phi \right)}{{\mu }_{2h}^{3}\left(\phi \right)}$, ${\beta }_{2h\left(\phi \right)}=\frac{{\mu }_{4h}\left(\phi \right)}{{\mu }_{2h}^{2}\left(\phi \right)}$, ${\sigma }_{\phi h}^{2}={\mu }_{2h}\left(\phi \right)=\frac{1}{{N}_{h}}{\sum }_{i=1}^{{N}_{h}}{\left({\phi }_{hi}-{P}_{h}\right)}^{2}$, ${\mu }_{3h}\left(\phi \right)=\frac{1}{{N}_{h}}{\sum }_{i=1}^{{N}_{h}}{\left({\phi }_{hi}-{P}_{h}\right)}^{3}$ and ${\mu }_{4h}\left(\phi \right)=\frac{1}{{N}_{h}}{\sum }_{i=1}^{{N}_{h}}{\left({\phi }_{hi}-{P}_{h}\right)}^{4}$.

The MSE of the estimator t₄ to the fda is given by

$$MSE\left({t}_{4}\right)={\overline{Y}}^{2}\left[1+{w}_{1}^{2}{B}_{1}+{w}_{2}^{2}{B}_{2}+2{w}_{1}{w}_{2}{B}_{3}-2{w}_{1}{B}_{4}-2{w}_{2}{B}_{5}\right],$$

(10)

where ${B}_{1}=\left[1+{A}_{y}+{\upsilon }_{st}{A}_{\phi }\left(3{\upsilon }_{st}-4C\right)\right]$, ${B}_{2}=\frac{{A}_{\phi }}{{R}^{2}},$ ${B}_{3}=\left(\frac{{A}_{\phi }}{R}\right)\left(2{\upsilon }_{st}-C\right),$ ${B}_{4}=\left[1+{\upsilon }_{st}{A}_{\phi }\left({\upsilon }_{st}-C\right)\right],$ ${B}_{5}=\left(\frac{{A}_{\phi }}{R}\right){\upsilon }_{st}$,$R=\frac{\overline{Y}}{P}$ and ${\upsilon }_{st}=\frac{{a}_{st}P}{\left({a}_{st}P+{b}_{st}\right)}$.

The MSE t₄ at (10) is minimized for

$$\left.\begin{array}{c}{w}_{1}=\frac{\left({B}_{2}{B}_{4}-{B}_{3}{B}_{5}\right)}{\left({B}_{1}{B}_{2}-{B}_{3}^{2}\right)}\\ {w}_{2}=\frac{\left({B}_{1}{B}_{5}-{B}_{3}{B}_{4}\right)}{\left({B}_{1}{B}_{2}-{B}_{3}^{2}\right)}\end{array}\right\}.$$

(11)

Therefore, the resulting minimum MSE of t₄ is given by

$$\begin{aligned} MSE_{\min } \left( {t_{4} } \right) & = \overline{Y}^{2} \left[ {1 - \frac{{\left( {B_{2} B_{4}^{2} - 2B_{3} B_{4} B_{5} + B_{1} B_{5}^{2} } \right)}}{{\left( {B_{1} B_{2} - B_{3}^{2} } \right)}}} \right] \hfill \\ & = \overline{Y}^{2} \left[ {1 - \frac{{\left( {A_{\phi } - \upsilon_{st}^{2} A_{\phi } A_{y\phi }^{2} - \upsilon_{st}^{2} A_{\phi }^{2} + \upsilon_{st}^{2} A_{\phi }^{2} A_{y}^{{}} } \right)}}{{\left( {A_{\phi } + A_{y} A_{\phi } - \upsilon_{st}^{2} A_{\phi }^{2} - A_{y\phi }^{2} } \right)}}} \right]. \hfill \\ \end{aligned}$$

(12)

Using information on auxiliary attribute $\phi$, Sharma and Singh³ proposed the following exponential type estimators for $\overline{Y}$ as

$${t}_{1e}={\overline{y}}_{st}\mathit{exp}\left(\frac{P-{p}_{st}}{P+{p}_{st}}\right),\; (\mathrm{ratio \; type \; exponential \; estimator})$$

(13)

$${t}_{2e}={\overline{y}}_{st}\mathit{exp}\left(\frac{{p}_{st}-P}{{p}_{st}+P}\right), \; (\text{product-type exponential estimator})$$

(14)

$${t}_{\alpha e}={\overline{y}}_{st}\mathit{exp}\left\{\frac{\alpha \left(P-{p}_{st}\right)}{\left(P+{p}_{st}\right)}\right\},$$

(15)

where $\alpha$ being a suitable chosen constant.

To the fda the MSEs of the estimators $t_{1e,} t_{2e}$ and $t_{\alpha e,}$ are respectively given by

$$MSE\left( {t_{1e} } \right) = \overline{Y}^{2} \left[ {A_{y} + \frac{{A_{\phi } }}{4}\left( {1 - 4C} \right)} \right],$$

(16)

$$MSE\left( {t_{2e} } \right) = \overline{Y}^{2} \left[ {A_{y} + \frac{{A_{\phi } }}{4}\left( {1 + 4C} \right)} \right],$$

(17)

$$MSE\left( {t_{\alpha e} } \right) = \overline{Y}^{2} \left[ {A_{y} + \frac{{\alpha A_{\phi } }}{4}\left( {\alpha - 4C} \right)} \right].$$

(18)

The $MSE\left({t}_{\alpha e}\right)$ is minimum when

$$\alpha =2C.$$

(19)

This yields the minimum MSE of ${t}_{\alpha e}$ is given by

$$MSE_{\min } \left( {t_{\alpha e} } \right) = \overline{Y}^{2} A_{y} \left( {1 - \rho_{y\phi }^{2} } \right).$$

(20)

Sharma and Singh³ proposed the following class of estimators for population mean $\overline{Y}$ as

$${t}_{5}=\left[{w}_{1}{\overline{y}}_{st}+{w}_{2}\left(P-{p}_{st}\right)\right]\mathit{exp}\left\{\frac{{a}_{st}\left(P-{p}_{st}\right)}{{a}_{st}\left(P+{p}_{st}\right)+2{b}_{st}}\right\},$$

(21)

where $\left({a}_{st},{b}_{st}\right)$ are same as defined for the class of estimators ${t}_{4}$ at (9) and $\left({w}_{1},{w}_{2}\right)$ are suitable chosen constants to be determined such that MSE of ${t}_{5}$ is minimum.

If we set ${a}_{st}=1$ and ${b}_{st}=NP$ in (21), then the class of estimators ${t}_{5}$ reduces to the Shahzad et al.¹¹ class of estimators for $\overline{Y}$ as

$${t}_{6}=\left[{w}_{1}{\overline{y}}_{st}+{w}_{2}\left(P-{p}_{st}\right)\right]\mathit{exp}\left\{\frac{\left(P-{p}_{st}\right)}{P+{p}_{st}+2NP}\right\}.$$

(22)

We note that the expressions of bias and MSE of the class of estimators ${t}_{5}$ derived by Sharma and Singh³ [Eqs. (4.6) and (4.7), p. 1789] are not correct. The correct expressions of bias and MSE of the estimator ${t}_{5}$ to the fda are respectively given by

$$B\left({t}_{5}\right)=\overline{Y}\left({w}_{1}{A}_{4}+{w}_{2}{A}_{5}-1\right),$$

(23)

and

$$MSE\left({t}_{5}\right)={\overline{Y}}^{2}\left[1+{w}_{1}^{2}{A}_{1}+{w}_{2}^{2}{A}_{2}+2{w}_{1}{w}_{2}{A}_{3}-2{w}_{1}{A}_{4}-2{w}_{2}{A}_{5}\right],$$

(24)

where ${A}_{1}=\left[1+{A}_{y}+{\upsilon }_{st}{A}_{\phi }\left({\upsilon }_{st}-2C\right)\right]$, ${A}_{2}=\frac{{A}_{\phi }}{{R}^{2}}$, ${A}_{3}=\left(\frac{1}{R}\right){A}_{\phi }\left({\upsilon }_{st}-C\right)$, ${A}_{4}=\left[1+\frac{{\upsilon }_{st}{A}_{\phi }}{8}\left(3{\upsilon }_{st}-4C\right)\right]$, ${A}_{5}=\frac{{\upsilon }_{st}{A}_{\phi }}{2R}$.

The MSE t₅ at (24) is minimized for

$$\left.\begin{array}{c}{w}_{1}=\frac{{A}_{2}{A}_{4}-{A}_{3}{A}_{5}}{{A}_{1}{A}_{2}-{A}_{3}^{2}}\\ {w}_{2}=\frac{{A}_{1}{A}_{5}-{A}_{3}{A}_{4}}{{A}_{1}{A}_{2}-{A}_{3}^{2}}\end{array}\right\}.$$

(25)

Therefore, the minimum MSE of t₅ is given by

$$MSE_{\min } \left( {t_{5} } \right) = \overline{Y}^{2} \left[ {1 - \frac{{A_{2} A_{4}^{2} - 2A_{3} A_{4} A_{5} + A_{1} A_{5}^{2} }}{{A_{1} A_{2} - A_{3}^{2} }}} \right]$$

(26)

Putting ${a}_{st}=1$ and ${b}_{st}=NP$ in (24), we get the MSE of $t_{6}$ to the fda is given by

$$MSE\left({t}_{6}\right)={\overline{Y}}^{2}\left[1+{w}_{1}^{2}{A}_{1\left(1\right)}+{w}_{2}^{2}{A}_{2\left(1\right)}+2{w}_{1}{w}_{2}{A}_{3\left(1\right)}-2{w}_{1}{A}_{4\left(1\right)}-2{w}_{2}{A}_{5\left(1\right)}\right],$$

(27)

where ${A}_{1\left(1\right)}=\left[1+{A}_{y}+{\upsilon }_{st\left(1\right)}{A}_{\phi }\left({\upsilon }_{st\left(1\right)}-2C\right)\right]$, ${A}_{2\left(1\right)}=\frac{{A}_{\phi }}{{R}^{2}}$, ${A}_{3\left(1\right)}=\left(\frac{1}{R}\right){A}_{\phi }\left({\upsilon }_{st\left(1\right)}-C\right)$, ${A}_{4\left(1\right)}=\left[1+\frac{{\upsilon }_{st\left(1\right)}{A}_{\phi }}{8}\left(3{\upsilon }_{st\left(1\right)}-4C\right)\right]$, ${A}_{5\left(1\right)}=\frac{{\upsilon }_{st\left(1\right)}{A}_{\phi }}{2R}$,${\upsilon }_{st\left(1\right)}=\frac{1}{\left(N+1\right)}$.

The MSE t₆ at (27) is minimum when

$$\left.\begin{array}{c}{w}_{1}=\frac{{A}_{2\left(1\right)}{A}_{4\left(1\right)}-{A}_{3\left(1\right)}{A}_{5\left(1\right)}}{{A}_{1\left(1\right)}{A}_{2\left(1\right)}-{A}_{3\left(1\right)}^{2}}\\ {w}_{2}=\frac{{A}_{1\left(1\right)}{A}_{5\left(1\right)}-{A}_{3\left(1\right)}{A}_{4\left(1\right)}}{{A}_{1\left(1\right)}{A}_{2\left(1\right)}-{A}_{3\left(1\right)}^{2}}\end{array}\right\}.$$

(28)

Therefore, the minimum MSE of t₆ is given by

$$MSE_{\min } \left( {t_{6} } \right) = \overline{Y}^{2} \left[ {1 - \frac{{A_{2\left( 1 \right)} A_{4\left( 1 \right)}^{2} - 2A_{3\left( 1 \right)} A_{4\left( 1 \right)} A_{5\left( 1 \right)} + A_{1\left( 1 \right)} A_{5\left( 1 \right)}^{2} }}{{A_{1\left( 1 \right)} A_{2\left( 1 \right)} - A_{3\left( 1 \right)}^{2} }}} \right].$$

(29)

Koyuncu¹⁰ and Shahzad et al.¹¹ proposed the following class of estimators for $\overline{Y}$ as

$${t}_{7}=\left[{w}_{1}{\overline{y}}_{st}+{w}_{2}{\left(\frac{{p}_{st}}{P}\right)}^{\gamma }\right]\mathit{exp}\left\{\frac{{a}_{st}\left(P-{p}_{st}\right)}{{a}_{st}\left(P+{p}_{st}\right)+2{b}_{st}}\right\},$$

(30)

where $\left({w}_{1},{w}_{2},\gamma \right)$ are suitable chosen constants and $\left({a}_{st},{b}_{st}\right)$ are same as defined earlier.

To the fda, the MSE of t₇ is given by

$$MSE\left({t}_{7}\right)={\overline{Y}}^{2}\left[1+{w}_{1}^{2}{C}_{1}+{w}_{2}^{2}{C}_{2}+2{w}_{1}{w}_{2}{C}_{3}-2{w}_{1}{C}_{4}-2{w}_{2}{C}_{5}\right],$$

(31)

where ${C}_{1}=\left[1+{A}_{y}+{\upsilon }_{st}{A}_{\phi }\left({\upsilon }_{st}-2C\right)\right]$, ${C}_{2}=\frac{1}{{P}^{2}{R}^{2}}\left[1+{A}_{\phi }\left\{{\gamma }^{2}+{\upsilon }_{st}^{2}-2\gamma {\upsilon }_{st}+\gamma \left(\gamma -1\right)\right\}\right]$, ${C}_{3}=\left(\frac{1}{PR}\right)\left[1+{A}_{\phi }\left\{\left({\upsilon }_{st}^{2}+\frac{\gamma \left(\gamma -1\right)}{2}-{\upsilon }_{st}\gamma \right)+\left(\gamma -{\upsilon }_{st}\right)C\right\}\right]$, ${C}_{4}=\left[1+\frac{{\upsilon }_{st}{A}_{\phi }}{8}\left(3{\upsilon }_{st}-4C\right)\right]$, ${C}_{5}=\frac{1}{PR}\left[1+\left\{\frac{\gamma \left(\gamma -1\right)}{2}-\frac{\gamma {\upsilon }_{st}}{2}+\frac{3}{8}{\upsilon }_{st}^{2}\right\}{A}_{\phi }\right]$.

$MSE\left({t}_{7}\right)$ at (31) is minimized for

$$\left.\begin{array}{c}{w}_{1}=\frac{{C}_{2}{C}_{4}-{C}_{3}{C}_{5}}{{C}_{1}{C}_{2}-{C}_{3}^{2}}\\ {w}_{2}=\frac{{C}_{1}{C}_{5}-{C}_{3}{C}_{4}}{{C}_{1}{C}_{2}-{C}_{3}^{2}}\end{array}\right\}.$$

(32)

Therefore, the minimum MSE of t₇ is given by

$$MSE_{\min } \left( {t_{7} } \right) = \overline{Y}^{2} \left[ {1 - \frac{{C_{2} C_{4}^{2} - 2C_{3} C_{4} C_{5} + C_{1} C_{5}^{2} }}{{C_{1} C_{2} - C_{3}^{2} }}} \right]$$

(33)

In this paper we have suggested a class of estimators for population mean $\overline{Y}$ of the study variable y using auxiliary attribute $\phi$. Expressions of bias and MSE of the proposed class of estimators are obtained up to terms of order 0 (n⁻¹).

We have obtained the optimum condition under which the MSE of the proposed class of estimators is minimum. We have derived the conditions under which the suggested class of estimators is more efficient than the conventional estimator and the estimators due to Naik and Gupta⁴, Koyuncu⁹, Sharma and Singh³ and Shahzad et al.¹¹. Numerical illustration is given in support of the proposed study.

Suggested class of estimators

We note that the exponent part of (30) is obtained on using the transformation $\left({a}_{st}{p}_{st}+{b}_{st}\right)$ such that $E\left\{{a}_{st}{p}_{st}+{b}_{st}\right\}=\left({a}_{st}P+{b}_{st}\right)$, in the first bracket coefficient of ${w}_{2}$ is ${\left(\frac{{p}_{st}}{P}\right)}^{\gamma }$ which does not use the transformation $\left({a}_{st}{p}_{st}+{b}_{st}\right)$. Thus, authors are in opinion that coefficient of ${w}_{2}$ should be ${\left(\frac{{a}_{st}{p}_{st}+{b}_{st}}{{a}_{st}P+{b}_{st}}\right)}^{\gamma }$. Hence the modified suggested class of estimators for $\overline{Y}$ is given by

$$t_{7\left( m \right)} = \left[ {w_{1} \overline{y}_{st} + w_{2} \left( {\frac{{a_{st} p_{st} + b_{st} }}{{a_{st} P + b_{st} }}} \right)^{\gamma } } \right]\exp \left\{ {\frac{{a_{st} \left( {P - p_{st} } \right)}}{{a_{st} \left( {P + p_{st} } \right) + 2b_{st} }}} \right\}.$$

(34)

where $\left( {w_{1} ,w_{2} } \right)$ are suitably chosen constants to be determined such that MSE of $t_{7\left( m \right)}$ is minimum; and $\left( {a_{st} ,b_{st} ,\gamma } \right)$ are same as defined earlier.

To the fda, the bias and MSE of $t_{7\left( m \right)}$ are respectively given by

$$B\left( {t_{7\left( m \right)} } \right) = \overline{Y}\left( {w_{1} D_{4} + w_{2} D_{5} - 1} \right),$$

(35)

and

$$MSE\left( {t_{7\left( m \right)} } \right) = \overline{Y}^{2} \left[ {1 + w_{1}^{2} D_{1} + w_{2}^{2} D_{2} + 2w_{1} w_{2} D_{3} - 2w_{1} D_{4} - 2w_{2} D_{5} } \right],$$

(36)

where $D_{1} = \left[ {1 + A_{y} + \upsilon_{st} A_{\phi } \left( {\upsilon_{st} - 2C} \right)} \right]$, $D_{2} = \,\frac{1}{{R^{2} P^{2} }}\left[ {1 + \upsilon_{st}^{2} \theta \left( {2\theta - 1} \right)A_{\phi } } \right]$, $D_{3} = \left( \frac{1}{RP} \right)\left[ {1 + \frac{{\upsilon_{st}^{{}} \left( {2\theta - 1} \right)}}{8}A_{\phi } \left( {2\theta + 4C - 3} \right)} \right]$, $D_{4} = \left[ {1 + \frac{{\upsilon_{st}^{{}} }}{8}A_{\phi } \left( {3\upsilon_{st}^{{}} - 4C} \right)} \right]$, $D_{5} = \frac{1}{RP}\left[ {1 + \frac{{\upsilon_{st}^{2} \theta \left( {\theta - 1} \right)}}{2}A_{\phi } } \right]$, $\theta = \frac{{\left( {2\gamma - 1} \right)}}{2}$.

The $MSE\left( {t_{7\left( m \right)} } \right)$ at (36) is minimized for

$$\left. \begin{gathered} w_{1} = \frac{{\left( {D_{2} D_{4} - D_{3} D_{5} } \right)}}{{\left( {D_{1} D_{2} - D_{3}^{2} } \right)}} \hfill \\ w_{2} = \frac{{\left( {D_{1} D_{5} - D_{3} D_{4} } \right)}}{{\left( {D_{1} D_{2} - D_{3}^{2} } \right)}} \hfill \\ \end{gathered} \right\}\,.$$

(37)

Therefore, the minimum MSE of $t_{7\left( m \right)}$ is given by

$$MSE_{\min } \left( {t_{7\left( m \right)} } \right) = \overline{Y}^{2} \left[ {1 - \frac{{\left( {D_{2} D_{4}^{2} - 2D_{3} D_{4} D_{5} + D_{1} D_{5}^{2} } \right)}}{{\left( {D_{1} D_{2} - D_{3}^{2} } \right)}}} \right]$$

(38)

Now we can conclude this as a theorem given below.

Theorem 2.1

The MSE of $t_{7\left( m \right)}$ is greater than or equal to the minimum MSE of $t_{7\left( m \right)}$.

$$MSE\left( {t_{7\left( m \right)} } \right) \ge MSE_{\min } \left( {t_{7\left( m \right)} } \right)$$

An alternative class of estimators

We propose another class of estimators for population mean as $\overline{Y}$ as

$$t_{8} = \left[ {w_{1} \overline{y}_{st} + w_{2} \exp \left\{ {\frac{{\delta a_{st} \left( {P - p_{st} } \right)}}{{a_{st} \left( {P + p_{st} } \right) + 2b_{st} }}} \right\}} \right]\left( {\frac{{a_{st} P + b_{st} }}{{a_{st} p_{st} + b_{st} }}} \right)^{\eta } .$$

(39)

where $\left( {w_{1} ,w_{2} ,a_{st} ,b_{st} } \right)$ are same as defined earlier and $\left( {\delta ,\eta } \right)$ are constants which take real numbers like (− 1,0,1).

To the fda, the bias and MSE of $t_{8}$ are respectively given by

$$B\left( {t_{8} } \right) = \overline{Y}\left( {w_{1} E_{4} + w_{2} E_{5} - 1} \right),$$

(40)

and

$$MSE\left( {t_{8} } \right) = \overline{Y}^{2} \left[ {1 + w_{1}^{2} E_{1} + w_{2}^{2} E_{2} + 2w_{1} w_{2} E_{3} - 2w_{1} E_{4} - 2w_{2} E_{5} } \right],$$

(41)

where $E_{1} = \left[ {1 + A_{y} - 4\eta \upsilon_{st} A_{y\phi } + \eta \left( {2\eta + 1} \right)\upsilon_{st}^{2} A_{\phi } } \right]$, $E_{2} = \,\frac{1}{{R^{2} P^{2} }}\left[ {1 + \theta \left( {2\theta + 1} \right)\upsilon_{st}^{2} A_{\phi } } \right]$, $E_{3} = \left( \frac{1}{RP} \right)\left[ {1 + \frac{{\left( {\eta + \theta } \right)\left( {\eta + \theta + 1} \right)}}{2}\upsilon_{st}^{2} A_{\phi } - \left( {\eta + \theta } \right)\upsilon_{st}^{{}} A_{y\phi } } \right]$, $E_{4} = \left[ {1 + \frac{{\eta \upsilon_{st}^{{}} }}{2}\left\{ {\frac{{\left( {\eta + 1} \right)}}{2}\upsilon_{st}^{{}} A_{\phi } - 2A_{y\phi } } \right\}} \right]$, $E_{5} = \frac{1}{RP}\left[ {1 + \frac{{\theta \left( {\theta + 1} \right)}}{2}\upsilon_{st}^{2} A_{\phi } } \right]$.

The $MSE\left( {t_{8} } \right)$ at (41) is minimized for

$$\left. \begin{gathered} w_{1} = \frac{{\left( {E_{2} E_{4} - E_{3} E_{5} } \right)}}{{\left( {E_{1} E_{2} - E_{3}^{2} } \right)}} \hfill \\ w_{2} = \frac{{\left( {E_{1} E_{5} - E_{3} E_{4} } \right)}}{{\left( {E_{1} E_{2} - E_{3}^{2} } \right)}} \hfill \\ \end{gathered} \right\}\,.$$

(42)

Substitution of (42) in (41) provides the minimum MSE of $t_{8}$ is given by

$$MSE_{\min } \left( {t_{8} } \right) = \overline{Y}^{2} \left[ {1 - \frac{{\left( {E_{2} E_{4}^{2} - 2E_{3} E_{4} E_{5} + E_{1} E_{5}^{2} } \right)}}{{\left( {E_{1} E_{2} - E_{3}^{2} } \right)}}} \right]$$

(43)

Now we have the following theorem.

Theorem 3.1

The MSE of $t_{8}$ is greater than or equal to the minimum MSE of $t_{8}$.

$$MSE\left( {t_{8} } \right) \ge MSE_{\min } \left( {t_{8} } \right)$$

Efficiency comparison

From (2), (4), (6), (8), (16) and (17) we have

$$MSE\left( {t_{0} = \overline{y}_{st} } \right) - MSE\left( {t_{3} } \right) = \overline{Y}^{2} A_{y} \rho_{y\phi }^{2} \ge 0,$$

(44)

$$MSE\left( {t_{1} } \right) - MSE\left( {t_{3} } \right) = \overline{Y}^{2} A_{\phi } \left( {1 - C} \right)^{2} \ge 0,$$

(45)

$$MSE\left( {t_{2} } \right) - MSE\left( {t_{3} } \right) = \overline{Y}^{2} A_{\phi } \left( {1 + C} \right)^{2} \ge 0,$$

(46)

$$MSE\left( {t_{1e} } \right) - MSE\left( {t_{3} } \right) = \overline{Y}^{2} \frac{{A_{\phi } }}{4}\left( {1 - 2C} \right)^{2} \ge 0,$$

(47)

$$MSE\left( {t_{2e} } \right) - MSE\left( {t_{3} } \right) = \overline{Y}^{2} \frac{{A_{\phi } }}{4}\left( {1 + 2C} \right)^{2} \ge 0,$$

(48)

It follows from (44) to (46) that the regression estimator t₃ is more efficient than $\overline{y}_{st} ,t_{1,} t_{2} ,t_{1e} \,{\text{and}}\,t_{2e}$.

From (8), (12), (22), (26), (29), (33), (38) and (43) we have

$$MSE\left( {t_{3} } \right) - MSE_{\min } \left( {t_{4} } \right) = \overline{Y}^{2} \left[ {A_{y} \left( {1 - \rho_{y\phi }^{2} } \right) + \frac{{\left( {B_{2} B_{4}^{2} - 2B_{3} B_{4} B_{5} + B_{1} B_{5}^{2} } \right)}}{{\left( {B_{1} B_{2} - B_{3}^{2} } \right)}} - 1} \right] \ge 0$$

(49)

$$MSE\left( {t_{3} } \right) - MSE_{\min } \left( {t_{5} } \right) = \overline{Y}^{2} \left[ {A_{y} \left( {1 - \rho_{y\phi }^{2} } \right) + \frac{{\left( {A_{2} A_{4}^{2} - 2A_{3} A_{4} A_{5} + A_{1} A_{5}^{2} } \right)}}{{\left( {A_{1} A_{2} - A_{3}^{2} } \right)}} - 1} \right] \ge 0$$

(50)

$$MSE\left( {t_{3} } \right) - MSE_{\min } \left( {t_{6} } \right) = \overline{Y}^{2} \left[ {A_{y} \left( {1 - \rho_{y\phi }^{2} } \right) + \frac{{\left( {A_{2\left( 1 \right)} A_{4\left( 1 \right)}^{2} - 2A_{3\left( 1 \right)} A_{4\left( 1 \right)} A_{5\left( 1 \right)} + A_{1\left( 1 \right)} A_{5\left( 1 \right)}^{2} } \right)}}{{\left( {A_{1\left( 1 \right)} A_{2\left( 1 \right)} - A_{3\left( 1 \right)}^{2} } \right)}} - 1} \right] \ge 0$$

(51)

$$MSE\left( {t_{3} } \right) - MSE_{\min } \left( {t_{7} } \right) = \overline{Y}^{2} \left[ {A_{y} \left( {1 - \rho_{y\phi }^{2} } \right) + \frac{{\left( {C_{2} C_{4}^{2} - 2C_{3} C_{4} C_{5} + C_{1} C_{5}^{2} } \right)}}{{\left( {C_{1} C_{2} - C_{3}^{2} } \right)}} - 1} \right] \ge 0$$

(52)

$$MSE\left( {t_{3} } \right) - MSE_{\min } \left( {t_{7\left( m \right)} } \right) = \overline{Y}^{2} \left[ {A_{y} \left( {1 - \rho_{y\phi }^{2} } \right) + \frac{{\left( {D_{2} D_{4}^{2} - 2D_{3} D_{4} D_{5} + D_{1} D_{5}^{2} } \right)}}{{\left( {D_{1} D_{2} - D_{3}^{2} } \right)}} - 1} \right] \ge 0$$

(53)

$$MSE\left( {t_{3} } \right) - MSE_{\min } \left( {t_{8} } \right) = \overline{Y}^{2} \left[ {A_{y} \left( {1 - \rho_{y\phi }^{2} } \right) + \frac{{\left( {E_{2} E_{4}^{2} - 2E_{3} E_{4} E_{5} + E_{1} E_{5}^{2} } \right)}}{{\left( {E_{1} E_{2} - E_{3}^{2} } \right)}} - 1} \right] \ge 0$$

(54)

It follows from (49) to (54) that the classes of estimators $t_{4,} t_{5} ,t_{6} ,\,t_{7} ,t_{7\left( m \right)} \,{\text{and }}t_{8}$ are more efficient than the regression estimator $\,t_{3}$.

Further from (12), (26), (29), (33), (38) and (43) we have

$$MSE_{\min } \left( {t_{8} } \right) < MSE_{\min } \left( {t_{4} } \right)\;{\text{if}}\;M_{2} < M_{1} ,$$

(55)

$$MSE_{\min } \left( {t_{8} } \right) < MSE_{\min } \left( {t_{5} } \right)\;\;{\text{if}}\;M_{3} < M_{1} ,$$

(56)

$$MSE_{\min } \left( {t_{8} } \right) < MSE_{\min } \left( {t_{6} } \right)\;{\text{if}}\;M_{4} < M_{1} ,$$

(57)

$$MSE_{\min } \left( {t_{8} } \right) < MSE_{\min } \left( {t_{7} } \right)\;{\text{if}}\;M_{5} < M_{1} ,$$

(58)

$$MSE_{\min } \left( {t_{8} } \right) < MSE_{\min } \left( {t_{7\left( m \right)} } \right)\;{\text{if}}\;M_{6} < M_{1} ,$$

(59)

where $M_{1} = \frac{{\left( {E_{2} E_{4}^{2} - 2E_{3} E_{4} E_{5} + E_{1} E_{5}^{2} } \right)}}{{\left( {E_{1} E_{2} - E_{3}^{2} } \right)}}$, $\,M_{2} = \frac{{\left( {A_{y} - \upsilon_{st}^{2} A_{\phi } A_{y\phi }^{2} - \upsilon_{st}^{2} A_{\phi }^{2} + \upsilon_{st}^{2} A_{\phi }^{2} A_{y}^{{}} } \right)}}{{\left( {A_{\phi } + A_{y} A_{\phi } - \upsilon_{st}^{2} A_{\phi }^{2} - A_{y\phi }^{2} } \right)}}$, $\,M_{3} = \frac{{\left( {A_{2} A_{4}^{2} - 2A_{3} A_{4} A_{5} + A_{1} A_{5}^{2} } \right)}}{{\left( {A_{1} A_{2} - A_{3}^{2} } \right)}}$, $\,M_{4} = \frac{{\left( {A_{2\left( 1 \right)} A_{4\left( 1 \right)}^{2} - 2A_{3\left( 1 \right)} A_{4\left( 1 \right)} A_{5\left( 1 \right)} + A_{1\left( 1 \right)} A_{5\left( 1 \right)}^{2} } \right)}}{{\left( {A_{1\left( 1 \right)} A_{2\left( 1 \right)} - A_{3\left( 1 \right)}^{2} } \right)}}$, $\,M_{5} = \frac{{\left( {C_{2} C_{4}^{2} - 2C_{3} C_{4} C_{5} + C_{1} C_{5}^{2} } \right)}}{{\left( {C_{1} C_{2} - C_{3}^{2} } \right)}}$, $M_{6} = \frac{{\left( {D_{2} D_{4}^{2} - 2D_{3} D_{4} D_{5} + D_{1} D_{5}^{2} } \right)}}{{\left( {D_{1} D_{2} - D_{3}^{2} } \right)}}$.

Therefore we can say that the proposed class of estimators $t_{8}$ is more efficient than the estimators $t_{4,} t_{5} ,t_{6} ,t_{7}$ and $t_{7\left( m \right)}$ as long as the conditions (55), (56), (57), (58), and (59) respectively are satisfied.

Numerical illustration

To judge the merits of the suggested class of estimators $t_{8}$ over other existing estimators, we have computed the percent relative efficiency (PRE) of different estimators with respect to usual unbiased estimator ${\overline{\text{y}}}_{{{\text{st}}}}$ by using the following formulae:

$$PRE\left( {t_{1} ,\overline{y}_{st} } \right) = \frac{{A_{y} }}{{\left[ {A_{y} + A_{\phi } \left( {1 - 2C} \right)} \right]}} \times 100,$$

(60)

$$PRE\left( {t_{2} ,\overline{y}_{st} } \right) = \frac{{A_{y} }}{{\left[ {A_{y} + A_{\phi } \left( {1 + 2C} \right)} \right]}} \times 100,$$

(61)

$$PRE\left( {t_{3} \,{\text{or}}\,t_{\alpha e} ,\overline{y}_{st} } \right) = \frac{1}{{\left[ {1 - \rho_{y\phi }^{2}} \right]}} \times 100,$$

(62)

$$PRE\left( {t_{4} ,\overline{y}_{st} } \right) = \frac{{A_{y} }}{{\left[ {1 - M_{2} } \right]}} \times 100,$$

(63)

$$PRE\left( {t_{1e} ,\overline{y}_{st} } \right) = \frac{{A_{y} }}{{\left[ {A_{y} + \frac{{A_{\phi } }}{4}\left( {1 - 4C} \right)} \right]}} \times 100,$$

(64)

$$PRE\left( {t_{2e} ,\overline{y}_{st} } \right) = \frac{{A_{y} }}{{\left[ {A_{y} + \frac{{A_{\phi } }}{4}\left( {1 + 4C} \right)} \right]}} \times 100,$$

(65)

$$PRE\left( {t_{5} ,\overline{y}_{st} } \right) = \frac{{A_{y} }}{{\left[ {1 - M_{3} } \right]}} \times 100,$$

(66)

$$PRE\left( {t_{6} ,\overline{y}_{st} } \right) = \frac{{A_{y} }}{{\left[ {1 - M_{4} } \right]}} \times 100,$$

(67)

$$PRE\left( {t_{7} ,\overline{y}_{st} } \right) = \frac{{A_{y} }}{{\left[ {1 - M_{5} } \right]}} \times 100,$$

(68)

$$PRE\left( {t_{7\left( m \right)} ,\overline{y}_{st} } \right) = \frac{{A_{y} }}{{\left[ {1 - M_{6} } \right]}} \times 100,$$

(69)

$$PRE\left( {t_{8} ,\overline{y}_{st} } \right) = \frac{{A_{y} }}{{\left[ {1 - M_{1} } \right]}} \times 100,$$

(70)

To demonstrate the effectiveness of the proposed class of estimators $t_{8}$, we utilise data on the number of teachers as the study variable (y), and the number of students classified as more or less than 750 in both primary and secondary schools as the auxiliary attribute $\phi$ for 923 districts across six regions (as 1: Marmara, 2: Agean, 3: Mediterranean, 4: Central Anatolia, 5: Black Sea, 6: East and Southest Anatolia) in Turkey in 2007 (Source: The Turkish Republic Ministry of Education).

The summary statistics of the data are given in Table 1. We applied Neyman²³ allocation for allocating the samples to various strata²⁴. Source: Koyuncu⁹.

Table 1 Data statistics.

Full size table

It is observed from Table 2 that the estimators ${t}_{1}\text{ and }{t}_{1e}$ are more efficient than the usual unbiased estimator $\overline{y}$ (which does not utilize the auxiliary attribute). The product estimator ${t}_{2}$ and product-type exponential estimator ${t}_{2\left(e\right)}$ perform poor than $\overline{y}$ (due to positive correlation between y and φ). The ${t}_{3}$ is more efficient than ${t}_{1},{t}_{1e},{t}_{2}\text{ and }{t}_{2e}$. The performance of the estimators $\left({t}_{4},{t}_{5},{t}_{6}\right)$ are almost same but marginally better than estimators ${t}_{1},{t}_{1e},{t}_{2},{t}_{2e } \; \text{and } \; {t}_{3}$.

Table 2 PREs values of different estimators of $\overline{Y}$ with respect to $\overline{y}$.

Full size table

Table 2 also demonstrates that the proposed estimator $t_{8}$ with $\left(\delta =-1,\eta =1,{a}_{st}={C}_{\phi \left(st\right)},{b}_{st}=NP\right)$ has the largest PRE(= 1.49E+11) followed by ${t}_{7\left(m\right)}$ with $\left(\lambda =-1,{a}_{st}={C}_{\phi \left(st\right)},{b}_{st}=NP\right)$. It is further observed that the proposed classes of estimators $t_{7\left( m \right)}$ and $t_{8}$ are always better than the classes of estimators $t_{1}$⁴, ${t}_{1e},{t}_{2}\text{ and }{t}_{2e}$, ${t}_{3}$ (difference estimator), $t_{4}$⁹, ${t}_{5}$³, ${t}_{6}$¹¹, ${t}_{7}$¹⁰ for all choices of $\left({a}_{st},{b}_{st}\right)$. The proposed class of estimators $t_{8}$ is the best among the estimators closed in Table 2.

Thus, our recommendation is to use the suggested class of estimators ${t}_{7\left(m\right)}$ and ${t}_{8}$ in practice.

Conclusion

In this article, we propose two classes of estimators for estimating the population mean $\overline{Y}$ of the study variable y using information on an auxiliary attribute $\left( \phi \right)$. The suggested classes of estimators are wide-ranging. The biases and mean squared errors of the proposed classes of estimators $t_{7\left( m \right)}$ and $t_{8}$ are derived up to the first degree of approximation. The optimum estimators in the classes of estimators $t_{7\left( m \right)}$ and $t_{8}$ are investigated using the minimum mean squared error formulae. An empirical study is conducted to evaluate the efficiency of the proposed classes of estimators $t_{7\left( m \right)}$ and $t_{8}$ and the findings are presented in Table 2. The results of Table 2 demonstrate that the suggested classes of estimators $t_{7\left( m \right)}$ and $t_{8}$ are more efficient than the recently developed classes of estimators $t_{4} ,t_{5} {,}t_{6} ,t_{7} \;{\text{and}}\;t_{3} \,$ by Koyuncu⁹, Sharma and Singh³, Shahzad et al.¹¹, Koyuncu¹⁰, and the difference estimator, with a considerable gain in efficiency. Therefore, we conclude that the proposed classes of estimators and are justified and can be used in practice.

One potential direction for future research is the application of advanced statistical techniques, such as machine learning and artificial intelligence, can be explored to improve the accuracy and efficiency of the estimators. These techniques can also help in identifying relevant auxiliary variables for improving the estimation process. The impact of various sampling designs on the estimation process can be investigated. For example, the effect of unequal sample sizes in different strata, non-response rates, and measurement errors on the accuracy and efficiency of the estimators can be studied. Finally, the extension of the current research to other types of population parameters, such as variance and quantiles, can also be explored. This can lead to the development of new classes of estimators and further improve the accuracy and efficiency of the estimation process.

Data availability

All the necessary data generated and/or analysed during the current study are included in this published article.

References

Kendall, M. G. & Stuart, A. The Advanced Theory of Statistics 2nd edn. (Charles Griffin and Company Limited, 1967).
MATH Google Scholar
Shabbir, J. & Gupta, S. On estimating the finite population mean with known population proportion of an auxiliary variable. Pak. J. Stat. 23(1), 1–9 (2007).
MathSciNet MATH Google Scholar
Sharma, P. & Singh, R. Efficient estimator of population mean in stratified random sampling using auxiliary attribute. World Appl. Sci. J. 27(12), 1786–1791 (2013).
Google Scholar
Naik, V. D. & Gupta, P. C. A note on estimation of mean with known population proportion of an auxiliary character. J. Indian Soc. Agric. Stat. 48(2), 151–158 (1996).
Google Scholar
Jhajj, H. S., Sharma, M. K. & Grover, L. K. A family of estimators of population mean using information on auxiliary attribute. Pak. J. Stat. 22(1), 43–50 (2006).
MathSciNet MATH Google Scholar
Solanki, R. S. & Singh, H. P. Improved estimation of population mean using population proportion of an auxiliary character. Chilean J. Stat. 4(1), 3–17 (2013).
MathSciNet MATH Google Scholar
Singh, H. P., Gupta, A. & Tailor, R. Estimation of population mean using a difference-type exponential imputation method. J. Stat. Theory Pract. 15, 1–43 (2021).
Article MathSciNet MATH Google Scholar
Gupta, A. & Tailor, R. Ratio in ratio type exponential strategy for the estimation of population mean. J. Reliability Stat. Stud. 551–564 (2021).
Koyuncu, N. Improved estimation of population mean in stratified random sampling using information on auxiliary attribute. In Proceeding of 59th ISI World Statistics Congress, Hong Kong, China, 25–30 (2013).
Koyuncu, N. Efficient combined estimators of population mean using auxiliary attribute under stratified random sampling. In Proceeding of 11th International Conference of Numerical Analysis and Applied Mathematics, vol. 1558, 1466–1469 (2013).
Shahzad, U., Hanif, M., Koyuncu, N. & Garcia, A. V. A family of ratio estimators in stratified random sampling utilizing auxiliary attribute alongside the nonresponse. J. Stat. Theory Appl. 18(1), 12–25 (2019).
MathSciNet Google Scholar
Zaman, T. Efficient estimators of population mean using auxiliary attribute in stratified random sampling. Adv. Appl. Stat. 56(2), 153–171 (2019).
Google Scholar
Zaman, T. An efficient exponential estimator of the mean under stratified random sampling. Math. Popul. Stud. 28(2), 104–121 (2021).
Article MathSciNet MATH Google Scholar
Hussain, S., Ahmad, S., Saleem, M. & Akhtar, S. Finite population distribution function estimation with dual use of auxiliary information under simple and stratified random sampling. PLoS ONE 15(9), e0239098 (2020).
Article CAS PubMed PubMed Central Google Scholar
Hussain, S., Akhtar, S. & El-Morshedy, M. Modified estimators of finite population distribution function based on dual use of auxiliary information under stratified random sampling. Sci. Prog. 105(3), 00368504221128486 (2022).
Article Google Scholar
Zaman, T. & Bulut, H. An efficient family of robust-type estimators for the population variance in simple and stratified random sampling. Commun. Stat.-Theory Methods (2021).
Zaman, T. & Kadilar, C. Exponential ratio and product type estimators of the mean in stratified two-phase sampling. AIMS Math. 6(5), 4265–4279 (2021).
Article MathSciNet MATH Google Scholar
Ahmad, S. et al. Dual use of auxiliary information for estimating the finite population mean under the stratified random sampling scheme. J. Math. 1–12 (2021).
Ahmad, S. et al. Improved estimation of finite population variance using dual supplementary information under stratified random sampling. Math. Probl. Eng. 2022 12. https://doi.org/10.1155/2022/3813952 (2022).
Article Google Scholar
Shahzad, U., Ahmad, I., Almanjahie, I. M., Al-Noor, N. H., & Hanif, M. A novel family of variance estimators based on L-moments and calibration approach under stratified random sampling. Commun. Stat.-Simul. Comput. 1–14. https://doi.org/10.1080/03610918.2021.1945629 (2021).
Shahzad, U., Ahmad, I., Almanjahie, I. & Al-Noor, N. H. L-Moments based calibrated variance estimators using double stratified sampling. Comput. Mater. Continua 68(3), 3411–3430 (2021).
Article Google Scholar
Shahzad, U., Ahmad, I., García-Luengo, A. V., Zaman, T. & Al-Noor, N. H. Kumar, A estimation of coefficient of variation using calibrated estimators in double stratified random sampling. Mathematics 11, 252 (2023).
Article Google Scholar
Neyman, J. On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selecting. J. R. Stat. Soc. 97, 558–606 (1934).
Article MATH Google Scholar
Cochran, W. G. Sampling Techniques 3rd edn. (Wiley, 1977).
MATH Google Scholar

Download references

Acknowledgements

We would like to express our sincere gratitude to the anonymous reviewers and the editor for their valuable comments, feedback, and suggestions, which greatly improved the quality and impact of this manuscript. Their thoughtful and constructive criticisms and insights have been immensely helpful in shaping the final version of this paper. We appreciate their time, effort, and expertise in reviewing and editing this work. I also acknowledged to Dhanashree Bhure, Editorial Support at Scientific Reports for assisting and guiding us for the proper submission of the manuscript.

Author information

Authors and Affiliations

School of Studies in Statistics, Vikram University, Ujjain, M.P., 456010, India
Housila P. Singh & Rajesh Tailor
Indian Agricultural Statistics Research Institute, ICAR, New Delhi, 110012, India
Anurag Gupta

Authors

Housila P. Singh
View author publications
Search author on:PubMed Google Scholar
Anurag Gupta
View author publications
Search author on:PubMed Google Scholar
Rajesh Tailor
View author publications
Search author on:PubMed Google Scholar

Contributions

Idea of the estimator generation is of H.P.S. Theoretical study and comparison have been carried out by R.T. A.G. has carried out the empirical study of the estimator and drafted the whole research paper in article form. All authors read and approved the final study manuscript.

Corresponding author

Correspondence to Anurag Gupta.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Singh, H.P., Gupta, A. & Tailor, R. Efficient class of estimators for finite population mean using auxiliary attribute in stratified random sampling. Sci Rep 13, 10253 (2023). https://doi.org/10.1038/s41598-023-34603-z

Download citation

Received: 21 December 2022
Accepted: 04 May 2023
Published: 24 June 2023
Version of record: 24 June 2023
DOI: https://doi.org/10.1038/s41598-023-34603-z

This article is cited by

Evaluating the performance of logarithmic type estimators using auxiliary attribute
- Shashi Bhushan
- Anoop Kumar
Life Cycle Reliability and Safety Engineering (2023)

Efficient class of estimators for finite population mean using auxiliary attribute in stratified random sampling

Subjects

Abstract

Similar content being viewed by others

An enhanced estimator of finite population variance using two auxiliary variables under simple random sampling

A new improved generalized class of estimators for population distribution function using auxiliary variable under simple random sampling

A new auxiliary variables-based estimator for population distribution function under stratified random sampling and non-response

Introduction

Notations

Reviewing some existing estimators

Suggested class of estimators

Theorem 2.1

An alternative class of estimators

Theorem 3.1

Efficiency comparison

Numerical illustration

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

This article is cited by

Evaluating the performance of logarithmic type estimators using auxiliary attribute

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

An enhanced estimator of finite population variance using two auxiliary variables under simple random sampling

A new improved generalized class of estimators for population distribution function using auxiliary variable under simple random sampling

A new auxiliary variables-based estimator for population distribution function under stratified random sampling and non-response

Introduction

Notations

Reviewing some existing estimators

Suggested class of estimators

Theorem 2.1

An alternative class of estimators

Theorem 3.1

Efficiency comparison

Numerical illustration

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Evaluating the performance of logarithmic type estimators using auxiliary attribute

Search

Quick links