Introduction

In recent years, the need for more flexible and adaptable distributions has become increasingly evident. The goal of generalizing distributions is to develop models that more effectively capture diverse data patterns and characteristics, thereby broadening their applicability across various fields. This flexibility enables distributions to accurately represent complex phenomena and facilitate informed decision-making. Researchers are introducing novel probability distributions by employing different methodologies, often involving the expansion of distribution parameters. These techniques enhance the versatility and practicality of new distributions compared to existing ones, with the potential to derive parametric distributions by modifying the number of parameters in an established model.

The Lomax (Lx) distribution, introduced by Lomax1, is utilized for modeling business failure data and has broad applications across fields such as income and wealth disparity, city size, medicine, actuarial science, biological sciences, engineering, and reliability modeling. It is employed in simulating data related to income and wealth, analyzing receiver operating characteristic curves, assessing business size, reliability, life testing, and Hirsch-related statistics.

The Lx distribution has proven valuable in fields such as survival analysis, reliability engineering, and actuarial science due to its flexibility in modeling heavy-tailed data. Its close relationship with the Weibull distribution further enhances its applicability, offering a versatile tool for understanding various real-world phenomena.

Many researchers have advanced the modeling of heavy-tailed data by generalizing the functional forms of the Lx distribution. Notable examples include the exponentiated Lx2, beta Lx3, Poisson Lx distribution4, exponential Lx5, gamma Lx6, Weibull Lx7, power Lx8, exponentiated Weibull Lx9, Marshall_Olkin exponential Lx10, type II Topp_Leone power Lx11, Fréchet Topp_Leone Lx (FTLLx)12, McDonald-Lx (MCLx) and Kumaraswamy Lx (KwLx)13, beta exponentiated Lx (BELx)14, transmuted Weibull Lx (TWLx)15, Burr X Lx (BXLx)16, odd exponentiated half-logistic Lx (OEHLLx)17, Lx Weibull18, generalized odd Lindley-Lx19, odd Lx log-logistic (OLxLL)20, Marshall–Olkin power Lx (MOPLx)21, extended odd Weibull Lx22 odd Lomax–Lx23, and extended exponentiated Lx24, distributions. These generalized distributions improve the ability to capture complex patterns in heavy-tailed datasets and offer more robust tools for analysis.

The present study aims to extend the Lx distribution by incorporating the odd Pareto-G (OP-G) family, as proposed by Hussein et al.25, by adding two additional shape parameters to the Lx model. The resulting model is referred to as the odd Pareto–Lomax (OPLx) distribution. We have derived several mathematical properties of the OPLx distribution and estimated its parameters using various techniques. Comprehensive simulations were conducted to assess the performance of these estimators. Furthermore, the OPLx distribution was applied to three real-life datasets from different fields to demonstrate its practical utility.

The rest of the paper is organized in the following sections. In Sect.  2, the OPLx distribution is presented. Section  3 provides the properties of the OPLx model. In Sect.  4, the OPLx parameters are estimated. A simulation study is presented in Sect.  5. Section  6 provides three examples of practical data applications. Final remarks are presented in Sect.  7.

The OPLx distribution

Let the random variable (rv) \(X\) follow the Lx distribution1, then the cumulative distribution function (CDF) of \(X\) takes the form

$$G\left(x;\alpha,\beta\right)=1-{\left(1+\frac{x}{\beta}\right)}^{-\alpha},x>0,$$

where \(\alpha>0\) and \(\beta>0\) are the shape and scale parameters, respectively.

The corresponding probability density function (PDF) of (1) reduces to

$$g\left(x;\alpha,\beta\right)=\frac{\alpha}{\beta}{\left(1+\frac{x}{\beta}\right)}^{-\left(\alpha+1\right)},x>0.$$

Recently, Hussein et al.25 introduced the OP-G family, which can be defined by the following CDF

$$F\left(x;\theta,\zeta\right)=1-{\left\{\frac{\theta\left[1-G\left(x\right)\right]}{G\left(x\right)+\theta\left[1-G\left(x\right)\right]}\right\}}^{\zeta},x>0,\theta,\zeta>0,$$
(1)

where \(\theta\) and \(\zeta\) are two shape parameters and \(G\left(x\right)\) is the baseline CDF.

The corresponding PDF of (1) is defined by

$$f\left(x;\theta,\zeta\right)=\frac{\zeta{\theta}^{\zeta}g\left(x\right)}{{\left[1-G\left(x\right)\right]}^{2}}{\left\{\frac{\left(1-G\left(x\right)\right)}{G\left(x\right)+\theta\left[1-G\left(x\right)\right]}\right\}}^{\zeta+1}.$$

By inserting the CDF of the Lx model in Eq. (1), we obtain the CDF of the OPLx distribution as follows

$$F\left(x;\alpha,\beta,\theta,\zeta\right)=1-{\left\{\frac{\theta{\left(1+\frac{x}{\beta}\right)}^{-\alpha}}{1-\left(1-\theta\right){\left(1+\frac{x}{\beta}\right)}^{-\alpha}}\right\}}^{\zeta},x>0,\alpha,\beta,\theta,\zeta>0.$$
(2)

The corresponding PDF of the OPLx distribution follows as

$$f\left(x;\alpha,\beta,\theta,\zeta\right)=\frac{\alpha\zeta{\theta}^{\zeta}{\left(1+\frac{x}{\beta}\right)}^{-\zeta\alpha-1}}{{\beta\left[1-\left(1-\theta\right){\left(1+\frac{x}{\beta}\right)}^{-\alpha}\right]}^{\zeta+1}},x>0.$$
(3)

Therefore, a rv with PDF (3) is denoted by \(X\sim\)OPLx\(\left(\alpha,\beta,\theta,\zeta\right)\).

The hazard rate function (HRF) of the OPLx distribution follows as

$$h\left(x;\alpha,\beta,\theta,\zeta\right)=\frac{\alpha\zeta}{\beta}{{\left(1+\frac{x}{\beta}\right)}^{-1}\left[1-\left(1-\theta\right){\left(1+\frac{x}{\beta}\right)}^{-\alpha}\right]}^{-1}.$$

Figures 1 and 2 provide some plots for the PDF and HRF of the OPLx distribution. These plots indicate that the OPLx PDF can be left-skewed and reversed-J shaped. The HRF of the OPLx distribution can be decreasing, upside-down bathtub, increasing, and reversed-J shaped.

Fig. 1
Fig. 1
Full size image

Plots of the PDF of the OPLx distribution for some parameter values.

Fig. 2
Fig. 2
Full size image

Plots of the HRF of the OPLx distribution for some parameter values.

Some properties

This section provides some key properties of the OPLx distribution.

Mixture representation

In this section, we provide a valuable representation of the CDF of the OPLx distribution. According to Hussein et al.25, the PDF of the OPLx distribution can be expressed as

$$f\left(x\right)=\sum_{j=0}^{\infty}{s}_{j}\left(j+1\right)\frac{\alpha}{\beta}{\left(1+\frac{x}{\beta}\right)}^{-\left(\alpha+1\right)}{\left[1-{\left(1+\frac{x}{\beta}\right)}^{-\alpha}\right]}^{j},$$
(4)

where

$${s}_{j}={s}_{j}\left(\zeta,\theta\right)=\left\{\begin{array}{c}-{U}_{j}\left(\zeta,\theta\right),\theta<2;\\-{V}_{j}\left(\zeta,\theta\right),\theta\ge2,\end{array}\right.$$

and \({U}_{j}\left(\zeta,\theta\right)\) \(\left(\text{f}\text{o}\text{r}j\ge1\right)\) can be determined, recursively, by

$${U}_{j}=\frac{1}{{A}_{0}}\left({B}_{j}-\sum_{r=1}^{j}{A}_{r}{U}_{j-r}\right),{U}_{0}=\frac{{B}_{0}}{{A}_{0}},$$

.

and \({V}_{j}\left(\zeta,\theta\right)\) \(\left(\text{f}\text{o}\text{r}j\ge1\right)\) is determined as follows

$${V}_{j}=\frac{1}{{C}_{0}}\left({D}_{j}-\sum_{r=1}^{j}{C}_{r}{V}_{j-r}\right),{V}_{0}=\frac{{D}_{0}}{{C}_{0}}.$$

Using the generalized binomial series (GBS), we can write

$${\left[1-{\left(1+\frac{x}{\beta}\right)}^{-\alpha}\right]}^{j}=\sum_{k=0}^{\infty}{\left(-1\right)}^{k}\left(\begin{array}{c}j\\k\end{array}\right){\left(1+\frac{x}{\beta}\right)}^{-k\alpha}.$$

Applying the GBS to (4), we obtain

$$f\left(x\right)=\sum_{j,k=0}^{\infty}{s}_{j}{\left(-1\right)}^{k}\left(\begin{array}{c}j\\k\end{array}\right)\frac{\left(j+1\right)}{\left(k+1\right)}\frac{\alpha\left(k+1\right)}{\beta}{\left(1+\frac{x}{\beta}\right)}^{-\alpha\left(k+1\right)-1}=\sum_{k=0}^{\infty}{\delta}_{k}{g}_{\alpha\left(k+1\right),\beta}\left(x\right),\left(5\right)$$

where \({\delta}_{k}=\sum_{j=0}^{\infty}{s}_{j}{\left(-1\right)}^{k}\left(\begin{array}{c}j\\k\end{array}\right)\frac{\left(j+1\right)}{\left(k+1\right)}\) and \({g}_{\alpha\left(k+1\right),\beta}\left(x\right)\) is the PDF of the Lx distribution with parameters \(\alpha\left(k+1\right)\) and \(\beta\). Thus, several mathematical properties of the OPLx distribution follow simply from those properties of the Lx distribution with parameters \(\alpha\left(k+1\right)\) and \(\beta\).

Quantile function

The quantile function (QF) of the OPLx distribution follows, by inverting Eq. (2), as

$$Q\left(u\right)=\beta\left\{{\left(1-\frac{\theta-\theta{\left(1-u\right)}^{\frac{1}{\zeta}}}{\theta+\left(1-\theta\right){\left(1-u\right)}^{\frac{1}{\zeta}}}\right)}^{-\raisebox{1ex}{$1$}\!\left/\!\raisebox{-1ex}{$\alpha$}\right.}-1\right\},u\in\left(\text{0,1}\right).$$

The Bowley skewness (BS) (Kenney and Keeping26 is one of the earliest skewness measures, which is defined by \(S=\left[Q\left(\frac{3}{4}\right)+Q\left(\frac{1}{4}\right)-2Q\left(\frac{1}{2}\right)\right]/\left[Q\left(\frac{3}{4}\right)-Q\left(\frac{1}{4}\right)\right]\). The Moors kurtosis (MK) (Moors27 depends on octiles, and it is defined by \(K=\left[Q\left(\frac{3}{8}\right)-Q\left(\frac{1}{8}\right)+Q\left(\frac{7}{8}\right)-Q\left(\frac{5}{8}\right)\right]/\left[Q\left(\frac{6}{8}\right)-Q\left(\frac{2}{8}\right)\right]\). The plots of the BS and MK of the OPLx distribution for \(\zeta=1\) and selected choices of \(\alpha\), \(\beta\) and \(\theta\) are presented in Fig. 3. These plots are obtained for \(\beta=0.3\) and \(\theta=1.2\). These plots show that the shapes of the OPLx distribution have significance dependence on the values of \(\alpha\). Further, the OPLx distribution can be used to model positive skewed data.

Fig. 3
Fig. 3
Full size image

The plots of the BS and MK of the OPLx distribution for some parameter values.

Moments

The \(r\)th moments of the OPLx distribution follows from (5) as

$${\mu}_{r}^{{\prime}}={\int}_{0}^{\infty}{x}^{r}\sum_{k=0}^{\infty}{\delta}_{k}{g}_{\alpha\left(k+1\right),\beta}\left(x\right).$$

Then, we have

$${\mu}_{r}^{{\prime}}=\sum_{k=0}^{\infty}{\delta}_{k}\frac{{\beta}^{r}{\Gamma}\left(\alpha\left(k+1\right)-r\right){\Gamma}\left(1+r\right)}{{\Gamma}\left(\alpha\left(k+1\right)\right)},$$
(6)

for \(\alpha\left(k+1\right)>r\).

Setting \(r=1,2,3,\text{a}\text{n}\text{d}4\), respectively, we obtain the first four moments of the OPLx distribution.

The \(s\)th incomplete moment of the OPLx distribution is given by

$${\psi}_{s}\left(t\right)={\int}_{0}^{t}{x}^{s}f\left(x\right)dx=\sum_{k=0}^{\infty}{\delta}_{k}{\int}_{0}^{t}{x}^{s}{g}_{\alpha\left(k+1\right),\beta}\left(x\right)dx.$$

Then, \({\psi}_{s}\left(t\right)\) reduces to

$${\psi}_{s}\left(t\right)=\sum_{k=0}^{\infty}{\delta}_{k}\alpha\left(k+1\right){\beta}^{s}{B}_{t/(t+\beta)}\left(s+1,\alpha\left(k+1\right)-s\right),$$

where \({B}_{x}\left(p,q\right)={\int}_{0}^{x}{u}^{p-1}{\left(1-u\right)}^{q-1}du\) is the incomplete beta function and the condition (\(\alpha\left(k+1\right)>s\)) is required to ensure the existence of the \(s\)th moment.

The MGF of the OPLx distribution follows as

$$M\left(t\right)=E\left({e}^{tX}\right)=\sum_{r=0}^{\infty}\frac{{t}^{r}}{r!}E\left({x}^{r}\right)=\sum_{r,k=0}^{\infty}{\delta}_{k}\frac{{t}^{r}}{r!}{\int}_{0}^{\infty}{x}^{r}{g}_{\alpha\left(k+1\right),\beta}\left(x\right)dx.$$

Hence, the MGF of \(X\) reduces to

$$M\left(t\right)=\sum_{k,r=0}^{\infty}{\delta}_{k}\frac{{t}^{r}}{r!}\frac{{\beta}^{r}{\Gamma}\left(\alpha\left(k+1\right)-r\right){\Gamma}\left(1+r\right)}{{\Gamma}\left(\alpha\left(k+1\right)\right)},$$

where \(\alpha\left(k+1\right)>r\). Additionally, it is important to note that the MGF of the Lomax distribution does not exist in closed form for \(t>0\). Since the OPLx distribution retains the Lomax kernel in its PDF, the integral involved in computing the MGF may similarly lack a closed-form solution.

Table 1 provides the mean, say, \({\mu}_{x}\), variance, say, \({\sigma}_{x}^{2}\), skewness, say, \({\psi}_{1}\), and kurtosis, say, \({\psi}_{2}\), of the OPLx distribution, which are computed numerically for different values of \(\zeta\), \(\theta\), \(\beta\), and \(\alpha\) using the R statistical software. From Table 1, we can indicate that the skewness of the OPLx distribution varies within the interval (1.2727, 20.4412). Furthermore, the spread of the kurtosis of the OPLx distribution is from 5.1576 to 968.0334. Then, the OPLx model can be used effectively to model right skewed data.

Table 1 Numerical values of\({\mu}_{x}\),\({\sigma}_{x}^{2}\),\({\psi}_{1}\), and\({\psi}_{2}\)of the OPLx distribution for different values of\(\zeta\),\(\theta\),\(\beta\), and\(\alpha\).

Mean residual life and mean inactivity time

The mean residual life (MRL) represents the expected additional life length for a unit, which is alive at age \(t,\) and it is defined by \({M}_{X}\left(t\right)=E\left(X-t|X>t\right),\) for \(t>0\). The MRL of \(X\) is

$${M}_{X}\left(t\right)=\frac{{\mu}_{1}^{{\prime}}-{\psi}_{1}\left(t\right)}{S\left(t\right)}-t,$$

where \({\mu}_{1}^{{\prime}}\) refers to the mean of \(X\) which follows directly from Eq. (6) with \(r=1\) and \(S\left(t\right)\) is the survival function (SF) of the OPLx distribution.

Hence, the MRL of the OPLx distribution reduces to

$${M}_{X}\left(t\right)=\frac{\beta}{S\left(t\right)}\left[\sum_{k=0}^{\infty}\frac{{\delta}_{k}}{\alpha\left(k+1\right)-1}-\sum_{k=0}^{\infty}{\delta}_{k}\alpha\left(k+1\right){B}_{t/(t+\beta)}\left(2,\alpha\left(k+1\right)-1\right)\right]-t.$$

The mean inactivity time (MIT) represents the waiting time elapsed since the failure of an item on condition that this failure had occurred in \((0,t)\). The MIT is defined by \({m}_{X}\left(t\right)=E\left(t-X|X\le t\right),\) for \(t>0\).

The MIT of \(X\) has the form

$${m}_{X}\left(t\right)=t-\frac{{\psi}_{1}\left(t\right)}{F\left(t\right)}.$$

The MIT of the OPLx distribution reduces to

$${m}_{X}\left(t\right)=t-\frac{1}{F\left(t\right)}\sum_{k=0}^{\infty}{\delta}_{k}\alpha\left(k+1\right)\beta{B}_{t/(t+\beta)}\left(2,\alpha\left(k+1\right)-1\right).$$

Order statistics

Let \({x}_{1},\dots,{x}_{n}\) be a random sample from the OPLx model, then, the PDF of the \(i\)th order statistic, say, \({X}_{i:n}\), is defined by

$${f}_{i:n}\left(x\right)=\frac{n!}{\left(i-1\right)!\left(n-i\right)!}\sum_{k=0}^{n-i}{\left(-1\right)}^{k}\left(\begin{array}{c}n-i\\k\end{array}\right)f\left(x\right)F{\left(x\right)}^{k+i-1}.$$
$${f}_{i:n}\left(x\right)=\frac{\alpha}{\beta}\sum_{k=0}^{n-i}\frac{{\left(-1\right)}^{k}n!}{\left(i-1\right)!\left(n-i\right)!}\left(\begin{array}{c}n-i\\k\end{array}\right)\frac{\zeta{\theta}^{\zeta}{{\Psi}}^{-\left(\zeta\alpha+1\right)}}{{\left[1-\left(1-\theta\right){{\Psi}}^{-\alpha}\right]}^{\zeta+1}}{\left\{1-{\left\{\frac{\theta{{\Psi}}^{-\alpha}}{1-\left(1-\theta\right){{\Psi}}^{-\alpha}}\right\}}^{\zeta}\right\}}^{k+i-1},$$

where \({\Psi}=1+\frac{x}{\beta}.\).

The CDF of \({X}_{i:n}\) is defined by

$${F}_{i:n}\left(x\right)=\sum_{r=i}^{n}\left(\begin{array}{c}n\\r\end{array}\right){\left[F\left(x\right)\right]}^{r}{\left[1-F\left(x\right)\right]}^{n-r}.$$

Hence, the CDF of \({X}_{i:n}\) for the OPLx model reduces to

$${F}_{i:n}\left(x\right)=\sum_{r=i}^{n}\left(\begin{array}{c}n\\r\end{array}\right){\left\{1-{\left[\frac{\theta{{\Psi}}^{-\alpha}}{1-\left(1-\theta\right){{\Psi}}^{-\alpha}}\right]}^{\zeta}\right\}}^{r}{\left\{{\left[\frac{\theta{{\Psi}}^{-\alpha}}{1-\left(1-\theta\right){{\Psi}}^{-\alpha}}\right]}^{\zeta}\right\}}^{n-r}.$$

Rényi Entropy

The Rényi entropy of the rv \(X\) represents a measure of uncertainty variation. The Rényi entropy is defined by

$${I}_{\delta}=\frac{1}{1-\delta}\text{log}\left({\int}_{-\infty}^{\infty}f{\left(x\right)}^{\delta}dx\right),\delta>0\text{a}\text{n}\text{d}\delta\ne1.$$

Using the PDF (3), we have

$$f{\left(x\right)}^{\delta}={\left(\frac{\alpha\zeta{\theta}^{\zeta}}{\beta}\right)}^{\delta}{\left(1+\frac{x}{\beta}\right)}^{-\delta\left(\zeta\alpha+1\right)}{\left[1-\left(1-\theta\right){\left(1+\frac{x}{\beta}\right)}^{-\alpha}\right]}^{-\delta\left(\zeta+1\right)}.$$

Considering the GBS to the last term, it reduces to

$${\left[1-\left(1-\theta\right){\left(1+\frac{x}{\beta}\right)}^{-\alpha}\right]}^{-\delta\left(\zeta+1\right)}=\sum_{k=0}^{\infty}{\left(-1\right)}^{k}\left(\begin{array}{c}-\delta\left(\zeta+1\right)\\k\end{array}\right)\left[{\left(1-\theta\right)}^{k}{\left(1+\frac{x}{\beta}\right)}^{-\alpha k}\right].$$

After some algebra, we obtain

$$f{\left(x\right)}^{\delta}=\sum_{k=0}^{\infty}{\left(-1\right)}^{k}{{\left(1-\theta\right)}^{k}\left(\frac{\alpha\zeta{\theta}^{\zeta}}{\beta}\right)}^{\delta}\left(\begin{array}{c}-\delta\left(\zeta+1\right)\\k\end{array}\right){\left(1+\frac{x}{\beta}\right)}^{-\delta\left(\zeta\alpha+1\right)-\alpha k}.$$

Then, the Rényi entropy of the OPLx distribution reduces to

$${I}_{\delta}=\frac{1}{1-\delta}\text{log}\left(\sum_{k=0}^{\infty}{s}_{k}{\int}_{0}^{\infty}{\left(\frac{\alpha}{\beta}\right)}^{\delta}{\left(1+\frac{x}{\beta}\right)}^{-\delta\left(\zeta\alpha+1\right)-\alpha k}dx\right),$$

where

$${s}_{k}={\left(-1\right)}^{k}{{\left(1-\theta\right)}^{k}\left(\zeta{\theta}^{\zeta}\right)}^{\delta}\left(\begin{array}{c}-\delta\left(\zeta+1\right)\\k\end{array}\right).$$

Finally, \({I}_{\delta}\) can be expressed as

$${I}_{\delta}=\frac{1}{1-\delta}\text{log}\left(\frac{{\alpha}^{\delta}{\beta}^{1-\delta}}{\delta\left(\zeta\alpha+1\right)+\alpha k-1}\sum_{k=0}^{\infty}{s}_{k}\right).$$

where \(\delta\left(\zeta\alpha+1\right)+\alpha k>1\).

Methods of estimation

In this section, we employ eight methods for parameter estimation, namely: maximum likelihood (ML), least squares (LS), weighted LS (WLS), maximum product of spacing (MPS), percentiles (PC), Cramér-von Mises (CRVM), AndersonDarling (AD), and right-tail AD (RAD) estimators. For all eight estimation methods, the parameter estimates were obtained numerically by optimizing the corresponding objective functions. In particular, the optimization was carried out using the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm implemented in the optim function of the R software.

Let \({x}_{1},{x}_{2},\dots\dots,{x}_{n}\) be a random sample of size \(n\) from the OPLx distribution and let \({x}_{1:n},{x}_{2:n},\dots,{x}_{n:n}\) be their associated order statistics. Then, the log-likelihood function reduces to

$$\mathcal{l}=n\text{log}\alpha+n\text{log}\zeta-n\text{log}\beta+n\zeta\text{log}\theta-\left(\zeta\alpha+1\right)\sum_{i=1}^{n}\text{log}\left({b}_{i}\right)-\left(\zeta+1\right)\sum_{i=1}^{n}\text{log}\left[1-\left(1-\theta\right){{b}_{i}}^{-\alpha}\right],$$

where \({b}_{i}=1+{x}_{i}/\beta.\).

The ML estimators (MLE) for \(\alpha,\beta,\theta,\) and \(\zeta\) can be obtained by maximizing the last equation with respect to these parameters. Alternatively, the MLE can be determined by solving the corresponding score functions:

$$\frac{\partial\mathcal{l}}{\partial\alpha}=\frac{n}{\alpha}-\zeta\sum_{i=1}^{n}\text{log}\left({b}_{i}\right)-\left(\zeta+1\right)\sum_{i=1}^{n}\frac{\left(1-\theta\right){{b}_{i}}^{-\alpha}\text{log}\left({b}_{i}\right)}{\left[1-\left(1-\theta\right){{b}_{i}}^{-\alpha}\right]},$$
$$\frac{\partial\mathcal{l}}{\partial\beta}=\frac{-n}{\beta}+\left(\zeta\alpha+1\right)\sum_{i=1}^{n}\frac{{x}_{i}}{{\beta}^{2}{b}_{i}}+\left(\zeta+1\right)\sum_{i=1}^{n}\frac{\alpha\left(1-\theta\right){{b}_{i}}^{-\alpha-1}{x}_{i}}{{\beta}^{2}\left[1-\left(1-\theta\right){{b}_{i}}^{-\alpha}\right]},$$
$$\frac{\partial\mathcal{l}}{\partial\theta}=\frac{n\zeta}{\theta}-\left(\zeta+1\right)\sum_{i=1}^{n}\frac{{{b}_{i}}^{-\alpha}}{\left[1-\left(1-\theta\right){{b}_{i}}^{-\alpha}\right]}$$

and

$$\frac{\partial\mathcal{l}}{\partial\zeta}=\frac{n}{\zeta}+n\text{log}\theta-\alpha\sum_{i=1}^{n}\text{log}\left({b}_{i}\right)-\sum_{i=1}^{n}\text{log}\left[1-\left(1-\theta\right){{b}_{i}}^{-\alpha}\right].$$

The LS estimators (LSE) are obtained by minimizing the following function:

$$S\left(\alpha,\beta,\theta,\zeta\right)=\sum_{i=1}^{n}{\left(F\left({x}_{i:n};\alpha,\beta,\theta,\zeta\right)-\frac{i}{n+1}\right)}^{2},$$

with respect to \(\alpha,\beta,\theta,\text{a}\text{n}\text{d}\zeta\).

Similarly, these LSE are also calculated by solving the following equations (for \(k=1,2,3,4\)):

$$\sum_{i=1}^{n}\left(1-{\left\{\frac{\theta{b}_{i:n}^{-\alpha}}{1-\left(1-\theta\right){b}_{i:n}^{-\alpha}}\right\}}^{\zeta}-\frac{i}{n+1}\right){\psi}_{k}\left(\left.{x}_{i:n}\right|\omega\right)=0,$$

where \({b}_{i:n}=\left(1+{x}_{i:n}/\beta\right)\), \(\omega={\left(\alpha,\beta,\theta,\zeta\right)}^{T}\) is a vector of parameters, and

$${\psi}_{1}\left(\left.{x}_{i:n}\right|\omega\right)=\frac{\partial}{\partial\alpha}F\left({x}_{i:n};\omega\right)=\frac{\zeta{\theta}^{\zeta}{b}_{i:n}^{-\alpha\zeta}\text{log}{b}_{i:n}}{\vartheta{\left({x}_{i:n},\alpha,\theta,\beta\right)}^{\zeta+1}},$$
(7)
$${\psi}_{2}\left(\left.{x}_{i:n}\right|\omega\right)=\frac{\partial}{\partial\beta}F\left({x}_{i:n};\omega\right)=-\frac{\alpha\zeta{\theta}^{\zeta}{\left(\frac{{x}_{i:n}}{{\beta}^{2}}\right)b}_{i:n}^{-\alpha\zeta-1}}{\vartheta{\left({x}_{i:n},\alpha,\theta,\beta\right)}^{\zeta+1}},$$
(8)
$${\psi}_{3}\left(\left.{x}_{i:n}\right|\omega\right)=\frac{\partial}{\partial\theta}F\left({x}_{i:n};\omega\right)=-\frac{\zeta{\theta}^{\zeta-1}{b}_{i:n}^{-\alpha\zeta}\left(1-{b}_{i:n}^{-\alpha}\right)}{\vartheta{\left({x}_{i:n},\alpha,\theta,\beta\right)}^{\zeta+1}}$$
(9)

and

$${\psi}_{4}\left(\left.{x}_{i:n}\right|\omega\right)=\frac{\partial}{\partial\zeta}F\left({x}_{i:n};\omega\right)=-{\left[\frac{\theta{b}_{i:n}^{-\alpha}}{\vartheta\left({x}_{i:n},\alpha,\theta,\beta\right)}\right]}^{\zeta}\text{log}\left(\frac{\theta{b}_{i:n}^{-\alpha}}{\vartheta\left({x}_{i:n},\alpha,\theta,\beta\right)}\right),$$
(10)

where \(\vartheta({x}_{i:n},\alpha,\theta,\beta)=1-\left(1-\theta\right){b}_{i:n}^{-\alpha}.\).

The WLS estimators (WLSE) of the OPLx parameters can be determined by minimizing the equation:

$$W\left(\alpha,\beta,\theta,\zeta\right)=\sum_{i=1}^{n}{A\left\{1-{\left[\frac{\theta{b}_{i:n}^{-\alpha}}{1-\left(1-\theta\right){b}_{i:n}^{-\alpha}}\right]}^{\zeta}-\frac{i}{n+1}\right\}}^{2},$$

where \(A={\left(n+1\right)}^{2}\left(n+2\right)/i\left(n-i+1\right).\).

Moreover, the WLSE of the OPLx parameters are also obtained by solving the following function:

$$\sum_{i=1}^{n}A\left\{1-{\left[\frac{\theta{b}_{i:n}^{-\alpha}}{1-\left(1-\theta\right){b}_{i:n}^{-\alpha}}\right]}^{\zeta}-\frac{i}{n+1}\right\}{\psi}_{k}\left(\left.{x}_{i:n}\right|\omega\right)=0,$$

where \({\psi}_{k}\) are provided in Eqs. (7)–(10) for \(k=1,2,3,4.\).

The CRVM estimators (CRVME) of the OPLx parameters are obtained by minimizing the following function:

$$C\left(\alpha,\beta,\theta,\zeta\right)=\frac{1}{12n}+\sum_{i=1}^{n}{\left(F\left({x}_{i:n}\right)-\frac{2i-1}{2n}\right)}^{2}.$$

Furthermore, the CRVME are determined by solving the following equation:

$$\sum_{i=1}^{n}\left(1-{\left\{\frac{\theta{b}_{i:n}^{-\alpha}}{1-\left(1-\theta\right){b}_{i:n}^{-\alpha}}\right\}}^{\zeta}-\frac{2i-1}{2n}\right){\psi}_{k}\left(\left.{x}_{i:n}\right|\omega\right)=0.$$

The AD estimators (ADE) of the OPLx parameters can be obtained by minimizing the following function:

$$AD=-n-\frac{1}{n}\sum_{i=1}^{n}\left(2i-1\right)\left\{\text{log}F\left({x}_{i:n}\right)+\text{log}S\left({x}_{i:n}\right)\right\},$$

where \(S\left({x}_{i:n}\right)\) denotes the SF. The ADE are also followed by solving the following equations:

$$\sum_{i=1}^{n}\left(2i-1\right)\left[\frac{{\psi}_{k}\left({x}_{i:n}\right)}{F\left({x}_{i:n}\right)}-\frac{{\psi}_{k}\left({x}_{n+1-i:n}\right)}{S\left({x}_{n+1-i:n}\right)}\right]=0.$$

The RAD estimators (RADE) of the OPLx parameters are obtained by minimizing the following function with respect to \(\alpha,\beta,\theta\) and \(\zeta\)

$$\left(\alpha,\beta,\theta,\zeta\right)=\frac{n}{2}-2\sum_{i=1}^{n}F\left({x}_{i:n}\right)-\frac{1}{n}\sum_{i=1}^{n}\left(2i-1\right)\text{log}S\left({x}_{n+1-i:n}\right),$$

where \(S\left({x}_{n+1-i:n}\right)\) denotes the SF evaluated at \({x}_{n+1-i:n}\). The uniform spacings of a random sample of size \(n\) from the OPLx is defined by

$${D}_{i}=F\left({x}_{i:n}\right)-F\left({x}_{i-1:n}\right).$$

where \({D}_{i}\) are the uniform spacings, \(F\left({x}_{0:n}\right)=0,F\left({x}_{n+1:n}\right)=1\), and \(\sum_{i=1}^{n+1}{D}_{i}=1.\) The MPS estimators (MPSE) of the OPLx parameters can be obtained by maximizing

$$G=\frac{1}{n+1}\sum_{i=1}^{n+1}\text{log}\left({D}_{i}\right).$$

The MPSE of the OPLx parameters can also be calculated by solving

$$\frac{1}{n+1}\sum_{i=1}^{n+1}\frac{1}{{D}_{i}}\left[{\psi}_{k}\left({x}_{i:n}\right)-{\psi}_{k}\left({x}_{i-1:n}\right)\right]=0.$$

Let \({u}_{i}=i/(n+1)\) be an unbiased estimator of \(F\left({x}_{i:n}\vert\alpha,\beta,\theta,\zeta\right)\). Then, the PC estimators (PCE) of the OPLx parameters are obtained by minimizing the following function

$$P\left(\alpha,\beta,\theta,\zeta\right)={\sum}_{i=1}^{n}{\left\{{x}_{i:n}-\beta\left[{\left(1-\frac{\theta-\theta{d}_{i}}{\theta+\left(1-\theta\right){d}_{i}}\right)}^{-\frac{1}{\alpha}}-1\right]\right\}}^{2},$$

with respect to \(\alpha,\beta,\theta,\) and \(\zeta,\) where \({d}_{i}={\left(1-{u}_{i}\right)}^{\frac{1}{\zeta}}.\).

The minimization is performed by differentiating \(P\left(\alpha,\beta,\theta,\zeta\right)\) with respect to each parameter and equating the resulting first-order partial derivatives to zero, and solving the resulting nonlinear system numerically using iterative optimization techniques such as the Newton–Raphson or quasi-Newton methods to obtain the PCE. This approach ensures that the estimates minimize the squared deviation between the theoretical and empirical quantiles of the fitted model.

Simulation analysis

This section presents a simulation study that evaluates the performance of the eight estimation methods discussed previously. The effectiveness of these methods is assessed based on three key metrics: the absolute biases \(\left(\left|\text{B}\text{I}\text{A}\text{S}\right|\right)\), mean square errors (MSE), and mean relative errors (MRE). These metrics are calculated using the following equations:

$$\left|\text{B}\text{I}\text{A}\text{S}\right|=\frac{1}{N}\sum_{i=1}^{N}\left|{\widehat{\omega}}_{i}-\omega\right|$$
$$\text{M}\text{S}\text{E}=\frac{1}{N}\sum_{i=1}^{N}{\left|{\widehat{\omega}}_{i}-\omega\right|}^{2}$$

and

$$\text{M}\text{R}\text{E}=\frac{1}{N}\sum_{i=1}^{N}\left|{\widehat{\omega}}_{i}-\omega\right|/\omega,$$

where \(\omega={\left(\alpha,\beta,\theta,\zeta\right)}^{T}.\).

The simulated observations from the OPLx model are obtained using its QF. We generate \(N=5000\) random samples, say, \({x}_{1},\dots\dots,\) \({x}_{n}\), for different sizes of \(n=\) 20, 50, 200 and 500 and different parametric values of \(\alpha=\left\{0.5,0.75,1.5,2.75\right\}\), \(\beta=\left\{0.5,0.75,1.5,2\right\}\), \(\theta=\left\{0.67,0.75,1.5,2.5\right\}\) and \(\zeta=\left\{\text{0.67,1.5},2.5,3\right\}\).

The simulation experiments were conducted using R software (version 4.4.0; R Core Team, 2024)28 with the n1minb function from the stats package to estimate the parameters of the OPLx distribution. Tables 2, 3, 4, 5, 6, 7, 8 and 9 summarize the |BIAS|, MSE, and MRE values for eight estimation methods—MLE, LSE, WLSE, CRVME, MPSE, PCE, ADE, and RADE—under different parameter settings and sample sizes (n = 20, 50, 200, and 500). In each table, superscripts denote the ranking of each estimator’s performance for a given parameter, while the column labeled \(\sum Ranks\) reports the total performance score across all parameters.

Across all parameter combinations, the simulation results demonstrate that the performance of each estimator improves as the sample size increases, confirming the consistency of all methods. Specifically, both MRE and MSE values decrease with increasing n, as expected for asymptotically unbiased estimators.

However, notable differences emerge when comparing methods. The RAD and AD estimators consistently achieve smaller bias and MSE than the traditional methods (ML, LS, and WLS), particularly for small and moderate sample sizes. This highlights their robustness and superior small-sample efficiency. In contrast, the ML method tends to exhibit larger bias and variability, especially when sample sizes are small, indicating its sensitivity to initial parameter values and potential convergence issues in complex likelihood surfaces.

The results in Table 10, which summarize the cumulative and partial rankings of all estimation techniques, confirm these observations. The RADE method attains the best overall rank (total score = 62), followed by ADE and PCE, suggesting that robust and adjusted estimation techniques outperform classical and likelihood-based counterparts. These findings demonstrate the practical advantage of using the RAD approach for precise and stable estimation of the OPLx distribution parameters, regardless of parameter configuration or sample size.

The absolute biases, MSE, and MRE presented in Table 1 are graphically summarized in Figs. 4, 5 and 6. These plots clearly demonstrate that the parameter estimates remain stable across different sample sizes. Moreover, the steady decline in MSE values as the sample size (\(n\)) increases confirms the consistency and reliability of all eight estimators examined.

Fig. 4
Fig. 4
Full size image

Absolute bias values for the OPLx parameters obtained using eight estimation methods across varying sample sizes.

Table 2 Simulation results of BIAS, MSE, and MRE for eight estimators under different sample sizes with parameters \(\alpha=0.5,\beta=0.75,\theta=0.67,\zeta=1.5\).
Table 3 Simulation results of BIAS, MSE, and MRE for eight estimators under different sample sizes with parameters \(\alpha=0.75,\beta=0.75,\theta=0.67,\zeta=1.5\).
Table 4 Simulation results of BIAS, MSE, and MRE for eight estimators under different sample sizes with parameters \(\alpha=1.5,\beta=0.75,\theta=0.67,\zeta=1.5\).
Table 5 Simulation results of BIAS, MSE, and MRE for eight estimators under different sample sizes with parameters \(\alpha=2.75,\beta=0.75,\theta=0.67,\zeta=1.5\).
Table 6 Simulation results of BIAS, MSE, and MRE for eight estimators under different sample sizes with parameters \(\alpha=0.5,\beta=1.5,\theta=0.67,\zeta=1.5\).
Table 7 Simulation results of BIAS, MSE, and MRE for eight estimators under different sample sizes with parameters \(\alpha=0.5,\beta=0.75,\theta=1.5,\zeta=1.5\).
Table 8 Simulation results of BIAS, MSE, and MRE for eight estimators under different sample sizes with parameters \(\alpha=0.5,\beta=0.75,\theta=0.67,\zeta=2.5\).
Table 9 Simulation results of BIAS, MSE, and MRE for eight estimators under different sample sizes with parameters \(\alpha=2.75,\beta=2,\theta=2.5,\zeta=3\).
Table 10 Partial and overall ranks of estimation methods across different parameter combinations and sample sizes.
Fig. 5
Fig. 5
Full size image

MSE values for the OPLx parameters under eight estimation methods and different sample sizes.

Fig. 6
Fig. 6
Full size image

MRE values for the OPLx parameters across eight estimation methods and sample sizes.

Three Real Data Applications

This section shed light on the real data applications of the OPLx distribution to illustrate its flexibility in modeling three real-life datasets. The first dataset refers to remission times (in months) for bladder cancer patients29, and it consists of 128 observations. The second dataset refers to survival times (in days) of 72 Guinea pigs infected with virulent tubercle bacilli30. The third dataset represents the fatigue fracture life of kevlar 373/epoxy subjected to constant pressure at 90% stress level until all had failed31 and32.

The OPLx model is compared with other competing Lx models including the FTLLx, MCLx, BELx, KwLx, BXLx, OEHLLx, OLxLL, TWLx, Weibull (W), gamma (Ga), and Lx distributions.

The model selection is carried out based on some information criteria (IC) and goodness-of-fit measures including the Akaike IC (AIC), consistent AIC (CAIC), \(-\mathcal{l}\), where \(-\mathcal{l}\) is the maximized log-likelihood, Cramer–von Mises (\({W}^{*}\)), Anderson–Darling (\({A}^{*}\)), and the Kolmogorov–Smirnov (KS) statistics and its associated p-value (KS p-value).

The analytical results of the fitted models for the three datasets are summarized in Tables 11, 12 and 13. These tables present the values of various goodness-of-fit criteria, including AIC, CAIC, \({W}^{*}\), \({A}^{*}\), \(-\mathcal{l}\), KS, and its associated KS p-value. The results clearly show that the proposed OPLx distribution consistently achieves the lowest values for all information criteria and the highest KS p-value, indicating its superior performance compared with all competing Lx-based models. This suggests that the OPLx distribution provides the best fit to the bladder cancer, guinea pig survival, and fatigue fracture datasets.

Tables 14, 15 and 16 present the ML estimates (MLEs) of the parameters of the fitted models, along with their corresponding standard errors (SEs) in parentheses. These results confirm the flexibility and stability of the OPLx distribution’s parameter estimates across diverse data types, further supporting its suitability for modeling various real-world lifetime datasets.

Table 11 Goodness-of-fit statistics for different fitted distributions to the bladder cancer data.
Table 12 Goodness-of-fit statistics for different fitted distributions to the guinea pigs data.

To enhance interpretability, the key goodness-of-fit comparisons illustrated graphically in Figs. 7, 8, 9, 10, 11 and 12, which are provide a clearer visual summary of the relative performance of the fitted models. To further illustrate the superiority of the proposed OPLx distribution, Figs. 7, 8 and 9 present the fitted functions for the three analyzed datasets. It can be clearly observed that the OPLx model provides an excellent fit to the empirical data, accurately capturing both the central and tail behaviors of the distributions. Additionally, Figs. 10, 11 and 12 display the quantile–quantile (QQ) plots which are utilized to visually assess the adequacy of the OPLx distribution compared to the competing models. In these plots, the theoretical quantiles from each fitted model are plotted against the empirical quantiles of the observed data. For the OPLx distribution, the plotted points align closely with the 45° reference line across all datasets, indicating that the theoretical and empirical quantiles are in strong agreement. This close alignment suggests that the OPLx model provides an excellent representation of the entire data range, including both the tails and the central region. In contrast, deviations from the line in the QQ plots for other models indicate poorer fits. Therefore, the QQ plots further support the conclusion that the OPLx distribution offers a superior overall fit to the data compared with existing competing models.

Table 13 Goodness-of-fit statistics for different fitted distributions to the fatigue fracture data.
Table 14 ML estimates (with SE in brackets) of different distributions for bladder cancer data.
Table 15 ML estimates (with SE in brackets) of different distributions for guinea pigs data.
Table 16 ML estimates (with SE in brackets) of different distributions for fatigue fracture data.
Fig. 7
Fig. 7
Full size image

Plots of the fitted functions of the OPLx distribution for bladder cancer data.

Fig. 8
Fig. 8
Full size image

Plots of the fitted functions of the OPLx distribution for guinea pigs data.

Fig. 9
Fig. 9
Full size image

Plots of the fitted functions of the OPLx distribution for fatigue fracture data.

Fig. 10
Fig. 10
Full size image

The QQ plots of the OPLx distribution and other distributions for bladder cancer data.

Fig. 11
Fig. 11
Full size image

The QQ plots of the OPLx distribution and other distributions for guinea pigs data.

Fig. 12
Fig. 12
Full size image

The QQ plots of the OPLx distribution and other distributions for fatigue fracture data.

Discussion

In this section, the OPLx distribution was used to model three real-life datasets, demonstrating its flexibility and superior fit when compared to several other Lx-based models, such as FTLLx, MCLx, BELx, TWLx, KwLx, BXLx, OEHLLx, OLxLL, W, Ga, and Lx distributions. The results of the goodness-of-fit statistics for the OPLx distribution and the other models are presented in Tables 11, 12 and 13 for each of the three datasets. For all three datasets, the OPLx distribution yielded the lowest values across all goodness-of-fit statistics, indicating an almost perfect fit. It outperformed all other models, including FTLLx and TWLx, which were the closest competitors. Visual comparisons, shown in Figs. 6, 7 and 8, further confirm that the OPLx model closely aligns with the observed data patterns, underscoring its effectiveness for modeling such datasets.

Tables 14, 15 and 16 further demonstrate that the OPLx model accurately captures the relationships within the datasets. The associated SEs of these estimates reflect the precision of the OPLx model, solidifying its reliability for real-world data analysis.

In conclusion, the OPLx distribution proves to be a highly versatile and robust model for various real-world applications, offering a superior fit compared to both traditional distributions like the W and G, and other Lx-based models. Its ability to accommodate diverse failure rate behaviors makes it a valuable tool in fields such as medical research, engineering, and materials science. Future research could explore extending the OPLx model to incorporate covariates or other advanced techniques to further enhance its applicability across different types of data.

Conclusions

This paper introduces the odd Pareto–Lomax (OPLx) distribution, a novel four-parameter extension of the Lomax model. The OPLx distribution is notable for its ability to exhibit a wide range of failure rate behaviors. Through a comprehensive analysis of the model’s properties and the use of maximum likelihood estimation alongside seven other classical estimation methods, we demonstrate the versatility and robustness of the OPLx distribution for a variety of sample sizes. Simulation results confirm the reliability of these estimators, with the RTAD approach emerging as the most effective method for parameter estimation. Furthermore, when applied to three real-world datasets, the OPLx distribution outperforms other Lomax-based models. The OPLx distribution represents a significant advancement in the modeling of data with extreme values or heavy-tailed characteristics, areas where traditional models typically struggle. Its ability to capture complex hazard rate behaviors and varying data patterns makes it a powerful tool for researchers and practitioners dealing with datasets that exhibit non-standard failure rate behavior. The findings of this study have important implications, offering a more flexible and accurate alternative to existing models and contributing to more effective modeling in fields such as reliability analysis, risk assessment, and survival analysis. By addressing the limitations of existing models, the OPLx distribution provides new opportunities for more accurate data analysis and opens avenues for further research on the extension and application of heavy-tailed distributions in various domains.

The OPLx distribution’s ability to better model extreme events can be invaluable for detecting outliers and rare occurrences, which are crucial in many Artificial Intelligence and Machine Learning tasks, such as fraud detection, fault diagnosis, and predictive maintenance. Additionally, the OPLx distribution can be adopted in machine learning models that deal with imbalanced datasets, where rare but significant instances (such as rare events or extreme behaviors) need to be given more weight for accurate prediction. In future work, we plan to explore the application of the OPLx distribution in machine learning, particularly for analyzing medical data. Medical datasets often involve rare and extreme events, where traditional methods may fall short in accurately capturing these phenomena. By leveraging the flexibility and robustness of the OPLx distribution, we aim to enhance the modeling of heavy-tailed distributions and improve prediction accuracy. This approach holds the potential to surpass conventional methods, offering more reliable insights in medical applications and other fields dealing with extreme values. We look forward to investigating this promising direction in forthcoming research.