Abstract
In the recent years, there has been a notable shift in the landscape of statistical and data science research, with increasing attention directed toward the development of advanced probability distributions aimed at addressing the challenges posed by medical and health-related data. In response to this need, the present study introduces a novel probability distribution model termed as the Exponentiated Odd Lomax Power Lindley distribution constructed by embedding the Power Lindley distribution within the Lomax family framework through the T-X transformation technique. This article delves into the application of probability distributions in medical data analysis, with a special focus on their role in understanding and managing the COVID-19 data. A comprehensive theoretical investigation is conducted, deriving key statistical properties including the probability and cumulative distribution functions, reliability functions, moments, moment generating function, order statistics, and entropy measures. To ensure accurate parameter estimation, several estimation methods are employed and rigorously compared through simulation studies. The real-world applicability of the developed distribution is illustrated through the modelling of COVID-19 datasets, where its performance is evaluated against a range of well-established probability distribution models. Empirical findings demonstrate the superior adaptability and modelling accuracy of the distribution while analysing real-world datasets.
Similar content being viewed by others
Introduction
In the field of medical science, there has been an ongoing challenge and need in transforming vague and complex data into useful insights that can improve healthcare outcomes. Given the sophisticated nature of human health influenced by environmental, biological and behavioural factors, the data is rarely comprehensible. It is generally high-dimensional, noisy and subject to varied degrees of uncertainty, making it complicated to draw specific conclusions. To overcome these challenges, the application of probability distribution models has become increasingly indispensable. Such models provide systematic approach to quantify and understand the inherent randomness in medical science data, enabling researchers to draw meaningful insights that are otherwise unpredictable. In particular, the COVID-19 pandemic has emphasized the significance of probabilistic models in studying and analysing global health problems. The stochastic nature of the pandemic, where outcomes are affected by different variables, need novel statistical approaches to model and forecast.
In response to this global challenge, numerous researchers have analysed COVID-19 data using a variety of probabilistic models to better understand its spread and impact. These efforts primarily focused on proposing models that offer improved fit and predictive accuracy for pandemic-related data. Such developments are useful for supporting evidence-based decision making and improving diagnostic accuracy. For example, Exponentiated transformation of Gumbel type-II distribution is studied by Sindhu et al.1 to model COVID-19 death case data. In this study, the authors have examined various mathematical properties of the model and have given their in-depth analysis. Hassan et al.2 provided a new generalization of the inverted Topp-Leone model by combining with the Kumaraswamy-G family. The application of the developed distribution to model COVID-19 data, demonstrated that this new distribution provides better fit than the other existing distributions. Azm et al.3 developed transmuted generalised Lomax distribution to analyse COVID-19 datasets. Odd Weibull Inverted Topp-Leone distribution explored by Almetwally4 as a generalisation of inverted Topp-Leone distribution with the Weibull-G family. Similarly, Almongy et al.5 studied extended Odd Weibull Rayleigh distribution, specifically designed to model COVID-19 mortality rates. The authors6,7,8,9 and10 have explored the novel probability distribution models and have demonstrated their superiority to model real-world scenarios through their application to COVID-19 pandemic. By compounding the Half-Logistic and Odd Frechet-G class, Bhat et al.11 introduced the Odd Frechet Half-Logistic distribution. Generalising the Lindley distribution by DUS transformation, the authors12 highlighted the effectiveness of the developed model while analysing real-life data sets. More recently, Qayoom et al.13 proposed a novel extension of the Rayleigh distribution. This new model was validated using COVID-19 data and compared with several established distributions. In a related contribution, Rather and Subramanian14 introduced the exponentiated Mukherjee-Islam distribution and demonstrated its practical utility through empirical applications. Extending this research field, Rather and Ozel15 developed the Weighted Power Lindley distribution, and in a subsequent study16, introduced a length-biased variant of the same. Both models were analysed in terms of their structural properties and real-world relevance. Moreover, the authors17,18, and19 have made notable contributions to the field by developing innovative extensions of probability distribution models and validating their effectiveness through real-world applications.
All these advancements reflect a broader trend in recent years to development flexible probability models that can better capture the complex characteristics of real-world data. One effective strategy to achieve such flexibility is through the use of generator functions that can transform existing baseline distributions into new classes of probability models. In this article, we used the transformed-transformer (T-X) family by Alzaatreh et al.20 to introduce new generalized probability distribution model, with cumulative distribution function (CDF) as:
,
Where \(f(v)\) is the probability density function of a random variable \(V\) taking values in the range \(\left[ {m_{1} , m_{2} } \right]\) such that \(- \infty \le m_{1} < m_{2} \le + \infty\); \(Z\) is another random variable with CDF \(G_{Z} (z)\), and \(H(z)\) is a function of \(G_{Z} (z)\). This function takes different forms as mentioned by Alzaatreh et al.20. In this study, we consider the exponentiated odds function form,
Therefore, the resulting cumulative distribution function of the new generalised probability distribution is
where, \(F_{V} (v)\) is the cumulative distribution function of random variable \(V\).
Recently, many researchers have focused on extending the flexibility of probability distributions by applying the T-X transformation approach. For instance, Hassan and Mohamed21 introduced the Weibull Inverse Lomax distribution, a four-parameter lifetime model. The authors22 developed the new generalization of the Power Lomax distribution with Topp-Leone type-II family. Using the Gompertz-G family, Gompertz Frechet distribution is introduced by Oguntunde et al.23. Odd Frechet inverse Lomax distribution studied by ZeinEldin et al.24 as an extension of inverse Lomax distribution using Frechet-G family. Marshall-Olkin Alpha Power Weibull distribution introduced by Almetwally et al.25 as an extended generalization of the Weibull distribution. The new Exponential-X Frechet distribution developed by Alzeley et al.26 through T-X method to increase the flexibility of Frechet distribution. The Odd Lomax inverted Nadarajah-Haghighi distribution has been studied by Almongy et al.27 by incorporating inverted Nadarajah-Haghighi distribution with the Lomax-G family. Similarly, the authors28,29,30,31,32,33,34,35 and36 have explored many probability distribution models using T-X transformation approach and have demonstrated their practical utility while analysing real-life datasets. Building upon this foundation, our study introduces a novel extension of the Power Lindley distribution, aiming to achieve superior performance compared to existing models. By capturing the complex dynamics of the data more effectively, this new distribution has the potential to offer valuable insights and contribute meaningfully to public health planning and decision-making.
The structure of the manuscript is organised to present a comprehensive understanding of the proposed model. Section “Exponentiated Odd Lomax Power Lindley distribution” outlines the derivation of the probability density function, cumulative distribution, survival function and failure rate, together with their graphical representation. Section “Linear Representation” is dedicated to a linear representation of the various theoretical properties of the developed model mentioned in Section “Exponentiated Odd Lomax Power Lindley distribution”. Section “Statistical Properties” explores the statistical moments, and reliability analysis measures of the underlying probability distribution model. Section “Ordered statistics” and Section “Entropy measures” address ordered statistics and entropy measures respectively. Parameter estimation techniques are discussed in detail under Sect. “Parameter Estimation”. Section “Simulation” presents an extensive simulation study to assess the performance of the different estimation techniques. In Sect. “Application”, the applicability of the model is demonstrated through analysis of real-life data sets. Finally, Sect. “Conclusion” concludes the manuscript with a summary of key findings and discussions.
Exponentiated odd lomax power lindley distribution
In this study, we introduce a novel and flexible probability distribution, termed the Exponentiated Odd Lomax Power Lindley (EOLxPL) distribution, by integrating the Power Lindley (PL) distribution as the baseline model within the Lomax-G family framework, as proposed by Dhungana and Kumar30. This construction utilizes the versatile T-X transformation method, previously outlined in Sect. “Introduction”, which facilitates the development of complex distributional forms through systematic approach.
The Lomax-G family is a well-known heavy-tailed model extensively employed in diverse domains such as reliability engineering and survival analysis, medical sciences, actuarial science, economics, and environmental studies. Its capacity to accommodate extreme values and long right tails makes it a suitable generator for constructing new, flexible families of distributions such as Lomax-normal, Lomax–Weibull, Lomax-loglogistic, Lomax–Pareto distributions and many more. These models have shown greater flexibility in analysing real-life data.
The probability density function (PDF) and cumulative distribution function (CDF) of the Lomax distribution37 are respectively defined as:
and
An essential tool used in enhancing distributional shapes is the Exponentiated Odd transformation, which modifies the baseline CDF by introducing parameters that offer control over skewness, kurtosis, and tail behaviour. This transformation, as introduced by Alzaatreh et al.20, is particularly effective when modelling datasets with heavy tails, asymmetry, or multimodal patterns. The general form of the Exponentiated Odd transformation applied to a baseline CDF is given by:
In the formulation of the EOLxPL distribution, we extend the Lomax generator with the exponentiated odd transformation applied to the Power Lindley distribution using the T-X methodology. This results in a five-parameter model that summarises the combined advantages of the three foundational techniques: The Power Lindley distribution for baseline flexibility, the Lomax distribution for tail robustness, and the exponentiated odd transformation for shape control.
The PL distribution, a two-parameter model introduced and extensively analysed by Ghitany et al.38, offers greater flexibility compared to the standard Lindley distribution. It allows for more varied shapes in the density and hazard rate functions, as well as adjustments in skewness and kurtosis. Assuming that \(Z\) be a random variable following PL distribution with parameters \((\gamma ; \delta )\). The PDF and the corresponding CDF are given by:
and
Now using Eq. (2) in the transformation explained in Eq. (1) we have
where \(G_{Z} (z)\) is the CDF of PL distribution defined in Eq. (4). Substituting the Eq. (4) in Eq. (5), we get
The expression represented by Eq. (6) is the required CDF of the newly explored probability distribution model named as EOLxPL distribution. The graphical pattern of the CDF of EOLxPL distribution is shown in Fig. 1, Fig. 2 as below:
To derive the PDF of EOLxPL distribution, we take the derivative of Eq. (5) with respect to z, which gives:
where \(g(z; \gamma ,\delta )\) and \(G_{Z} (z)\) are the PDF and CDF of PL distribution defined in Eq. (3) and (4) respectively. Substituting Eq. (3) and (4) in Eq. (7) we obtain
Equation (8) is the required PDF of EOLxPL distribution. The behaviour of the PDF of EOLxPL distribution is graphical explained in Fig. 3, Fig. 4 for different combination of parameters.
The PDF plot as shown in Fig. 3, demonstrate how variations in the shape parameters \(\gamma\) and \(\delta\), with fixed values of other parameters \(\alpha = 1.3\), \(\beta = 1.1\), and \(\lambda = 4\) systematically influence the distributions form. A consistent pattern emerges as \(\gamma\) and \(\delta\) increase: the PDF becomes more peaked, less skewed, and increasingly concentrated around lower values of the random variable \(Z\). Additionally, the tail behaviour changes significantly—distributions with smaller shape parameters exhibit heavier right tails, indicating a greater likelihood of extreme outcomes, while those with larger parameters display shorter, and lighter tails. The overall trend indicates a transition from broad, skewed, and heavy-tailed distributions to sharply peaked forms. Similarly, we can interpret Fig. 4.
Thus, the behaviour of PDF is strongly influenced by its underlying parameters, namely \(\alpha\),\(\beta\), \(\gamma\), \(\delta\), and \(\lambda\), which respectively represent shape and scale. The parameters \(\alpha\), \(\gamma\), \(\delta\), and \(\lambda\) serve as shape parameters, governing the overall form of the distribution, including features like skewness, tail behaviour, and peak sharpness. The scale parameter \(\beta\) controls the spread or dispersion of the distribution. Collectively, these parameters allow for high flexibility, enabling the PDF to adapt to a wide range of data characteristics by adjusting its symmetry, concentration, and central tendency. Such parameter-driven adaptability makes the distribution particularly effective for modelling diverse real-world datasets across various applications.
The survival function, which describe the behaviour of the EOLxPL distribution over time, are mathematically expressed as follows:
The graphical behaviour of the survival function over different values of parameters is presented in Fig. 5 and Fig. 6.
Figure 5 and Fig. 6 illustrate the impact of varying different parameter values on the survival function’s behaviour. In Fig. 5, when parameters \(\gamma\), \(\delta\), and \(\lambda\), are varied while keeping the remaining parameters \(\alpha\) and \(\beta\) constant, the survival curve shifts towards the origin. This indicates a higher likelihood of early failures, leading to a reduced expected lifespan and diminished system reliability. Conversely, in Fig. 6, parameters \(\alpha\),\(\beta\), and \(\lambda\) while holding \(\gamma\), and \(\beta\) constant, results in the survival function shifting to the right. This shift signifies prolonged survival times, suggesting a lower failure rate, increased system longevity, and enhanced overall reliability.
The failure rate is a time-dependent function that quantifies the instantaneous likelihood of failure, given that the system or component has survived up to that specific time. For the EOLxPL distribution, the failure rate denoted by \(F_{R} (z)\), is given by:
Figure 7 depicts the variation in the failure rate of the EOLxPL distribution corresponding to different parameter values.
The failure rate plot shown in Fig. 7, demonstrates the flexibility and richness of the model in capturing various failure rate behaviours through parameter variation. The curves exhibit a wide range of shapes, including monotonically increasing and decreasing, unimodal (hump-shaped), and bathtub-shaped hazard rates. This diversity reflects the model’s ability to represent systems with different reliability characteristics—such as early-life failures, aging effects, and systems with constant or varying risk over time. Such adaptability makes the model highly suitable for real-world reliability and survival analysis data where failure patterns are often complex and non-linear.
Linear representation
In this section, we will provide linear representation of the various theoretical properties of the EOLxPL distribution mentioned in Section “Exponentiated Odd Lomax Power Lindley distribution” such as PDF, CDF, survival function and failure rate. With the help of these linear representations, various statistical properties of the developed model can be easily evaluated. Such approach aims to enhance our ability to examine and interpret the behaviour of the EOLxPL distribution and draw valuable insights for its application and further development.
Consider the power series expansion for \((1 + \Delta )^{ - s}\) and \((1 - \Delta )^{ - s}\) provided \(|\Delta | < 1\), which are respectively expressed mathematically as
and
From Eq. (5), we have
To ensure the validity of the series expansion used in the expression
it is necessary to establish the convergence of the associated power series. Specifically, we express this term as a generalized binomial expansion of the form \((1 + \Delta )^{ - s} = \sum\limits_{i = 0}^{\infty } {\left( {{}^{ - s}C_{i} } \right) \Delta^{i} }\), which converges absolutely for \(|\Delta | < 1\). In our case, the expansion variable is
where \(G_{Z} (z)\) is the CDF, satisfying \(0 \le G_{Z} (z) \le 1\), so for the convergence of the expansion across the domain of interest, the following condition of convergence must be satisfied
Under this condition the expansion.
\(\left\{ {1 + \frac{1}{\beta }\left( {\frac{{G_{Z} (z)}}{{1 - G_{Z} (z)}}} \right)^{\lambda } } \right\}^{( - \alpha )} = \sum\limits_{i = 0}^{\infty } {\left( {{}^{ - \alpha }C_{i} } \right) \frac{1}{{\beta^{i} }}\left( {\left( {\frac{{G_{Z} (z)}}{{1 - G_{Z} (z)}}} \right)^{\lambda } } \right)^{i} }\) converges absolutely.
Now, Using the power series expansion mentioned in Eq. (9) and Eq. (10), we obtain
Again using the power series expansion for \(\left[ {1 - \left\{ {1 + \left( {\frac{\delta }{\delta + 1}} \right)z^{\gamma } } \right\}e^{{ - \delta z^{\gamma } }} } \right]^{\lambda i + j}\), and it will be expanded as
On substituting the value of above expression in Eq. (11), the required representation of the CDF of EOLxPL distribution is
where,
Similarly, the PDF, survival function and failure rate of EOLxPL distribution using power series expansions can be expressed as
where,
and
Statistical properties
Moments
The \(q^{th}\) moment about origin for the EOLxPL distribution usually denoted by \(\mu^{\prime}_{q}\) is given by
For computing the moments of the EOLxPL distribution, we will use linear representation of the PDF explained in the Eq. (13). Therefore, from Eq. (14) we have
Now make transformation \(z^{\gamma } = y\), so \(dz = \frac{{y^{{\frac{1 - \gamma }{\gamma }}} }}{\gamma }dy\). Also \({\text{when}} z \to 0,{\text{then}} y \to 0\) and \({\text{when}} z \to \infty , {\text{then}} y \to \infty\).
Putting \(q = 1, 2, 3, 4\) in Eq. (15), we obtain first four moments about origin for the EOLxPL distribution and are expressed as
So, the variance of the EOLxPL distribution usually denotes as \(\mu_{2}\) is calculated from the above expressions using the relation
A detailed overview is presented in Table 1, highlighting how the central tendency, dispersion, and shape characteristics of the EOLxPL distribution vary with changes in its parameters.
Incomplete moments
The \(q^{th}\) incomplete moment about origin based on the EOLxPL distribution is defined as
Using the above transformation and after simplification we get,
Moment generating function
The moment generating function associated with EOLxPL distribution is given by
Similarly, the characteristics function \(Cz(t)\) and the cumulant generating function \(Kz(t)\) are mathematically expressed as
Mean residual life and mean past life
In the field of reliability theory and survival analysis, the concept of mean residual life (MRL) and mean past life (MPL) are essential for studying the behaviour of lifetimes of systems or components. These functions provide useful insights about the remaining and elapsed lifetime of a component, which are essential in modelling, diagnostics, and decision-making in engineering, biomedical studies, and actuarial science.
The MRL determines the average time a component is expected to continue working after time \(T\), provided it is still working at time \(T\). This measure is important in maintenance and warranty analysis, helping to assess when a component may likely fail after operating for a certain period. Under the assumption that the lifetime \(T\) of the component follows EOLxPL distribution, the MRL is defined as:
where, \(\Phi_{T} (t)\) is the CDF of EOLxPL distribution with expression given in Eq. (12).
Similarly, The MPL measures the average lifetime of the components that have failed before or at time T. It presents the past behaviour of failures and helps in studying post-failure analysis of the component. The MPL is important in the fields like warranty return analysis, used product market analysis, forensics, and many more areas. For the lifetime \(T\) of the component following EOLxPL distribution, the MPL is defined as:
Ordered statistics
Ordered statistics is a fundamental concept in statistical theory and data analysis, focusing on the properties and behaviour of data points when arranged in a specific order. Specifically, it involves studying the values that correspond to certain positions in a sorted sequence of observations. Ordered statistics plays an important role in both descriptive and inferential statistics, helping to characterize data, identify trends, and perform hypothesis testing.
Ordered statistics refer to the statistics that are derived from the values of sample observations once they have been arranged in ascending or descending order. Consider a random sample \(z_{1} , z_{2} , z_{3} ,..., z_{n}\) of size \(n\) following EOLxPL distribution. Then the ordered statistics corresponding to the given random sample is \(\left( {Z_{(1)} , Z_{(2)} , Z_{(3)} ,..., Z_{(n)} } \right)\) such that.
\(Z_{(1)} = \min (z_{1} , z_{2} , z_{3} ,..., z_{n} )\) and \(Z_{(n)} = \max (z_{1} , z_{2} , z_{3} ,..., z_{n} )\).
In this section, we will obtain the PDF for the minimum and maximum ordered statistics. The PDF of the \(i^{th}\) ordered statistics based on EOLxPL distribution is
On substituting Eq. (6) and Eq. (8) in the above expression, the required PDF of the \(i^{th}\) ordered statistics is given by
Put \(i = 1 {\text{and}} n\) in Eq. (16), we get the PDF of minimum and maximum ordered statistics for EOLxPL distribution respectively. That.
Entropy measures
Entropy is a foundational concept in several scientific disciplines, including information theory, physics (thermodynamics), statistics, and even machine learning. While its specific interpretation varies by field, the core idea revolves around uncertainty, randomness, or information content.
Entropy was formally introduced by Claude Shannon39 as a measure of the uncertainty or unpredictability associated with a random variable or information source. If the outcome of a random process is highly predictable, entropy is low (close to zero). If the outcome is very uncertain or random, entropy is high. In this section we consider two entropy measures namely Renyi’s entropy40 and Tsalli’s entropy41 for the EOLxPL distribution. The Renyi’s entropy associated with EOLxPL distribution is defined as:
Using the power series expression for \(\left[ {1 + \frac{1}{\beta }\left\{ {\frac{{1 - \left( {1 + \left( {\frac{\delta }{\delta + 1}} \right)z^{\gamma } } \right)e^{{ - \delta z^{\gamma } }} }}{{1 - \left\{ {1 - \left( {1 + \left( {\frac{\delta }{\delta + 1}} \right)z^{\gamma } } \right)e^{{ - \delta z^{\gamma } }} } \right\}}}} \right\}^{\lambda } } \right]^{ - \varsigma (\alpha + 1)}\)we get,
Again, using power series expression and after some simplification, we obtain
Now make a substitution \(w = z^{\gamma }\). So,
Also, when \(z \to 0, {\text{then}} w \to 0\) and when \(z \to \infty , {\text{then}} w \to \infty\). Therefore,
This is the required expression for the Renyi’s entropy for the EOLxPL distribution. Similarly, the Tsalli’s entropy based of EOLxPL distribution is given by
Parameter estimation
In statistical modelling, estimating the parameter(s) of a probability distribution is fundamental for fitting the model to observed data. This section discusses a range of estimation techniques employed to determine the unknown parameter(s) of a distribution. Each method operates by optimizing a particular objective function, which is formulated based on the observed sample and depends on the parameter(s) of interest. The following are the principal methods considered:
Maximum likelihood estimation is a widely used method for estimating the parameters of the probability distribution models. It involves constructing the likelihood function, which represents the joint probability density function of the observed sample given the parameters. The estimate of the parameters is obtained by maximizing this function with respect to each of the parameter. For computational convenience, the log-likelihood function is often used, and the parameter are estimated by solving the resulting likelihood equation. For EOLxPL distribution, the objective function for estimating distribution parameters is mathematically given by:
where, \(\varphi (z; \alpha , \beta , \gamma , \delta , \lambda )\) is the PDF of EOLxPL distribution, and \(z_{1} , z_{2} , z_{3} ,..., z_{n}\) is the random sample of size \(n\) drawn from EOLxPL distribution.
The Anderson–Darling estimation method (ADEM) is another approach for estimating parameters. This method focuses on minimizing the Anderson–Darling statistics, which quantifies the distance between the empirical cumulative distribution function and the theoretical cumulative distribution function of the model. It gives more weight to the tails of the distribution, making it suitable for application where tail behaviour is of particular interest. The objective function for Anderson–Darling method in order to obtain parameters of EOLxPL distribution is expressed as
where, \(\Phi_{Z} (z)\), and \(S_{Z} (z)\) denotes the CDF and survival function respectively of EOLxPL distribution.
Right-tailed Anderson–Darling estimation method is a variant of ADEM that emphasizes the right tail of the distribution. The objective function of this method is a modified version of the standard Anderson–Darling statistics and is mathematically expressed as:
Similarly, the left-tailed version of the ADEM focuses on the lower tail of the distribution. The estimate of the parameters of the EOLxPL distribution under this method is obtained by minimizing the objective function given by:
The Cramer-Von Mises method estimates parameter(s) by minimizing a Cramer-Von Mises statistic that evaluates the squared difference between the empirical cumulative distribution function and the theoretical cumulative distribution function of the model across all data points. Unlike the ADEM, it assigns equal weight to all parts of the distribution, offering a more uniform fit over the entire range. For Cramer-Von Mises estimation method, the objective function to estimate parameters of EOLxPL distribution is equal to:
Least squares estimation method aims to minimize the sum of squared differences between the empirical cumulative distribution function and the theoretical cumulative distribution function values. In this method the objective function that is to be minimized is generally presented as:
As an extension of least squares estimation, the weighted least square estimation method introduces a weight function into the objective function. This allows the method to assign different levels of importance to different regions of the distribution, thereby improving flexibility and robustness. The estimation involves minimizing the weighted sum of squared differences between the empirical and theoretical cumulative distribution functions, and is enumerated as:
The maximum product of spacing method of estimation is based on maximizing the product of spacing between successive theoretical cumulative distribution function values at ordered sample points. The objective function that needs to maximized to estimate unknown parameters represents the geometric mean of these spacing, which reflects how well the model fits the data and is mathematically expressed as:
where, \(\Omega_{i} = \Phi_{{Z_{(i)} }} (z) - \Phi_{{Z_{(i - 1)} }} (z)\), and \(\Phi_{{Z_{(i)} }} (z)\) is the CDF of \(i^{th}\) ordered statistics of EOLxPL distribution.
In this approach, the estimation is performed by minimizing the sum of absolute differences between adjacent spacing of the theoretical CDF evaluated at ordered sample values. So, the objective function in order to estimate unknown parameters of the given distribution is given by:
The minimum spacing absolute log-distance estimation method is based on taking the logarithm of the absolute spacing differences before summing them. The log transformation helps to stabilize variance and enhance estimation performance in distributions where spacing varies significantly across the support. To obtain estimates of the unknown parameters of the EOLxPL distribution under minimum spacing absolute log-distance estimation method, the following objective function is to be minimized:
Simulation
To evaluate the performance of the various parameter estimation techniques proposed in Sect. “Parameter Estimation” for the EOLxPL distribution, a comprehensive Monte Carlo simulation study is carried out using the R-software42. The primary aim of this study is to systematically compare the efficiency, precision, and reliability of each estimation method under different sampling conditions. The following terminology is used to denote these estimation methods:
This simulation framework is designed to assess the estimators using three widely accepted performance measures: Bias, Mean Squared Error (MSE), and Mean Relative Error (MRE). These metrics respectively quantify the average deviation of the estimator from the true parameter, the variability of the estimation error, and the relative magnitude of the error in relation to the true parameter value. Together, they offer a holistic view of the quality of each estimation method. Under this simulation study, the random samples are generated from the EOLxPL distribution using the inverse cumulative distribution function method. The four distinct sets of initial parameter values used in this simulation study are:
Set 1-\((\alpha = 0.25, \beta = 0.05, \gamma = 0.50, \delta = 0.15, \lambda = 0.90)\).
Set 2-\((\alpha = 0.40, \beta = 0.10, \gamma = 0.35, \delta = 0.25, \lambda = 0.70)\).
Set 3-\((\alpha = 0.15, \beta = 1.05, \gamma = 1.30, \delta = 0.35, \lambda = 1.15)\).
and Set 4-\((\alpha = 0.20, \beta = 0.95, \gamma = 1.15, \delta = 0.25, \lambda = 1.05)\).
These values are selected to cover a wide range of possible conditions for the given parameters defined on \((0,\infty )\). These values represent both small and moderately large cases, allowing us to see how different estimation methods behave across different parts of the parameter space. This range helps test whether each method is stable and reliable when the parameters change in size. Some estimation techniques may work well when parameters are small but may lose accuracy as values increase. By using a varied set of values, we can better understand if a method performs consistently or only under certain conditions. To ensure the robustness of the results, each simulation is replicated 1000 times for a series of predefined sample sizes: 20, 50, 100, 150, 250, 400, and 600. These sample sizes are chosen to reflect small to moderately large data scenarios that may arise in practical applications. For each replicate, parameter estimates are obtained using the different estimation techniques under consideration. The performance metrics (Bias, MSE, and MRE) are then computed across all replications for each method and sample size. The corresponding mathematical expressions are given by:
\(Bias(\hat{\Theta }) = \frac{1}{r}\sum\limits_{m = 1}^{r} {|\hat{\Theta }_{m} - \Theta |}\),\(MSE = \frac{1}{r}\sum\limits_{m = 1}^{r} {\left( {\hat{\Theta }_{m} - \Theta } \right)}^{2}\), and \(MRE = \frac{1}{r}\sum\limits_{m = 1}^{r} {\left| {\frac{{\hat{\Theta } - \Theta }}{\Theta }} \right|}\).
Where, \(\hat{\Theta }_{m}\) denotes the estimate of the parameter in the \(m^{th}\) replication, \(\Theta\) is the true parameter value, and \(r\) is the total number of replications or iterations.
The computed values for each metric across all the methods and sample sizes are presented in Table 2, Table 3, Table 4 and Table 5. These results provide a detailed comparative analysis that highlights the relative strengths and limitations of each estimation technique. Specifically, they allow us to identify which methods perform more consistently as sample size increases and which technique may be more sensitive to sample variability.
To evaluate and compare the performance of various estimation methods, ranks were assigned to the Bias, MSE, and MRE values based on their magnitudes based on their magnitudes in ascending for each parameter corresponding to the same sample size. Table 6 presents the partial and overall ranks for each estimation method, derived by summing the individual ranks of Bias, MSE, and MRE. As seen in Table 2 to Table 5, all estimation methods exhibit reduced Bias, MSE, and MRE with increasing sample size, indicating improved accuracy and consistency of the estimators. Table 6 shows that the minimum spacing absolute log-distance estimation method consistently achieved the lowest overall rank followed by minimum spacing absolute distance and maximum likelihood estimation methods, demonstrating their superiority in estimating parameters of the EOLxPL distribution.
Overall, this simulation study offers valuable insights into practical performance of the proposed estimators, supporting theoretical findings and guiding the selection of appropriate estimation methods for empirical application involving the EOLxPL distribution.
Application
This section illustrates the practical utility and effectiveness of the newly proposed EOLxPL distribution in modelling real-world data. To validate its performance, we analyse two real life datasets pertaining to COVID-19 cases from China and Italy. The objective is to demonstrate the superiority of the EOLxPL distribution compared to several existing models, namely Exponentiated Odd Lomax Log Logistic (EOLxLogL), Exponentiated Odd Lomax Frechet (EOLxFrt), Exponentiated Odd Lomax Lomax (EOLxLx), Frechet (Frt), Lomax (Lx) and Power Lindley (PL) distributions. All data analysis and computations are carried out using the R-software42.
Dataset 1: This data set is about number of daily deaths due to COVID-19 in China from 23rd January to 28th March. The data set has been taken from (https://www.worldometers.info/coronavirus/country/china/) and had been studied by Sindhu et al43.:
8, 16, 15, 24, 26, 26, 38, 43, 46, 45,57,64,65,73,73,86,89,97,108, 97, 146, 121, 143, 142, 105, 98, 136, 114, 118, 109, 97, 150, 71, 52, 29, 44, 47, 35, 42, 31, 38, 31, 30, 28, 27, 22, 17, 22, 11, 7, 13, 10, 14, 13, 11, 8, 3, 7, 6, 9, 7, 4, 6, 5, 3, 5
Dataset 2:The second dataset presents COVID-19 mortality rates in Italy over a period of 59 days recorded from February 27 to April 27, 2020. This dataset is also studied by Almongy et al5.:
4.571, 7.201, 3.606, 8.479, 11.410, 8.961, 10.919, 10.908, 6.503, 18.474, 11.010, 17.337, 16.561, 13.226, 15.137, 8.697, 15.787, 13.333, 11.822, 14.242, 11.273, 14.330, 16.046, 11.950, 10.282, 11.775, 10.138, 9.037, 12.396, 10.644, 8.646, 8.905, 8.906, 7.407, 7.445, 7.214, 6.194, 4.640, 5.452, 5.073, 4.416, 4.859, 4.408, 4.639, 3.148, 4.040, 4.253, 4.011, 3.564, 3.827, 3.134, 2.780, 2.881, 3.341, 2.686, 2.814, 2.508, 2.450, 1.518.
The summary statistics for the given real life datasets are presented in Table 7 to get idea about the behaviour of such dataset.
The maximum likelihood estimator (MLE) and the corresponding standard error of the parameters involved in the EOLxPL distribution and the other well-known distribution used for comparative study has enumerated in Table 8 as below:
To evaluate and compare the goodness-of-fit and modelling capabilities of the EOLxPL distribution against the aforementioned distributions, we employ three different methodologies: Information criteria measures, goodness-of-fit tests, and graphical diagnostics.
Information criteria measures
To assess model adequacy, we utilize several information-based criteria, including Log-likelihood (−2logL), Akaike Information Criterion (AIC), Corrected Akaike Information Criterion (CAIC), Hannan-Quinn Information Criterion (HQIC), and Bayesian Information Criterion (BIC). These metrics are widely used to identify the most appropriate model, with lower values indicating a better fit. The corresponding values for each distribution are provided in Table 9 and Table 10. Based on these results, the EOLxPL distribution consistently exhibits the lowest values across all criteria, indicating its superior performance in fitting the real-world datasets compared to the competing models.
Goodness of fit test
To further validate the fit of the proposed model, we apply goodness-of-fit tests: The Kolmogorov–Smirnov (K-S), Cramer-von Mises (CVM), and Anderson–Darling (A-D) statistics. These tests compare the empirical distribution of the data with the theoretical distributions under consideration. The computed test statistics and corresponding p-values for each distribution are also summarized in Table 9 and Table 10. The EOLxPL distribution yields the smallest test statistics and the highest p-values, indicating the best agreement with the empirical data and confirming its superiority over other distributions.
Graphical measures
Visual tools provide an intuitive understanding of how well a distribution models real-world data. To this end, several graphical analyses are presented, including: Histograms over laid with estimated probability density functions, plots comparing empirical and theoretical cumulative distribution functions, estimated (or empirical) and theoretical survival function plot, probability–probability (P–P) plot and quantile–quantile (Q–Q) plot. These visualizations are shown below in Fig. 8, Fig. 9, Fig. 10, Fig. 11, Fig. 12, Fig. 13, Fig. 14, Fig. 15, Fig. 16 and Fig. 17. The EOLxPL distribution is observed to align more closely with the empirical data across all graphical measures, outperforming the other considered distributions for both datasets.
Daily mortality rates due to COVID-19.
Conclusion
This study presents a significant advancement in statistical modelling through the introduction of the EOLxPL distribution model, developed by embedding the Power Lindley distribution within the Lomax family via the T-X transformation method. The proposed model demonstrates a high degree of flexibility and adaptability, addressing limitations of existing distributions in capturing real-world data complexities. A thorough theoretical examination was undertaken, encompassing the derivation of fundamental statistical characteristics, including the probability density function, cumulative distribution function, survival and hazard rate functions, moments, moment generating function. ordered statistics and entropy measures. These derivations provide a robust foundation for both theoretical exploration and practical applications. To ensure the accuracy and reliability of parameter estimation for the EOLxPL distribution, several estimation techniques were analysed. Among them, the minimum spacing absolute log-distance estimation method emerged as the most efficient, offering superior performance in terms of bias, MSE, and MRE, as confirmed by comprehensive Monte Carlo simulation studies. These findings offer valuable guidance for selecting optimal estimation methods in empirical applications. The practical utility of the EOLxPL distribution was further validated through its application to real-world datasets from the COVID-19 pandemic. Comparative goodness-of-fit analyses demonstrated its superior performance over several established models, including the EOLxLogL, EOLxFrt, EOLxLx, Frt, Lx, and PL distributions. These results affirm the model’s capacity to effectively capture diverse data structures, making it a versatile and valuable tool for a wide range of applied statistical tasks.
Data availability
The data that supports the findings of this study are available within the article.
References
Sindhu, T. N., Shafiq, A. & Al-Mdallal, Q. M. Exponentiated transformation of Gumbel Type-II distribution for modeling COVID-19 data. Alex. Eng. J. 60(1), 671–689. https://doi.org/10.1016/j.aej.2020.09.060 (2020).
Hassan, A. S., Almetwally, E. M., & Ibrahim, G. M. (2021). Kumaraswamy inverted Topp–Leone distribution with applications to COVID-19 data. Computers, Materials & Continua/Computers, Materials & Continua (Print). 68(1) 337–358. https://doi.org/10.32604/cmc.2021.013971
Azm, W. S. et al. A new transmuted generalized Lomax distribution: Properties and applications to COVID-19 data. Comput. Intell. Neurosci. https://doi.org/10.1155/2021/5918511 (2021).
Almetwally, E. M. The Odd Weibull inverse Topp-Leone distribution with applications to COVID-19 data. Ann. Data Sci. 9(1), 121–140. https://doi.org/10.1007/s40745-021-00329-w (2021).
Almongy, H. M., Almetwally, E. M., Aljohani, H. M., Alghamdi, A. S. & Hafez, E. A new extended rayleigh distribution with applications of COVID-19 data. Results Phys. 23, 104012. https://doi.org/10.1016/j.rinp.2021.104012 (2021).
Liu, X. et al. Modeling the survival times of the COVID-19 patients with a new statistical model: A case study from China. PLoS ONE 16(7), e0254999. https://doi.org/10.1371/journal.pone.0254999 (2021).
Hossam, E. et al. A novel extension of Gumbel distribution: Statistical inference with Covid-19 application. Alex. Eng. J. 61(11), 8823–8842. https://doi.org/10.1016/j.aej.2022.01.071 (2022).
Alsuhabi, H. et al. A superior extension for the Lomax distribution with application to Covid-19 infections real data. Alex. Eng. J. 61(12), 11077–11090. https://doi.org/10.1016/j.aej.2022.03.067 (2022).
Gemeay, A. M. et al. General two-parameter distribution: Statistical properties, estimation, and application on COVID-19. PLoS ONE 18(2), e0281474. https://doi.org/10.1371/journal.pone.0281474 (2023).
Ahmad, A., Alsadat, N., Rather, A. A., Meraou, M. & El-Din, M. M. M. A novel statistical approach to COVID-19 variability using the Weibull-Inverse Nadarajah Haghighi distribution. Alex. Eng. J. 107, 950–962. https://doi.org/10.1016/j.aej.2024.08.008 (2024).
Bhat, A. A. et al. A novel extension of half-logistic distribution with statistical inference, estimation and applications. Sci. Rep. https://doi.org/10.1038/s41598-024-53768-9 (2024).
Qayoom, D., Rather, A. A., Alsadat, N., Hussam, E. & Gemeay, A. M. A new class of Lindley distribution: System reliability, simulation and applications. Heliyon https://doi.org/10.1016/j.heliyon.2024.e38335 (2024).
Qayoom, D. et al. Development of a novel extension of Rayleigh distribution with application to COVID-19 data. Sci. Rep. https://doi.org/10.1038/s41598-025-03645-w (2025).
Rather, A. A. & Subramanian, C. Exponentiated Mukherjee-Islam distribution. J. Stat. Appl. Prob. 7(2), 357–361. https://doi.org/10.18576/jsap/070213 (2018).
Rather, A. A. & Özel, G. The Weighted Power Lindley distribution with applications on the life time data. Pak. J. Stat. Op. Res. https://doi.org/10.18187/pjsor.v16i2.2931 (2020).
Rather, A. A. & Ozel, G. A new length-biased Power Lindley distribution with properties and its applications. J. Stat. Manag. Syst. 25(1), 23–42. https://doi.org/10.1080/09720510.2021.1920665 (2021).
Qayoom, D. & Rather, A. A. Weighted Transmuted Mukherjee-Islam distribution with statistical properties. Reliab.: Theo. Appl. 19, 124–137. https://doi.org/10.24412/1932-2321-2024-278-124-137 (2024).
Qayoom, D. & Rather, A. A. A comprehensive study of length-biased transmuted distribution. Reliab.: Theo. Appl. 19(78), 291–304. https://doi.org/10.24412/1932-2321-2024-278-291-304 (2024).
Rather, A. A., Subramanian, C., Al-Omari, A. I. & Alanzi, A. R. A. Exponentiated Ailamujia distribution with statistical inference and applications of medical data. J. Stat. Manag. Syst. 25(4), 907–925. https://doi.org/10.1080/09720510.2021.1966206 (2022).
Alzaatreh, A., Lee, C. & Famoye, F. A new method for generating families of continuous distributions. METRON 71(1), 63–79. https://doi.org/10.1007/s40300-013-0007-y (2013).
Hassan, A. S. & Mohamed, R. E. Weibull inverse Lomax distribution. Pak. J. Stat. Op. Res. https://doi.org/10.18187/pjsor.v15i3.2378 (2019).
Al-Marzouki, S., Jamal, F., Chesneau, C. & Elgarhy, M. Type II Topp Leone Power Lomax distribution with applications. Mathematics 8(1), 4. https://doi.org/10.3390/math8010004 (2019).
Oguntunde, P. E., Khaleel, M. A., Ahmed, M. T. & Okagbue, H. I. The Gompertz Fréchet distribution: Properties and applications. Cogent Math. Stat. 6(1), 1568662. https://doi.org/10.1080/25742558.2019.1568662 (2019).
ZeinEldin, R. A., Haq, M. a. U., Hashmi, S., Elsehety, M., & Elgarhy, M. Statistical inference of Odd Fréchet inverse Lomax distribution with applications. Complexity. 2020, 1–20. https://doi.org/10.1155/2020/4658596 (2020).
Almetwally, E. M. et al. Marshall-Olkin Alpha Power Weibull distribution: Different methods of estimation based on Type-I and Type-II censoring. Complexity. https://doi.org/10.1155/2021/5533799 (2021).
Alzeley, O. et al. Statistical Inference under censored data for the new Exponential-X Fréchet distribution: Simulation and application to leukemia Data. Comput. Intell. Neurosci. https://doi.org/10.1155/2021/2167670 (2021).
Almongy, H. M., Almetwally, E. M., Ahmad, H. H. & Al-Nefaie, A. H. Modeling of COVID-19 vaccination rate using odd Lomax inverted Nadarajah-Haghighi distribution. PLoS ONE 17(10), e0276181. https://doi.org/10.1371/journal.pone.0276181 (2022).
Bhatti, F. A., Hamedani, G. G., Korkmaz, M. Ç., Sheng, W. & Ali, A. On the Burr XII-moment exponential distribution. PLoS ONE 16(2), e0246935. https://doi.org/10.1371/journal.pone.0246935 (2021).
Ogunde, A. A. & Adeniji, O. E. Type II Topp-Leone Bur XII distribution: Properties and applications to failure time data. Sci. Afr. 16, e01200. https://doi.org/10.1016/j.sciaf.2022.e01200 (2022).
Dhungana, G. P. & Kumar, V. Exponentiated Odd Lomax exponential distribution with application to COVID-19 death cases of Nepal. PLoS ONE 17(6), e0269450. https://doi.org/10.1371/journal.pone.0269450 (2022).
Nagarjuna, V. B. V., Vardhan, R. V. & Chesneau, C. Nadarajah-Haghighi Lomax Distribution and its applications. Math. Comput. Appl. 27(2), 30. https://doi.org/10.3390/mca27020030 (2022).
Alotaibi, R., Almetwally, E. M., Ghosh, I. & Rezk, H. Classical and Bayesian inference on finite mixture of Exponentiated Kumaraswamy Gompertz and Exponentiated Kumaraswamy Fréchet distributions under Progressive Type II Censoring with Applications. Mathematics 10(9), 1496. https://doi.org/10.3390/math10091496 (2022).
Ogunde, A. A., Chukwu, A. U. & Oseghale, I. O. The Kumaraswamy Generalized inverse Lomax distribution and applications to reliability and survival data. Sci. Afr. 19, e01483. https://doi.org/10.1016/j.sciaf.2022.e01483 (2022).
Bhat, A. A. et al. The odd Lindley Power Rayleigh distribution: Properties, classical and Bayesian estimation with applications. Sci. Afr. 20, e01736. https://doi.org/10.1016/j.sciaf.2023.e01736 (2023).
Alsadat, N. et al. The novel Kumaraswamy Power Frechet distribution with data analysis related to diverse scientific areas. Alex. Eng. J. 70, 651–664. https://doi.org/10.1016/j.aej.2023.03.003 (2023).
Kailembo, B. B., Gadde, S. R. & Kirigiti, P. J. Application of Odd Lomax Log-logistic distribution to cancer data. Heliyon 10(6), e27376. https://doi.org/10.1016/j.heliyon.2024.e27376 (2024).
Lomax, K. S. Business failures: Another example of the analysis of failure data. J. Am. Stat. Assoc. 49, 847–852 (1954).
Ghitany, M. E., Al-Mutairi, D. K., Balakrishnan, N. & Al-Enezi, L. J. Power Lindley distribution and associated inference. Comput. Stat. Data Anal. 64, 20–33 (2013).
Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x (1948).
Alfréd, R. On measures of information and entropy. Proc. Fourth Berkeley Symp. Math., Stat. Prob. 1960, 547–561 (1961).
Constantino, T. Possible generalization of Boltzmann-Gibbs statistics. J. Stat. Phys. 52(1–2), 479–487 (1988).
R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria (2023).https://www.R-project.org/
Sindhu, T. N., Shafiq, A. & Al-Mdallal, Q. M. On the analysis of number of deaths due to Covid −19 outbreak data using a new class of distributions. Results Phys. 21, 103747. https://doi.org/10.1016/j.rinp.2020.103747 (2020).
Funding
The authors would like to acknowledge the Deanship of Graduate Studies and Scientific Research, Taif University for funding this work.
Author information
Authors and Affiliations
Contributions
All the authors have equally contributed.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Qayoom, D., Rather, A.A., Alotaibi, E.S. et al. A novel extension of the power lindley distribution with statistical properties and application to COVID-19 data. Sci Rep 15, 30486 (2025). https://doi.org/10.1038/s41598-025-15256-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-15256-6