Introduction

The Weibull model is the most commonly used model for survival and reliability analysis in many domains, however, it is less suitable when the data show a non-monotonic failure rate. This is because traditional models have a number of drawbacks. Various distributions have been frequently used for modeling actuarial and econometric data, but they often failed to offer a sufficient fit. However, model flexibility can be improved through generalization, and this practice has become quite common in recent times. Various new generalizations, new families of distributions and parameter induction approaches not only enrich the statistical literature but also enable researchers and practitioners to choose different flexible models to achieve better fits. The readers are referred to Lee et al.1, Maurya and Nadarajah2, Tahir and Cordeiro3 and Tahir and Nadarajah4 for a detailed discussion and various parameter induction approaches in baseline models. The compounding approach is frequently used as it presents new models by combining two or more similar or dissimilar models, taking into account the nature of the random variable(s) and their supports.

There are two different approaches that can be used in the compounding method: one is based on the zero-truncated power series distribution, and the other uses zero-truncated continuous lifespan models. Using the zero-truncated Poisson, geometric, logarithmic, binomial, negative-binomial or the power series distribution has two main benefits: (1) there are seldom any instances of the value zero in real data sets, so one have to treat the value zero as being excluded, and (2) the number of complementary risks for component failures is the basis on which the compounding process is developed and this number must therefore be larger than or equal to one. Here we discuss the Poisson-G class, because its structure closely resembles and can be compared to our proposed class of distributions. Since the Poisson distribution is a widely used discrete model for count data, its compound models are also equally studied in the continuous situation. It is all because of its versatility and simplicity in practical use2. The cumulative distribution function (cdf) of the complementary Poisson-G (CP-G) class for series structure by considering a truncated random variable is given by

$$\begin{aligned} F\left( x\right) =\frac{e^{\lambda G\left( x\right) }-1}{e^{\lambda }-1}, \end{aligned}$$
(1)

where \(\lambda\) is the parameter of the Poisson distribution and G(.) is the cdf of any baseline or parent model. Note that Castellares et al.5 achieved somewhat better fits than the Poisson model using a discrete Bell distribution (DBellD) constructed from well-known Bell numbers6, see the probability mass function (pmf)

$$\begin{aligned} \text {P}(X=x)=\frac{\lambda ^{x}e^{-e^{\lambda }+1}B_{x}}{x!},\qquad \qquad x=0,1,2,\ldots , \end{aligned}$$
(2)

where \(B_{x}\) are the Bell numbers. The DBellD possesses a number of beneficial characteristics, such as a single parameter distribution. The Poisson model cannot be nested into the Bell model despite both belonging to the one-parameter exponential family of distributions. However, for small values of the parameter, the Bell model tends to the Poisson distribution, and the DBellD is also infinitely divisible. These DBellD properties inspired the development of its generalized class, compared mathematically and empirically to the CP-G class and its particular models. Fayomi et al.7 expanded the DBellD and produced its generalized class, the exponentiated Bell-G (EBell-G) family. The cdf for the EBell-G family of distributions is provided by7 as

$$\begin{aligned} F\left( x\right) =\frac{1-e^{-e^{\lambda }\left( 1-e^{-\lambda G^{\theta }(x)}\right) }}{1-e^{1-e^{\lambda }}}, \end{aligned}$$
(3)

where \(\lambda\) and \(\theta\) represent the Bell and shape parameter, respectively. In the case of complementary and competing risks, Algarni8 proposed its complementary version. In many cases, information regarding a specific factor that caused the failure is unavailable, and the only information provided is the lifetime of the maximum or minimum among all the risk factors. Such phenomena regularly occur in various fields such as reliability, biology, data science, actuarial sciences, and health care.

In this paper, we study and revisit the extended Weibull model in connection with the CBell-G family of distributions 8. In particular, we develop the complementary Bell Weibull (CBellW) model, derive its properties and discuss various practical applications. The proposed CBellW model has some of the characteristics listed below:

  • It is more flexible then the well-known complementary Poisson Weibull model.

  • It is tractable, has three parameters and comparatively simple probability density function (pdf) and cdf.

  • It has a very good fit for heavy-tailed and skewed data.

  • It works well when the Weibull, exponential or Burr distribution is used as baseline model, and the failure rate function can have various shapes, including unimodal, upside-down bathtub, increasing or decreasing shapes.

Note that the Weibull distribution is also an important lifetime distribution and there are several recent modifications9,10,11.

The paper is organized as follows. “The CBellW distribution and its properties” section presents the CBellW model and its key distributional properties. “Simulation study” section discusses the results of the simulation study, while “Real-life applications” section focuses on various applications of the CBellW model using six real data sets. Finally, the paper concludes in “Concluding remarks” section.

The CBellW distribution and its properties

General distributional properties

Practitioners can employ the CBellW distribution to analyze various types of data because of the failure rate function’s versatility. Consider the baseline cdf and pdf for the Weibull distribution, \(G\left( x\right) =1-\exp \left( -\left( \frac{x}{\alpha }\right) ^{\beta }\right)\) and \(g\left( x\right) =\frac{\beta }{\alpha ^{\beta }}\left( x\right) ^{\beta -1}\exp \left( -\left( \frac{x}{\alpha }\right) ^{\beta }\right) ,\) for \(x>0\), \(\alpha >0\) and \(\beta >0\), respectively. Then, the cdf of the CBellW distribution is as follows:

$$\begin{aligned} F(x;\lambda ,\,\alpha ,\,\beta )=\frac{\exp \left( e^{\lambda \left[ 1-\exp \left( -\left( x/\alpha \right) ^{\beta }\right) \right] }-1\right) -1}{\exp \left( e^{\lambda }-1\right) -1}, \end{aligned}$$
(4)

where \(x>0\), \(\alpha >0\) and \(\beta >0\). The pdf corresponding to Eq. (4) is as follows:

$$\begin{aligned} f(x;\lambda ,\,\alpha ,\,\beta )= & {} \lambda \,\frac{\beta }{\alpha ^{\beta }}\left( x\right) ^{\beta -1}\exp \left( -\left( x/\alpha \right) ^{\beta }\right) \,\exp \left[ \lambda \,\biggl (1-\exp \left[ -\left( x/\alpha \right) ^{\beta }\right] \biggr )\right] \\{} & {} \exp \left( e^{\lambda \Bigl (1-\exp \left[ -\left( x/\alpha \right) ^{\beta }\right] \Bigr )}-1\right) \,\left[ \exp \left( e^{\lambda }-1\right) -1\right] ^{-1}\nonumber . \end{aligned}$$
(5)

The survival function related to the CBellW distribution is as follows:

$$\begin{aligned} S(x;\lambda ,\,\alpha ,\,\beta )=\frac{\exp \left( e^{\lambda }-1\right) -\exp \left( e^{\lambda \,\Bigl (1-\exp \left[ -\left( x/\alpha \right) ^{\beta }\right] \Bigr )}-1\right) }{\exp \left( e^{\lambda }-1\right) -1}. \end{aligned}$$
(6)

The hazard rate function (hrf) is the ratio of \(\frac{f(x)}{1-F(x)}\) and can be obtained using Eqs. (5) and (6). The quantile function (qf) of the CBellW distribution is as follows:

$$\begin{aligned} Q_{G}\left( z\right) =\alpha \Biggl (-\log \left[ 1-\Biggl \{\lambda ^{-1}\,\log \left[ 1+\log \left\{ 1+u[\exp (e^{\lambda }-1)-1]\right\} \right] \Biggr \}\right] \Biggr )^{1/\beta }, \end{aligned}$$
(7)

where u[0,1]. As the qf has a closed form solution, it can be used to obtain L-moments, and it is suitable to design a GASP as well as various actuarial risk measures. Figure 1 demonstrates the possibility of symmetric, reversed-J, and right-skewed for the pdf of the CBellW distribution. In general, after a failure of different engineering systems the hrf initially has to drop, then it is reasonably static, and lastly, there is a growing failure rate. The terms “burning,” “random,” and “wear-out failure zones” refer to these three phases in reliability theory. The hrf plots have some adaptable shapes, such as increasing, decreasing, and increasing–decreasing shapes, which quantify the characteristics of the lifetime distribution. It can represent the second phase of the bathtub-shaped failure rate because it has a long constant failure rate period as shown in Fig. 1, whereas Fig. 2 shows the mean, variance, skewness and kurtosis of the CBellW model. By increasing \(\lambda\), the mean and variance tend to increase. On the other hand, skewness and kurtosis reduce when \(\lambda\) increases. The scale parameter \(\alpha\) is considered as 1.

Figure 1
Figure 1The alternative text for this image may have been generated using AI.
Full size image

Plots of pdf and hrf of the CBellW for different parameter values.

Figure 2
Figure 2The alternative text for this image may have been generated using AI.
Full size image

Graphical illustration of mean, variance, skewness and kurtosis of the CBellW model for different parameter values.

Proposition 1

The pdf of the CBellW distribution can be expressed in the form

$$\begin{aligned} f(x)=\sum _{n=0}^{\infty }t_{n}\pi \left[ x;\left( n+1\right) \alpha ,\beta \right] , \end{aligned}$$
(8)

where \(t_{n}=\sum _{v=0}^{\infty }\zeta _{v}\left( v+1\right) \left( -1\right) ^{n}\left( {\begin{array}{c}v\\ n\end{array}}\right)\) and \(\zeta _{v}\) is defined is Eq. (43) (see Appendix) and the last term \(\pi \left[ x;\left( n+1\right) \alpha ,\beta \right] =\frac{\beta }{\alpha ^{\beta }}\left( x\right) ^{\beta -1}\exp \left( -\left[ n+1\right] \left( x/\alpha \right) ^{\beta }\right)\) is the Weibull pdf.

Proof

Using Eq. (41) yields

$$\begin{aligned} f(x)=\sum _{v=0}^{\infty }\zeta _{v}\left( v+1\right) \,\frac{\beta }{\alpha ^{\beta }}\left( x\right) ^{\beta -1}\exp \left( -\left( x/\alpha \right) ^{\beta }\right) \left( 1-\exp \left[ -\left( x/\alpha \right) ^{\beta }\right] \right) ^{v}, \end{aligned}$$
(9)

and by applying binomial expansion to the last term we get

$$\begin{aligned} {\left( 1-\exp \left[ -\left( x/\alpha \right) ^{\beta }\right] \right) ^{v}=\sum _{n=0}^{v}\left( -1\right) ^{n}\left( {\begin{array}{c}v\\ n\end{array}}\right) \exp \left[ -n\left( x/\alpha \right) ^{\beta }\right] ,} \end{aligned}$$

and the above expression reduces to

$$\begin{aligned} {f(x)=\sum _{n=0}^{v}\sum _{v=0}^{\infty }\zeta _{v}\left( v+1\right) \left( -1\right) ^{n}\left( {\begin{array}{c}v\\ n\end{array}}\right) \,\frac{\beta }{\alpha ^{\beta }}\left( x\right) ^{\beta -1}\exp \left[ -\left[ n+1\right] \left( x/\alpha \right) ^{\beta }\right] .} \end{aligned}$$

This gives the desired results and completes the proof of Proposition 1. \(\square\)

The general result of Proposition 1 shows that the CBellW pdf is a linear combination of Weibull densities. Therefore, several mathematical properties of CBellW can be derived from those of the Weibull distribution. Some of them will be presented below.

Ordinary and incomplete moments

The mean and variance of the CBellW distribution can be obtained by using Eq. (10), where \(\text {mean}=\mu _{1}^{\prime }\) and \(\text {variance}=\mu _2=\mu _{2}^{\prime }-\left( \mu _{1}^{\prime }\right) ^{2}.\) Moreover, the first four moments can be obtained using the well-established relationship between ordinary and central moments. The moment-based measure of skewness and kurtosis, respectively, is obtained by using \(\beta _{1}=\frac{\mu _{3}^{2}}{\mu _{2}^{3}}\) and \(\beta _{2}=\frac{\mu _{4}}{\mu _{2}^{2}}\), where \(\mu _{3}=\mu _{3}^{\prime }-3\mu _{2}^{\prime }\mu _{1}^{\prime }+2(\mu _{1}^{\prime })^{3}\) and \(\mu _{4}=\mu _{4}^{\prime }-4\mu _{3}^{\prime }\mu _{1}^{\prime }+6\mu _{2}^{\prime }(\mu _{1}^{\prime })^{2}-3(\mu _{1}^{\prime })^{4}\). Pearson’s coefficient of skewness and kurtosis can be yielded as \(\sqrt{\beta _{1}}\) and \(\beta _{2}-3\), respectively. The rth raw or ordinary moment of the CBellW distribution is given by

$$\begin{aligned} \mu _{r}^{\prime }=\mathbb {E}(X^{r})=\alpha ^{r}\Gamma \left( \frac{r}{\beta }+1\right) \sum _{n=0}^{\infty }t_{n}\frac{1}{\left( n+1\right) ^{1+\frac{r}{\beta }}}, \end{aligned}$$
(10)

where \(t_n\) is as in Proposition 1. On the other hand, there are many important and useful applications for incomplete times. For instance, they are essential when calculating the average waiting time, deviation, conditional moments, measures of income disparity, etc. The representation of the rth incomplete moments is provided by \(\mu _{s}\left( x\right) =\int _{-\infty }^{t}x^{s}f(x)dx\). Using Eq. (41), we get

$$\begin{aligned} \mu _{s}\left( x\right) =\mathbb {E}(X^{s}1_{\{X\le x\}})=\alpha ^{s}\sum _{n=0}^{\infty }\frac{t_{n}}{(n+1)^{s/\beta +1}}\Gamma \left( \frac{s}{\beta }+1,\left[ n+1\right] \left( x/\alpha \right) ^{\beta }\right) , \end{aligned}$$
(11)

where \(t_n\) is given in Proposition 1, and \(\Gamma \left( a,b\right)\) is the Gamma function.

Moment generating function

In probability theory and statistics, several statistical measures are used to specify the distribution of interest namely the moment generating function (mgf), characteristic function, the rth moments, qf, etc. Let X be a random variable associated to f(x) given in Eq. (8). The mgf is defined by \(E\left( e^{tx}\right) =\int e^{tx}f\left( x\right) dx\). Here, we use the Wright generalized hypergeometric function,

$$\begin{aligned} _{p}\Psi _{q}\left[ \begin{array}{cc} \left( \alpha _{1},A_{1}\right) ,\ldots , &{} \left( \alpha _{p},A_{p}\right) \\ \left( \beta 1,B_{1}\right) ,\ldots , &{} \left( \beta _{p},B_{p}\right) \end{array};x\right] =\sum _{n=0}^{\infty }\frac{\Pi _{j=1}^{p}\Gamma \left( \alpha _{j}+A_{j}n\right) }{\Pi _{j=1}^{q}\Gamma \left( \beta _{j}+B_{j}n\right) }\frac{x^{n}}{n!}, \end{aligned}$$
(12)

to derive the mgf. Considering

$$\begin{aligned} f(x;\lambda ,\alpha ,\beta )=\sum _{n=0}^{\infty }t_{n}\frac{\beta }{\alpha ^{\beta }}\left( x\right) ^{\beta -1}\exp \left[ -\left( \Omega \,x\right) ^{\beta }\right] dx, \end{aligned}$$
(13)

for computational ease we set \(\Omega =\frac{\left( n+1\right) ^{1/\beta }}{\alpha }\), and by expanding the series \(e^{tx}=\sum _{m=0}^{\infty }\frac{t_{m}}{m!}x^{m}\), we obtain

$$\begin{aligned} M\left( t\right) =\frac{\alpha ^{\beta }}{\Omega ^{\beta }}\sum _{n=0}^{\infty }t_{n}\sum _{m=0}^{\infty }\frac{(\frac{t}{\Omega })^{m}}{m!}\Gamma \left( m/\beta +1\right) . \end{aligned}$$

Hence, we have the following expression for the mgf:

$$\begin{aligned} M\left( t\right) =\frac{\alpha ^{\beta }}{\Omega ^{\beta }}\sum _{n=0}^{\infty }t_{n}\,\,_{1}\Psi _{0}\left[ \begin{array}{cc} 1,1/\beta \\ - \end{array};\frac{t}{\Omega }\right] . \end{aligned}$$
(14)

Reliability

Numerous applications related to reliability have been conducted in various fields. We are able to calculate the failure probability at a specific time point due to aspects of reliability. Let \(X_1\) and \(X_2\) be two random variables that follow the CBellW distribution. If the applied stress is more than the component’s strength, it will fail; but, if \(X_1>X_2\), it will operate satisfactorily. Here, we derive the reliability of the CBellW model when \(X_{1}\) and \(X_{2}\) are independent with \(f(x;\lambda _{1},\alpha ,\,\beta )\) and \(F(x;\lambda _{2},\alpha ,\,\beta )\) as well as identical scale \((\alpha )\) and shape \((\beta )\) parameters. It is then given by

$$\begin{aligned} R=\intop _{0}^{\infty }f_{1}\left( x\right) \,F_{2}\left( x\right) \,dx. \end{aligned}$$

By using Eqs. (41) and (42), we get

$$\begin{aligned} f(x;\lambda _{1},\alpha ,\,\beta )=\sum _{v=0}^{\infty }\zeta _{v}\left( \lambda _{1}\right) \,\left( v+1\right) \,\left\{ \frac{\beta }{\alpha ^{\beta }}\left( x\right) ^{\beta -1}\exp \left[ -\left( x/\alpha \right) ^{\beta }\right] \right\} \,\left\{ 1-\exp \left[ -\left( x/\alpha \right) ^{\beta }\right] \right\} ^{v} \end{aligned}$$

and

$$\begin{aligned} F(x;\lambda _{2},\alpha ,\,\beta )=\sum _{t=0}^{\infty }\zeta _{t}\left( \lambda _{2}\right) \left\{ 1-\exp \left[ -\left( x/\alpha \right) ^{\beta }\right] \right\} ^{t+1}, \end{aligned}$$

where \(\zeta _v\) is defined in Eq. (43), so it holds

$$\begin{aligned} R=\sum _{v=0}^{\infty }\zeta _{v}\left( \lambda _{1}\right) \,\left( v+1\right) \,\sum _{t=0}^{\infty }\zeta _{t}\left( \lambda _{2}\right) \,I\left( \alpha ,\,\beta ,v,\,t\right) \end{aligned}$$

with

$$\begin{aligned} I\left( \alpha ,\,\beta ,v,\,t\right) =\intop _{0}^{\infty }\left\{ \frac{\beta }{\alpha ^{\beta }}\left( x\right) ^{\beta -1}\exp \left[ -\left( x/\alpha \right) ^{\beta }\right] \right\} \,\left\{ 1-\exp \left[ -\left( x/\alpha \right) ^{\beta }\right] \right\} ^{v+t+1}dx. \end{aligned}$$

Applying the binomial expansion and by simplifying, we get

$$\begin{aligned} {I\left( \alpha ,\,\beta ,v,\,t\right) =\sum _{z=0}^{v+t+1}\left( -1\right) ^{z}\left( {\begin{array}{c}v+t+1\\ z\end{array}}\right) \intop _{0}^{\infty }\frac{\beta }{\alpha ^{\beta }}\left( x\right) ^{\beta -1}\exp \left[ -\left[ z+1\right] \left( x/\alpha \right) ^{\beta }\right] dx,} \end{aligned}$$
$$\begin{aligned} {I\left( \alpha ,\,\beta ,v,\,t\right) =\sum _{z=0}^{v+t+1}s_{z}\left[ z+1\right] ^{-1},} \end{aligned}$$

where \(s_{z}=\left( -1\right) ^{z}\left( {\begin{array}{c}v+t+1\\ z\end{array}}\right) .\)

Residual and reversed residual life

The nth moment of the residual life of X is given by

$$\begin{aligned} m_{n}\left( t\right) =\frac{1}{1-F\left( t\right) }\intop _{t}^{\infty }\left( x-t\right) ^{n}dF\left( x\right) . \end{aligned}$$

By using Eq. (8), one gets

$$\begin{aligned} m_{n}\left( t\right) =\frac{1}{1-F\left( t\right) }\sum _{p=0}^{\infty }t_{p}^{*}\frac{\beta }{\alpha ^{\beta }}\intop _{t}^{\infty }x^{r}\left( x\right) ^{\beta -1}\exp \left[ -\left[ n+1\right] \left( x/\alpha \right) ^{\beta }\right] dx, \end{aligned}$$

where \(t_{p}=t_{n}\) and \(t_{p}^{*}=t_{p}\sum _{r=0}^{n}\left( {\begin{array}{c}n\\ r\end{array}}\right) \left( -t\right) ^{n-r}\). Here, the mean residual life of X can be achieved by setting \(n=1\) in Eq. (15).

$$\begin{aligned} m_{n}\left( t\right) =\frac{1}{1-F\left( t\right) }\sum _{p=0}^{\infty }t_{p}^{*}\frac{\alpha ^{r}}{\left[ n+1\right] ^{\frac{r}{\beta }+1}}\gamma \left( \frac{r}{\beta }+1,\left[ n+1\right] \left( t/\alpha \right) ^{\beta }\right) , \end{aligned}$$
(15)

where the function \(\gamma (a,b)\) represents the upper incomplete Gamma function. The following expression gives the nth moment of reversed residual life:

$$\begin{aligned} M_{n}\left( t\right) =\frac{1}{F\left( t\right) }\intop _{0}^{t}\left( t-x\right) ^{n}dF\left( x\right) , \end{aligned}$$
$$\begin{aligned} {M_{n}\left( t\right) =\frac{\beta }{\alpha ^{\beta }F\left( t\right) }\sum _{p=0}^{\infty }t_{p}^{**}\intop _{0}^{t}x^{r}\left( x\right) ^{\beta -1}\exp \left[ -\left[ n+1\right] \left( x/\alpha \right) ^{\beta }\right] dx,} \end{aligned}$$
(16)

where \(t_{p}^{**}=t_{p}\sum _{r=0}^{n}\left( {\begin{array}{c}n\\ r\end{array}}\right) \left( -1\right) ^{r}t^{n-r}\). Then, the mean reverse residual life or mean inactivity time of X can be obtained by setting \(n=1\) in Eq. (17):

$$\begin{aligned} M_{n}\left( t\right) =\frac{1}{F\left( t\right) }\sum _{p=0}^{\infty }t_{p}^{**}\frac{\alpha ^{r}}{\left[ n+1\right] ^{\frac{r}{\beta }+1}}\gamma \left( \frac{r}{\beta }+1,\left[ n+1\right] \left( t/\alpha \right) ^{\beta }\right) \end{aligned}$$
(17)

Entropy measures

Entropy measures are important when highlighting a random variable’s uncertainty variation. Here, we present important entropy measures including the Reńyi entropy, Havrda and Charvat (HC) entropy, the Arimoto entropy and the Tsallis entropy based on the CBellW model. Moreover, we evaluate their numerical values which show flexibility under the CBellW model. For more details, the readers are referred to12. In the following let X \(\sim\) CBellW (\(\alpha ,\,\beta ,\,\lambda\)).

The Reńyi entropy is given by

$$\begin{aligned} R_{\delta }=\left( 1-\delta \right) ^{-1}\log \left[ \sum _{k=0}^{\infty }Q_{k}\,\frac{\beta ^{\delta -1}\,\alpha ^{1-\delta }}{\left[ k+\delta \right] ^{\delta -\frac{\delta }{\beta }+\frac{1}{\beta }}}\Gamma \left( \delta -\frac{\delta }{\beta }+\frac{1}{\beta }\right) \right] , \end{aligned}$$
(18)

where \(Q_{k}=\sum _{b=0}^{\infty }Q_{b}\left( -1\right) ^{k}\left( {\begin{array}{c}b\\ k\end{array}}\right)\) and

$$\begin{aligned} Q_{b}=\frac{\left( 1+\delta \right) ^{b}\lambda ^{\left( \delta +b\right) }}{\left[ \exp \left( e^{\lambda }-1\right) -1\right] ^{\delta }b!}\sum _{t,s}^{\infty }\left( -1\right) ^{\left( t+s\right) }\frac{\delta ^{t}}{t!}. \end{aligned}$$
(19)

The HC entropy is given by

$$\begin{aligned} HC_{\delta }=\frac{1}{2^{1-\delta }-1}\left[ \sum _{k=0}^{\infty }Q_{k}\,\frac{\beta ^{\delta -1}\,\alpha ^{1-\delta }}{\left[ k+\delta \right] ^{\delta -\frac{\delta }{\beta }+\frac{1}{\beta }}}\Gamma \left( \delta -\frac{\delta }{\beta }+\frac{1}{\beta }\right) -1\right] . \end{aligned}$$
(20)

The Arimoto entropy is given as follows:

$$\begin{aligned} A_{\delta }=\frac{\delta }{1-\delta }\left\{ \left[ \sum _{k=0}^{\infty }Q_{k}\,\frac{\beta ^{\delta -1}\,\alpha ^{1-\delta }}{\left[ k+\delta \right] ^{\delta -\frac{\delta }{\beta }+\frac{1}{\beta }}}\Gamma \left( \delta -\frac{\delta }{\beta }+\frac{1}{\beta }\right) \right] ^{\frac{1}{\delta }}-1\right\} \end{aligned}$$
(21)

The Tsallis entropy is given by

$$\begin{aligned} T_{\delta }=\frac{1}{\delta -1}\left[ 1-\sum _{k=0}^{\infty }Q_{k}\,\frac{\beta ^{\delta -1}\,\alpha ^{1-\delta }}{\left[ k+\delta \right] ^{\delta -\frac{\delta }{\beta }+\frac{1}{\beta }}}\Gamma \left( \delta -\frac{\delta }{\beta }+\frac{1}{\beta }\right) \right] . \end{aligned}$$
(22)

See Table 1 for exemplary numerical computations of the above entropy measures.

Table 1 Numerical computation of entropy measures.

Parameter estimation

The log-likelihood function L related to the parameter vector \(\theta =(\lambda ,\alpha ,\beta )^\top\) in Eq. (5) is given by

$$\begin{aligned} L= & {} n\log (\lambda )+n\log (\beta )-n\beta \log \left( \alpha \right) +(\beta -1)\sum _{i=1}^{n}\log x_{i}-\sum _{i=1}^{n}\left( x_{i}/\alpha \right) ^{\beta }\\{} & {} \quad +\lambda \sum _{i=1}^{n}\,\biggl (1-\exp \left[ -\left( x_{i}/\alpha \right) ^{\beta }\right] \biggr )+\sum _{i=1}^{n}\left\{ e^{\lambda \Bigl (1-\exp \left[ -\left( x_{i}/\alpha \right) ^{\beta }\right] \Bigr )}-1\right\} \\{} & {} \quad -n\log \left\{ \exp \left[ e^{\lambda }-1\right] -1\right\} . \end{aligned}$$

The components of the score vector \(U(\theta )\) are as follows:

$$\begin{aligned} U_{\lambda }= & {} \frac{n}{\lambda }+\sum _{i=1}^{n}\,\biggl (1-\exp \left[ -\left( x_{i}/\alpha \right) ^{\beta }\right] \biggr )+\sum _{i=1}^{n}e^{\lambda \Bigl (1-\exp \left[ -\left( x_{i}/\alpha \right) ^{\beta }\right] \Bigr )}\Bigl (1-\exp \left[ -\left( x_{i}/\alpha \right) ^{\beta }\right] \Bigr )\\{} & {} -\frac{n\,e^{\lambda }\,\exp \left[ e^{\lambda }-1\right] }{\left\{ \exp \left[ e^{\lambda }-1\right] -1\right\} }, \\ U_{\alpha }= & {} -\frac{n\beta }{\alpha }+\frac{\beta }{\alpha ^{2}}\sum _{i=1}^{n}x_{i}\left( x_{i}/\alpha \right) ^{\beta -1}-\frac{\lambda \beta }{\alpha ^{2}}\sum _{i=1}^{n}x_{i}\exp \left[ -\left( x_{i}/\alpha \right) ^{\beta }\right] \,\left( x_{i}/\alpha \right) ^{\beta -1}\\{} & {} \quad -\sum _{i=1}^{n}\,\frac{\beta \lambda x_{i}\left( \frac{x_{i}}{\alpha }\right) {}^{\beta -1}e^{\lambda \left( 1-e^{-\left( \frac{x_{i}}{\alpha }\right) {}^{\beta }}\right) -\left( \frac{x_{i}}{\alpha }\right) {}^{\beta }}}{\alpha ^{2}}, \\ U_{\beta }= & {} \frac{n}{\beta }-n\log \left( \alpha \right) +\sum _{i=1}^{n}\log x_{i}-\sum _{i=1}^{n}\left( x_{i}/\alpha \right) ^{\beta }\log \left[ \frac{x_{i}}{a}\right] \\{} & {} \quad +\lambda \sum _{i=1}^{n}e^{-\left( \frac{x_{i}}{\alpha }\right) {}^{\beta }}\left( \frac{x_{i}}{\alpha }\right) {}^{\beta }\log \left( \frac{x_{i}}{\alpha }\right) +\sum _{i=1}^{n}\lambda \left( \frac{x_{i}}{\alpha }\right) {}^{\beta }\log \left( \frac{x_{i}}{\alpha }\right) e^{\lambda \left( 1-e^{-\left( \frac{x_{i}}{\alpha }\right) {}^{\beta }}\right) -\left( \frac{x_{i}}{\alpha }\right) {}^{\beta }}. \end{aligned}$$

By solving this system of non-linear equations, one can obtain the maximum likelihood estimates of the respective parameters. The above equations can be solved using computer-based programming algorithms.

Simulation study

In this section, we conduct a simulation study related to the parameter estimates of the proposed CBellW model’s to analyze the performance for various sample sizes \(n=20,25,30,\ldots ,250\). We simulated \(N=1000\) samples that are replicated 5000 times. We consider the scale parameter \(\alpha =2\) for two different sets and vary the shape parameters \(\beta\) and \(\lambda\) in various combinations. In particular, we consider \(\text {set I}=[\beta = 6.0,\, \lambda = 0.70]\) and \(\text {set II}=[\beta = 5.0,\,\lambda = 1.20]\).

According to the results of the simulation study in Tables 2, 3, the bias and the mean squared error (MSE) of the parameters decrease as the sample size increases. Therefore, the CBellW model parameters may be estimated and their proposed confidence intervals can be constructed using the maximum likelihood estimators (MLEs) and their asymptotic results. The graphical illustration of MSEs and biases for set I and set II are presented in Figs. 3 and 4, respectively. The following Eqs. (23) and (24),

$$\begin{aligned} \text {MSE}(\hat{\Theta })=\sum _{r=1}^{5,000}\frac{(\hat{\Theta _{i}}-\Theta )^{2}}{5,000} \end{aligned}$$
(23)

and

$$\begin{aligned} \text {Bias}(\hat{\Theta })=\sum _{r=1}^{5,000}\frac{\hat{\Theta _{i}}}{5,000}-\Theta , \end{aligned}$$
(24)

are used to evaluate the MSE and bias of the estimates, respectively.

Table 2 Output summary of simulation study regarding set I.
Table 3 Output summary of simulation study regarding set II.
Figure 3
Figure 3The alternative text for this image may have been generated using AI.
Full size image

Graphical illustration of biases and MSEs for varying sample sizes for set I.

Figure 4
Figure 4The alternative text for this image may have been generated using AI.
Full size image

Graphical illustration of biases and MSEs for varying sample sizes for set II.

Real-life applications

This section aims to practically implement the CBellW model on real data sets to demonstrate the benefits of the proposed model. In “Modeling of COVID-19 and cancer data” section, we apply the CBellW model to four medical data sets, and in the following “Designing a GASP with application to Guinea pigs data” and “Actuarial measures with applications to auto-mobile collision claims data” section, we design a GASP (with application to Guinea pigs data) and compute risk measures by using actuarial data, respectively. We also compare several Weibull-based models such as the complementary Poisson Weibull (CPW)3, alpha power Weibull (APW)13, transmuted Weibull (TW)14, beta Weibull (BW)15, Marshall Olkin Weibull (MOW)16, Weibull claim (W-claim)17, gamma Weibull (GW)18, Gull alpha power Weibull (GAPW)19, and exponentiated exponential (EE) with the proposed CBellW model.

The first data set was recently used by20 and comprises daily confirmed COVID-19 death cases. The data set consists of 89 observations with an average of 18.72 daily reported deaths. The second data set represents the survival time of head and neck cancer disease patients treated by using radiotherapy (RT). The data set consists of 58 observations with a mean survival time of 226.17. This data set is also used by21. The third data set contains 128 people with blood cancer’s average number of months in remission with a mean remission time of 9.37 months and was recently examined by many authors including Hamdeni et al.22 and23. The fourth data set was recently used by23 and represents the survival times in days of 73 patients diagnosed with acute bone cancer with mean survival time of 3.76. The fifth data set24 represents the survival data of Guinea pigs infected with virulent tubercle bacilli. Guinea pigs are regarded to have a high susceptibility to human tuberculosis, which is one of the motives to select guinea pigs for this study. The sixth data set is extracted from the Insurance Data R package25 and represents UK auto-mobile collision claims. The data set consists of 32 observations (in pounds) related to the severity of claims. The observations are divided by 100 for computational purposes (but this does not affect statistical inference). The descriptive statistics for all the data sets are shown in Table 4, whereas the data sets 1–5 are given in Table 5 (for data set 6 see25).

Table 4 Descriptive information on the data sets.
Table 5 Real data sets.

Modeling of COVID-19 and cancer data

From a medical perspective, policy makers are always interested in accurate estimates to enable better planning for disease management and control. There are several flexible models that are commonly used for this purpose, e.g., Klakattawi et al.23 used an extended Weibull model for cancer patients survival analysis. Badr et al.26 employed an extended Weibull distribution on survival data. Zichuan et al.27 analysed bladder cancer data also by using an extended Weibull distribution, and Wang et al.28 introduced an exponent power Weibull model to analyze medical data.

In the following, we focus on data sets 1–4 (COVID-19 and cancer data). Table 6 displays the MLEs and SEs of the estimates for the fitted models. AIC, CAIC, BIC, and HQIC are shown in Table 7 along with other important metrics like p-values and the results of the Anderson-Darling (A), Cramer-von Mises (W), and Kolmogrov–Smirnov (K–S) tests. See also some related visualizations in Figs. 5, 6, 7, 8, 9 and 10. Following the results it can be stated that the proposed CBellW model with three parameters outperforms the other well-known models. Among all other comparable models, the model with the highest p-values and lowest values of the information criteria is deemed to be the best.

Table 6 Fitted models with parameter estimates and standard errors.
Table 7 Detailed summary of model selection criteria.
Figure 5
Figure 5The alternative text for this image may have been generated using AI.
Full size image

TTT plots of Data-1 – Data-4.

Figure 6
Figure 6The alternative text for this image may have been generated using AI.
Full size image

Estimated pdf, cdf, hrf, and survival function for Data-1.

Figure 7
Figure 7The alternative text for this image may have been generated using AI.
Full size image

Estimated pdf, cdf, hrf, and survival function for Data-2.

Figure 8
Figure 8The alternative text for this image may have been generated using AI.
Full size image

Estimated pdf, cdf, hrf, and survival function for Data-3.

Figure 9
Figure 9The alternative text for this image may have been generated using AI.
Full size image

Estimated pdf, cdf, hrf, and survival function for Data-4.

Figure 10
Figure 10The alternative text for this image may have been generated using AI.
Full size image

Probability–Probability (P–P) plot of Data-1-4.

Designing a GASP with application to Guinea pigs data

Product quality is one of the most important characteristics that distinguish different goods in a global market. Before approving or rejecting a lot, particular quality control procedures are carried out in accordance with different sample schemes. A lot of items will be accepted or rejected in accordance with the acceptance sampling technique depending on the quality of the items that were assessed in a sample taken from the lot29. The GASP inspects multiple items at once depending on the number of testers available to the experimenter for testing, whereas the ordinary acceptance sampling plan (OASP) only inspects one item at a time.

This section provides an example of a GASP having cdf as in Eq. (26) with known parameters \(\beta\) and \(\lambda\) to demonstrate the assumption that an item’s lifespan distribution will follow the CBellW model. A sample of size n should be collected for a GASP, distributed, and retained for life testing for a predetermined period of time, where \(n=rg\) with r items for each group. If any group experiences more failures than the acceptance number c, the experiment is declared a failure. Many authors have briefly described GASPs, and it can be found in, e.g.,30,31,32,33,34. When designing the GASP, the quality parameter is taken into consideration as either the mean or the median; however, for skewed distributions, the median is typically preferred30. The GASP is based on the following steps:

  • Identify the group size g.

  • Assign r items to each group for the life test after selecting gr items at random from a lot; in the life test, \(n= gr\) is the necessary sample size.

  • Set the life test’s termination time \(t_0\) and the acceptance number c for each group.

  • A decision is finally made to either accept or reject the lot. A lot can be accepted when there is a maximum of c nonconforming units, and it is to be rejected when there are more than c nonconforming units.

The probability of accepting a lot is given as follows:

$$\begin{aligned} p_{a\left( p\right) }=\left[ \sum _{i=0}^{c}\left( {\begin{array}{c}r\\ i\end{array}}\right) p^{i}\left[ 1-p\right] ^{r-i}\right] ^{g}, \end{aligned}$$
(25)

where p is used to signify the likelihood that a group member would fail before \(t_0\) and is produced by inserting Eq. (7) in Eq. (4):

$$\begin{aligned} m=\alpha \left[ -\log \left( 1-\left\{ \lambda ^{-1}\,\log \left[ 1+\log \left\{ 1+p[\exp (e^{\lambda }-1)-1]\right\} \right] \right\} \right) \right] ^{1/\beta }, \end{aligned}$$
(26)

In the following, let

$$\begin{aligned} \zeta =\left[ -\log \left( 1-\left\{ \lambda ^{-1}\,\log \left[ 1+\log \left\{ 1+p[\exp (e^{\lambda }-1)-1]\right\} \right] \right\} \right) \right] ^{1/\beta }. \end{aligned}$$

By replacing \(\alpha =m/\zeta\) and \(t=a_{1}m_{0}\) in Eq. (4), we obtain the probability of a failure as

$$\begin{aligned} F_{\text {CBellW}}(t)=\frac{\exp \left[ e^{\lambda \left\{ 1-\exp \left[ -\left( \frac{a_{1}\zeta }{r_{2}}\right) ^{\beta }\right] \right\} }-1\right] -1}{\exp \left[ e^{\lambda }-1\right] -1}. \end{aligned}$$
(27)

Given \(a_1\) and \(r_2\), where \(r_2=m/m_0\), p may be calculated for a chosen \(\beta\) and \(\lambda\) from Eq. (27). Both failure probabilities, which correspond to the consumer’s and producer’s risk, are denoted by \(p_1\) and \(p_2\), respectively. We have to determine the values of the design parameters (c, g) that concurrently meet both of the following equations for a given value of \(\theta\) and \(\lambda\), \(r_2\), \(a_1\), \(\beta\), and \(\gamma\)

$$\begin{aligned} p_{a\left( p_{1}|\frac{m}{m_{0}}=r_{1}\right) }=\left[ \sum _{i=0}^{c}\left( {\begin{array}{c}r\\ i\end{array}}\right) p_{1}^{i}\left[ 1-p_{1}\right] ^{r-i}\right] ^{g}\le \beta , \end{aligned}$$
(28)

and

$$\begin{aligned} p_{a\left( p_{2}|\frac{m}{m_{0}}=r_{2}\right) }=\left[ \sum _{i=0}^{c}\left( {\begin{array}{c}r\\ i\end{array}}\right) p_{2}^{i}\left[ 1-p_{2}\right] ^{r-i}\right] ^{g}\ge 1-\gamma , \end{aligned}$$
(29)

where \(r_1\) and \(r_2\) represent the mean ratio at producer’s risk and consumer’s risk, respectively, and the failure probabilities to be used in Eqs. (28) and (29) are given in the following Eqs. (30) and (31) for the CBellW model:

$$\begin{aligned} p_{1}=\frac{\exp \left[ e^{\lambda \left\{ 1-\exp \left[ -\left( a_{1}\zeta \right) ^{\beta }\right] \right\} }-1\right] -1}{\exp \left[ e^{\lambda }-1\right] -1} \end{aligned}$$
(30)

and

$$\begin{aligned} p_{2}=\frac{\exp \left[ e^{\lambda \left\{ 1-\exp \left[ -\left( \frac{a_{1}\zeta }{r_{2}}\right) ^{\beta }\right] \right\} }-1\right] -1}{\exp \left[ e^{\lambda }-1\right] -1}. \end{aligned}$$
(31)

Table 8 shows the design parameters, which are obtained by taking \(\beta =0.7330\) and \(\lambda =2.0201\) and two levels of r (5, 10). The analysis revealed that by reducing \(\beta\) (consumer’s risk) the number of groups tends to be increased. Moreover, the number of groups rapidly declines when \(r_2\) increases. However, after a certain point, the probability of accepting a lot is increased with constant values of g and c. Table 8, where \(\beta =0.25\), \(a_1=1\), \(\lambda =2.0201\) and \(r=10\), indicating that g decreases and the OC value increases, shows the proposed GASP (see also Table 9).

Recently, Sivakumar et al.24 designed a GASP under the odd generalized exponential log-logistic model by analyzing survival data from guinea pigs that had been exposed to virulent tubercle bacilli. One of the factors that led researchers to choose guinea pigs for this investigation was their reputation for having a high vulnerability to human tuberculosis. Here, we bear in mind only the observations in which all animals in a single cage are below the identical regime. The data was also studied by Bjerkedal35. The data set consists of 72 observations of survival time with mean and median values of 1.77 and 1.51 days, respectively. See Fig. 11 for visualizations of the related data set. The K–S test led to a p-value of 0.617 and a maximum difference between real and fitted data of 0.089. In comparison to the odd generalized exponential log logistic model24, the three parameter CBellW model fits the data better (K–S test 0.0774 and p-value 0.7809). The estimated parameters (SEs) are \(\hat{\alpha }=0.3418\) (0.1661), \(\hat{\beta }=0.7330\) (0.1351) and \(\hat{\lambda }=2.0201\) (0.2990). Table 8 shows the GASP under the CBellW model with MLE values suggesting minimal g and c for \(r=5\) and \(r=10\) and \(a_1=0.5\) and 1, for lifetime testing. There are 90 groups, or 450 \((=90 \cdot 5)\) total units, required for testing. The number of groups or units that must be tested under identical conditions, however, is significantly reduced when \(r=10\). As a result, a total of 12 groups or 120 \((=12\cdot 10)\) items is required for life testing. Here, a group size of 10 is preferred. Under the CBellW model, as the true median life grows, the number of groups reduces and the OC values rise.

Table 8 GASP under the CBellW model, \(\beta =0.7330\) and \(\lambda =2.0201\).
Table 9 Proposed GASP.
Figure 11
Figure 11The alternative text for this image may have been generated using AI.
Full size image

TTT plot and estimated pdf, cdf, hrf, K–M and P–P plot for Data-5.

Actuarial measures with applications to auto-mobile collision claims data

Due to their adaptability and potential for precise predictions, extended models have become popular for investigating actuarial data. Many authors have used these models and emphasized their advantages36,37,38,39,40,41. As for the proposed CBellW model, we discuss different risk measures in the following.

The Value at Risk (VaR) is a statistical measure used in finance and risk management to estimate the potential losses on a financial portfolio or investment over a specified time horizon and at a given confidence level q. It quantifies the maximum loss that an investment or portfolio is expected to suffer under normal market conditions over a defined time frame. If a random variable X follows the CBellW distribution, then the following expression defines its VaR:

$$\begin{aligned} \text {VaR}_{q}=\alpha \Biggl (-\log \left[ 1-\Biggl \{\lambda ^{-1}\,\log \left[ 1+\log \left\{ 1+q[\exp (e^{\lambda }-1)-1]\right\} \right] \Biggr \}\right] \Biggr )^{1/\beta } \end{aligned}$$
(32)

The Expected Shortfall (ES), developed by Artzner et al.42 and typically regarded as superior to VaR, is another important financial indicator. It can be computed by

$$\begin{aligned} \text {ES}_{q}\left( x\right) =\frac{1}{q}\intop _{0}^{q}\text {VaR}_{x}\,dx, \end{aligned}$$
(33)

for \(0<q<1\) and VaR given by (32).

The Tail Value at Risk (TVaR), or tail conditional expectation (TCE), is the expected value of the loss in the event that it exceeds the VaR:

$$\begin{aligned} \text {TVaR}_{q}\left( x\right) =\frac{1}{1-q}\intop _{\text {VaR}_{q}}^{\infty }x\,f\left( x\right) dx \end{aligned}$$
(34)

By using Eq. (11), we get:

$$\begin{aligned} \text {TVaR}_{q}\left( x\right) =\frac{\alpha \left[ 1-q\right] ^{-1}}{(n+1)^{1/\beta +1}}\sum _{n=0}^{\infty }t_{n}\gamma \left( \frac{1}{\beta }+1,\left[ n+1\right] \left( \frac{\text {VaR}_{q}}{\alpha }\right) ^{\beta }\right) \end{aligned}$$
(35)

The Tail Variance (TV) is defined by the following expression:

$$\begin{aligned} \text {TV}_{q}\left( x\right) =E\left[ X^{2}|X>x_{q}\right] -\left[ \text {TVaR}_{q}\right] ^{2} \end{aligned}$$
(36)

Considering \(I=E\left[ X^{2}|X>x_{q}\right]\), i.e.,

$$\begin{aligned} I=\text {TVaR}_{q}\left( x\right) =\frac{1}{1-q}\intop _{\text {VaR}_{q}}^{\infty }x^{2}\,f_{\text {BellW}}\left( x\right) dx, \end{aligned}$$

lead us to

$$\begin{aligned} I=\frac{\alpha ^{2}\left[ 1-q\right] ^{-1}}{(n+1)^{2/\beta +1}}\sum _{n=0}^{\infty }t_{n}\gamma \left( \frac{2}{\beta }+1,\left[ n+1\right] \left( \frac{\text {VaR}_{q}}{\alpha }\right) ^{\beta }\right) . \end{aligned}$$
(37)

By inserting Eqs. (35) and (37) in Eq. (36), we obtain the expression for TV for the CBellW model.

The Tail Variance Premium (TVP) combines information on both central tendency and dispersion. It is defined by

$$\begin{aligned} \text {TVP}_{q}\left( X\right) =\text {TVaR}_{q}+\delta \text {TV}_{q}, \end{aligned}$$
(38)

where \(0<\delta <1\). By inserting Eqs. (36) and (35) in Eq. (38), we obtain the Tail Variance Premium for the CBellW model.

In the following we exemplary apply VaR and ES to the UK auto-mobile collision claims data set. Various visualizations of the data set can be seen in Fig. 12. Table 10 gives the MLEs and SEs of the estimates for the fitted models. AIC, CAIC, BIC, and HQIC are shown in Table 11 along with other important metrics like p-values and the results of some tests (A, W, K–S).

Table 10 Fitted models with parameter estimates and standard errors.
Table 11 Detailed summary of model selection measures.
Figure 12
Figure 12The alternative text for this image may have been generated using AI.
Full size image

Estimated plots of pdf, cdf, hrf, K–M, P–P and TTT for Data-6.

Table 12 and Fig. 13 provide numerical and graphical representations, respectively, of both VaR and ES. By using the MLEs for the data set, the proposed CBellW model and the Weibull model are compared in terms of their VaR and ES. Note that a distribution is considered to have a heavier tail compared to another distribution when the associated risk measures yield larger values. Table 12 shows that the CBellW model has larger values of both risk measures than its counterpart, the Weibull model. Figure 13 also reveals that the proposed model has a heavier tail than the Weibull model. The readers are referred to Chan et al.43 for numerical computations of ES and VaR using the R package VaRES.

Table 12 VaR and ES for the CBellW and W model based on MLEs.
Figure 13
Figure 13The alternative text for this image may have been generated using AI.
Full size image

Graphical illustration of VaR and ES based on the MLEs.

Concluding remarks

In this paper, we have studied the CBellW model based on the CBell-G family of distributions. The failure rate function of the CBellW model can take different forms that makes it a very flexible and relevant model for real-world applications in numerous areas. We derived and discussed the key properties of the CBellW model in detail. The effectiveness of the CBellW model has been evaluated using real data applications (COVID-19, cancer, quality control, and actuarial data), and it has been compared with several established models. The conducted analysis revealed that the proposed CBellW model is superior to the competitors. The introduced distribution family represents a considerable contribution to the existing body of literature, given that it builds upon the DBellD as its foundation, inheriting the advantageous properties associated with Bell distributions. Hence, the proposed CBellW distribution family presents a promising alternative with the potential to outperform the well-established CP-G family. Since the quantile function of the CBellW model has a closed form solution it can be used to perform quantile regression analysis as an exemplary idea for fruitful future directions to further employ and enhance the CBellW model.