Abstract
This paper discusses a novel technique to creating distribution families by combining the transformation of alpha power and the cosine function. The proposed technique have been named the cosine alpha power-generated family. The Weibull distribution is employed to produce a distinctive model for the cosine alpha power generated family, the specific model is called the cosine alpha power-Weibull (CAP-W). The distribution statistical characteristics are investigated, involving quantiles, Rényi entropies, and order statistics. The CAP-W has a density function that is right-skewed, symmetrical, and decreases continuously, along with J-shape, upside down bathtub, increasing and decreasing hazard rate function. Various methods of estimation–maximum likelihood, ordinary least-squares, weighted least-squares, and cramér–von mises were utilized to estimate the distribution parameters, and a simulation study is carried out to examine their performance. Furthermore, the efficiency of the provided distribution is demonstrated by four real data sets. Ultimately, the log cosine alpha power Weibull regression model is constructed and examined with a real dataset.
Similar content being viewed by others
Introduction
Statistical models are commonly utilized in real-life situations to analyze and explain datasets. Classical distributions such as Weibull, Lomax, and exponential are widely employed for this purpose. However, these distributions cannot always fit complicated datasets well. Therefore, in order to make the classical distribution more flexible and adaptable to fit data from different fields of study, several researchers have attempted to extend it by producing new families of distributions. Various techniques for generalizing distributions are proposed in literature. Some of these techniques include the exponentiated family by 1, the T-X generalized family by 2, and the alpha power (AP) transformation family by 3. The cumulative distribution function (CDF) of the AP is described as follows:
and the corresponding probability density function (PDF) is
where F(w) represents the CDF and f(w) denotes the PDF of any continuous distribution.
Many distributions have been produced by the AP family, like the AP exponential distribution by 3, the AP Weibull distribution by 4, the AP Fréchet distribution by 5 , the AP inverted Topp–Leone by 6 , the AP generalized Pareto distribution 7, and the AP Lindley distribution by 8.
Recently, the trigonometric transformation of conventional distributions has attracted a lot of interest since of its variety of appealing features, such as the capacity to improve the properties of existing distributions while including no extra parameters to those in the baseline model. Also, the trigonometric function is adaptable and flexible because the parameter(s) fluctuate as the values alter, and the periodic function determines how the distribution curve behaves.
Numerous studies have introduced new families and distributions using the trigonometric transformation. For example, the sin-G 10,11, the tan-G 11,12, the sec-G 11,13, the sine unit exponentiated half-logistic distribution 14, the new weighted sine-G family 15, the weighted sine very flexible Weibull distribution 16, the tangent exponential-G family 17, the new logarithmic tangent-U family 18, the new cotangent-G family 19, the tan Weibull loss distribution 20 ,the exponentiated arctan-X family 21, and the sine alpha power-G family 22, among others.
Souza 11,23 introduced a new approach of the trigonometric transformation to producing a unique family of distributions known as the cosine-G (CG) family by applying a cosine transformation to the CDF of a baseline distribution. The CDF of the CG family is stated as
and its PDF is presented by
Several studies have extended cosine-G framework to develop new families with varying shapes that are flexible in modelling data. For example, the novel arc cosine-\(\Psi\) class of distribution 24, the extended cosine generalized family 25, the cosine Topp–Leone family 26, the cosine pie-power odd-G family 27, the cosine inverse Lomax?G family 28, the cosine Weibull family 29, the new cosine trigonometric-G family 30,the new class of cos-G family 31, and the cosine Kumaraswamy -G family 32
In this paper, our main goal is to produce a novel family of the CG using the AP family as the baseline distribution in the CG. This new family is known as the cosine alpha power generated (CAP-G) family of distributions. The primary features for creating this novel family are as follows:
-
The CAP-G family distinctive feature is the lack of extra parameters with flexibility in fitting various datasets.
-
The CAP-G family is formulated within a trigonometric framework, yielding a novel trigonometric distribution that has the ability to capture different levels of shapes versatility and tail behaviors. In particular, the periodic nature of the trigonometric family is used to shape the forms of the density and hazard rate functions, as well as providing versatility to the family’s special models, which produces novel and flexible models.
-
Incorporating the alpha?power transformation within the proposed family provides additional flexibility to control both the degree of skewness and the heaviness of the tails.
-
The CAP-G family is expected to supply better fit and forecasting precision for datasets displaying patterns that are complex to explain using classical distributions.
-
When F(w) is a baseline CDF, transformations such as \(\sin (\frac{\pi }{2}F(w))\) or \(1-\cos (\frac{\pi }{2}F(w))\) map the unit interval [0,1] into itself, remain continuous and monotone over the relevant range, and therefore their compositions is a valid CDF, where \(w \in R\).
This paper expands its objectives of studies by selecting the Weibull distribution as the baseline distribution, yielding the CAP-W distribution. The choice of this distribution is motivated by the widely recognized versatility of the Weibull distribution for modeling lifetime and reliability data, with its capability to accommodate increasing, decreasing, and constant hazard rate patterns. Therefore, the resulting CAP-W distribution has the advantages of illustrating various curve shapes for both the PDF and the hazard rate function, indicating its adaptability and performance for displaying actual-life data in a variety of application fields. Moreover, the Weibull distribution encompasses the exponential and Rayleigh distributions as special cases, which further reinforces its appropriateness as a baseline model. Additionally, the CAP-W displays a better fit against four rival models across three real-life scenarios. Moreover, a novel log location-scale regression model has been constructed.
The following sections are arranged as: Section "The cosine AP-generated family" describes the CAP-G family along with their characteristics. Section "Properties of cosine AP- Weibull distribution" explains the CAP-W and analyzes the important statistical features. The four various approaches are employed to produce the estimators of the distribution parameters in Section "Estimation methods". In section "Simulation study", the Monte Carlo simulation has been carried out to assessing the accuracy of the estimations. Section "Applications" discusses four real-world data sets that are examined to explain the CAP-W’s efficiency. The log CAP-W (LCAP-W) regression model is discussed in Section "The log CAP-W regression model". Ultimately, the result of this study is given in Section "Conclusions".
The cosine AP-generated family
Let W be a random variable (RV) that follows the CAP-G with CDF \(G_{CAP-G}(w)\). Thus, \(G_{CAP-G}(w)\) of the CAP-G family is simple to produced via replacing F(w) in Eq. (3) with Eq. (1) in the following form:
and the related PDF is
where \(\alpha\) is the transformation parameter, F(w) and f(w) refer to the CDF and PDF of baseline distribution, respectively.
The CAP-G family’s hazard rate function, \(h_{CAP-G}(w)\), is shown as
and the CAP-G family’s survival function, \(s_{CAP-G}(w)\), can be stated by
To produce the quantile function for the CAP-W, the CDF is inverted in Eq. (5) according to the following:
The related median, lower quartile, and higher quartile are determined by replacing \(p = 0.5, 0.25\), and 0.75, respectively, in Eq. (9).
Special models of the CAP-G family
Cosine AP Weibull distribution
The CAP-W distribution is classified as a part to the CAP-G family. We let F(w) in Eq. (5) represents the Weibull distribution’s CDF 33, thus the CDF of CAP-W distribution will be
The associated PDF can be written as follows:
The PDF of the CAP-W distribution in Fig. 1 indicates a variety of symmetrical, decreasing, and right-skewed shapes that illustrate the flexibility of the distribution using various combinations of the parameters \(\alpha\), \(\eta\) and \(\theta\). The transformation parameter \(\alpha\) is used as a shape-controlling parameter that controls the degree of skewness and tail heaviness. While, the scale and shape parameters \(\eta\) and \(\theta\) of the baseline Weibull distribution modulate the expansion and the peak, respectively.
Cosine AP exponential distribution
Applying the exponential distribution 34 as the baseline distribution, the CDF and PDF of the cosine AP exponential (CAP-E) distribution are provided by
where \(\lambda\) is the scale parameter of the exponential distribution. Figure 2 illustrates the PDF plots of the CAP-E distribution, showing its right-skewed and nearly symmetrical forms.
Cosine AP Rayleigh distribution
The cosine AP Rayleigh (CAP-R) distribution is obtained by inserting the CDF of Rayleigh distribution 35 in Eq. (5). Therefore, the CDF of the CAP-R distribution is displayed as
and the CAP-R distribution’s PDF is provided as
Figure 3 illustrates the PDF plots of the CAP-R distribution, which shows right-skewed, nearly symmetric, and symmetric.
Cosine AP Lomax distribution
The cosine AP Lomax (CAP-L) distribution is determined by replacing the CDF of Lomax distribution 36 in Eq. (5). Hence, the CAP-L distribution’s CDF is supplied by:
and the PDF will be
The CAP-L density is represented to be right-skewed shapes as observed in Fig. 4.
Cosine AP Lindley distribution
By applying the CDF and PDF of Lindley distribution 37 in Eq. (5) and Eq.(6), the CDF and PDF of the cosine AP Lindley (CAP-Li) distribution are provided , respectively, as
and
where \(\theta\) is the shape parameter of Lindley distribution. Figure 5 shows PDF graphs for CAP-Li with its right-skewed and nearly symmetric shapes.
Properties of cosine AP- Weibull distribution
This section focuses on producing and discussing the various features of CAP-W distribution.
Hazard rate and survival functions
The CAP-W distribution’s hazard rate and survival functions are provided by
and
respectively. Figure 6 illustrates several shapes of the CAP-W hazard function using different values of the parameters \(\alpha\), \(\eta\) and \(\theta\), which appear to be upside down bathtub, J shape, decreasing, and increasing, indicating the flexibility of the distribution to model different real-world datasets.
Expansion for the cosine AP-Weibull density
Consider the following series for the sine function
into Eq. (11), we get the CAP-W distribution’s PDF by the following:
Then, utilizing the following binomial series expansion
we obtain:
Furthermore, by employing the following series representation:
the CAP-W’s PDF will be written as
where
Quantile function and median
The CAP-W quantile function is derived from Eq. (10) as the following:
where \(a(p)=\pi +2(\alpha -1)arccos(1-p)\). The median can be acquired as
where \(b=\pi +2(\alpha -1)arccos(1-0.5)\).
Skewness and kurtosis
The Moors kurtosis 38 and Galton skewness 39 of the CAP-W distribution can be derived by applying Eq. (18) into the following measures:
Figure 7 shows the behavior of the skewness and kurtosis of the CAP-W model, with the transforming parameter \(\alpha\) varying approximately from 0.05 to 0.40 and shape parameter \(\theta\) from 1.0 to 4.0. It is clear that when \(\alpha\) increases, the skewness decreases while being modulated by \(\theta\). This indicates that the cosine-AP family, specifically the parameter \(\alpha\) , has the regulation over the distributional skewness. Regarding the behavior of the kurtosis, we can observe that small value of \(\alpha\) leads to larger kurtosis (high peaked distribution), whereas increasing \(\alpha\) creates distribution with lower kurtosis, which gives us the same conclusion of the Galton-skewness.
Moments
If W has a CAP-W distribution, then the rth moment of W can be acquired via
Substituting Eq. (16) into Eq. (19), we have
Let \(u=(s_{3}+1)\eta w^{\theta }\), we get
Then, we obtain the formula for the rth moment of CAP-W as
where \(\varphi _{s_{3}}\) is given by Eq. (17).
Table 1 shows that when theta increases, the mean and variance of the CAP-W decrease for fixed alpha and eta.
Moment generating function
The moment generating function, \(M_W(t)\), for the CAP-W distribution is indicated by
Setting Eq. (20) into Eq. (21), we get the following:
where \(\varphi _{s_{3}}\) is given by Eq. (17).
Characteristic function
The CAP-W characteristic function, \(\phi _W(t)\), is derived as
Using Eq. (20) in Eq. (22), we get the following:
where \(\varphi _{s_{3}}\) is provided in Eq. (17).
Rényi entropies
The Rényi entropy quantifies the degree of uncertainty or variability in the RV W. It is supplied by
where
By substituting g(w) indicated in Eq. (11) into Eq. (23), we obtain
and \(\psi =\int _{0}^{\infty } {\left( w^{v (\theta -1)}\right) }\exp (-v\eta w^{\theta })~{\left( \alpha ^{-v\exp (-\eta w^{\theta })}\right) } \sin ^{v}\left[ \frac{\pi }{2}~\left( \frac{1-\alpha ^{1-\exp (-\eta w^{\theta })}}{1-\alpha }\right) \right] dw.\)
According to 11, the Taylor series formula of \(T(s)=\sin ^{v}[\frac{\pi }{2}s]\) can be expressed as
where \(b_{k}=\frac{(-1)^{k}T^{(k)}(1)}{k!}\), and \(T^{(k)}(1)\) denotes the \(k^{th}\) derivative of T(.) calculated at the value 1.
Thus, by expanding the \(\sin ^{v}\left[ \frac{\pi }{2}~\left( \frac{1-\alpha ^{1-\exp (-\eta w^{\theta })}}{1-\alpha }\right) \right]\) component, we get
By employing expansions in Eq. (14) and Eq. (15) yields
where \(\rho =\left( \begin{array}{c} k \\ r \end{array} \right) \left( \begin{array}{c} r \\ s_{2} \end{array}\right) w^{v(\theta -1)}\exp (-(v+s_{3})\eta w^{\theta })\).
substituting \(g^{v}(w)\) in Eq. (24) into Eq. (23), and computing the integral. Then, the Rényi entropy of the CAP-W distribution can be obtained as
and
Order statistics
Let \(W_{1}, W_{2},....., W_{m}\) be a random sample from CAP-W distribution. The PDF of the kth order statistics \(W_{k:m}\), denoted by \(g_{k:m}(x)\), can be expressed as
Employing the binomial expansion in Eq. (14) onto Eq. (25), we get
By putting Eq. (10) and Eq. (11) into Eq. (26), we obtain
Estimation methods
In this section, the parameters of the CAP-W distribution are estimated using the following methods: the maximum likelihood estimation method, the ordinary least squares method , the weighted least squares method, and the Cramér–von Mises method. These four estimation procedures are well established and are usually applied in reliability and lifetime analyses. The MLE serves as the official likelihood based method, while LS and WLS yield regression type estimates by minimizing squared deviations between empirical and theoretical distribution functions. The CVM estimator, is a minimum distance approach that depend on the integrated squared difference between the empirical and the model CDFs.
Maximum likelihood estimation method
Suppose \(w_1,w_2,...,w_n\) be a random sample from CAP-W distribution. Then, the log-likelihood \((\ell )\) for the vector of the parameters \(\Omega = (\alpha , \eta ,\theta )\) is:
The first partial derivatives of Eq. (27) with respect to \(\eta , \alpha ,\) and,\(\theta\) are as follows:
It is obvious that Eqs. (28)-(30) are challenging to solve traditionally. As a result, a numerical optimization technique can be applied to estimate the parameters.
Ordinary and weighted least-squares methods
The ordinary least squares (LS) estimation and the weighted least squares (WLS) estimation were suggested by 40. The LS estimates may be obtained via minimizing the following function:
with respect to the unknown parameters. Thus, the LS estimation of vector of parameters \(\Omega\) are calculated via minimizing
The WLS estimators for the CAP-W parameters can be determined by minimizing the following objective function:
or
with respect to \(\Omega\).
Cramér–von Mises method
The Cramér–Von Mises (CVM) technique, commonly known as the minimal distance estimator, were produced via 41 to estimate unknown parameter by minimizing the following function:
with respect to \(\alpha , \eta ,\) and, \(\theta\).
Simulation study
In this part, Monte Carlo simulation will be used to assess the performance of the four methods of estimation based on sample size (n). We use the CAP-W quantile function to generate 1000 random samples with \(n =165, 200, 300,500,\) and 700 to investigate the following parameter sets:
For every sample size, parameter estimates were computed, and the mean square error (MSE) was obtained and calculated using
where \(\Phi = (\alpha , \eta ,\theta )\). Estimates, absolute bias and MSE values of \(\alpha , \eta ,\) and \(\theta\) produced by applying the MLE, LS, WLS, and CVM methods are shown in Tables 2, 3, 4 and 5. All the estimation techniques studied achieved consistency, according to Tables 2, 3, 4 and 5. When the sample size becomes larger, the MSE and absolute bias in most of cases decreases, and the parameter estimates approach the actual values. Moreover, Tables 6, 7 and 8 demonstrate the coverage probability with the length for the estimates of the parameters using a nominal coverage of \(1-\alpha =0.95\). By looking at the tables, it can be observed that converge probability is increasing while the length is decreasing in almost all cases to reach 95% nominal coverage. Furthermore, Figures 8, 9, 10 and 11 illustrate the MSE of the estimates for different sample sizes. It is observed that the LS technique is considered more efficient in estimating the parameter \(\alpha\), while the MLE is more accurate than other techniques in estimating \(\eta\) and \(\theta\) of the CAP-W distribution.
Applications
This section examines the flexibility and efficiency of the CAP-W distribution using four real-life data sets in different fields. The datasets have been given below.
First data set
The set of data shows the wait times (in minutes) for one hundred customers in bank to get serviced, as expressed by 42. The data are presented below:
0.8, 0.8, 1.3, 1.5, 1.8, 1.9, 1.9, 2.1, 2.6, 2.7, 2.9, 3.1, 3.2, 3.3, 3.5, 3.6, 4.0, 4.1, 4.2, 4.2, 4.3, 4.3, 4.4, 4.4, 4.6, 4.7, 4.7, 4.8, 4.9, 4.9, 5.0, 5.3, 5.5, 5.7, 5.7, 6.1, 6.2, 6.2, 6.2, 6.3, 6.7, 6.9, 7.1, 7.1, 7.1, 7.1, 7.4, 7.6, 7.7, 8.0, 8.2, 8.6, 8.6, 8.6, 8.8, 8.8, 8.9, 8.9, 9.5, 9.6, 9.7, 9.8, 10.7, 10.9, 11.0, 11.0, 11.1, 11.2, 11.2, 11.5, 11.9, 12.4, 12.5, 12.9, 13.0, 13.1, 13.3, 13.6, 13.7, 13.9, 14.1, 15.4, 15.4, 17.3, 17.3, 18.1, 18.2, 18.4, 18.9, 19.0, 19.9, 20.6, 21.3, 21.4, 21.9, 23.0, 27.0, 31.6, 33.1, 38.5.
Second data set
The second dataset shows the active repair times (in hours) for a transceive used for aerial communication. These data were obtained from 19,43, and are available below:
0.50, 0.60, 0.60, 0.70, 0.70, 0.70, 0.80, 0.80, 1.00, 1.00, 1.00, 1.00, 1.10, 1.30, 1.50, 1.50, 1.50, 1.50, 2.00, 2.00, 2.20, 2.50, 2.70, 3.00, 3.00, 3.30, 4.00, 4.00, 4.50, 4.70, 5.00, 5.40, 5.40, 7.00, 7.50, 8.80, 9.00, 10.20, 22.00, 24.50.
Third data set
The third dataset includes the survival times for a group of patients with head and neck cancer who obtained treatment utilizing both radiation and chemotherapy (RT+CT). The dataset derived from 44 is presented as follows: 12.20, 23.56, 23.74 ,25.87, 31.98, 37, 41.35, 47.38, 55.46, 58.36, 63.47, 68.46, 78.26, 74.47, 81.43, 84, 92, 94, 110, 112, 119, 127, 130, 133 ,140, 146, 155, 159, 173, 179 ,194, 195 ,209, 249, 281, 319 , 339, 432, 469, 519, 633, 725, 817, 1776.
Fourth data set
The dataset illustrates the number of failures in the air conditioning systems of jet aircraft. This data were examined by 45,46. The data are 194, 413, 90, 74, 55, 23, 97, 50, 359, 50, 130, 487, 57, 102, 15, 14, 10, 57, 320, 261, 51, 44, 9, 254, 493, 33, 18, 209, 41, 58, 60, 48, 56, 87, 11, 102, 12, 5, 14, 14, 29, 37, 186, 29, 104, 7, 4, 72, 270, 283, 7, 61, 100, 61, 502, 220, 120, 141, 22, 603, 35, 98, 54, 100, 11, 181, 65, 49, 12, 239, 14, 18, 39, 3, 12, 5, 32, 9, 438, 43, 134, 184, 20, 386, 182, 71, 80, 188, 230, 152, 5, 36, 79, 59, 33, 246, 1, 79, 3, 27, 201, 84, 27, 156, 21, 16, 88, 130, 14, 118, 44, 15, 42, 106, 46, 230, 26, 59, 153, 104, 20, 206, 5, 66, 34, 29, 26, 35, 5, 82, 31, 118, 326, 12, 54, 36, 34, 18, 25, 120, 31, 22, 18, 216, 139, 67, 310, 3, 46, 210, 57, 76, 14, 111, 97, 62, 39, 30, 7, 44, 11, 63, 23, 22, 23, 14, 18, 13, 34, 16, 18, 130, 90, 163, 208, 1, 24, 70, 16, 101, 52, 208, 95, 62, 11, 191, 14, 71.
Table 9 displays a variety of statistical measures for the three datasets, which provide an overview of the initial information, such as sizes, measures of central tendency, and variability to illustrate the appropriateness of the CAP-W distribution for these datasets.
The CAP-W distribution is contrasted to four various rival distributions, using the CDF as follows:
-
Gull AP Weibull (GAPW) distribution in 47:
$$\begin{aligned} F_{GAPW}(x)=\frac{\alpha (1-\exp (-\theta x^{\gamma }))}{\alpha ^{1-\exp (-\theta x^{\gamma })}},~x>0,~~\alpha ,\gamma ,\theta >0. \end{aligned}$$ -
AP Weibull (APW) distribution in 4:
$$\begin{aligned} F_{APW}(w) =\left\{ \begin{array}{rcl} \frac{1}{1-\alpha }(1-\alpha ^{1-\exp (-\lambda w^{\beta })}) & \text{ if } & \alpha >0 ,\alpha \ne 1,\\ 1-\exp (-\lambda w^{\beta }) & \text{ if } & \alpha =1, \end{array}\right. \end{aligned}$$where \(w>0, ~~ \lambda ,\beta >0\).
-
New cosine Weibull (NCW) distribution in 48:
$$\begin{aligned} F_{NCW}(w)=1-\cos \left[ \frac{\pi (2-2^{\exp (-\alpha w^{\beta })})}{2}\right] ,~w>0,~~\alpha ,\beta >0. \end{aligned}$$ -
Sine exponential (SX) distribution in 10:
$$\begin{aligned} G_{SX}(w)=\cos \left[ \frac{\pi }{2} \exp (-\theta w)\right] , ~w>0,~~\theta >0. \end{aligned}$$
The selected distributions were chosen because they are widely used for reliability and lifetime data and collectively demonstrate a variety of shapes and tail behaviors. These distributions share structural features with the proposed distribution, having Weibull and exponential distribution, which permits a fair and informative comparison in terms of flexibility and goodness of fit, with varying non-monotonic failure patterns.
The efficiency of CAP-W distribution will be examined applying the goodness of fit criteria (GoF), including the negative log-likelihood (\(-\hat{\ell }\)), Akaike information criterion (\(A_{1}\)), Bayesian information criterion (\(A_{2}\)), consistent Akaike information criterion (\(A_{3}\)), and non-parametric statistical tests such as Kolmogorov Smirnov (\(A_{4}\)) with its p-value (\(A_{5}\)).
The MLEs along with their standard error (SE) of the CAP-W, GAPW, APW, NCW, and SX distributions for the three data sets are provided in Table 10. Based on the outcomes in Tables 11, 12, 13 and 14, the CAP-W distribution has minimal values for \(-\hat{\ell }\),\(A_{1}\), \(A_{2}\), \(A_{3}\), and \(A_{4}\) with a high \(A_{5}\). This indicates that for the three real data sets, the CAP-W distribution gives the perfect fit when compared to others rival distributions. Figures 12, 13, 14 and 15 display the estimated PDF and CDF of the CAP-W distribution and competing distributions, which assert its superiority in fitting different datasets. Figures 16, 17, 18 and 19 demonstrate the P-P plots for the CAP-W distribution with the other competing distributions using the empirical and theoretical CDF of the three datasets. It can be seen that the plotted points for the CAP-W distribution fall very close to the diagonal line, indicating that its theoretical CDF closely matches the empirical CDF compared to the other models. This provides further evidence of the better performance of the proposed model.
The log CAP-W regression model
In this section, we present the LCAP-W regression model. Using the transformation \(Y = \log (W)\) in Eq. (11) and re-parameterization \(\theta =\frac{1}{\sigma }\), \(\eta =e^{\frac{-\mu }{\sigma }}\) yields the LCAP-W. Thus, the LCAP-W PDF will be displayed by
where \(\mu \in \mathbb {R}\) is the location paramater, \(\sigma >0\) is the scale paramater, and \(\alpha >0,\alpha \ne 1.\) The associated CDF is shown as
The survival function will be supplied by
and hazard function is
Also, the quantile function is given by
The standardized RV, \(z =\frac{y-\mu }{\sigma }\), produces the following PDF:
with the survival function stated as
In numerous real-world situations, factors such as cholesterol, blood pressure and weight affect life expectancy. Several regression models are utilized in survival analysis, and one widely used model is the log-location-scale regression, which statisticians apply to estimate univariate survival functions for censored data 34. A number of studies have recommended the log-location-scale regression model, based on continuous distributions, for use in multiple disciplines like 26,49,50,51. Thus, we provide a novel linear location-scale regression model based on LCAP-W:
where \(\nu =(\nu _{1},..,\nu _{p})^{T}, \alpha ,\sigma >0\), and \(z_{j}\) is the random error with PDF in Eq. (34). The explanatory vector is \(w^{T}_{j}=(w_{i1},...,w_{jp})^{T}\). The parameter \(\mu _{j}=\nu w^{T}_{j}\) represents the location of \(y_{j}\), and the location parameter vector \(\mu =(\mu _{1},...,\mu _{n})^T\) can be stated as a linear model \(\mu =\nu w\), where \(w=(w_{1},...,w_{n})^T\) is the known matrix.
Estimation of the LCAP-W regression model
The LCAP-W regression model’s parameters are estimated using the maximum likelihood technique. Assume that \((y_{1},w_{1}),...,(y_{n},w_{n})\) be a random sample of n observations from a right-censored lifetime dataset, where \(y_{j} =\)min\([\log (t_{j}),\log (c_{j})]\). Let \(t_j\) be the lifetime and \(c_{j}\) be the censoring time, then the likelihood function for parameter \(\Omega =(\alpha ,\sigma ,\nu )\) is provided by
where \(\tau _j=\left\{ \begin{array}{rcl} 1 & \text{ if } & y_j=\log (t_j) \\ 0& \text{ if } & y_j=\log (c_j)\end{array} \right.\) is the censoring indicates. The log-likelihood function for \(\Omega\) reduces to
where d is the number of uncensored observations and \(z_{j}=(y_{j}-\nu w^{T}_{j})/\sigma\). The MLEs of the parameters may be obtained via maximizing the log-likelihood function in Eq. (38). The “optim” function in R package may be employed to get the MLEs.
Applications for the LCAP-W regression model
The effectiveness of the LCAP-W regression model is studied through use of an actual dataset and comparing it with a variety of contrasting models. The dataset is referred to the Leukemia Data (LD) and was provided by Lawless in 34. The data are presented below:
y: 65, 140, 100, 134, 16, 106, 121, 4, 39, 121, 56, 26, 22, 1, 1, 5, 65, 56, 65, 17, 7, 16, 22, 3, 4, 2, 3, 8, 4, 3, 30, 4, 43.
AG: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.
WBC: 2.3, 0.75, 4.3, 2.6, 6, 10.5, 10, 17, 5.4, 7, 9.4, 32, 35, 100, 100, 52, 100, 4.4, 3,4, 1.5, 9, 5.3, 10, 19, 27, 28, 31, 26, 21, 79, 100, 100.
status: 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1.
The study includes the following variables: \(t_{j}\): survival time, \(y_{j}\): log survival time, statusj: censoring indication (0=censoring, 1 =lifetime), \(w_{j1}\): white blood cell characteristics (AG) test (0 =negative, 1 =positive), \(w_{j2}\): white blood cell count (WBC).
Thus, the fitted model may be stated like the following:
The MLEs and their SEs are shown in Table 15. The criteria \(A_{1}\), \(A_{2}\) , \(A_{3}\), and Hannan-Quinn information criterion (\(A_{6}\)) are displayed in Table 16. The LCAP-W regression model is compared with log cosine Topp–Leon Weibull (LCTLWe) in 26, log tangent Topp-Leone Weibull (LTTLWe) in 49, log sine alpha power Weibull (LSAPWe) in 22, and the log Marshall-Olkin odd log-logistic Weibull (LMOOLLWe) in 51. Table 13 illustrates that, when compared with the others competing model, the LCAP-W has the lowest values of all criteria, demonstrating that it best fits the leukemia data. The coefficients for the variables AG and WBC are consistently positive and negative, respectively, in each of the fitted models.
Residual analysis
Residual analysis is used to assess the adequate accuracy of the fitted model and determine outlier observations. This study involved residual analysis using martingale residuals and deviance residuals.
Martingale residual
The martingale residual is described by 52 as follows:
where \(\delta _{j}\) indicates the censor indicator; \(\delta _{j}=0\), if the jth observation is censored , and \(\delta _{j}=1\), if the jth observation is uncensored , and \(s(y_{j},\hat{\Omega })\) is the survival function of LCAP-W .
The LCAP-W regression model’s martingale residual is
where \(\hat{z_{j}}=(y_{j}-\hat{\nu }w^{T}_{j})/\hat{\sigma }\). The value of \(r_{M_j}\) ranges from \(-\infty\) to \(+1\) and is skewed.
Deviance residual
This is a modification of the martingale residual, which reduces skewness and provides it more symmetrical around zero. It can be stated as
where \(r_{M_j}\) is given in 40. The LCAP-W regression model’s deviance residual is
Figure 20 illustrates the deviance residuals compared with the observation index for the dataset. All observations are contained inside the interval \((-3,3)\). We conclude that no observed values show to be outliers. As a result, the fitted model is well suited to this data.
Simulation study for the LCAP-W regression model
We perform simulation studies to examine the performance of the MLE in the LCAP-W regression model for a variety of sample size (n), value of parameter, and censoring percentage values. The lifetimes represented by \(w_1,..., w_n\) were obtained from the CAP-W defined in 11, with the reparameterization \(\theta =\frac{1}{\sigma }, \eta =e^{-\frac{\mu }{\sigma }}\), and by assuming \(\mu _j = \nu _0 + \nu _1 w_j\), where \(w_j\) generated from uniform distribution (0, 1). The censoring times \(c_1,c_2,..., c_n\) are generated from a uniform distribution \((0,\tau )\), where \(\tau\) was changed until \(0.1, 0.2, \text {and}~ 0.5\) censoring percentages were obtained. The lifetimes considered for each fit are computed as \(y_{j}=\text {min}[\log (w_j),\log (c_j)]\). The simulation was repeated \(N=1000\) times with various sample sizes: n = 60, 120, 250, 350, and 500. The values of the parameters were:
-
Set I \(\nu _{0}=0.70,\nu _{1}=1.09,\sigma =3.20,\alpha =2.70.\)
-
Set II \(\nu _{0}=1.30,\nu _{1}=0.07,\sigma =4.10,\alpha =4.5.\)
For every parameter, the estimate and MSE are computed, and the outcomes are shown in Tables 17, 18 and 19. Tables 17, 18 and 19 demonstrate that as sample sizes rise, estimates tend approach the true values of the parameters and the MSE of estimates decreases. The outcomes show that the maximum likelihood approach consistently estimates for the LCAP-W regression model’s parameters.
Conclusions
In this study, we proposed a new technique for creating a new family of distributions with extra flexibility for modeling real-life data in a variety of fields without adding additional parameters. This new technique combines two famous methods: cosine-G and alpha power transformation. The proposed approach is known as the cosine alpha power-G family. To explain the CAP-G family, a specific model called the cosine alpha power-Weibull was created. The density function graphs reveal that the CAP-W distribution is symmetrical, skewed to the right, and decreasing, whereas the hazard rate function indicates increasing, decreasing, upside down bathtub, and J shapes. Various mathematical features of the CAP-W are obtained, including the quantile function, moments, Rényi entropies, and order statistics. The CAP-W parameters are estimated using four methods: MLE, LS, WLS and CVM, and Monte Carlo simulations are used to examine the performance of parameters. The simulation results indicate that the MLE and LS techniques shown superior efficiency and accuracy in estimating parameters. We analyze four real-world data applications and indicate that the CAP-W distribution is the best model against its rivals. Furthermore, We introduced a novel location-scale regression model known as the LCAP-W. We anticipate that the suggested approach and family will have a lot of applications for use in several domains. One of the many guides this study provides for future research is the generation of new distribution families using the proposed approach. Furthermore, the best estimate approach can be examined by estimating the parameters with Bayesian and classical estimation methods.
Data availability
The datasets used and analyzed in this study are available in the Figshare repository, https://doi.org/10.6084/m9.figshare.31073341.
References
Gupta, R. C., Gupta, P. L. & Gupta, R. D. Modeling failure time data by Lehman alternatives. Commun. Stat. Theory Methods 27(4), 887–904 (1998).
Alzaatreh, A., Lee, C. & Famoye, F. A new method for generating families of continuous distributions. Metron 71(1), 63–79 (2013).
Mahdavi, A. & Kundu, D. A new method for generating distributions with an application to exponential distribution. Commun. Stat. Theory Methods 46(13), 6543–6557 (2017).
Nassar, M., Alzaatreh, A., Mead, M. & Abo-Kasem, O. Alpha power Weibull distribution: Properties and applications. Commun. Stat. Theory Methods 46(20), 10236–10252 (2107).
Nasiru, S., Mwita, P. N. & Ngesa, O. Alpha power transformed Frechet distribution. Appl. Math. Inf. Sci. 13(1), 129–141 (2019).
Ibrahim, G. M., Hassan, A. S., Almetwally, E. M. & Almongy, H. M. Parameter estimation of alpha power inverted Topp-Leone distribution with applications. Intell. Autom. Soft Comput. 29(2), 353–371 (2021).
Bleed, S. O., Attwa, R. A., Ali, R. F. M. & Radwan, T. On alpha power transformation generalized pareto distribution and some properties. J. Appl. Math. 2024(1), 6270350 (2024).
Dugasa, S. J., Goshu, A. T. & Arero, B. G. Alpha power transformation of the lindley probability distribution. J. Probab. Stat 2024(1), 9068114 (2024).
Benchiha, S. et al. A new sine family of generalized distributions: Statistical inference with applications. Math. Comput. Appl. 28(4), 83 (2023).
Kumar, D., Singh, U. & Singh, S. K. A new distribution using sine function-its application to bladder cancer patients data. J. Stat. Appl. Probab. 4(3), 417 (2015).
Souza, L. New trigonometric classes of probabilistic distributions.Ph.D. Thesis, Universidade Federal Rural de Pernambuco, Recife, Brazi (2015).
Souza, L. et al. Tan-G class of trigonometric distributions and its applications. Cubo 23(1), 1–20 (2021).
Souza, L. et al. Sec-G class of distributions: Properties and applications. Symmetry 14(2), 299 (2022).
Genç, M. & Özbilen, O. Sine unit exponentiated half-logistic distribution: Theory, estimation, and applications in reliability modeling. Mathematics 13(11), 1871 (2025).
Heydari, T., Zare, K., Shokri, S., Khodadadi, Z. & Almaspoor, Z. A new sine-based probabilistic approach: Theory and monte carlo simulation with reliability application. J. Math. 2024(1), 9593193 (2024).
Wang, Y., Albalawi, O., Alshanbari, H. M. & Alsubaie, H. H. A modified cosine-based probability distribution: Its mathematical features with statistical modeling in sports and reliability prospects. Alex. Eng. J. 109, 322–333 (2024).
Hussam, E., Sapkota, L. P. & Gemeay, A. M. Tangent exponential-G family of distributions with applications in medical and engineering. Alex. Eng. J. 105, 181–203 (2024).
Alsolmi, M. M. A new logarithmic Tangent-U family of distributions with reliability analysis in engineering data. Comput. J. Math. Stat. Sci. 4(1), 258–282 (2025).
Ahmad, A. et al. Deriving the new cotangent Fréchet distribution with real data analysis. Alex. Eng. J. 105, 12–24 (2024).
Abonongo, J. Properties and applications of the Tan Weibull loss distribution. Kuwait J. Sci. 52(1), 100304 (2025).
Rahman, H. Exponentiated Arctan-X family of distribution: Properties, simulation and applications to insurance data. Thail. Stat. 23(1), 199–216 (2025).
Alghamdi, A. S., ALoufi, S. F. & Baharith, L. A. The Sine Alpha Power-G Family of Distributions: Characterizations, Regression Modeling, and Applications. Symmetry 17(3), 468 (2025).
Souza, L. et al. General properties for the Cos-G class of distributions with applications. Eurasian Bull. Math. 2(2), 63–79 (2019).
Ahmad, A. et al. Introducing novel arc cosine-class of distribution with theory and data evaluation related to coronavirus. Sci. Rep. 15(1), 13069 (2025).
Mahmood, Z. et al. An extended cosine generalized family of distributions for reliability modeling: Characteristics and applications with simulation study. Math. Probl. Eng. 2022, 3634698 (2022).
Nanga, S., Nasiru, S. & Dioggban, J. Cosine Topp-Leone family of distributions: Properties and regression. Res. Math. 10(1), 2208935 (2023).
Kumar, P. et al. A new class of cosine trigonometric lifetime distribution with applications. Alex. Eng. J. 106, 664–674 (2024).
Bashiru, S. O. et al. A hybrid cosine inverse Lomax-G family of distributions with applications in medical and engineering data. Niger. J. Technol. Dev. 22, 261–278 (2025).
Zamanah, E. & Nasiru, S. Cosine Weibull family of distributions: Properties, simulation, and applications to medical data. Int. J. Math. Math. Sci. 2025(1), 3059057 (2025).
Jin, N., Wang, Y., Cheng, S. & He, Y. A new probability distribution with properties and statistical analysis of the human resource and radiation data. J. Radiat. Res. Appl. Sci. 18(4), 101927 (2025).
Kumar, P., Sapkota, L. & Kumar, V. A new class of COS-G family of distributions with applications. Reliab. Theory Appl. 20(1), 105–123 (2025).
Ibrahim, A., Isa, A., Bashiru, S. & Chinedu, A. Cosine Kumaraswamy family of distributions: Properties and applications to real-world datasets. Sigma 43(6), 2185–2196 (2025).
Murthy, D. N. P. Weibull models (Wiley, 2004).
Lawless, J. F. Statistical models and methods for lifetime data (Wiley, 2011).
Shen, Z. et al. A new generalized rayleigh distribution with analysis to big data of an online community. Alex. Eng. J. 61(12), 11523–11535 (2022).
Lomax, K. S. Business failures: Another example of the analysis of failure data. J. Am. Statist. Assoc. 49(268), 847–852 (1954).
Lindley, D. V. Fiducial distributions and Bayes’ theorem. J. R. stat. Soc. Ser. B (Methodol.) 20, 102–107 (1958).
Moors, J. J. A. A quantile alternative for kurtosis. J. R. Stat. Soc. Ser. D (The Statistician) 37(1), 25–32 (1988).
Galton, F. Inquiries into Human Faculty and Its Development (Macmillan, 1883).
Swain, J. J., Venkatraman, S. & Wilson, J. R. Least-squares estimation of distribution functions in Johnson’s translation system. J. Stat. Comput. Simul. 29(4), 271–297 (1988).
Choi, K. & Bulgren, W. G. An estimation procedure for mixtures of distributions. J. R. stat. Soc. Ser. B (Methodol.) 30(3), 444–460 (1968).
Ghitany, M. E., Atieh, B. & Nadarajah, S. Lindley distribution and its application. Math. Comput. Simul. 78(4), 493–506 (2008).
Jorgensen, B. Statistical properties of the generalized inverse Gaussian distribution. (Springer, 2012).
Efron, B. Logistic regression, survival analysis, and the Kaplan-Meier curve. J. Am. Stat. Assoc. 83(402), 414–425 (1988).
Cordeiro, G. & Lemonte, A. The \(\beta\)-Birnbaum-Saunders distribution: An improved distribution for fatigue life modeling. Comput. Stat. Data Anal. 55(3), 1445–1461 (2011).
Ali, M., Ali, I., Yousof, H. & Ahmed, M. G families of probability distributions: theory and practices. (CRC Press, 2023).
Ijaz, M., Asim, S. M., Alamgir, Farooq, M., Khan, S. A. & Manzoor, S. A Gull Alpha Power Weibull distribution with applications to real and simulated data. Plos one 15 (6), e0233080 (2020).
Ahmad, A., Jallal, M. & Mubarak, S.A. New Cosine-Generator With an Example of Weibull Distribution: Simulation and Application Related to Banking Sector. Reliab. Theor. Appl 18 (1 (72)), 133–145 (2023).
Nanga, S., Nasiru, S. & Dioggban, J. Tangent Topp-Leone family of distributions. Sci. Afr. 17, e01363 (2022).
Baharith, L. A., Al-Beladi, K. M. & Klakattawi, H. S. The Odds exponential-pareto IV distribution: Regression model and application. Entropy 22(5), 497 (2020).
Cordeiro, G. M., Tahir, M. H., Vasconcelos, J. C., Ortega, E. M. & Hussain, M. A. A new extended log-Weibull regression: Simulations and applications. Hacet. J. Math. Stat. 50(3), 855–871 (2021).
Barlow, W. E. & Prentice, R. L. Residuals for relative risk regression. Biometrika 75(1), 65–74 (1988).
Funding
The authors declare that they received no financial assistance related to the research of this study.
Author information
Authors and Affiliations
Contributions
S.F.A.: Methodology, Formal analysis, Software, Writing-original draft preparation, Writing-review and editing. A.S.A.:Investigation, Formal analysis, Writing-review and editing. The authors checked and gave permission to the final manuscript for publishing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Alghamdi, A.S., ALoufi, S.F. A new family of alpha power-G using cosine function with applications and regression modeling. Sci Rep 16, 6617 (2026). https://doi.org/10.1038/s41598-026-36324-5
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-026-36324-5






















