Abstract
A new one-parameter discrete distribution, namely the Poisson Haq (PH) distribution, is proposed by a mixture of the Poisson variable and an independently distributed Haq random variable. This model effectively analyzes over-dispersed count datasets by extending Poisson distribution. Various useful statistical properties of the PH distribution are derived and discussed. The failure rate of the proposed distribution is “increasing” and “upside bathtub” shaped. The model parameter estimation is performed using renowned estimation approaches, method of moments, and method of maximum likelihood estimation. A parametric regression model tailored for count datasets is also developed using the proposed distribution. A simulation study is conducted to demonstrate the performance and behavior of the proposed estimators. The present study validates that the new count model adequately explains the medical datasets, which are the number of infected patients with the Nipah virus, the number of mammalian cytogenetic dosimetry lesions, and the Length of Hospital Stay. Additionally, we also estimate the model parameter using the Bayesian approach with gamma prior. Compared to widely used alternatives such as the Poisson (AIC = 145.16, BIC = 147.19), Poisson moment exponential (AIC = 137.53, BIC = 139.56), Poisson-XLindley (AIC = 135.86, BIC = 137.88) distributions and others, our model demonstrates improved fitting accuracy, as evidenced by lower AIC (135.78) and BIC (137.81) values for first data and similarly for second data applications. Finally, to validate the fit of the PH regression model, it is applied to the Length of Hospital Stay dataset.
Similar content being viewed by others
Introduction
Data modeling complexity has risen significantly because of excessive data collection across fields like engineering, medicine, ecology, epidemiology, and renewable energy1,2,3,4,5,6,7. Poisson distribution stands as the principal statistical tool for examining data sets with count values. A critical aspect of the Poisson distribution will be referred to as equal dispersion (variance is equal to mean). Practically there are situations where Poisson distribution is unsuitable for modeling data with a high degree of under and overdispersion. Overdispersion is a phenomenon that is frequently seen in count data and presents serious difficulties for statistical modeling.
The mixed Poisson models are adaptable tools for the analysis of count data exhibiting heterogeneity and overdispersion. Over the years, various researchers have been introduced by mixing Poisson distribution with various continuous distributions. Some examples of Poisson mixture models are; The generalized Poisson-Lindley distribution8 and applied in reliability and biological studies. Poisson Amarendra distribution9 arises by compounding the Poisson and Amarendra distributions. It is utilized to model the ecological and insurance claims datasets. Poisson Garima distribution10 effectively captures overdispersed count data and has been applied in health and social sciences. Poisson Shanker distribution proposed by11 has been shown to perform well in actuarial and demographic studies. Poisson pseudo-Lindley distribution introduced by12 explored its several statistical properties, including descriptive measures, quantile function, and Lorenz curve. The parameters were estimated using maximum likelihood estimation (MLE) and the applicability of the proposed model was demonstrated using two real-world datasets. Bernoulli Poisson moment exponential distribution13 is studied along with its various generating functions and mathematical characteristics. Parameter estimation was conducted using the MLE approach and its applicability was established using three datasets. Poisson Quasi-Lindley distribution was put forward by14. They comprehensively studied its different reliability properties. Additionally, an extensive simulation study was conducted and parameter estimation was performed using the MLE method. Ref15. Introduced one parameter Poisson Agu Eghwerido and derived its various properties, including factorial moments, generating function, and entropies. The parameter of the said model was estimated using both classical and Bayesian approaches. Ref16. proposed discrete Poisson Aradhana distribution to study overdispersed data sets. To explore the behavior of the proposed model in depth, various statistical measures were derived. Parameter estimation was performed using the maximum likelihood method, along with a comprehensive simulation analysis. Ref17. proposed Poisson moment exponential distribution and applied it to four datasets to demonstrate its practicality. The parameter estimation was performed using seven different estimation approaches. Ref18. investigated the Poisson XLindley distribution to study right skewed, leptokurtic, and overdispersed datasets. Numerous reliability characteristics of the derived model were explored. The parameter of the new model was estimated by utilizing six different estimation techniques. Poisson Mirra, its regression model, and its first-order integer-valued autoregressive process (INAR-1) were introduced by19. The INAR-1 model parameters were estimated using Yule-Walker, conditional maximum likelihood, and conditional least squares methods. Poisson’s new XLindley was instigated20 and its fundamental mathematical and statistical properties were explicitly studied. Ref21. developed a two-parameter discrete Poisson mixing distribution and derived its key properties. A new count regression model was proposed, and the model’s applicability was demonstrated using asymmetric datasets. Ref22. introduced Poisson entropy-based weighted exponential distribution for modeling right-skewed data with heavy tails. Parameters were estimated using MLE and Bayesian methods. Its applicability was demonstrated with three real-world datasets. Ref23. developed Poisson quasi-Shanker distribution, derived key properties, and estimated parameters, and endorsed its applicability through simulation and real datasets. Ref24. proposed Poisson Quasi XLindley distribution, a two-parameter discrete model. Its several key statistical properties were analyzed explicitly. The model was applied to two real datasets and integrated into a count regression framework as well. Some more examples of discrete models are; discrete Poisson-Lindley distribution25 new geometric distribution26 discrete extended odd Weibull exponential distribution27 discrete Half-Logistic distribution28 discrete exponentiated moment exponential distribution29 Posson Komal distribution30 and Poisson Xrama distribution31.
Each of these compound models offers specific benefits and draws limitations in their usage. The acceleration of technological development produces a tremendous amount of complex high-dimensional data that spreads across healthcare domains with finance and social science and engineering operations. Numerous contemporary data models find difficulty in accurately reflecting intricate statistical patterns which include extreme over-dispersion, zero inflation, and non-normal distributional forms. The rapid generation of emerging datasets demands swiftly developing interpretative statistical models that remain practical and computationally efficient to perform accurate analyses of complex datasets. Therefore, this study attempts to introduce a novel Poisson compound model based on the Haq distribution, which is designed to better capture the variations present in complex datasets. Haq distribution is a one-parameter powerful and flexible probability model designed to handle complex datasets with overdispersion, skewness, and reliability characteristics. Its derivation from a mixture of exponential and Xgamma distributions allows it to capture variations in real-world data more effectively than many existing models. The Haq distribution offers both a heavier right tail and higher flexibility for describing over-dispersion patterns better than standard distributions including the Lindley and moment-exponential distributions. Application of the Haq distribution in the Poisson framework gives users better control over tail effects and extreme distributions in addition to skewness properties that align with real-world count data patterns. The mathematical structure of the Haq distribution enables smooth integration-based operations for generating the pmf and calculating moments as well as additional properties while maintaining behavioral richness in its modeling framework.
The Haq distribution was originally presented by32. It was obtained by mixing Xgamma \(\:\left({f}_{1}\left(x\right)\right)\) and exponential \(\:\left({f}_{2}\left(x\right)\right)\) distributions with scale parameter \(\:\theta\:\) for both and mixing proportions \(\:{p}_{1}=\frac{1}{1+\theta\:}\) and \(\:{p}_{2}=\frac{\theta\:}{1+\theta\:}\). The probability density function of the Haq distribution is given as
.
As stated earlier, the current work introduces the Poisson-Haq distribution, a novel mixed Poisson distribution formed by combining the Poisson and Haq distributions. The study’s particular aims are:
-
To develop the new Poisson-Haq distribution and derive its key mathematical properties, including probability mass function, moments, and other essential characteristics.
-
To estimate the parameters of the proposed distribution using the maximum likelihood and method of moments techniques. Additionally, conduct a comprehensive simulation study to evaluate the performance and reliability of these estimators.
-
To further estimate the distribution’s parameters using Bayesian estimation methods, providing an alternative approach to inference.
-
To demonstrate the practical applicability and adequacy of the Poisson-Haq distribution by fitting it into two medical datasets, thereby illustrating its utility in real-world scenarios.
The structure of the study is as follows: Sect. "Poisson Haq distribution" presents the derivation of the Poisson Haq distribution, along with a visual depiction of its probability mass function and hazard rate function. Section "Mathematical characteristics of PH distribution" focuses on the calculation of the key theoretical properties. Section “Parameter estimation” details the statistical inference of the new distribution parameters using both the method of moments and maximum likelihood estimation. A new count regression model is presented in Sect. "Posson Haq regression model". Section “Data applications” represents the application of the new model to medical field datasets. Section “Bayesian analysis” covers an analysis based on the Bayesian approach, while Sect. “Conclusion” contains the concluding remarks.
Poisson Haq distribution
A random variable X is said to follow a PH distribution if it satisfies the following representation.
Using Eq. (2) and the probability mass function of the Poisson distribution, the PH distribution is obtained as follows:
Applying the standard gamma function formula: \(\:\underset{0}{\overset{\infty\:}{\int\:}}{x}^{n-1}{e}^{-\alpha\:x}=\frac{\varGamma\:\left(n\right)}{{\left(\alpha\:\right)}^{n}}\)
The behavior of the pmf at the lower and upper limits is described by
.
and
.
Figure 1 depicts a graphical depiction of the PH distribution for various parameter values. It demonstrates that the distribution is unimodal and skewed to the right.
Equation (3) provides the expression for the cumulative distribution function (cdf) of the PH distribution.
Using the Eq. (3), the survival function is gained and given as
The hazard function also known as the failure rate is defined by taking the ratio of pmf to the survival function. The failure rate shows how the risk of an event changes over time. An increasing failure rate makes events happen more frequently over time like machines breaking down. A decreasing failure rate indicates events occur earlier in the time when products tend to break down soon after production. The hazard rate of PH distribution is given by
.
The reverse hazard rate (also called the past failure rate) is a reliability metric that measures the probability of failure in a small interval before time t. It tells us how likely it is that an event has already occurred by a certain time, which is useful for early detection in areas like equipment monitoring or healthcare. Mathematically, it is defined as the ratio of pmf to the distribution function and is given as
.
Figure 2 presents visual representations of the hazard function of PH distribution using different parameter values. The distribution curve for \(\:h\left(x\right)\) undergoes distinct shape modifications as the parameter \(\:\theta\:\) value increases. The hazard rate exhibits an upward trend in its pattern when \(\:\theta\:\) maintains small values. The hazard rate shows a heightened steepness when parameter values fall within moderate ranges based on the curve changes. For larger\(\:\:\theta\:\), the shape shifts to a bathtub form: hazard rate first dips and then rises again.
Mathematical characteristics of PH distribution
In this section, we derive many essential mathematical aspects of the Poisson Haq distribution and conduct a thorough investigation of their behavior.
Moments
The rth factorial moments of random variable X can be obtained as
.
Let \(\:y=x-r\), we get the following.
Moment generating function
The moment-generating function can be derived as
.
Using the geometric series formula\(\:\:\sum\:_{x=0}^{\infty\:}a{r}^{x}=\frac{a}{1-r}\) and after simplification, we obtain the following result.
After simplification,
Similarly, the characteristic function (CF) and the probability-generating function of the PH distribution are derived using the same approach as the moment-generating function and are given below, respectively.
and
.
The first four moments about the origin are
.
Variance \(\:\left(Var\right)\) and Index of dispersion \(\:\left(ID\right)\) of PH distribution are given by
.
and
.
The coefficient of skewness (CS) and the coefficient of kurtosis (CK) can be determined using the following formulas.
and
.
-
Table 1 shows that as θ increases, the mean decreases rapidly, while the variance decreases more slowly, resulting in a consistently high Index of Dispersion (ID). This constant overdispersion reflects the typical behavior of real-world count data involving rare but extreme events, such as insurance claims, equipment failures, or hospital admissions.
-
At the same time, both the coefficient of skewness (CS) and coefficient of kurtosis (CK) increase significantly with\(\:\:\theta\:\). The increase in skewness indicates a growing asymmetry, with a longer right tail, meaning that large counts become increasingly probable relative to the mean. The sharp increase in kurtosis signifies heavier tails and a higher peak, characteristic of datasets with a preponderance of small counts punctuated by occasional large outliers.
Parameter Estimation
This section is based on parameter estimation using two widely recognized estimation approaches: the method of moments and maximum likelihood estimation. Both approaches are explored in detail, providing insight into their usefulness in estimating the parameters of the PH distribution. Furthermore, a detailed simulation study is carried out to assess the behavior accuracy of these estimates under different scenarios. This comprehensive analysis aids in understanding the practical performance of estimation techniques in real-world applications.
Maximum likelihood estimator (MLE)
The log-likelihood function for the PH distribution can be written as
.
To optimize the equation above, we calculate the partial derivative concerning θ and obtain
.
The ML estimator may be produced by numerically solving Eq. (10), which cannot be represented simply in terms of parameters.
Method of moments estimator (MME)
The MME can be obtained by setting the population mean equal to the sample mean. Thus, the MME of \(\:\theta\:\), denoted as \(\:\widehat{\theta\:}\), is derived by solving the following equation.
Simulation
We present here a comprehensive simulation study to assess the performance of both estimation methods discussed in the previous subsection. To investigate how the estimators behave under different sample sizes, both small and large sample sizes are considered. Specifically, we considered the following sample sizes of n = 25, 50, 100, 200, and 300 to generate a sample. The simulation process is repeated N = 10,000 times to ensure robust results. For each generated sample, we calculate the four key performance metrics to evaluate the behavior of estimators: The average estimate (AE), absolute bias (AB), mean relative error (MRE), and mean square error (MSE).
The calculated values for each of these performance measures are summarized in Table 2. Additionally, the heatmap based on AB, MRE, and MSE measures are presented in Fig. 3.
Table 2; Fig. 3 present simulation findings for the performance of MLE and MME estimation methods for different samples and various choices of parameter \(\:\theta\:\). It is observed that.
-
In the majority of cases, the maximum likelihood estimation approach provides smaller values of AB, and MRE compared to MME, indicating that MLE tends to be more accurate, especially for higher parameter values and sample sizes.
-
As the sample size increases both estimators tend towards the true value of the parameter. This is a characteristic behavior for estimators that tend to converge to the true parameter.
-
As the parameter \(\:\theta\:\) increases, the difference between MME and MLE becomes more noticeable, with MLE consistently showing smaller errors.
-
The MLE and MME estimators show a reduction in AB and MSE as the sample size increases. However, MME tends to have slightly larger AB and MSE values than MLE, particularly for higher values of parameter.
Posson Haq regression model
A new count regression model based on Poisson Haq distribution is proposed in this section. The PMF is defined in terms of parameter \(\:\theta\:\), which is a transformation of the mean parameter \(\:\mu\:>0\). The transformation \(\:\theta\:=\theta\:\left(\mu\:\right)\) is presented by:
where \(\:A=1+3\mu\:+30{\mu\:}^{2}+{\mu\:}^{3}+3\sqrt{3}\sqrt{2{\mu\:}^{2}+6{\mu\:}^{3}+33{\mu\:}^{4}+2{\mu\:}^{5}}\).(Proof is given in Appendix).
The PMF of the PHaq distribution for count variable \(\:Y\) is articulated as:
We denote this probability model \(\:Y\left(\theta\:,\mu\:\right)\), where \(\:\mu\:\) serves as the mean parameter.
Assume the count response variable \(\:{Y}_{i}\) for the i-th observation follows the \(\:PH\left(\theta\:,\mu\:\right)\) model. Let \(\:E\left(Y\right)={\mu\:}_{i}\) and to relate the mean \(\:{\mu\:}_{i}\) to the random variable, we utilize the log-link function:
where \(\:\beta\:=(\beta_{0},\:\beta_{1},...,{\beta}_{p})^T\) is the vector of regression coefficients, and \(\:{x}_{i}={\left({x}_{i1},{x}_{i2},\dots\:,{x}_{ip}\right)}^{T}\) is the vector of explanatory variables for the i-th observation. Substituting the log-link function \(\:{\mu\:}_{i}=\text{exp}\left(\beta\:{x}_{i}^{T}\right)\) and re-parameterization \(\:\theta\:\left(\mu\:\right)\), we obtain the regression model that can be fitted to real-world count data. The likelihood function of the new count regression model is given by
.
Now the estimates of the proposed regression parameters by maximizing the above log-likelihood function with respect to \(\:\beta\:\) using numerical optimization methods using R software.
Data applications
In this section, we compare the new count distribution with widely recognized discrete probability distributions to assess its applicability and adequacy. To facilitate the analysis, we utilized three datasets from different domains: one is the number of mammalian cytogenetic dosimetry lesions, and the other examines remission times (in months) of Nipah virus infection. To perform the comparison, we consider different count distributions, each model offering unique features suited to different types of data. The competitive distributions include Poisson moment exponential (PME)17 Poisson XLindley (PXL)18 Poisson Ramos-Louzada (PRL)33 Poisson entropy-based weighted exponential (PEWE)22. Additionally, we examine the standard Poisson distribution, widely used for modeling rare events. The density functions of these models are given below.
and
.
The parameters of all considered distributions are estimated using the MLE method, which ensures that the selected parameters maximize the likelihood function based on the observed data. For the determination of best-fitted probability distribution, a detailed assessment is conducted using various information criteria and goodness-of-fit measures. These measures include: the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and Chi-Square goodness-of-fit. The determination of the best-fit distribution is based on the minimum values of AIC, BIC, and Chi-Square test statistic and higher the values log-likelihood and Chi-Square p-values.
Infected patients of Nipah virus data
The first dataset is about the survival time (in months) of Kerela, the Indian state, resident who was infected by the Nipah virus in 201734. Some descriptive statistics for this dataset are mean = 0.7636, variance = 0.8311, skewness = 1.8575 and kurtosis = 6.1792. Figure 4 presents various descriptive plots to visually explore the dataset, including a Violin plot to combine distributional information with summary statistics, a Boxplot to highlight the distribution and potential outliers, a Q-Q plot to assess the normality of the data by comparing it to a theoretical distribution, and a Histogram to show the frequency distribution of the data.
The parameter estimates observed frequencies (Obs. Fr.) and expected frequencies (Exp. Fr.) of all fitted distributions along with goodness-of-fit measures for the Nipah Virus dataset are given in Table 3.
Table 3 reveals that the PH distribution achieves the smallest AIC, BIC, and Chi-Square test statistic values compared to other distributions, indicating that it offers the best fit to the observed data among all the competitive distributions. Furthermore, Fig. 5 presents a comparative visualization of the Obs. and Exp. Frequencies for the considered count models. This graphical representation allows for a clearer understanding of how well each probability distribution aligns with real-life data.
Mammalian cytogenetic dosimetry lesions data
The second dataset represents the number of mammalian cytogenetic dosimetry lesions induced by exposure to streptogramin (NSC-45383) in rabbit lymphoblasts at a dosage of 70 3bc g/kg35. We first computed some descriptive metrics of these datasets to illustrate their behavior such as the mean lesion count is 0.5400 with a variance of 1.3319, the skewness is 1.7109, and the coefficient of kurtosis is 5.6532. These descriptive measures indicate a positively skewed with heavy tails, suggesting a relatively low frequency of extreme lesion counts.
Additionally, to better understand these patterns some descriptive plots including Boxplot, Histogram, Violin, and Q-Q plots are generated and are listed in Fig. 6. These descriptive plots complement the numerical summaries by highlighting the overall pattern, central tendency, shape, and outliers of the data.
The MLEs, Obs. and Exp. Frequencies and goodness-of-fit metrics for cytogenetic dosimetry lesions data are presented in Table 4. Further, the Obs. and Exp. Frequencies are also generated for all fitted distributions and listed in Fig. 7.
It is observed that the Poisson-Haq distribution efficiently analyzes this dataset as compared to other distributions.
Application of the PH regression model
Now we take into account the dataset that was reported in36. The information shows the length of cardiovascular patients’ stay at the hospital. A total of 3589 observations were taken into account using the COUNT package in R software. The considered dataset is known as AZPRO data. The purpose of this study is to examine how the response variable “length of patients’ stays at the hospital (los)”, is impacted by the variables such as cardiovascular procedure (procedure), gender (sex), Admission type (admit), and age. The barplot of the quantity of doctor visits is shown in Fig. 8.
Now by utilizing the following log-link function to associate the variables with the response mean.
The data is fitted with Poisson regression, Poisson-transmuted record type exponential (PTRTE)37and the proposed PH regression model. The parameter estimates and model selection measures are presented in Table 5.
Table 5 displays all findings of the proposed regression model along with considered competitive regression models. It is conclusive that all the competitive regression models are significant, as the p-value for each model is less than 5%. Furthermore, the table also confirms that the PH regression model fits the data best with the highest log-likelihood and the lowest AIC and BIC scores as compared to others.
Bayesian analysis
In this section, the Bayesian estimation approach is employed to estimate the parameters of the new count distribution. Bayesian technique offers a powerful framework for parameter estimation by incorporating prior information about the parameter(s) in the system of prior distribution(s). For this purpose, we assume gamma prior to the parameter \(\:\theta\:\), reflecting our preliminary identification or convention about its possible values. The posterior density combines both observed data and prior information and is obtained using Bayes’ theorem. This density involves the multiplication of the likelihood function and prior information. The resultant posterior distribution delivers an updated estimate of the parameter \(\:\theta\:\) that combines both prior information and the evidence from the data.
The gamma distribution is widely considered prior distribution. It is suitable due to its conjugation with many likelihood functions commonly used in the Bayesian paradigm.
The posterior density for parameter \(\:\theta\:\) is given by
.
It is easily seen from Eq. (11) that the posterior density is not available in closed form, necessitating the use of computational techniques for parameter estimation. To acquire posterior summaries of interest, we utilized the Markov Chain Monte Carlo (MCMC) technique which is well-suited for sampling from complex posterior density functions.
The simulation program generated 1,007,000 samples from the joint posterior distribution. The first 7,000 samples are discarded as a burn-in period to mitigate the impact of seed values on the final parameter estimates. A thinning interval of 200 is utilized to reduce autocorrelation among successive samples, ensuring the resulting set of samples is approximately independent. The mean of these samples is utilized to compute Bayes estimates. To ensure the accuracy and validity of results, coverage diagnostics are performed. Trace plots of the sample values are examined to assess the stability of the Markov chain. Additionally, we also use the Geweke diagnostic, which uses a z-score to compare the means of two non-overlapping segments of the chain, which are normalized by the asymptotic standard error of the difference. The convergence deemed satisfactory of the absolute value of the z-score is less than 1.96. All calculations and analyses are performed using the MCMCpack package with the R software.
The posterior summaries, including the posterior mean, standard deviation, Geweke diagnostic scores, and highest posterior density interval (HPD) intervals, are listed in Table 6. These findings deliver a comprehensive overview of the Bayesian estimates and their associated uncertainties. The posterior samples for the PH distribution parameter for both datasets are shown in Figs. 9 and 10.
Conclusion
This study presents and examines a novel one-parameter probability distribution for count data. This novel count probability model is derived using the mixed Poisson compounding technique and named as Poisson Haq (PH) distribution. The PH distribution is capable of addressing the need for modeling count datasets exhibiting overdispersion. The failure rate of PH distribution is showing an increasing pattern.
The essential statistical and reliability characteristics of PH distribution were mathematically and numerically derived, including cumulative distribution function, survival and hazard functions, moment, and associated measures. It has been observed that the mean and variance of PH distribution decreases with an increase in parameter values. This implies that the new distribution tends more concentrated around lower values with higher values of the parameter. The coefficients of skewness and kurtosis also show an increasing pattern for higher parameter values. The distribution becomes more skewed to the right, more pronounced peak and heavier tails. The dispersion index gradually decreases for larger values of the parameter.
The distribution parameter has been estimated using the methods of moments and maximum likelihood estimation. It has been observed that as the sample size increases both estimators tend towards the true value of the parameter. The MLE and MME estimators show a reduction in Bias and MSE as the sample size increases. However, MME tends to have slightly larger Bias and MSE values than the MLE, particularly for higher values of the parameter. Further, it has been seen that as the parameter \(\:\theta\:\) increases, the difference between MME and MLE becomes more noticeable, with MLE consistently showing smaller errors.
To demonstrate the versatility of the new distribution, three datasets related to medical science are considered. The first data is associated with the number of infected patients with the Nipah virus, the second one is about the number of mammalian cytogenetic dosimetry lesions, and the third data is related to the length of hospital stay. Comparative analyses show that the new distribution analyzed these datasets adequately as compared to considered competitive distributions. Additionally, the Bayesian estimation approach is also employed to estimate the parameter of PH distribution.
Future research aims to investigate additional modifications and applications of the PH distribution. Potential directions but not limited to, exploring actuarial measures, examining reliability characteristics like mean residual life function and entropy, and forms of truncated, zero-inflated, and neutrosophic models. Additionally, this distribution can be applied to population size estimation. These advancements are anticipated to enhance the versatility and applicability of the proposed distribution, solidifying its position as a robust and competitive model in the realm of statistical literature.
Data availability
Data availability: All data generated or analysed during this study are included in this published article.
References
Saha, P., Biswas, S. K., Biswas, M. H. A. & Ghosh, U. An SEQAIHR model to study COVID-19 transmission and optimal control strategies in Hong kong, 2022. Nonlinear Dyn. 111, 6873–6893 (2023).
Saha, P., Mondal, B. & Ghosh, U. Dynamical behaviors of an epidemic model with partial immunity having nonlinear incidence and saturated treatment in deterministic and stochastic environments. Chaos Solitons Fractals. 174, 113775 (2023).
Saha, P., Sikdar, G. C. & Ghosh, U. Transmission dynamics and control strategy of single-strain dengue disease. Int. J. Dyn. Control. 11, 1396–1414 (2023).
Saha, P., Pal, K. K., Ghosh, U. & Tiwari, P. K. Effects of vaccination and saturated treatment on COVID-19 transmission in india: deterministic and stochastic approaches. J Biol. Syst, 1–47 (2024).
Saha, P., Pal, K., Ghosh, K. & Kumar Tiwari, P. U. Dynamic analysis of deterministic and stochastic SEIR models incorporating the Ornstein–Uhlenbeck process. Chaos Interdiscip J. Nonlinear Sci 35, (2025).
Tian, R., Zhang, F., Du, H. & Wang, P. Optimization design method for multi-stress accelerated degradation test based on Tweedie exponential dispersion process. Appl. Math. Model. 135, 684–707 (2024).
Tian, R., Zhang, F., Du, H. & Wang, P. Exponential dispersion accelerated degradation modelling and reliability assessment considering initial value and processes heterogeneity. Maint Reliab. I Niezawodn 26, (2024).
Bhati, D. & Sastry, D. V. S. Maha qadri, P. Z. A new generalized poisson-lindley distribution: applications and properties. Austrian J. Stat. 44, 35–51 (2015).
Shanker, R. The discrete poisson-amarendra distribution. Int. J. Stat. Distrib. Appl. 2, 14–21 (2016).
Shanker, R. The discrete poisson-garima distribution. Biometrics Biostat Int. J. 5, 1–7 (2017).
Shanker, R. R., Fesshaye, H., Shanker, R. R., Leonida, T. A. & Sium, S. On discrete Poisson-Shanker distribution and its applications. Biometrics Biostat Int. J. 5, 121 (2017).
Zeghdoudi, H. & Nedjar, S. On Poisson pseudo Lindley distribution: properties and applications. J. Probab. Stat. Sci. 15, 19–28 (2017).
Alrumayh, A. Bernoulli Poisson Moment Exponential Distribution : Mathematical Properties, Regression Model, and Applications. Int. J. Math. Math. Sci. (2024). (2024).
Grine, R. & Zeghdoudi, H. On Poisson quasi-lindley distribution and its applications. J. Mod. Appl. Stat. Methods. 16, 403–417 (2017).
Alamri, O. A. Classical and bayesian Estimation of discrete Poisson Agu-Eghwerido distribution with applications. Alexandria Eng. J. 109, 768–777 (2024).
Shanker, R. & Shukla, K. K. A quasi Poisson-Aradhana distribution. Hung. Stat. Rev. 3, 3–17 (2020).
Ahsan-ul-Haq, M. On Poisson moment exponential distribution with applications. Ann. Data Sci. https://doi.org/10.1007/s40745-022-00400-0 (2022).
Ahsan-ul-Haq, M., Al-Bossly, A., El-Morshedy, M. & Eliwa, M. S. Poisson XLindley Distribution for Count Data: Statistical and Reliability Properties with Estimation Techniques and Inference. Comput. Intell. Neurosci. (2022). (2022).
Maya, R., Irshad, M. R., Chesneau, C., Nitin, S. L. & Shibu, D. S. On discrete Poisson–Mirra distribution: regression, INAR (1) process and applications. Axioms 11, 193 (2022).
Seghier, F. Z., Ahsan-ul-Haq, M., Zeghdoudi, H. & Hashmi, S. A. New generalization of Poisson distribution for Over-dispersed, count data: mathematical properties, regression model and applications. Lobachevskii J. Math. 44, 3850–3859 (2023).
Alrumayh, A. & Khogeer, H. A. A new Two-Parameter discrete distribution for overdispersed and asymmetric data: its properties, estimation, regression model, and applications. Symmetry (Basel). 15, 1289 (2023).
Alomair, A. & Ahsan-ul-Haq, M. A new extension of Poisson distribution for asymmetric count data: theory, classical and bayesian Estimation with application to lifetime data. PeerJ Comput. Sci. 9, e1748 (2023).
Zaagan, A. A. & Mahnashi, A. M. Analysis of leukemia and forest fires data using new Poisson Quasi-Shanker distribution. Alexandria Eng. J. 104, 701–709 (2024).
Alghamdi, F. M. et al. Discrete Poisson Quasi-XLindley distribution with mathematical properties, regression model, and data analysis. J. Radiat. Res. Appl. Sci. 17, 100874 (2024).
Al-Babtain, A. A., Gemeay, A. M. & Afify, A. Z. Estimation methods for the discrete Poisson-Lindley and discrete Lindley distributions with actuarial measures and applications in medicine. J. King Saud Univ. 33, 101224 (2021).
Alosey, A. R., El & Gemeay, A. M. A novel version of geometric distribution: method and application. Comput. J. Math. Stat. Sci. 4, 1–16 (2025).
Nagy, M. et al. The New Novel Discrete Distribution with Application on COVID-19 Mortality Numbers in Kingdom of Saudi Arabia and Latvia. Complexity (2021). (2021).
Teamah, A. E. A. M., Elbanna, A. A. & Gemeay, A. M. Discrete Half-Logistic distribution: statistical properties, estimation, and application. J. Stat. Appl. Probab. 13, 273–284 (2024).
Ahmad, K. et al. Statistical inference on the exponentiated moment exponential distribution and its discretization. J. Radiat. Res. Appl. Sci. 17, 101116 (2024).
Alomair, A. M. & Ahsan-ul-Haq, M. A new mixed Poisson Komal distribution with application on radiation, agricultural and medical sciences data. J. Radiat. Res. Appl. Sci. 18, 101500 (2025).
Alomair, A. M. & Ahsan-ul-Haq, M. Analysis of radiation and corn borer data using discrete Poisson Xrama distribution. J. Radiat. Res. Appl. Sci. 18, 101388 (2025).
Ahsan-ul-Haq, M. Statistical analysis of Haq distribution: Estimation and applications. Pakistan J. Stat. 38, 473–490 (2022).
Alkhairy, I. Classical and bayesian inference for the discrete Poisson Ramos-Louzada distribution with application to COVID-19 data. Math. Biosci. Eng. 20, 14061–14080 (2023).
Seghier, F. Z., Zeghdoudi, H. & Benchaabane, A. A size-biased Poisson-gamma Lindley distribution with application. Eur. J. Stat. 1, 132–147 (2021).
Catcheside, D. G., Lea, D. E. & Thoday, J. M. Types of chromosome structural change induced by the irradiation of Tradescantia microspores. J. Genet. 47, 113–136 (1946).
Hosmer, D. W. & Lemeshow, S. Applied Logistic Regression. (2000).
Erbayram, T. & Akdoğan, Y. A new discrete model generated from mixed Poisson transmuted record type exponential distribution. Ric Di Mat. https://doi.org/10.1007/s11587-022-00755-9 (2023).
Acknowledgements
“This work was supported by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia [Grant No. KFU251035]”.
Author information
Authors and Affiliations
Contributions
Author contributions: Abdullah M. Alomair: conceptualization and methodology; Faisal Ayyaz: writing manuscript; Saadia Tariq: supervision and editing; Muhammad Ahsan-ul-Haq: conceptualization, methodology, writing. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The solution of re-parameterization for \(\:\theta\:\) in terms of \(\:\mu\:\).
We are given the equation:
Multiply both sides by the denominator:
Expand the left-hand side:
Move all terms to one side:
Divide through by µ:
This is a cubic equation in \(\:\theta\:\). Using the cubic formula, we get:
where \(\:A=1+3\mu\:+30{\mu\:}^{2}+{\mu\:}^{3}+3\sqrt{3}\sqrt{2{\mu\:}^{2}+6{\mu\:}^{3}+33{\mu\:}^{4}+2{\mu\:}^{5}}\).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Alomair, A.M., Ayyaz, F., Tariq, S. et al. Discrete Poisson Haq distribution with mathematical properties and count data modeling. Sci Rep 15, 23281 (2025). https://doi.org/10.1038/s41598-025-07223-y
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-07223-y
Keywords
This article is cited by
-
Challenging Environmental Justice Assumptions: Geographic Dominance Over Social Vulnerability in Texas Hurricane Risk Distribution
Journal of Geovisualization and Spatial Analysis (2025)












