Introduction

Enzymes play a crucial role in regulating homeostasis by metabolizing endogenous substrates and xenobiotics, including drugs and environmental chemicals1,2. This enzyme-mediated metabolism can be inhibited when a substance, called inhibitor, reversibly binds to the enzyme (competitive inhibitor), enzyme-substrate complex (uncompetitive inhibitor), or both (mixed inhibitor), a phenomenon known as enzyme inhibition (Fig. 1a).

Fig. 1: The canonical approach for estimating inhibition constants.
figure 1

a An inhibitor (I) can suppress the enzyme-catalyzed reaction from a substrate (S) to a product (P) by binding to a free enzyme (E) (competitive), enzyme-substrate complex (C) (uncompetitive), or both (Mixed), forming reversible complexes (Y and B). Here, \({S}_{T}=S+C+P+B\) and \({I}_{T}=I+Y+B\) denote the total substrate and inhibitor concentrations, respectively, and \({K}_{M}=\frac{{k}_{2}+{k}_{-1}}{{k}_{1}}\) denotes the Michaelis-Menten (MM) constant. The dissociation constants of each binding (\({K}_{{ic}}=\frac{{k}_{-3}}{{k}_{3}}\) and \({K}_{{iu}}=\frac{{k}_{-4}}{{k}_{4}}\)), known as the inhibition constants, determine which inhibition type is dominant. b The canonical approach for estimating the inhibition constants and types involves the following steps: (i) The inhibitor concentration leading to half of the % control activity (\(I{C}_{50}\)) is estimated from experiments with a single \({S}_{T}\) and varying \({I}_{T}\). Here, the % control activity is the percentage ratio of the initial velocity with inhibitor to the initial velocity without inhibitor. (ii) For various combinations of \({S}_{T}\) values ranging from \(\frac{1}{5}{K}_{M}\) to \(5{K}_{M}\) and \({I}_{T}\) values ranging from \(\frac{1}{3}I{C}_{50}\) to \(3I{C}_{50}\), including \({I}_{T}=0\), initial velocities are measured. (iii) By fitting Eq. 1 to the initial velocity data, the inhibition constants are estimated. It is unclear whether the dataset used in the canonical approach is sufficient or necessary for precise estimation.

Analyzing enzyme inhibition is essential for predicting risks in drug development and clinical practice. Enzyme inhibition can occur when multiple drugs are co-administered, as medicinal drugs may inhibit not only their target enzymes but also drug-metabolizing enzymes, particularly cytochrome P450 enzyme3 (CYP). Inhibition of CYP can lead to unwanted delay of metabolisms of other drugs in vivo, potentially compromising the drugs’ safety. To predict these risks in drug development, the U.S. Food and Drug Administration has recommended evaluating the inhibitor’s potency during drug development4,5. Enzyme inhibition is also a major clinical concern and necessitates drug dose adjustments. While dose adjustments can help mitigate risks in cases of competitive inhibitors6,7,8, they are less effective for uncompetitive inhibitors, making clinical management more challenging8. Mixed inhibition involves complex interactions between substrates and inhibitors, with outcomes highly dependent on their concentrations6. The importance of enzyme inhibition analysis extends beyond drug development and management; it also plays a crucial role in food technology, particularly in preventing food browning and spoilage9.

However, direct in vivo analysis of enzyme inhibition is challenging due to the complexity of biological systems and the intensity of resources required10,11. Thus, in vivo enzyme inhibition is often predicted using a mathematical model based on parameters derived from in vitro experiments5. The key parameters in this analysis are the inhibition constants, which represent the dissociation constants between the inhibitor and enzyme or enzyme-substrate complex4,12 (\({K}_{{ic}}\) and \({K}_{{iu}}\) in Fig. 1a). As the inhibition constants characterize not only the potency but also the mechanism of the enzyme inhibition (Fig. 1a right), their estimation needs to be not only accurate but also precise (i.e., narrow confidence interval) to ensure the reliability of enzyme inhibition analysis13.

The inhibition constants are estimated by fitting the inhibition models for different types of enzyme inhibition14 (see “Results” for details) to the initial reaction velocity data, an approach that has been utilized in > 68,000 studies since its introduction in 193015. Traditionally, the initial reaction velocity data are obtained from experiments that involve varying combinations of various substrate and inhibitor concentrations14. This canonical experimental condition allows the accurate and precise estimation of inhibition constants for competitive and uncompetitive models, each of which involves a single inhibition constant16,17. Recent studies have shown that even a single inhibitor concentration can lead to precise and accurate estimation for models with a single inhibition constant16. However, these methods rely heavily on prior knowledge of the inhibition type (i.e., competitive or uncompetitive), which is often not the case, thereby considerably limiting their broader applicability.

Notably, when there is no prior knowledge of the inhibition type, the mixed inhibition type model, which involves two inhibition constants, is applicable, since it can describe all types of enzyme inhibitions18. However, the experimental conditions necessary for ensuring precise and accurate estimation have not been clearly identified. It is also unclear whether the conventional experimental conditions are sufficient for accurate and precise estimation. Consequently, when prior information about the type of inhibition is unavailable, both mixed and other types might be falsely reported, even for inhibition of the same enzyme. For instance, the enzyme inhibition between midazolam as a substrate and ketoconazole as an inhibitor for CYP3A4 has been actively studied, but both mixed and competitive types have been reported19. Taken together, the experimental conditions under which inhibition constants of all inhibition types can be accurately and precisely estimated remain unknown.

Here, we develop an efficient framework leading to the accurate and precise estimation of inhibition constants for all inhibition types, including mixed inhibition, without any prior information. Specifically, we performed an analysis of mathematical models and error landscape to identify an optimal experimental design for estimating two inhibition constants. This revealed that accurate and precise estimation is possible with data obtained by using a single inhibitor concentration that is greater than the half-maximal inhibitory concentration (\(I{C}_{50}\))—the inhibitor concentration exerting 50% inhibition. Importantly, despite the reduced data, the precision and accuracy were dramatically improved when we utilized the harmonic mean relationship between \(I{C}_{50}\) and inhibition constants for the estimation. This approach referred to as the 50-BOA (\(I{C}_{50}\)-Based Optimal Approach), enables accurate and precise estimation of inhibition constants for triazolam-ketoconazole and chlorzoxazone-ethambutol using much less data than is typically required in conventional methods. We provide a ready-to-use MATLAB and R packages that automate the estimation of inhibition constants and identification of inhibition types based on the 50-BOA. As the 50-BOA leads to accurate and precise estimation with a small amount of data, it is a more efficient approach that could substantially enhance the in vitro estimation process of enzyme inhibition in various fields such as drug development and food chemistry, among others.

Results

The canonical approach for estimation of the inhibition constant

Before presenting the main finding of this study, we first detail the mathematical model that describes enzyme inhibition and the canonical approach for estimating inhibition constants based on this model.

Enzyme-catalyzed reactions involve a substrate (S) binding to a free enzyme (E) to form a reversible enzyme-substrate complex (C), which then converts to products (P) (Fig. 1a). These reactions can be suppressed by an inhibitor (I), which binds to either E or C, forming reversible enzyme-inhibitor complexes (Y) or enzyme-substrate-inhibitor complexes (B), with dissociation constants of \({K}_{{ic}}=\frac{{k}_{-3}}{{k}_{3}}\) or \({K}_{{iu}}=\frac{{k}_{-4}}{{k}_{4}}\), respectively (enzyme inhibition; Fig. 1a). The dissociation constants, referred to as inhibition constants, characterize the potency of the inhibition; lower inhibition constants indicate higher binding affinity between I and E or I and C, leading to higher inhibitory potency.

Additionally, the relative magnitude of the two inhibition constants determines the mechanism of the inhibition (i.e., inhibition type). Specifically, if \({K}_{{ic}}\, \ll \, {K}_{{iu}}\), I predominantly binds to E, competing with S (i.e., competitive inhibition). Conversely, if \({K}_{{iu}} \, \ll \, {K}_{{ic}}\), I predominantly binds to C, suppressing the enzyme-catalyzed reaction in an uncompetitive way (i.e., uncompetitive inhibition). If the inhibition constants have comparable magnitude, I binds to both E and C with similar affinities, suppressing the enzyme-catalyzed reaction in both competitive and uncompetitive ways (mixed inhibition).

The initial velocity of product formation (V0) can be described by the equation:

$${V}_{0}=\frac{{V}_{\max }{S}_{T}}{{K}_{M}\left(1+\frac{{I}_{T}}{{K}_{{ic}}}\right)+{S}_{T}\left(1+\frac{{I}_{T}}{{K}_{{iu}}}\right)}.$$
(1)

where \({S}_{T}=S+C+P+B\), \({I}_{T}=I+Y+B\), and \({E}_{T}=E+C+Y+B\) denote the total substrate, inhibitor, and enzyme concentrations, respectively, \({V}_{\max }={k}_{2}{E}_{T}\) is the maximal velocity, and \({K}_{M}\) is the Michaelis-Menten (MM) constant. Since Eq. 1 is a general equation that can describe not only mixed (\({K}_{{ic}} \approx {K}_{{iu}}\)) inhibition but also competitive (\({K}_{{ic}}\, \ll \, {K}_{{iu}}\)) and uncompetitive (\({K}_{{iu}}\, \ll \, {K}_{{ic}}\)) inhibitions, it allows for the simultaneous estimation of inhibition constants and identification of inhibition type without prior knowledge for the inhibition type20.

The conventional method of estimating inhibition types and constants follows these steps (canonical approach; Fig. 1b): First, to determine the standard for the appropriate total inhibitor concentrations (\({I}_{T}\)) setup, the half-maximal inhibitory concentration (\({{IC}}_{50}\)) is priorly estimated from the % control activity data over various \({I}_{T}\) ranges with a single \({S}_{T}\), usually equal to \({K}_{M}\)14,21 (Fig. 1b (i)). After estimating \({{IC}}_{50}\), an experimental design is established using \({S}_{T}\) at \(0.2{K}_{M}\), \({K}_{M}\), and \(5{K}_{M}\) and \({I}_{T}\) at 0, \({\frac{1}{3}{IC}}_{50}\), \({{IC}}_{50}\), and \({3{IC}}_{50}\)14 (Fig. 1b (ii)). For each combination of concentrations, the initial velocity is measured. Then, by fitting Eq. 1 to the data, inhibition constants are estimated (Fig. 1b (iii)). However, it remains uncertain whether this empirically established canonical approach is sufficient and whether all of these data are necessary for accurate and precise estimation.

Experimental data with low total inhibitor concentrations (\({{{{\boldsymbol{I}}}}}_{{{{\boldsymbol{T}}}}}\)) does not provide information for estimation of inhibition constants

To investigate which experimental setups lead to accurate and precise estimation based on Eq. 1, we generated initial velocity data through simulation with an experimental setup of \({S}_{T}\), \({I}_{T}\), and true inhibition constants (Fig. 2a; see “Methods” for details). We then calculated the mean squared relative error (fitting error) between the simulated data and the mixed inhibition model (Eq. 1) for a range of candidate pairs of inhibition constants (Fig. 2a). By assigning dark and bright colors for low and high fitting errors, respectively, we plotted a heatmap of the fitting error landscape (Fig. 2b; see “Methods” for details). Using the heatmap, we analyzed which experimental setup can lead to precise estimation.

Fig. 2: Using a single inhibitor concentration larger than inhibition constants leads to precise estimation.
figure 2

a The initial velocity data (red dots), obtained based on a true \({K}_{{ic}}\) and \({K}_{{iu}}\) pair with a single \({I}_{T}\) value and several \({S}_{T}\) values (see “Methods” for details), were compared with Eq. 1 of different parameter pairs (e.g., solid and dashed lines). b For each parameter pair, the fitting error between Eq. 1 and the data was calculated. The heatmap representing the fitting error landscape was plotted for a range of parameter pairs. c When \({I}_{T}=0.1\) μM was much smaller than the true parameter values (\({K}_{{ic}}={K}_{{iu}}=1\) μM), a wide range of parameter pairs (e.g., ▲, , ) led to low fitting errors, indicating imprecise estimation. d The fitted curves with these parameter pairs matched the data well. ef When \({I}_{T}=2\) μM was greater than the true \({K}_{{ic}}=1\) μM but smaller than the true \({K}_{{iu}}=10\) μM, the pairs of the same \({K}_{{ic}}\) but distinct \({K}_{{iu}}\) values (e.g., ▲, ) led to accurate fitting, indicating precise \({K}_{{ic}}\) but not \({K}_{{iu}}\) estimation. g, h Conversely, when \({I}_{T}=2\) μM exceeded the true \({K}_{{iu}}=1\) μM but much smaller than the true \({K}_{{ic}}=10\) μM, the pairs of the same \({K}_{{iu}}\) but distinct \({K}_{{ic}}\) values (e.g., , ) led to accurate fitting, Indicating precise \({K}_{{iu}}\) but not \({K}_{{ic}}\) estimation. i, j When \({I}_{T}=10\) μM was greater than or comparable to both true parameter values (\({K}_{{ic}}={K}_{{iu}}=1\) μM), precise estimation for both \({K}_{{ic}}\) and \({K}_{{iu}}\) is possible (). For c-j, the true data were obtained with \({S}_{T}=0.2{K}_{M}\), \({K}_{M}\), and \(5{K}_{M}\) (\({K}_{M}=1\) μM). The initial velocity data were normalized by \({V}_{\max }=0.1\) μM\(/\min /{{\rm{mg}}} \,{{\rm{protein}}}\).

Our analysis revealed that the precision of estimating the inhibition constants varies depending on the relationship between \({I}_{T}\) and the inhibition constants \({K}_{{ic}}\) and \({K}_{{iu}}\) (Fig. 2c–j), which can be divided into four types. First, when \({I}_{T}\) is much lower than \({K}_{{ic}}\) and \({K}_{{iu}}\) (\({I}_{T}\, \ll \, {K}_{{ic}}\) and \({I}_{T}\, \ll \, {K}_{{iu}}\)), the dark region is broadly distributed across \({K}_{{ic}}\) and \({K}_{{iu}}\) (Fig. 2c). This indicates that there are many pairs of \({K}_{{ic}}\) and \({K}_{{iu}}\) that can fit the given data (Fig. 2d) and that precise estimation is not possible due to the identifiability issue. Second, when \({I}_{T}\) is higher than \({K}_{{ic}}\) but not \({K}_{{iu}}\), (\({K}_{{ic}} \, \le \, {I}_{T} \, \ll \, {K}_{{iu}}\)), the dark region is broadly distributed only in the \({K}_{{iu}}\) direction (Fig. 2e). This indicates that pairs having similar \({K}_{{ic}}\) but different \({K}_{{iu}}\) to the true inhibition constants can fit the given data (Fig. 2f), and that precise estimation is possible for \({K}_{{ic}}\) but not for \({K}_{{iu}}\) due to the identifiability issue for \({K}_{{iu}}\). Third, when conversely \({I}_{T}\) is higher than \({K}_{{iu}}\) but not \({K}_{{ic}}\), (\({K}_{{iu}}\le {I}_{T}\ll {K}_{{ic}}\)), the dark region is broadly distributed only in the \({K}_{{ic}}\) direction (Fig. 2g). This indicates that pairs having similar \({K}_{{iu}}\) but different \({K}_{{ic}}\) to the true inhibition constants can fit the given data and that precise estimation is only possible for \({K}_{{iu}}\) but not for \({K}_{{ic}}\) due to the identifiability issue for \({K}_{{ic}}\) (Fig. 2h). Finally, when \({I}_{T}\) is higher than \({K}_{{ic}}\) and \({K}_{{iu}}\) (\({I}_{T} \, \ge \, {K}_{{ic}}\) and \({I}_{T} \, \ge \, {K}_{{iu}}\)), the dark region is narrowly distributed (Fig. 2i). This indicates that only pairs of \({K}_{{ic}}\) and \({K}_{{iu}}\) with values similar to the true inhibition constants can fit the given data (Fig. 2j), and that precise estimation is possible without an identifiability issue.

These changes in precision depending on \({I}_{T}\) occur because Eq. 1 approximates differently for each \({I}_{T}\) setup. When \({I}_{T} \, \ll \, {K}_{{ic}}\) and \({I}_{T} \, \ll \, {K}_{{iu}}\), both \({K}_{{ic}}\) and \({K}_{{iu}}\) are negligible in Eq. 1 since \(1+\frac{{I}_{T}}{{K}_{{ic}}}\approx 1\) and \(1+\frac{{I}_{T}}{{K}_{{iu}}}\approx 1\). This results in an identifiability issue for both \({K}_{{ic}}\) and \({K}_{{iu}}\). As \({I}_{T}\) increases above one of the inhibition constants (i.e., \({{K}_{{ic}} \, \le \, I}_{T} \, \ll \, {K}_{{iu}}\) or \({{K}_{{iu}} \, \le \, I}_{T} \, \ll \, {K}_{{ic}}\)), the inhibition constant much higher than \({I}_{T}\) becomes negligible since \(1+\frac{{I}_{T}}{{K}_{{iu}}}\approx 1\) or \(1+\frac{{I}_{T}}{{K}_{{ic}}}\approx 1\). This results in an identifiability issue only for the omitted inhibition constant.

In conclusion, contrary to the canonical approach varying \({I}_{T}\), our results suggest that \({I}_{T}\) lower than \({K}_{{ic}}\) and \({K}_{{iu}}\) is unnecessary for accurate and precise estimation of inhibition constants. This result is consistent with previous studies that investigated experimental design for accurate estimation of the inhibition constant with competitive and uncompetitive inhibition; those studies found that the ratio \(\frac{{I}_{T}}{{K}_{{ic}}}\) (competitive inhibition) or \(\frac{{I}_{T}}{{K}_{{iu}}}\) (uncompetitive inhibition) lower than 1 induces inaccurate estimation and type classification16. We have now generalized this for all inhibition types.

IC50 can serve as an experimental criterion for avoiding identifiability issues

In the previous section, we suggested a criterion of appropriate \({I}_{T}\) for precise estimation of inhibition constants. This criterion is based on the relationship between \({I}_{T}\) and inhibition constants. However, since the inhibition constants are not known prior to estimation, it remains challenging to determine an appropriate \({I}_{T}\) value for experiments. In contrast, \({{IC}}_{50}\) can be determined before estimating the inhibition constants (Fig. 1b). Thus, \(I{C}_{50}\) could serve as an experimental criterion for determining an appropriate \({I}_{T}\). For this, we examined the relationship between \({{IC}}_{50}\) and inhibition constants.

A key relationship between \(I{C}_{50}\) and inhibition constants comes from a well-established formula in enzyme kinetics, called the Cheng-Prusoff equation22,23 (Fig. 3a):

$$\frac{1}{I{C}_{50}}=\frac{\alpha }{{K}_{{ic}}}+\frac{1-\alpha }{{K}_{{iu}}}=\frac{1}{H\left({K}_{{ic}},{K}_{{iu}}\right)},\alpha=\frac{{K}_{M}}{{S}_{T}+{K}_{M}}.$$
(2)
Fig. 3: Using an inhibitor concentration above IC50 with IC50 regularization allows for precise estimation.
figure 3

a \(I{C}_{50}\), measured with \({S}_{T}\), is a weighted harmonic mean of \({K}_{{ic}}\) and \({K}_{{iu}}\), with the weight of \(\alpha=\frac{{K}_{M}}{{S}_{T}+{K}_{M}}\) (Eq. 2). bd Since \(I{C}_{50}\) is always greater than either \({K}_{{ic}}\) or \({K}_{{iu}},\) using \({I}_{T}\ge I{C}_{50}\) allows for precise estimation of \({K}_{{ic}}\) (b), \({K}_{{iu}}\) (c), or both (d) when \({K}_{{ic}}\) and \({K}_{{iu}}\) differ within 10-fold (See Supplementary Fig. 1 for other cases). e With known \({{IC}}_{50}\), the unknown \({K}_{{ic}}\) and \({K}_{{iu}}\) need to satisfy the weighted harmonic mean equation, \(H\left({K}_{{ic}},{K}_{{iu}}\right)={{{{\rm{IC}}}}}_{50}\), which can be incorporated into estimation via a regularization term (regularization error; Eq. 3). f–h The heatmap shows regularization error, where the low regularization error region (black region) corresponds to the \({K}_{{ic}}\) and \({K}_{{iu}}\) values satisfying the constraint \(H\left({K}_{{ic}},{K}_{{iu}}\right)=I{C}_{50}\). i–k Incorporating the regularization error into the fitting error (total error) allowed for precise estimation of \({K}_{{ic}}\) and \({K}_{{iu}}\) (dashed lines), reducing the low error region (black region) compared to unregularized cases (b–d). l \(I{C}_{50}\)-based optimal approach (50-BOA) uses only a single \({I}_{T}\ge {{IC}}_{50}\) along with the \({{IC}}_{50}\) regularization, considerably reducing the number of required experiments for precise and accurate estimation, compared to the canonical approach (Fig. 1b).

This equation holds when Eq. 1 accurately predicts the initial velocity of product formation. According to this equation, \({{IC}}_{50}\) is the weighted harmonic mean between \({K}_{{ic}}\) and \({K}_{{iu}}\) with weight \(\alpha\). This indicates that \(I{C}_{50}\) satisfies one of the following three cases: \({K}_{{ic}} \, < \, I{C}_{50} \, < \, {K}_{{iu}}\) (Fig. 3b), \({K}_{{iu}} \, < \, I{C}_{50} \, < \, {K}_{{ic}}\)(Fig. 3c), or \({K}_{{ic}} \approx I{C}_{50} \approx {K}_{{iu}}\) (Fig. 3d). Thus, if \({I}_{T} \, \ge \, {{IC}}_{50}\) in the experiment, precise estimation is possible for only \({K}_{{ic}}\) in the first case (Fig. 3b) and only \({K}_{{iu}}\) in the second case (Fig. 3c). On the other hand, in the third case, precise estimation is possible for both \({K}_{{ic}}\) and \({K}_{{iu}}\) (Fig. 3d).

Incorporating IC50-based regularization enhances precision across a broader range

Although using \({I}_{T} \, \ge \, I{C}_{50}\) is sufficient for the precise estimation of at least one inhibition constant, identifiability issues persist for the other inhibition constant when it is considerably greater than \(I{C}_{50}\) (Fig. 3b, c). To address this limitation, we incorporated \(I{C}_{50}\) information into the estimation process to enhance the precision. Specifically, given that \(I{C}_{50}\) is known, the inhibition constants must satisfy Eq. 2 (i.e., \(I{C}_{50}=H\left({K}_{{ic}},\,{K}_{{iu}}\right)\); Fig. 3e). To incorporate this constraint into the estimation process, we defined a regularization error as the square of the relative error between \(I{C}_{50}\) and \(H\left({K}_{{ic}},{K}_{{iu}}\right)\) (Fig. 3e–h):

$${\left[\frac{I{C}_{50}-H\left({K}_{{ic}},{K}_{{iu}}\right)}{I{C}_{50}}\right]}^{2}.$$
(3)

We then combined this regularization error, weighted by a regularization constant (\(\lambda\)), with the fitting error to define a loss function (total error): total error = fitting error + \(\lambda \times\) regularization error. The value of \(\lambda\) was selected to minimize the cross-validation error (Supplementary Fig. 2; see “Methods” for details). We then estimated the inhibition constants by minimizing the total error, rather than the fitting error. This allows for the incorporation of both initial velocity data and \(I{C}_{50}\) information simultaneously in the estimation process.

This approach yielded more precise estimations of both inhibition constants compared to just using the fitting error. That is, when \({K}_{{iu}} \approx I{C}_{50} \approx {K}_{{ic}}\) (Fig. 3d), estimation with the total error further enhanced precision (Fig. 3k). Importantly, even when \(I{C}_{50}\) is lower than one of the inhibition constants, precise estimation becomes possible. Specifically, when \({K}_{{ic}} \, < \, I{C}_{50} \, < \, {K}_{{iu}}\), precise estimation of both constants was achieved using the total error (Fig. 3i), which was not possible for \({K}_{{iu}}\) using only the fitting error (Fig. 3b). Similarly, when \({K}_{{iu}} \, \le \, I{C}_{50} \, < \, {K}_{{ic}}\), both constants were precisely estimated using the total error (Fig. 3j), which was not possible for \({K}_{{ic}}\) with only the fitting error (Fig. 3c). However, with the competitive case (\({K}_{{iu}}\) is much greater than \({K}_{{ic}}\)) and uncompetitive case (\({K}_{{ic}}\) is much greater than \({K}_{{iu}}\)), even with the regularization, precise estimation of \({K}_{{iu}}\) and \({K}_{{ic}}\), respectively, is not possible (Supplementary Fig. 1). Note that for the competitive (or uncompetitive case), the estimation of \({K}_{{iu}}\) (or \({K}_{{ic}}\)) is not critical.

Since this approach relies on \(I{C}_{50}\), the error in the estimation of \(I{C}_{50}\) could propagate the estimation of inhibition constants. Nevertheless, the error of \(I{C}_{50}\) has minimal effect on the estimation of the dominant inhibition constants: \({K}_{{ic}}\) when \({K}_{{ic}} \, < \, I{C}_{50} \, < \, {K}_{{iu}}\) (Supplementary Fig. 3a), \({K}_{{iu}}\) when \({K}_{{iu}} \, < \, I{C}_{50} \, < \, {K}_{{ic}}\) (Supplementary Fig. 3e), and both \({K}_{{ic}}\) and \({K}_{{iu}}\) when \({K}_{{ic}} \approx I{C}_{50} \approx {K}_{{iu}}\) (Supplementary Fig. 3g, h). This robustness arises because as the bias in \(I{C}_{50}\) increases, the influence of the regularization term is automatically reduced through a decrease in \(\lambda\) (Supplementary Fig. 3c, f, i). This adjustment is facilitated by the cross-validation, which selects \(\lambda\) by evaluating how well the estimates with the \(I{C}_{50}\) information aligns with the initial velocity data (Supplementary Fig. 2; see “Methods” for details). Consequently, this approach enables precise estimations of the dominant inhibition constants when \(I{C}_{50}\) is biased.

Based on these findings, we propose an \(I{C}_{50}\)-based optimal approach for precise estimation of inhibition constants that we call 50-BOA (Fig. 3l): First, \(I{C}_{50}\) is estimated using the same method as in the canonical approach (Fig. 3l (i)). After estimating \(I{C}_{50}\), an experimental design is established using \({S}_{T}\) at \(0.2{K}_{M}\), \({K}_{M}\), and \(5{K}_{M}\) and a single \({I}_{T}\) that satisfies \({I}_{T} \, \ge \, I{C}_{50}\) (Fig. 3l. (ii)). For each combination of concentrations, the initial velocity is measured. Then, by fitting Eq. 1 to the data with the regularization term (Eq. 3), the inhibition constants minimizing the total error are estimated (Fig. 3l (iii)). The 50-BOA is expected to achieve a precise, accurate, and efficient estimation of the inhibition constants, as it uses only the measurements necessary for precise estimation from those used in the canonical approach.

The 50-BOA demonstrates precise, accurate, and efficient estimation for mixed inhibition experimental data

We evaluated whether the 50-BOA could precisely, accurately, and efficiently estimate inhibition constants in the case of mixed inhibition using real experimental data. For this, we utilized the triazolam (substrate) and ketoconazole (inhibitor) pair data derived from experiments varying \({S}_{T}\) (\(10\) to \(500\) μM) and \({I}_{T}\) (\(0\), \(0.01\) to \(0.5\) μM), spanning ranges around \({K}_{M}=72\) μM and \(I{C}_{50}=0.04\) μM, respectively20 (Fig. 4a).

Fig. 4: The 50-BOA enables precise and accurate estimation for experimental mixed inhibition data.
figure 4

a The normalized initial velocities (\(\frac{{V}_{0}}{{V}_{\max }}\); \({V}_{\max }=1.02\) μM/min−1/mg protein) of triazolam with its inhibitor ketoconazole (\(I{C}_{50}=0.040\) μM when \({S}_{T}=50\) μM; \({K}_{M}=72\) μM) were measured for various triazolam (\({S}_{T}=10\), \(25\), \(50\), \(100\), \(250\), and \(500\) μM) and ketoconazole (\({I}_{T}=0\), \(0.01\), \(0.025\), \(0.05\), \(0.1\), \(0.25\), and \(0.5\) μM) concentration combinations, and fitted with Eq. 1 (solid lines). b Heatmaps of the total errors without (upper row) and with (lower row) regularization for entire or single \({I}_{T}\) setups. Using the entire \({I}_{T}\) range (i.e., canonical condition) led to precise \({K}_{{ic}}\) and \({K}_{{iu}}\) estimation, identifying mixed inhibition. For single \({I}_{T}\) setups, when \({I}_{T}\) \( < \) \(I{C}_{50}\) (e.g., \({I}_{T}=0.01\) μM), estimation of \({K}_{{ic}}\) and \({K}_{{iu}}\) was imprecise, as indicated by the low-error markers at \({K}_{{ic}}=0.041\) μM, \({K}_{{iu}}=0.041\) μM (circle), \({K}_{{ic}}=0.041\) μM, \({K}_{{iu}}=100\) μM (triangle), and \({K}_{{ic}}=100\) μM, \({K}_{{iu}}=0.041\) μM (rectangle). However, precision became comparable to the canonical condition as \({I}_{T}\) increased through \({I}_{T}\approx I{C}_{50}\) (e.g., \({I}_{T}=0.05\) μM) to \({I}_{T} > I{C}_{50}\) (e.g., \({I}_{T}=0.5\) μM). Regularization enhanced precision to match the canonical condition. c, d Estimated parameters (dots and asterisks) and 95% confidence intervals (error bars). Utilizing a single \({I}_{T}\) lower than \(I{C}_{50}\) led to inaccurate and imprecise estimations for \({K}_{{ic}}\) and \({K}_{{iu}}\) (i.e., out of the 1.5-fold range (dotted lines) of the estimated parameters (triangle)). On the other hand, the 50-BOA (e.g., \({I}_{T}=0.5\) μM; diamonds) allows for precise and accurate estimation of both parameters comparable to the canonical condition (c, d). e The 50-BOA (diamonds) requires only one seventh of the experiments required for the canonical condition (triangle) with comparable precision and accuracy.

Using the entire \({S}_{T}\) and \({I}_{T}\) dataset (canonical condition), both inhibition constants were precisely estimated (\({K}_{{ic}}=0.041\) μM, \({K}_{{iu}}=0.041\) μM), as demonstrated by the narrow region showing low error in landscape (Fig. 4b) and confidence intervals (CIs; Fig. 4c, d). As the estimated \({K}_{{ic}}\) and \({K}_{{iu}}\) values did not considerably differ from each other, the inhibition was classified as the mixed type.

Next, we evaluated the precision, accuracy, and efficiency of the estimation using each single \({I}_{T}\) data only. When \({I}_{T} \, < \, I{C}_{50}\), both inhibition constants were imprecisely estimated even with regularization, as shown by the broader low error (i.e., black) region in the landscape (Fig. 4b; \({I}_{T}=0.01\) μM) and CIs (Fig. 4c, d; \({I}_{T}=0.01,\) \(0.025\) μM) compared to the canonical condition. This imprecision could lead to misclassification of the inhibition type. For instance, \({K}_{{ic}}=0.041\) μM, \({K}_{{iu}}=0.041\) μM (mixed; circle in Fig. 4b), \({K}_{{ic}}=0.041\) μM, \({K}_{{iu}}=100\) μM (competitive; triangle in Fig. 4b), and \({K}_{{ic}}=100\) μM, \({K}_{{iu}}=0.041\) μM (uncompetitive; rectangle in Fig. 4b) all showed low error in landscapes (Fig. 4b; \({I}_{T}=0.01\) μM). Additionally, the estimated \({K}_{{ic}}\) (Fig. 4c; \({I}_{T}=0.025\) μM) and \({K}_{{iu}}\) (Fig. 4d; \({I}_{T}=0.01\)) values fell outside the 1.5-fold ranges of those from the canonical condition, indicating inaccurate estimations. Here, the 1.5-fold range is a widely accepted criterion for evaluating prediction accuracy in the field of drug metabolism and pharmacokinetics24,25,26,27,28,29,30,31,32.

The imprecision and inaccuracy of the estimation with \({I}_{T} \, < \, I{C}_{50}\) were gradually resolved as \({I}_{T}\) increased. When \({I}_{T}\approx I{C}_{50}\), both inhibition constants were accurately estimated with all estimates falling within 1.5-fold ranges (Fig. 4c, d; \({I}_{T}=0.05\) μM). However, it still showed broader low error region in the landscapes (Fig. 4b; \({I}_{T}=0.05\)) and the CIs (Fig. 4c, d; \({I}_{T}=0.05\) μM) compared to the canonical condition. When \({I}_{T}\) was further increased so that \({I}_{T} \, > \, I{C}_{50}\), both inhibition constants were precisely and accurately estimated, showing a narrow low error region in the landscape (Fig. 4b; \({I}_{T}=0.5\) μM) and the CIs (Fig. 4c, d; \({I}_{T}=0.25,\) \(0.5\) μM) comparable to the canonical approach. This precision was further enhanced by adding regularization (Fig. 4c, d; 50-BOA). Therefore, the 50-BOA achieves precise and accurate estimation comparable to the canonical condition while substantially enhancing efficiency by reducing the number of required experiments to one seventh (Fig. 4e).

Notably, the low-error region when using single \({I}_{T} \, > \, I{C}_{50}\) data (Fig. 4b; \({I}_{T}=0.5\) μM) was more narrowly defined than that observed in the canonical approach (Fig. 4b; Entire \({I}_{T}\)). This finding indicates that using more data did not improve precision, which is counterintuitive. To investigate this, we analyzed how noise in the data affects estimation depending on the \({I}_{T}\) level. Our analysis revealed that estimation is considerably more sensitive to experimental noise in \({I}_{T} \, < \, I{C}_{50}\) data compared to \({I}_{T} \, > \, I{C}_{50}\) data (Supplementary Fig. 4a). Consequently, adding \({I}_{T} \, < \, I{C}_{50}\) data to \({I}_{T} \, > \, I{C}_{50}\) data increased the sensitivity of the estimation to noise, ultimately decreasing precision compared to using \({I}_{T} \, > \, I{C}_{50}\) data exclusively (Supplementary Fig. 4b).

The 50-BOA demonstrates precise, accurate, and efficient estimation for competitive inhibition experimental data

We further evaluated whether the 50-BOA could also achieve precise, accurate, and efficient estimation in the case of competitive or uncompetitive inhibition using real experimental data. We utilized the chlorzoxazone (substrate) and ethambutol (inhibitor) pair data derived from experiments varying \({S}_{T}\) (\(12.5\) to \(100\) μM) and \({I}_{T}\) (\(0\), \(0.41\) to \(11.1\) μM) spanning around \({K}_{M}=39.1\) μM and \(I{C}_{50}=4\) μM, respectively33 (Fig. 5a).

Fig. 5: The 50-BOA allows for precise and accurate estimation for experimental competitive inhibition data.
figure 5

a The normalized initial velocities (\(\frac{{V}_{0}}{{V}_{\max }}\); \({V}_{\max }=11.77\) μM/min−1\(/{{\rm{mg}}}\) \({{\rm{protein}}}\)) of chlorzoxazone with its inhibitor ethambutol (\({{{\rm{I}}}}{{{{\rm{C}}}}}_{50}=4\) μM when \({S}_{T}=50\) μM\({{;}}\) \({K}_{M}=39.1\) μM) were measured for various chlorzoxazone (\({S}_{T}=12.5\), \(25\), \(50\), \(75\), and \(100\) μM) and ethambutol (\({I}_{T}=0\), \(0.41\), \(1.23\), \(3.7\), and \(11.1\) μM) concentration combinations and fitted with Eq. 1 (solid lines). b Heatmaps of the total errors without (upper row) and with (lower row) regularization for entire \({I}_{T}\) or single \({I}_{T}\) setups. Using the entire \({I}_{T}\) (i.e., canonical condition) led to precise \({K}_{{ic}}\) estimation, while \({K}_{{iu}}\) estimation remained imprecise in a wide range even with regularization, indicating competitive inhibition. For single \({I}_{T}\) setups, the precision of the \({K}_{{ic}}\) estimation became comparable to the canonical condition, as \({I}_{T}\) increased from \({I}_{T} \, < \, {{{\rm{I}}}}{{{{\rm{C}}}}}_{50}\) (e.g., \({I}_{T}=0.41\) μM) through \({I}_{T}\approx {{{\rm{I}}}}{{{{\rm{C}}}}}_{50}\) (e.g., \({I}_{T}=3.7\) μM) to \({I}_{T} \, > \, {{{\rm{I}}}}{{{{\rm{C}}}}}_{50}\) (e.g., \({I}_{T}=11.1\) μM). c The \({K}_{{ic}}\) values (dots) with 95% confidence intervals (error bars) estimated with regularization. Using a single \({I}_{T}\) lower than \({{{\rm{I}}}}{{{{\rm{C}}}}}_{50}\) led to inaccurate or imprecise estimation (i.e., out of the 1.5-fold range (dotted lines) of the estimated \({K}_{{ic}}\) (triangle)), while the 50-BOA (diamond) enables precise and accurate estimation comparable to the canonical condition. d The asymptotic distribution of estimated \({K}_{{iu}}\) (see “Methods” for details) from the canonical condition is located at a substantially high value (\(\approx {10}^{16}\) μM; dotted rectangle), reflecting the competitive inhibition type. Using a single \({I}_{T}\) lower than \({{{\rm{I}}}}{{{{\rm{C}}}}}_{50}\) led to an unexpected mode near \(10\) μM (dashed rectangle), indicating potential misclassification of the inhibition type as mixed. Conversely, the 50-BOA provides \({K}_{{iu}}\) distribution consistent with the canonical condition, avoiding such misclassification. e The 50-BOA (diamond) requires only one fifth of the experiments required for the canonical condition (triangle) with comparable precision and accuracy.

Using the entire \({S}_{T}\) and \({I}_{T}\) dataset (canonical condition), only \({K}_{{ic}}\) was precisely estimated, as demonstrated by the narrow vertical region showing low error in landscape (Fig. 5b) and confidence interval (CI) for \({K}_{{ic}}\) (Fig. 5c). This vertical region extended within a range of much higher \({K}_{{iu}}\) than \({K}_{{ic}}\) with and without regularization, indicating the competitive inhibition type. The competitive type was also indicated by the unimodal distribution of the estimated \({K}_{{iu}}\) located near \({K}_{{iu}}={10}^{16}\) μM \(\gg \, {K}_{{ic}}\) (Fig. 5d).

Next, we evaluated the precision, accuracy, and efficiency of the estimation using each single \({I}_{T}\) data only. When \({I}_{T} \, < \, I{C}_{50}\), \({K}_{{ic}}\) was imprecisely and inaccurately estimated, as shown by the broader low error region in the landscape (Fig. 5b; \({I}_{T}=0.41\) μM) and confidence interval (CI; Fig. 5c; \({I}_{T}=1.23\) μM) compared to the canonical condition. The estimated \({K}_{{ic}}\) value also fell outside the 1.5-fold range of that from the canonical condition, indicating inaccurate estimation (Fig. 5c; \({I}_{T}=0.41\) μM). Additionally, the classification of inhibition type was imprecise, as the low error region in the landscape (Fig. 5b; \({I}_{T}=0.41\) μM) extended to around \(10\) μM (mixed). The imprecise type classification was also demonstrated by the distribution of estimated (Fig. 5d; \({I}_{T}=0.41,\) \(1.23\) μM) \({K}_{{iu}}\), with two modes located near \({K}_{{iu}}=10\) μM\(\, \gtrsim \, {K}_{{ic}}\) (mixed) and \({K}_{{iu}}={10}^{16}\) μM\(\, \gg \,{K}_{{ic}}\) (competitive).

The imprecision and inaccuracy of the estimation with \({I}_{T}\, < \,I{C}_{50}\) were gradually resolved as \({I}_{T}\) increased over \(I{C}_{50}\). When \({I}_{T}\approx I{C}_{50}\), \({K}_{{ic}}\) was precisely and accurately estimated, shown by the narrow vertical region showing low error in landscape (Fig. 5b; \({I}_{T}=3.7\) μM) and CIs (Fig. 5c; \({I}_{T}=3.7\) μM) for \({K}_{{ic}}\) comparable to the canonical condition. Estimated \({K}_{{ic}}\) also fell within the 1.5-fold range (Fig. 5c; \({I}_{T}=3.7\) μM). However, type classification remained imprecise as the distribution of the estimated \({K}_{{iu}}\) ranged from around \(10\) μM to \({10}^{16}\) μM (Fig. 5d; \({I}_{T}=3.7\) μM). When \({I}_{T}\) was further increased so that \({I}_{T}\, > \,I{C}_{50}\), type classification was precise, as shown by the unimodal distribution of estimated \({K}_{{iu}}\) located near \({K}_{{iu}}={10}^{16}\) μM\(\, \gg \,{K}_{{ic}}\) (Fig. 5d; \({I}_{T}=11.1\) μM; 50-BOA). As \({K}_{{ic}}\, \ll \,{K}_{{iu}}\), this inhibition was successfully classified as the competitive type. Therefore, the 50-BOA provided a precise and accurate estimation comparable to the canonical condition, while achieving greater efficiency by reducing the number of required experiments to one fifth (Fig. 5e).

Extending the 50-BOA to systems with multiple substrate types or cooperativity

The 50-BOA was originally developed based on the mixed inhibition model (Eq. 1), which describes typical enzyme inhibition involving a single substrate and where initial velocities follow a hyperbolic relationship with \({S}_{T}\). However, when multiple substrate types are involved, or when enzyme-substrate binding or enzyme-inhibitor binding exhibits cooperativity, the mixed inhibition model fails to adequately describe the systems. Such systems require alternative models34. This raises the question of whether the 50-BOA, using data points from a single high \({I}_{T}\) with \(I{C}_{50}\)-based regularization, remains effective for systems described by the alternative models, enabling accurate and precise estimation of inhibition constants. To investigate this, we examined three systems that differ from the system described by the mixed inhibition model (Eq. 1), requiring alternative models for initial velocities34: a system in which two different substrates bind to an enzyme at distinct binding sites (see Supplementary Notes 1.1 and 1.2 for details); a system exhibiting cooperative enzyme-substrate binding (see Supplementary Notes 1.3 and 1.4 for details); and a system exhibiting cooperative enzyme-inhibitor binding (see Supplementary Notes 1.5 and 1.6 for details). For each system, we applied the 50-BOA with a loss function modified to align with the respective alternative models for initial velocities. We found that, even under conditions diverging from the mixed inhibition model, the 50-BOA provided precise and accurate estimations comparable to those obtained under canonical conditions. These results suggest the extensibility of the 50-BOA approach to more complex and realistic systems.

Discussion

Traditionally, estimating inhibition constants (\({K}_{{ic}}\), \({K}_{{iu}}\)) involves fitting the inhibition model (Eq. 1) to initial reaction velocity data obtained from various substrate and inhibitor concentrations (Fig. 1). It is well established that using a range of substrate and inhibitor concentrations ensures accurate and precise estimation of inhibition constants when the type of inhibition (e.g., competitive and uncompetitive) is known in advance16. However, identifying the inhibition type often requires accurate and precise estimation of inhibition constants, giving rise to a “chicken and egg” problem. This problem can be addressed when one can accurately and precisely estimate the inhibition constants of the mixed inhibition model, which covers all inhibition types and thus does not require prior knowledge.

Here, we identified the optimal design by examining the error landscape (Fig. 2). Surprisingly, accurate and precise estimation of both \({K}_{{ic}}\) and \({K}_{{iu}}\) of the mixed inhibition model is possible using initial velocities measured at a single inhibitor concentration greater than \(I{C}_{50}\) (Fig. 3). This successful estimation with a much smaller amount of data compared to the conventional experimental condition (i.e., a wide range of inhibitor concentrations) was made possible by incorporating the relationship between \({K}_{{ic}}\), \({K}_{{iu}}\), and \(I{C}_{50}\) as a regularization term (Eqs. 2, 3) in the estimation process along with conventional data fitting. Based on these results, we proposed the 50-BOA (\(I{C}_{50}\)-Based Optimal Approach), which enables accurate and precise estimation with less experimental data compared to the canonical approach (Fig. 3). The 50-BOA reduces reagent and enzyme resources consumption and minimize the necessary experimental repetitions. As a result, it can enhance the overall efficiency of the evaluation and development processes in fields such as drug discovery12,23 and food technology9, ultimately providing cost and labor savings.

We demonstrated that, even for systems with multiple substrates (Supplementary Notes 1.1 and 1.2), enzyme-substrate binding cooperativity (Supplementary Notes 1.3 and 1.4), or enzyme-inhibitor binding cooperativity (Supplementary Notes 1.5 and 1.6), the 50-BOA can precisely and accurately estimate inhibition constants with a modified loss function based on alternative initial velocity models. This suggests the potential of the 50-BOA approach when applied to more complex and realistic systems, such as those in which both enzyme-substrate and enzyme-inhibitor binding exhibit cooperativity, or where a cooperative inhibitor binds to an enzyme through both competitive and uncompetitive mechanisms. Investigating these scenarios further would be valuable.

To maximize the utility and accessibility of the 50-BOA, we provide the automated 50-BOA as a package in MATLAB and R. This package supports not only systems with mixed inhibition (Eq. 1) but also those involving multiple substrate types (Supplementary Fig. 5) or cooperativity (Supplementary Figs. 6, 7), enabling users to easily verify the accuracy and precision of inhibition constant estimations visually. Users can simply input initial reaction velocity data, along with substrate and inhibitor concentrations in Excel format, \({V}_{\max }\), \({K}_{M}\), and \(I{C}_{50}\) into the package. The package then provides the estimation of inhibition constants and an error landscape (See Supplementary Notes 1.8 for details). By examining the width of the error landscape, users can intuitively assess the precision of their estimation, and the shape of the error landscape allows simultaneous confirmation of the inhibition type.

To further enhance flexibility, the package allows incorporation of various error structures. By default, the 50-BOA is based on the sum of squared relative errors, assuming normally distributed initial velocity measurement errors with standard deviations proportional to the magnitude of the initial velocity, consistent with prior studies17. However, error structures in real systems can deviate from this assumption, with error standard deviation potentially depending on different powers of initial velocities or on \({S}_{T}\) and \({I}_{T}\), rather than solely on initial velocities35. We found that across various error structures, the 50-BOA provides precise and accurate estimates comparable to the conventional approach whenever error standard deviation is positively correlated with initial velocity (Supplementary Fig. 9a–o). In contrast, when error standard deviation is not positively correlated with initial velocity, neither the 50-BOA nor the conventional approach achieves precision and accuracy (Supplementary Fig. 9p–q). Importantly, modifying the loss functions to account for these error structures (Supplementary Notes 1.7) enabled both the 50-BOA and the conventional approach to achieve precise and accurate estimates (Supplementary Fig. 9r). This finding aligns with a previous study emphasizing the critical role of incorporating the correct error structure for precise kinetic data analysis35. To facilitate this, we have made the 50-BOA package available for incorporating custom error structures (see Supplementary Notes 1.8 for details).

While the package supports such modifications, this approach requires accurately identifying the error structure beforehand. If identifying the error structure is challenging, an alternative approach is to conduct multiple experiments could be performed to obtain reliable point estimates of the initial velocity at experimental points where the number of points equals the number of parameters to be estimated. These initial velocity data could then be used to solve a system of algebraic equations derived from the initial velocity equation (e.g., Eq. 1) to estimate the parameters, as suggested by a previous study36.

While the 50-BOA can achieve precision comparable to that of the canonical approach, further improvements can be made by increasing the number of experimental conditions. For this, two potential strategies can be considered: increasing the number of \({S}_{T}\) values while keeping \({I}_{T}\) constant, or using additional \({I}_{T}\) values. While both strategies improve precision, we found that adding \({I}_{T} \, > \, I{C}_{50}\) values yields a more substantial improvement in precision compared to increasing \({S}_{T}\) values while keeping \({I}_{T}\) constant (Supplementary Fig. 10). Therefore, if additional experiments are planned, focusing on \({I}_{T} \, > \, I{C}_{50}\) values higher than those originally used is effective strategy for achieving greater precision.

Thus far, we have recommend using a single \({I}_{T}\) value that satisfies \({I}_{T} \, \ge \, I{C}_{50}\) without providing an explicit upper limit. While using the highest possible \({I}_{T}\) value appears favorable based on our analysis, practical constraints may limit this. For example, the use of excessively high \({I}_{T}\) values may inhibit non-target enzymes, which leads to reducing selectivity of inhibitors37. Furthermore, excessive inhibitor concentration can completely inhibit production of metabolite due to strong inhibitory effects, thereby compromising the accuracy of initial velocity measurements and leading to imprecise predictions of inhibition constants. Therefore, it is generally advisable to use an \({I}_{T}\) value slightly larger than \(I{C}_{50}\) or at most a few times larger, as suggested by conventional approaches. Investigating a more precise upper limit for \({I}_{T}\), while accounting for experimental and measurement constraints, would be a valuable direction for future research.

To address the chicken-and-egg problem of identifying inhibition types and estimating parameters, previous studies have primarily focused on accurately determining the inhibition type before parameter estimation. Traditionally, the identification of the inhibition type relies on visual methods using linear versions of inhibition models, such as Lineweaver-Burk or Dixon plots38,39. However, these linearization techniques can introduce substantial errors17. A more contemporary approach involves nonlinear fitting to the models representing mixed (Eq. 1), competitive (Eq. 1 with \(1+\frac{{I}_{T}}{{K}_{{iu}}}=1\)), and uncompetitive (Eq. 1 with \(1+\frac{{I}_{T}}{{K}_{{ic}}}=1\)) types, selecting the best model based on criteria such as coefficient of determination16 (R2). However, this approach does not consider the differences in the model complexity. While the model complexity can be addressed by using alternative criteria such as AIC (Akaike information criterion) or BIC (Bayesian information criterion)16, the choice of the model selection criterion remains somewhat subjective40. These limitations have led to different inhibition types being reported even for the same enzyme inhibition19. Moreover, while a prior study suggested optimal experimental designs to distinguish mixed from competitive inhibition to enhance model discrimination, the suggested design relies on prior estimates of unknown parameters41. To address these limitations, the 50-BOA uses a mixed inhibition model that unifies all inhibition types under a single equation (Eq. 1). This allows for simultaneous estimation of inhibition constants and identification of inhibition type, without the need for distinct optimization steps for model discrimination and prior parameter estimation.

Inhibition constants can be estimated through model fitting, but they are often determined by using the simplified Cheng-Prusoff equation (Eq. 4) based solely on \(I{C}_{50}\) information21,23,42 as follows:

$$I{C}_{50}=\left\{\begin{array}{c}\begin{array}{cc}{K}_{{ic}}\left(1+\frac{{S}_{T}}{{K}_{M}}\right) & {{\rm{if \,competitive}}}\end{array}\hfill \\ \begin{array}{cc}{K}_{{iu}}\left(1+\frac{{K}_{M}}{{S}_{T}}\right) & {{\rm{if \,uncompetitive}}}\end{array}\end{array}\right..$$
(4)

This approach is known for its simplicity and high accuracy21. However, since the Cheng-Prusoff equation varies depending on the inhibition type, estimating inhibition constants solely from \(I{C}_{50}\) data requires prior knowledge. On the other hand, the 50-BOA uses the general Cheng-Prusoff equation (Eq. 2), which does not require inhibition type, as a regularization term (Eq. 3) in the estimation process. Incorporating the Cheng-Prusoff equation into the fitting process leads to more precise estimations than those from the canonical inhibition model-based method.

Previous studies also have investigated optimal experimental designs for estimating inhibition constants, resulting in the recommendation of D-optimal design43. D-optimal design identifies experimental conditions that maximize the determinant of the information matrix, whose inverse approximates the volume of the confidence ellipsoid for parameter estimates. It recommends measuring initial velocities multiple times at the theoretical minimum number of experimental points—equal to the number of parameters—to maximize the determinant of the information matrix. However, this design depends on the unknown true parameter values, the very target of the estimation. As a result, this approach requires prior estimates of parameters that are close to the true values to achieve precise estimation. Furthermore, the design aims to minimize the overall volume of the confidence ellipsoid rather than focusing on the confidence intervals of individual parameters. As a result, it can lead to precise estimation of one parameter at the expense of others, as noted in previous research36. In this study, we addressed these limitations of D-optimal design. Our approach provides a straightforward experimental design based on known values, such as \(I{C}_{50}\), rather than relying on unknown true parameters. Additionally, by incorporating \(I{C}_{50}\) information as a regularization term in the optimization process, we achieved precise estimation for both inhibition constants while requiring only the theoretical minimum number of experimental points, which is equal to the number of parameters being estimated (i.e., two \({S}_{T}\) values and one \({I}_{T}\) value).

This study mainly focused on enzyme inhibition scenarios where the initial velocity data is described by the mixed inhibition model (Eq. 1). However, this model fails to capture certain cases, such as direct interactions between inhibitors and substrates. In these situations, applying the mixed inhibition model results in a poor fit, revealing discrepancies between the actual dynamics and those represented by the model. However, applying the 50-BOA method, which uses fewer data points than the conventional approach, might allow the mixed inhibition to fit the data more accurately by excluding critical data points that contribute to discrepancies. This raises a concern that the 50-BOA cannot distinguish whether the data is from a system involving direct inhibitor-substrate interactions or one involving mixed inhibition. Nevertheless, our analysis showed that when the mixed inhibition model was fitted to data obtained from a single, sufficiently high \({I}_{T}\) involving direct inhibitor-substrate interactions, it produced similarly low goodness-of-fit as the conventional approach (see Supplementary Notes 1.9 and Supplementary Fig. 11 for details). Therefore, even with a reduced dataset, the 50-BOA approach remains effective in distinguishing between direct inhibitor-substrate interactions and mixed inhibition.

Previously, we found that even when the \(I{C}_{50}\) value is biased, the 50-BOA can accurately and precisely estimate the dominant inhibition constant (e.g., \({K}_{{ic}}\) when \({K}_{{ic}} \, < \, I{C}_{50} \, < \, {K}_{{iu}}\)) (Supplementary Fig. 3a, b, d, h). However, the precise estimation of the other non-dominant inhibition constant (e.g., \({K}_{{iu}}\) when \({K}_{{ic}} \, < \, I{C}_{50} \, < \, {K}_{{iu}}\)) still requires accurate determination of \(I{C}_{50}\) (Supplementary Fig. 3e, g). Prior research has identified the minimum requirements for accurately and precisely estimating \(I{C}_{50}\)44. However, when inhibition occurs gradually with increasing inhibitor concentrations, even following these minimum requirements may result in imprecise \(I{C}_{50}\) estimates. Thus, future research could explore alternative experimental designs less dependent on \(I{C}_{50}\) when its estimation lacks precision. Additionally, it would be worth considering whether the data used to estimate \(I{C}_{50}\) could be leveraged to optimize the experimental design.

Even for typical enzyme mixed inhibition scenarios, the 50-BOA may exhibit reduced accuracy in scenarios where enzyme concentrations are high, such as in vivo experiments. The MM equation, which describes substrate metabolism, is only accurate when enzyme concentrations are sufficiently low, relative to the substrate concentration and \({K}_{M}\)45,46,47. This limitation extends to most equations based on the MM equation. For instance, drug clearance or enzyme induction equations have been shown to be inaccurate at high enzyme concentrations in vivo, necessitating alternative equations28,29,48. The same caution applies to enzyme inhibition, where the enzyme concentrations in in vitro experiments must be carefully controlled. Additionally, when dealing with in vivo conditions, it is important to be cautious about using inhibition models, which rely on the MM equation. Nonetheless, the inhibition model-based approaches have been used even in high enzyme concentration scenarios12,49. Therefore, in cases of high enzyme concentrations, whether in vivo or in vitro experiments where high enzyme concentrations are necessary50, the estimates obtained using the 50-BOA may be inaccurate. Recent studies have introduced new equations that address the inaccuracy of the MM equation, offering a more accurate, precise, and efficient estimation48,51,52,53. Since the development of equations to resolve the inaccuracy of inhibition models is ongoing46,54,55, future research could focus on developing a universally accurate and precise estimation method, building upon the methodology presented in this study.

Methods

Data

Estimating inhibition constants \({K}_{{ic}}\) and \({K}_{{iu}}\) requires initial velocity data. This data was either obtained through ordinary differential equation (ODE) simulations or drawn from previously reported experimental results. Simulation-generated data was utilized to examine how different setups of substrate (\({S}_{T}\)) and inhibitor (\({I}_{T}\)) concentrations impact the estimation of inhibition constants. This data was derived from time-series product (\(P\)) data, generated via full model simulation using the function ode15s built into MATLAB R2023a. The full model was built based on mass-action kinetics models of enzyme inhibition (Eq. 5; Fig. 1a).

$$\frac{{dS}}{{dt}} =-{k}_{1}{ES}-{k}_{5}{YS}+{k}_{-1}C+{k}_{-5}B\\ \frac{{dE}}{{dt}} =-{k}_{1}{ES}-{k}_{3}{EI}+\left({k}_{-1}+{k}_{2}\right)C+{k}_{-3}Y\\ \frac{{dC}}{{dt}} =-\left({k}_{-1}+{k}_{2}\right)C-{k}_{4}{CI}+{k}_{1}{ES}+{k}_{-4}B\\ \frac{{dI}}{{dt}} ={k}_{-3}Y+{k}_{-4}B-{k}_{3}{EI}-{k}_{4}{CI}\\ \frac{{dB}}{{dt}} =-\left({k}_{-4} +{k}_{-5}\right)B+{k}_{4}{CI}+{k}_{5}{YS}\\ \frac{{dY}}{{dt}} =-{k}_{-3}Y-{k}_{5}{YS}+{k}_{3}{EI}+{k}_{-5}B\\ \frac{{dP}}{{dt}} ={k}_{2}C$$
(5)

To represent ODE with parameters \({V}_{\max }\), \({K}_{M}\), \({K}_{{ic}}\), and \({K}_{{iu}}\), we modified the reaction rate in the full model to incorporate the parameters (Eq. 6). This modification was based on each parameter’s definition, with the ratio \(\frac{{k}_{-5}}{{k}_{5}}=\frac{{K}_{{iu}}}{{K}_{{ic}}}{K}_{M}\) determined by the detailed balance condition.

$$\frac{{dS}}{{dt}} =-\left({K}_{M}{k}_{-1}-\frac{{V}_{\max }}{{E}_{T}}\right){ES}-\frac{{K}_{{ic}}}{{K}_{{iu}}{K}_{M}}{k}_{-5}{YS}+{k}_{-1}C+{k}_{-5}B\\ \frac{{dE}}{{dt}} =-\left({K}_{M}{k}_{-1}-\frac{{V}_{\max }}{{E}_{T}}\right){ES}-{K}_{{ic}}{k}_{-3}{EI}+\left({k}_{-1}+\frac{{V}_{\max }}{{E}_{T}}\right)C+{k}_{-3}Y\\ \frac{{dC}}{{dt}} =-\left({k}_{-1}+\frac{{V}_{\max }}{{E}_{T}}\right)C-{K}_{{iu}}{k}_{-4}{CI}+\left({K}_{M}{k}_{-1}-\frac{{V}_{\max }}{{E}_{T}}\right){ES}+{k}_{-4}B\\ \frac{{dI}}{{dt}} ={k}_{-3}Y+{k}_{-4}B-{K}_{{ic}}{k}_{-3}{EI}-{K}_{{iu}}{k}_{-4}{CI}\\ \frac{{dB}}{{dt}} =-\left({k}_{-4}+{k}_{-5}\right)B+{K}_{{iu}}{k}_{-4}{CI}+\frac{{K}_{{ic}}}{{K}_{{iu}}{K}_{M}}{k}_{-5}{YS}\\ \frac{{dY}}{{dt}} =-{k}_{-3}Y-\frac{{K}_{{ic}}}{{K}_{{iu}}{K}_{M}}{k}_{-5}{YS}+{K}_{{ic}}{k}_{-3}{EI}+{k}_{-5}B\\ \frac{{dP}}{{dt}} =\frac{{V}_{\max }}{{E}_{T}}C$$
(6)

The initial conditions were set to \(S\left(0\right)={S}_{T}\), \(E\left(0\right)={E}_{T}=0.01\) μM, \(C\left(0\right)=0\), \(I\left(0\right)={I}_{T}\), \(B\left(0\right)=0\), \(Y\left(0\right)=0\), and \(P\left(0\right)=0\), and the inverse reaction rates \({k}_{-1}\), \({k}_{-3}\), \({k}_{-4}\), and \({k}_{-5}\) were all set to 100. To compute initial velocity data, we recorded the time (\(\tau\)) at which the product reached 1% of the initial substrate concentration and the product concentration (\({P}_{{exact}}\)) at that time56. We then obtained the observed product value (\({P}_{{obs}}\)) by adding a normally distributed error \(\epsilon \sim N\left({{\mathrm{0,0.01}}}\right)\) to \({P}_{{exact}}\) as follows17:

$${P}_{{obs}}={P}_{{exact}}+\epsilon \times {P}_{{exact}}.$$

The initial velocity (\({V}_{0}\)) was then calculated using the formula \({V}_{0}=\frac{{P}_{{obs}}}{\tau }\).

We also aimed to validate our methodology by assessing its ability to provide accurate and precise estimations using real experimental data. The data included initial velocity data and the \(I{C}_{50}\) value from two sources: the inhibition between triazolam (substrate) and ketoconazole (inhibitor) reported as mixed inhibition, and the inhibition between chlorzoxazone (substrate) and ethambutol (inhibitor) reported as competitive inhibition. The triazolam-ketoconazole data was extracted from the initial velocity graph reported in a previous study20 using PlotDigitizer, a valid and reliable tool for data extraction57. The \(I{C}_{50}\) value was taken from the same study. For the chlorzoxazone-ethambutol data, both the initial velocity data and the \(I{C}_{50}\) value were collected as reported in a previous study33.

To evaluate the performance of the 50-BOA extended to enzyme inhibition system involving multiple substrate types (Supplementary Fig. 5), substrate cooperativity (Supplementary Fig. 6), or inhibitor cooperativity (Supplementary Fig. 7), we generated synthetic data using alternative velocity equations (see Supplementary Notes 1.26 for details). Using each model, we first computed the exact initial velocity (\({V}_{{exact}}\)) under a specific experimental setup. The observed initial velocity (\({V}_{{obs}}\)) was then generated by adding a normally distributed error \(\epsilon \sim N\left(\right.0,\) \(0.01\)) to \({V}_{{exact}}\) as follows:

$${V}_{{obs}}={V}_{{exact}}+\epsilon \times {V}_{{exact}}.$$

The percentage of control activity data was also generated from alternative velocity equations. For each model, we generated the exact % control activity data \(({\%}_{{exact}})\), as the percentage of initial velocities at each \({I}_{T}\) relative to the initial velocity at \({I}_{T}=0\), across \({I}_{T}\) values ranging from \(1{0}^{-3}\) to \(1{0}^{3}\) μM. The observed % control activity data \(({\%}_{{obs}})\) was then obtained by adding a normally distributed error \(\epsilon \sim N({{\mathrm{0,1}}})\) to \({\%}_{{exact}}\) (i.e., \({\%}_{{obs}}={\%}_{{exact}}+\epsilon\)). By fitting the equation44

$$f({I}_{T})=\frac{100}{1+{\left({I}_{T}/I{C}_{50}\right)}^{\alpha }}$$

to the \({\%}_{{obs}}\) data, the \(I{C}_{50}\) and α values that minimized the sum of squared errors between the data and \(f\) were estimated.

Error landscape

The accuracy and precision of the estimated inhibition constants under various experimental conditions were evaluated using error landscapes. These error landscapes represent the shape of the loss function used in estimation, visualized within an x-y plane defined by specific ranges of \({K}_{{ic}}\) and \({K}_{{iu}}\). To establish these ranges, we set minimum (\({K}_{{ic}}^{\min }\), \({K}_{{iu}}^{\min }\)) and maximum (\({K}_{{ic}}^{\max }\), \({K}_{{iu}}^{\max }\)) values for candidates of estimated \({K}_{{ic}}\) and \({K}_{{iu}}\). We then divided these values into 100 parts on a logarithmic scale, generating candidate values for \({K}_{{ic}}\) (\({K}_{c}\)) and \({K}_{{iu}}\) (\({K}_{u}\)) as follows:

$${K}_{c}:=\left\{{K}_{{ic},1}={K}_{{ic}}^{\min },{K}_{{ic},2},\ldots,{K}_{{ic},100}={K}_{{ic}}^{\max }\right\},\\ {K}_{u}:=\left\{{K}_{{iu},1}={K}_{{iu}}^{\min },{K}_{{iu},2},\ldots,{K}_{{iu},100}={K}_{{iu}}^{\max }\right\}.$$

To analyze these candidates, we compared the initial velocity data (\({V}_{{{\mathrm{0,1}}}}^{*},\ldots,{V}_{0,m}^{*}\)) obtained from experimental setups with the substrate (\(S:=\left\{{S}_{T,1},\ldots,{S}_{T,m}\right\}\)) and inhibitor (\(I:=\left\{{I}_{T,1},\ldots,{I}_{T,m}\right\}\)) concentrations to the computed initial velocity data. This computed data (\({V}_{0}\left({K}_{{ic},i},{K}_{{iu},j},{S}_{T,1},{I}_{T,1}\right),\ldots,{V}_{0}\left({K}_{{ic},i},{K}_{{iu},j},{S}_{T,m},{I}_{T,m}\right)\)) was obtained by substituting each candidate \({K}_{{ic},i}\in {K}_{c}\), \({K}_{{iu},j}\in {K}_{u}\) and the experimental setups into the inhibition model (Eq. 1). We compared these data by calculating the fitting error (Eq. 7) for each candidate as follows:

$${{{\rm{fitting}}}}\, {{{\rm{error}}}}\left({K}_{{ic},i},{K}_{{iu},j},S,I\right)=\frac{1}{m}{\sum}_{k=1}^{m}{\left[\frac{{V}_{0,k}^{*}-{V}_{0}\left({K}_{{ic},i},{K}_{{iu},j},{S}_{T,k},{I}_{T,k}\right)}{{V}_{0,k}^{*}}\right]}^{2}.$$
(7)

By plotting these calculated fitting errors on the x-y plane, we created an error landscape, with \({K}_{{ic}}\) on the x-axis, \({K}_{{iu}}\) on the y-axis, and colors representing the fitting error values for each candidate (Fig. 2b).

Regularization

We incorporated a regularization term into the loss function used for estimation. This regularization term (Eq. 3) was defined as the squared relative error between the experimentally obtained \(I{C}_{50}\) and the calculated \(I{C}_{50}\) using the Cheng-Prusoff equation (Eq. 2). During estimation with regularization, the total error—comprising the sum of the fitting error and the regularization term, multiplied by a regularization constant (\(\lambda\))—was used as a loss function. The value of \(\lambda\) influences the bias-variance trade-off. A very small \(\lambda\) may lead to overfitting, resulting in low bias but high variance. Conversely, a very large \(\lambda\) can reduce variance but may introduce bias due to underfitting. Therefore, selecting appropriate \(\lambda\) is crucial for effective estimation with regularization.

To avoid overfitting and underfitting, we used cross-validation58 to determine \(\lambda\) within a predefined range \(\Lambda\) (Supplementary Fig. 2). Among \(\Lambda\), one value \(r\in \Lambda\) was selected. Then, for the initial velocity data, one data point was excluded and designated as test data, while the remaining data was used as training data. By fitting the training data and minimizing total error loss function with \(r\), the estimates \({\hat{K}}_{{ic}}\) and \({\hat{K}}_{{iu}}\) were obtained. By substituting \({\hat{K}}_{{ic}}\) and \({\hat{K}}_{{iu}}\) to the inhibition model (Eq. 1), we computed the initial velocity and calculated the squared relative error (test error) between the test data and the computed initial velocity. We repeated this process for all combinations of training and test data, summing up the test errors to calculate the cross-validation error (cv error) for \(r\). We then repeated this entire procedure for all \(r\in \Lambda\), and the \(r\) with the smallest cv error was selected as \(\lambda\).

Computation of asymptotic confidence interval and distribution

The confidence interval is a key indicator of precise estimation. A narrower confidence interval indicates a more precise estimation. To obtain the confidence interval for the estimated \({K}_{{ic}}\) and \({K}_{{iu}}\), we used a bootstrapping approach59. In this method, we resampled the data 1000 times, with each resample containing the same number of data points as the original data. We then fit the inhibition model to each resampled dataset to estimate \({K}_{{ic}}\) and \({K}_{{iu}}\). The resulting estimates were represented in a histogram to derive the asymptotic distribution of the inhibition constants. By excluding the lower 2.5% and upper 97.5% of this distribution, we determined the 95% confidence interval for inhibition constants.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.