Introduction

A hospital stay is a time-ordered, limited-component process, following a recognizable narrative-of-care rooted in the primary diagnosis. However, these narratives frequently break down. Patient decompensation, haphazard care protocols, or disruptive external circumstances, like the Coronavirus Disease 2019 (COVID-19) pandemic, are noteworthy not because of what is done—these are the same tests, medications, and vital signs that are typically done in any hospital—but because of when they are done. The electronic health record (EHR) thus harbors extensive latent data on the time-structured dynamics of the health care process1,2,3.

The information theory concepts of surprisal, entropy, and ergodicity help reveal these dynamics. Departures from the standard narrative are considered surprises and offer quantitative insight into clinical decision-making.

A dynamical system is deemed ergodic when the statistical properties observed over a single long-term trajectory (the time average) are the same as those observed across all accessible states at a single moment (the space average)4. Ergodicity simplifies analysis of complex systems, lends confidence to long-term forecasting, and adapts well to fields as diverse as biology, physics, and economics5,6,7,8.

We follow convention in defining surprisal as the logarithm of the inverse of probability, a term synonymous with “information content”9. The average surprisal of events is the eponymous Shannon entropy, which was later extended to dynamical systems by Kolmogorov and Sinai10,11,12. In our context, higher entropy arises from a more uniform distribution of lab test orders, while lower entropy connotes a more regular pattern, as seen when lab test orders cluster around scheduled rounding times.

Any type of mathematical modeling that aims to predict the future from the past relies on some form of temporal consistency—either the future will be like the past, or the trends in the future will resemble trends in the past. However, model performance is limited by data drift–where input data and their relationship with the target outcome change, such as with changes in practice patterns or documentation habits within and between hospitals13,14,15,16,17. Machine learning operations (MLOps) is a discipline tasked with continuous monitoring for data drift over time and space13,16,18. Yet the field is still nascent, lacks consensus definitions, and its standard tools of mean and variance do not adequately capture higher-dimensional processes1,19,20,21,22. Demonstrating ergodicity and tracking the entropy of ergodic systems can indicate when the assumption of stationarity is being violated. Moreover, this can be observed at any level of a complex system, allowing for rigorous analysis of granular changes in dynamic systems with temporal data.

We hypothesize that (1) ergodic properties are present at the level of hospital units but not across entire hospitals, (2) surprisal and entropy serves as indicators of clinical behavior and data drift, and (3) disruptions to hospital systems, such as the COVID-19 pandemic are associated with an increase in the entropy. We applied standard measures of entropy to the hour-by-hour probabilities of routine laboratory orders. Demonstrating these findings lays the groundwork for further research into using ergodicity and entropy for operational management, model vigilance, and evaluating the utility of predictive algorithms in health care.

Results

Surprisal analysis demonstrates latent information in ordering practices

Figure 1 depicts the relative frequency of lab orders, treated as empirical estimates of probability, for each hour of the day based on pre-COVID data (2018-2020) data from an inpatient ward at Columbia University Irving Medical Center (CUIMC). Derivations of these numbers and explanations of their definitions can be found in the Methods section. In each matrix, rows are lab tests, columns are hours of the day, and cells are color-coded to represent the probability p or the surprisal (-ln p) of the lab being ordered at that time. Individual probability values range from 0 to 0.27, with surprisal ranging from 1.31 to 5.2 over 24 h. We selected five different labs to represent repeat and ad hoc ordering practices. For the former, we chose serum sodium (Na) concentrations to represent serum chemistries and hemoglobin level (Hgb) for blood counts, which are typically drawn daily on admitted patients. For the latter, we selected partial pressure of oxygen in arterial blood (PaO2) from arterial blood gas analysis, which requires more invasive arterial access, and fibrinogen, a measure of coagulopathy that is typically a one-time order. Partial Thromboplastin Time (PTT), an additional measure of coagulopathy, may be routine or ad hoc.

Fig. 1: Hourly heatmap of probability and surprisal for a sample inpatient ward (floor).
Fig. 1: Hourly heatmap of probability and surprisal for a sample inpatient ward (floor).
Full size image

We show the relationship between probability and surprisal. a In the probability heatmap, lighter colors mean higher probabilities. Each row sums to 1. Higher probabilities are found during business hours. The band at 5:00 P.M. shows the time that routine, repeat AM lab orders are released so that they may be drawn and collected at 5:00 A.M. in time for morning hospital rounds. b In the surprisal heat map, lighter colors imply higher surprisal–the log inverse of probability. The highest surprisal is noted overnight. CUIMC Columbia University Irving Medical Center, PTT Partial Thromboplastin Time, PaO2 Partial Pressure of Oxygen.

We find that surprisal follows a diurnal pattern with a decreased probability of overnight orders for draws. This pattern was most evident in the repeat labs (Na, Hgb) and least evident in the ad hoc labs (PTT), which demonstrated a more uniform distribution throughout the day as care providers place one-time orders.

In addition to demonstrating the relationship between probability and surprisal, Fig. 1 also highlights the digital behavior of EHRs as noted by the low surprisal (dark) bands. When a care provider places an order, the EHR records three distinct times: order placement, collection, and result. This analysis uses the first definition, but it is complicated by the priority of the lab order. “STAT” orders are always released immediately. However, when routine labs are to be repeated at future intervals, they are stored in batches and released at different times of day, depending on the institution’s practices. At CUIMC, the lab system releases repeat lab orders 36 h before they are due. For example, at 5:00 P.M., the laboratory system releases recurring lab orders due to be collected at 5:00 A.M. two days later. Since repeat labs are typically drawn at 5:00 A.M. for morning rounds on floor, the surprisal for repeat floor labs like sodium and hemoglobin is lowest at 5:00 P.M.

Extending this analysis across units and hospitals reveals that each has a unique surprise thumbprint. Figure 2 shows the addition of individual departments and two other hospitals, along with the measured surprisal for Na (left column) and PaO2 (right column). At each institution, surprisal was most evenly distributed throughout the day in the Emergency Departments (EDs), reflecting ad hoc ordering as patients appeared at random intervals.

Fig. 2: Surprisal heatmaps by hospital and unit.
Fig. 2: Surprisal heatmaps by hospital and unit.
Full size image

Surprisal for two laboratory tests across different units at three hospitals. Lighter colors indicate higher surprisal. Surprisal for hemoglobin (a) and PaO2 (b) across units at UF. Surprisal for hemoglobin (c) and PaO2 (d) across units at UVA. At CUIMIC (e, f), MICU A at is staffed by advanced practice providers, MICU B is staffed by resident physicians; both are under the direction of attending physicians. This illustrates different surprisal patterns even in units that appear nominally similar. UF University of Florida, UVA University of Virginia, CUIMC Columbia University Irving Medical Center, ED emergency department, STICU Surgical Trauma Intensive Care Unit, NEURO Neurologic Intensive Care Unit, IMC Intermediate Medical Care Unit, MICU Medical Intensive Care Unit, NICU Neonatal Intensive Care Unit.

Other patterns are explained by the behavior of each institution’s EHR. At the University of Florida (UF), same-day tests ordered as “routine” are released immediately if they are set to be drawn before midnight and are otherwise released at midnight. Therefore, UF showed very low surprisal only at midnight (Fig. 2a, b) when repeat labs are released, with more even distribution throughout the day for both Na and PaO2. PaO2 behaved more as a repeat lab in the Neuro and Trauma Surgery Intensive Care Units (NEURO and STICU), perhaps because of the large number of mechanically ventilated patients.

At the University of Virginia (UVA), repeat labs are released four hours before they are scheduled to be collected. Therefore, UVA’s heat maps showed bands of low surprisal at 3:00 A.M. and every 4 h thereafter (Fig. 2b, e). There is high surprisal between these draws, suggesting STAT orders, which approach maximum on the floor and Neonatal Intensive Care Unit (NICU). PaO2 has a more even distribution at UVA and does not behave as a repeat lab in the STICU.

CUIMC showed a more diurnal dispersal of surprisal (Fig. 2c, f) across units, though each had its own pattern, and there was less difference between Na and PaO2 order times. Medical Intensive Care Unit (MICU) A cares for all ECMO (Extracorporeal Membrane Oxygenation) patients and is staffed by advanced care practitioners. In contrast, MICU B cares for a variety of non-ECMO, medically complex conditions and is staffed by residents. Both units are under the direction of faculty physicians. For ECMO patients in MICU A, labs are typically ordered STAT as labs are placed at the start of shifts and taper off throughout the day, with high surprisal noted overnight for both Na and PaO2. For MICU B, alternatively, repeat morning labs are released at 3:00 P.M. (for 3:00 A.M. draws), with another low surprisal band at 3:00 A.M. for twice-daily lab draws. MICU B also places STAT orders at the start of the shift (6:00 A.M.) and has higher surprisal overnight. The Neuro-ICU has two bands of low surprisal at 12:00 A.M. and 12:00 P.M. for their repeat labs.

Ergodicity is present in hospital units

An ergodic system is one where any snapshot of a whole system can be trusted to reflect the behavior of its components over time. To test the idea that an individual hospital unit––with its stable selection of patient types, unit protocols, and provider populations––might be ergodic, we calculated the Jensen-Shannon divergences, a similarity metric of probability distributions, both between the surprisal patterns of random individual beds (NICUB09 and 4103 A) and their respective units as well as between units (FLOOR and NICU)23 (Fig. 3). Generally, an ergodic unit would have a similar distribution of events for a single bed over a long period (time average) to the distribution of events over all beds in single snapshot of time (space average).

Fig. 3: Jensen-Shannon plots over time and space.
Fig. 3: Jensen-Shannon plots over time and space.
Full size image

We highlight a single bed in each unit (bed NICUB09 and bed 4103 A for the NICU and floor, respectively) to show that the divergence between the bed and unit (bottom lines) is negligible while the difference between units is substantial (top line). JS Jensen-Shannon.

We averaged probabilities of Na lab orders by hour over 3 years for the individual beds (time-averages), and we compared those distributions, month-by-month, to monthly averages of the units to which they belonged (space-averages). The result is shown in the two lower lines of the figure. We also compared the month-by-month distributions of the two units. The result is shown in the upper line. The average divergence between the individual beds and their respective units was small for both the NICU and inpatient ward (floor), at 0.013 and 0.007, respectively. Our interpretation is that the time average of lab orders for a single bed over a long period is similar to the time and space average of the unit24. These distributions of probabilities, with similar mean, variance, and correlation, are consistent with ergodicity. On the other hand, the average Jensen-Shannon divergence between the neonatal ICU and cardiac in-patient ward was more than ten-fold larger at 0.192.

Thus, hospital units behave as ergodic systems. On the other hand, the whole hospital does not; rather, we find that the distributions of lab test order times vary from unit to unit. As expected for ergodic systems, the time series of the divergences, shown in Fig. 3, were stationary (Augmented Dickey-Fuller tests, p < 0.001).

Entropy falls after the first day of hospital admission and is independent of the number of lab test orders

Entropy is the weighted average of surprisal. As patients are admitted at any time of day or night, we expected many admission lab orders to have higher entropy than orders later in admission. To test this, we generated surprisal heatmaps filtered by hours and days following admission. Within two hours of admission (Fig. 4a), the distributions of surprisal across the day are relatively even at UF, with the notable exception of the Intermediate Care Unit (IMC), where patients are typically transferred from an ICU bed. Over the following days (Fig. 4b, c), the probability of routine orders released at midnight is high, suggesting that regular patterns emerge on the inpatient ward, with higher-surprisal for ad hoc labs. Accordingly, the entropy on the first day of admission is highest, as labs are ordered at unscheduled times. After admission, entropy falls as most labs are clustered to one or two time points for scheduled lab draws (Fig. 4d).

Fig. 4: Entropy declines on average over a patient stay.
Fig. 4: Entropy declines on average over a patient stay.
Full size image

Surprisal heatmaps based on day of admission with histogram demonstrating overall values. a Low surprisal, as reflected in the darker colors, is shown within two hours of admission, as patients can arrive in the ED or unit at any time, though transfers to ICU or Floor happen later in the day. b There is increased surprisal of lab orders outside routine times, as reflected by the lighter colors distributed evenly throughout the day. c This pattern continues on the second day of admission. d Total entropy declines based on the day of admission.

To ensure that entropy was not simply a function of the number of test orders, we plotted lab test order times as a function of the number of orders (Supplementary Fig. 1). Each data point is the entropy calculated for a 28-day window. We initially observed that the Emergency Departments of all hospitals had the highest number of orders and the highest entropy; the UVA neonatal ICU had the fewest laboratory test orders and the lowest entropy. However, inspection of the plots does not show consistent correlations. Specifically, in some cases, entropy is unchanged in a unit over a wide range of lab test orders. For example, the Neurologic ICU unit at UF has a mean entropy of 1.57 over an extensive range of test numbers (218–1693 orders), and the inpatient ward (floor) has an entropy of 1.12–1.47 over an even more extensive range of test numbers (4270–8268 orders). Moreover, some hospitals had a wide range of entropy in different units with similar numbers of lab test orders. In another example, entropy in two UVA ICUs and one in-patient ward ranged from 1.23 to 2.41 for comparable numbers of orders. Likewise, at CUIMC, the entropies for one in-patient ward and the MICU B ranged from 1.63 to 3.01 over a similar number of orders. Thus, entropy contains information about a hospital unit independent of the number of labs ordered on that unit.

Entropy as a hospital metric: impact of the Coronavirus-19 pandemic

Figure 5 shows the time course of entropy over three years that include the COVID-19 pandemic. We also show the number of COVID-19 hospitalizations (plotted in gray) for calibration. For this analysis, the entropy lines show the difference between each 28-day window. Before COVID-19, monthly entropy generally oscillated around a characteristic mean in each of the wards and each of the hospitals. Again, we show that entropy was highest and near maximum for the ED in all hospitals.

Fig. 5: Effects of COVID-19 on entropy measures.
Fig. 5: Effects of COVID-19 on entropy measures.
Full size image

The Shannon Entropy (left y-axis), as plotted against time (x-axis) and COVID-19 hospitalizations (right y-axis), reveals changes in unit entropy patterns during the pandemic. a UF was minimally impacted by the pandemic, as demonstrated by the low number of COVID-19 hospitalizations with stable fluctuations in entropy. b This was also true for UVA. c At CUIMC, all units except the ED showed a drop in entropy following the spike in COVID hospitalizations, with only MICU A and B appearing to return to baseline.

Entropy changed little during the pandemic in the UF and UVA hospitals, where the proportion of COVID-19 patients was low. In contrast, at CUIMC, where the effect of the pandemic was most severe, there were marked changes correlating with the steep rise in COVID cases. Entropy fell across all units other than the ED, where it remained near the maximum. The change in entropy persisted in some units and appeared to reverse in others, with the MICUs returning to pre-pandemic levels and the floor appearing to remain lower. We thus see how entropy measures can visualize shifts in patient population and the impact of these changes on clinician ordering behavior.

Surprisal measures inform predictive models

To test the possibility that surprisal may signal suspicion of an imminent clinical event, we examined its ability to predict hemorrhage in a previous dataset using a cardiorespiratory monitoring data set of 3688 consecutively admitted patients to the University of Virginia Medical ICU, 141 of whom had 155 hemorrhagic events, defined as three units of packed red blood cell transfusion within 24 h having had no transfusions for a day prior25. The features included the means, variances, pairwise correlation coefficients of heart rate, respiratory rate, and oxygen saturation along with dynamical measures such as detrended fluctuation analysis and coefficient of sample entropy. We used regularized logistic regression adjusted for repeated measures26. We found that the surprisal of the reporting time of a hemoglobin level was a statistically significant predictor when all other continuous cardiorespiratory monitoring parameters were considered (p < 0.001). While the most important predictor was oxygen saturation variability, hemoglobin surprisal was the next most informative and had the same impact on the model as blood pressure. Acknowledging that clinician suspicions are sometimes incorrect, if we assume the clinician was ordering a non-routine hemoglobin based on a clinical suspicion of hemorrhage, this suggests that the model’s prediction may depend on the provider’s, which may indicate label leakage.

Discussion

We quantified the dynamics of laboratory test-ordering behavior in fifteen clinical units in three hospital systems. The theoretical bases are Shannon entropy and ergodicity theory. We and others consider hospitals and their EHRs as complex dynamic systems5,7,8,27. We made three important observations:

  1. 1.

    From the lens of lab ordering practices, hospitals are collections of approximations of ergodic systems.

  2. 2.

    External circumstances, such as the COVID-19 pandemic, can alter the dynamical properties of hospitals.

  3. 3.

    Surprisal and entropy differ among hospitals, units, and care teams.

Our first finding is that a hospital unit—but not a whole hospital—can effectively behave as an ergodic dynamical system. These systems exhibit local distributional stability not only in their mean and variance, the current standard of traditional statistical process monitoring20, but also in their higher-order distributional moments. Knowing when a system is ergodic, and when it is not, has strong implications for how it can be modeled: from ensuring detecting changes in ecosystem behavior28, to optimizing time-average growth rates in economics29, and understanding deviations between individual and group choices30.

Furthermore, if the system is approximately stationary, we can apply the principle of maximum entropy to infer the most likely distribution of events, here lab test order times, given what is already known9. This principle asserts that, among all probability distributions satisfying a given set of constraints (such as average order volume or known diurnal patterns), the one with the highest entropy is the most probable distribution31,32. In our context, Emergency Departments, where lab orders are placed continuously in response to unscheduled arrivals, resemble unconstrained, high-entropy systems. In contrast, inpatient wards exhibit constraints such as fixed rounding schedules or daily labs.

Applying the maximum entropy principle not only helps characterize existing constraints but also enables simulation of hypothetical operational changes1,5,33. For example, we can estimate how the distribution of lab orders would shift if a new constraint were introduced, such as eliminating overnight phlebotomy services between midnight and 6:00 AM. Conversely, we can simulate the impact of altering or removing existing constraints—for instance, redistributing effort by increasing the ratio of daytime to nighttime staff. We would write the equation for the system’s entropy, encode the operational constraints as mathematical expressions, and then maximize the entropy to yield a principled, quantitative forecast of how the workflow structure will respond9. This could similarly be applied to monitoring patient throughput, tracking adherence to ordered protocols, or correlating interactions between these variables and patient outcomes1,5,33. This approach enables data-driven scenario analysis and resource optimization grounded in information theory.

Our second finding relates to the COVID-19 pandemic, which disrupted hospital practice, often to extraordinary degrees. The entropy of lab orders changed little in two academic centers with moderate numbers of COVID-19 admissions but fell at CUIMC, a pandemic epicenter. Our suspicion that hospital care would become disordered was not borne out by entropy measures of laboratory test orders. Instead, we suspect that the orderliness of care increased at the most COVID-heavy hospital because the patient population became more uniform, had the same diagnosis, were treated by the same protocols, and were not newly admitted, and thus, the entropy fell. Thus, changes in overall entropy in an ergodic unit can reflect changes in patients themselves, the care provided to them, or both.

Our third finding—that surprisal and entropy vary across hospitals, units, and care teams—suggests that predictive modeling may benefit from tailoring to the operational scale where local distributional stability is observed. A model may treat the structural representation of the patient and the trajectory of their illness as independent of the care environment14,15, yet experience teaches us that approaches to patient care often differ in time and space17,34,35,36,37. Variations in health care practices and information technology infrastructure across sites contribute todata drift and temporal degradation, known limitations of EHR-based prediction schemes14,16,17. With entropy and surprisal, these changes are identifiable in real time, offering timely, interpretable, and more rigorous change detection.

Surprisal analysis highlights another bias in mathematical modeling. When predicting an outcome like sepsis, these models, without careful feature selection, can cheat38,39. The presence of unscheduled orders alone may inform the model of an increased risk of decompensation, of which the provider who placed the order is already well aware, a concept known as data leakage38,39. We demonstrated that the surprisal of a hemoglobin order, a marker of clinical suspicion, predicts hemorrhage as successfully as falling blood pressure, a marker of pathophysiology. This finding complements Kamram et al.’s demonstration that excluding data after clinical suspicion arises erases model performance40. Even after adjusting for data drift, surprise orders may thus inappropriately bias a model.

Our study has limitations. First, as a novel measure of the orderliness of hospital care without direct comparisons to previous work, we are limited to hypothetical explorations of differences in entropy between hospitals and units. Our conclusions are based on our understanding of hospital workflow and may not represent objective truths, though we believe our analysis to be robust and concordant with practice patterns at our respective hospitals. Second, we acknowledge that this form of ergodicity deviates from the strict form described in statistical mechanics, which would require the space average to be the average over all the beds in a selected hour and that all hours are equivalent. We instead selected arbitrary time windows to distinguish between time- and space-averages. Over longer or shorter windows, results may have differed, though we note the very small distance between distributions of a bed from its unit compared to the very large distance between units.

We conclude that hospitals are collections of ergodic dynamic systems in which information can be quantified and monitored for change. We propose that entropy-based metrics are useful features in algorithmic integrity and modeling hospital-wide enterprise management.

Methods

Study populations

We studied three quaternary care, academic referral hospital systems. The University of Virginia Hospital (UVA) is a 671-bed hospital with a largely rural catchment area. The University of Florida Health (UF) is a 1162-bed hospital with a similarly large rural catchment area. New York-Presbyterian/Columbia University Irving Medical Center (CUIMC) is a 738-bed medical center serving an urban population in northern Manhattan, as well as a wide referral base.

Data sources

We collected data from January 2018 to May 2021 to capture information about practice patterns before and during the COVID-19 pandemic. Pre-COVID and COVID times were defined according to the institution’s first COVID-19 admission date. Units were chosen to capture a variety of services and levels of care. The total number of beds and labs for each institution are shown in Supplementary Table 1.

When a care provider places an order, the EHR records three distinct times: order placement, collection, and result. This analysis uses the first definition, but it is complicated by the priority of the lab ordered. “STAT” orders are always released immediately. However, when repeat, routine labs are to be drawn at future intervals, they are stored in batches and released (“placed”) at different times of day depending on the institution’s EHR. At CUIMC, the lab system releases repeat lab orders 36 hours before they are due. For example, at 5:00 P.M., the laboratory system releases recurring lab orders due to be collected at 5:00 A.M. the day after tomorrow. At UVA, routine labs are released four hours before they are scheduled to be collected. At UF, day-of tests ordered as “routine” are released immediately if they are set to be drawn before midnight and are otherwise released at midnight. Logical Observation Identifiers Names and Codes (LOINC) standards were used to identify labs.

This study was approved by the University of Florida Institutional Review Board (201600262), the Columbia University Human Research Protection Office (AAAV0256), and the University of Virginia Institutional Review Board for Health Sciences Research (22152).

Entropy analysis

For each hour of the day at each unit of each institution, each laboratory test has an empiric probability p of being ordered based on the relative frequency of orders placed at each hour, normalized over 24 h. The probability of a Na test being ordered at 2:00 A.M. might be lower (e.g., p(Na, 0200) = 4%) than one being ordered before rounds at 6:00 A.M. (e.g., p(Na, 0600) = 20%). Surprisal is the natural logarithm of the inverse of probability:

$${\rm{S}}=-\mathrm{ln}\,{\rm{p}}({\rm{X}},{\rm{i}}).$$
(1)

Here, x is a single measurement of a single variety of a lab test (e.g., Na), and p(X,i) is the probability that each particular test was drawn in the ith hour. Thus, using the above probabilities, the surprisal of a routine lab ordered at 2:00 A.M. (S = 3.2) is twice that of a lab ordered at 6:00 A.M. (S = 1.6). We note that “surprisal” could be used interchangeably with the negative logarithm of the probability, information content, or Shannon information.

To characterize the dynamical system of lab order times, we take the Shannon entropy in Eq. (2), or the weighted average of the surprisal4:

$${\rm{H}}({\rm{X}})=-\sum \left(\mathrm{from}\,{\rm{i}}=1\,\mathrm{to}\,24\right)\,[{\rm{p}}\left({\rm{X}},{\rm{i}}\right)* \mathrm{ln}\left({\rm{p}}({\rm{X}},{\rm{i}})\right)]$$
(2)

This weighted average provides a baseline from which to compare the information provided by individual events26,27. When all labs for the day are ordered at the same time (say, hour i), then p(X,i) = 1, and entropy is 0, its lower bound. When events occur with equal hourly probability and p(X,i) = 1/24, entropy is at its upper bound or maximum, ln 24 or 3.18.

For our analysis, we calculated the lab test frequencies for each hour of the day within a 28-day window and normalized them to create a probability distribution. We moved the window in non-overlapping, 7-day strides beginning on the day of admission. This allowed for a time series estimate of the entropy for each unit in each hospital. Verifying our calculations using a well-known method for entropy estimation in sparse data did not change the rank order or distribution of the results41.

Ergodicity analysis

An ergodic system is one where the long-term behavior of a single system (time average) is equivalent to the average behavior of many similar systems at a given moment (space average). We expect the long-time-averages of events in a single bed will be the same as the corresponding long-time-average of the unit. This could happen on a spectrum of two extremes: (i) the long-time-average may be dominated by stability, such as in a Neonatal Intensive Care Unit (NICU); or (ii) the long-time-average may be dominated by instability, such as in the Emergency Department.

We tested for ergodicity of hospital units by comparing the entropy of Na lab test orders for a ward over a short epoch (one month) with that of a single bed over a much longer period (three years). These quantities represent the space- and time-averages, respectively, and the expected result for an ergodic system is that these entropies are similar. We studied a discrete set of five labs (X= sodium, hemoglobin, arterial oxygen, partial thromboplastin time, and fibrinogen). Time is made discrete and cyclic, by hour of the day, i = 1:24. The dynamical variable studied, N(X,i), is the number N of lab orders of type X in hour i. This is converted to a probability p(X,i) = N(X,i) divided by the long-time-average of the total number of orders of X in 24 h, then that probability is converted to surprisal. We calculated the Jensen-Shannon divergence, a standard metric of the similarity of probability distributions, both between the individual bed and its unit and between the two units. We performed the tests in an acute cardiac inpatient ward where acuity is high and stays are short, and in the neonatal ICU, where acuity is generally lower and stays are much longer. To test for ergodicity of the whole hospital, we measured the difference in entropy between these two units. These are approximations of ergodicity and can be taken, at minimum, to demonstrate distributional stationarity of the variables.

Regression modeling

For multivariable modeling, we assessed 11 physiologic variables: means, variances, and pairwise correlation coefficients of heart rate, respiratory rate, and oxygen saturation; detrended fluctuation analysis42; and coefficient of sample entropy43. We used multivariable logistic regression adjusted for repeated measures to relate physiologic data to the hypoglycemia outcome on the entire cohort44. We systematically built the model by: (1) removing, blinded to the outcome, the most predictable features correlated more than R2 of 0.9 with other features, (2) imputing missing values with median values for the study population, (3) building a model with all remaining features and restricted cubic splines (three knots) on each feature with enough unique values25,44, adjusting for repeated measures using the Huber-White method44, and (4) using ridge regression45 to penalize model coefficients, shrinking the effective degrees of freedom to maximize the corrected Akaike information criterion46,47.