Introduction

Osteoporosis affects one in three women and one in five men older than 50 years1. It leads to fragility fractures, which commonly include those of the wrist, spine, and hip. A quarter of these are hip fractures, the most severe, which lead to high mortality and poor outcomes after surgery1,2. Data from the Global Burden Study (GBD) 2019 showed an increase of 92.7% in hip fractures from 1990 to 2019, affecting 14.2 million people2. In Brazil, one of the countries undergoing rapid population ageing, the data recorded at the Department of Informatics of the Unified Health System (Datasus) included 480,652 hospitalisations in ten years (2008–2018), an increase of 76.9%3.

It is challenging to assess perioperative risk in older people, including those with hip fractures, because cognitive impairment, multimorbidity, frailty, and other age-related physiological decline increase the risk for poor outcomes4. Several prognostic scores have been studied to assist general perioperative care6,7,8, such as the American Society of Anesthesiologists (ASA) score7,8,9, one of the most widely used. However, there is no superiority of one specific tool to assess perioperative risk in older people over another. Other commonly suggested are frailty screenings, cognitive and functional assessments, and objective tools that are part of the comprehensive geriatric assessment5,9. In addition, the perioperative risk assessment can be tailored to specific surgical situations, such as hip fractures5. Since it was developed and validated for older people in the UK, the Nottingham Hip Fracture Score (NHFS) has been proposed as a scoring system to guide decision-making and clinical care in older people with hip fractures10.

The NHFS was published in 2008 in a cohort with 4967 patients from Queen’s Medical Center (Nottingham, UK)10. It was designed as a predictive tool for 30-day mortality in hip fractures and validated for 1-year mortality11. The score is multi-domain, quantitative, and easy to apply, assessing age, gender, comorbidities, malignancies, haemoglobin levels, a short cognitive questionnaire, and frailty inference by identifying those living in long-term care facilities (LTCF). NHFS was validated outside the UK, including in Dutch, Australian, Swedish, Chinese, and Greek patients12,13,14,15,16. However, it is still limited, with insufficient evidence of its applicability in low- and middle-income countries. Indeed, the prevalence of multimorbidity, residence place, and cognitive impairment—factors included in the Nottingham Hip Fracture Score—may differ between countries. For instance, South America has a higher prevalence of comorbidities and dementia compared to Europe, along with fewer people living in care homes17,18,19. In addition, the patient’s journey differs from that of developed countries, where admission to the surgery is within two days, while in Brazil and Chile, it generally takes more than five days7,20,21,22. This fact could, in part, influence the postoperative outcomes, with higher mortality rates in Brazil20,21.

Therefore, it is worth investigating whether NHFS performs in Brazilian reality, which has yet to be explored. This study aimed to conduct a cohort study to evaluate the performance of the NHFS in predicting 30-day mortality after hip fracture in an older Brazilian population in a public hospital in São Paulo State’s countryside. Considering the challenge of assessing perioperative risk in older people and the heterogeneity of perioperative care in Brazil, implementing NHFS can improve care by identifying priorities, helping decision-making, facilitating communication with patients and families, and guiding therapy by informing multidisciplinary teams and critical care needs5. In addition, having a pre-operative tool validated in different populations and healthcare systems can help gather and compare data from various centres and facilitate best practices implementation.

Methods

Study design and setting

We performed an observational study at a tertiary university hospital in Botucatu (São Paulo State, Brazil). The Hospital provides care for the local community and a catchment population of two million people23. Data were collected routinely as part of care provision commencing during the COVID-19 pandemic. In April 2022, the Ethics Institution Board approved the inclusion of these routinely collected data as part of a cohort study (Number 49339121.2.0000.5411). “Cohort of Older People Who Suffered Low-impact Trauma and Hip Fracture—(COeSTa—NHFS Study) was used to identify the present sample, observed from admission to 30-day after fracture. Data were collected in RedCap (14.4.0—Vanderbilt University) and then exported to Microsoft Office Excel (Microsoft 365 MSO (Versão 2409 Build 16.0.18025.20030). The manuscript was written using the STROBE checklist.

Patients

Patients met inclusion criteria if they were aged 60 years and above, admitted due to a low-energy trauma, defined as a fall from standing height or less24,25, hip fracture (femoral neck, intertrochanteric, and subtrochanteric), and scheduled for surgery. Subjects were not included if they had suspected non-osteoporotic fracture, were receiving terminal palliative care because of previous severe illness, refused surgery, or were transferred to another hospital; in addition, participants were excluded if they died before surgery, or had missing data needed to calculate the NHFS, as shown in Fig. 1.

Fig. 1
figure 1

Flowchart representing the patient’s trajectory throughout the cohort follow-up of patients with hip fracture in a tertiary centre from Sao Paulo, Brazil, 2020–2023.

Data variables and sources

The Internal Medicine Perioperative Assessment and Risk Management generally evaluates functional capacity, ASA score, cardiac, nutritional, pulmonary, renal, thromboembolic, delirium risk assessment, and management of comorbidities. For patients with hip fractures, the NHFS and Abbreviated Mental Test Score (AMTS)26 for cognitive screening were added to the general clinical assessment. Regarding the surgical treatment, neck fractures were, in their majority, treated with arthroplasty, and if nondisplaced or valgus impacted, with internal fixation. Intertrochanteric and subtrochanteric fractures were fixed with intramedullary nails or sliding hip screws27.

Family or medical records were assessed to clinical and laboratory data. For patients without AMTS available, if in the clinical exams, they presented “unconscious” and “confused,” we inferred AMTS ≤ 6, which is the cutoff of the NHFS for cognitive dysfunction. On the other hand, if “conscious” or “orientated in space and time,” we considered > 6. The participant was excluded if there were conflicting opinions or it was impossible to obtain the information. We inferred the data for 39 (7.7%) patients.

Haemoglobin was measured by colourimetric assay. Comorbidities were considered as preexisting cardiovascular (hypertension, atrial fibrillation, valvopathies, heart failure, cardiac ischemic disease), cerebrovascular (stroke or transitory ischemia), respiratory (asthma or chronic obstructive pulmonary disease), malignant non-invasive skin cancer, previous renal disease, or diabetes. Acute kidney injury or pulmonary infection were not considered.

Bias prevention strategies were implemented, such as double-checking sorted inputs, including all subjects that met inclusion criteria in that 3.5-year time frame, and researchers’ blind detachment from statistical analysis and data collection. The primary outcome, 30-day mortality, was investigated in electronic medical records, follow-up medical appointments, or telephone calls. Our primary exposure variable was NHFS performance. ASA was also investigated because it is a classical, world-used score.

The secondary outcomes, such as postoperative complications, were extracted from medical records at any time of hospitalisation. They included pneumonia, thromboembolic events (pulmonary embolism and/or deep venous thrombosis), infection (urine infection, surgical site infection), and delirium.

Assessing perioperative risk

NHFS, ASA, and Measurement of Exercise Tolerance (MET) were used to assess the general perioperative risks. The NHFS score ranges between zero and ten, using the sum of all items. The items and punctuation to calculate the NHFS are age ≤ 65 years (zero points), 66–85 years (three points),  ≥ 86 years (four points), and one point for each item: male gender, haemoglobin at admission (≤ 10 g/dL), living in LTCF, number of comorbidities (greater than or equal to two), malignancy in the last 20 years, and AMTS at admission (≤ 6 out of 10)10. Considering that the score is objective and the items involved are accepted worldwide, the English version of the NHFS was used except for AMTS, which was performed using the trans-culturally adapted version into the Brazilian Portuguese language28. NHFS is described in detail in the supplementary file (S1).

ASA’s physical status was assessed in levels ranging from 1 to 6, as previously described7,8. The functional capacity was assessed by the METs using the Brazilian version of the Duke Activity Status Index (DASI), where < 4 was defined as poor functional capacity29.

Sample size

The sample size was calculated with a power of 0.8 and an alpha error of 5%. According to Green et al.30, for a small effect size and two predictors, nearly 480 patients should be included.

Statistical analysis

Continuous variable normality tests, such as the Shapiro–Wilk Test, were used to assess the distribution patterns. Normal distributed variables were expressed in means ± standard deviation and non-normal in medians and percentiles 25 and 75. Categorical data were described in absolute number (N) and frequency (%).

Univariate analysis assessed the differences between survivors and 30-day mortality, using Student t-test or Mann–Whitney test, according to distribution pattern. Fisher’s Exact Test or χ2 was used to compare categorical variables. The Z-test was used to compare the proportions between the observed mortality at each NHFS level in our study and the mortality estimated by the NHFS score.

Survival models were constructed using Cox proportional regression, dependent time Cox regression, and Kaplan Meyer Survival Curve to investigate the performance of NHFS and ASA in predicting mortality. Data from patients, alive or not, were censored 30 days after the fracture. NHFS and ASA were analyzed as categorical variables > 4 vs. ≤ 4 and ≥ 3 vs. < 3, respectively. These cut-offs were selected based on previous studies, which identified a high rate of complications above these levels5,8,10.

All variables were evaluated for risk proportionality (hazard) for Cox regression analysis. Categorical variables were assessed using Kaplan-Meyer curves, and continuous variables were evaluated using a scatter plot between the residuals obtained in their isolated evaluation in Cox regression and time. After these assessments, sex and age did not present proportional risks (hazard), while NHFS, ASA, and fracture type presented proportional risks.

Therefore, the first model was built to evaluate the NHFS’s 30-day mortality prediction capacity. In the univariate analysis, the classification of fractures between extracapsular and intracapsular was associated with mortality. This way, Cox proportional regression was performed with the NHFS and the fracture classification since they presented proportional risks. Sex and age were not included in these adjustments because the NHFS calculation already considers these variables.

The second model evaluated the ASA’s 30-day mortality prediction capacity. According to the univariate analysis, the variables associated with mortality were added to the model: age and fracture classification. Sex showed no association but was added to the model due to its epidemiological importance. Due to the lack of proportionality of age risks (hazard), the analysis used time-dependent Cox regression, adding the age variable covaried by time in the model.

The Kaplan-Meyer Survival Curve was built and compared with the log-rank test, expressed as a hazard ratio and 95% confidence interval. The accuracy of NHFS and ASA in predicting 30-day mortality was assessed by the receiver operating characteristic (ROC) curve, represented by a sensitivity vs. 1-specificity graphic, the area under the curve (AUC), and a 95% confidence interval for the AUC. The higher Youden index [sensibility% + specificity%) − 100] was used to find the cutoff value.

Patients with NHFS > 4 vs. ≤ 4 were compared regarding the complications using the categorical variable analysis, aforementioned.

The analyses were performed using RStudio version 4.2.231 and Jamovi statistical programs32 (Jamovi version 2.3). Graph Pad 6.0 was used to draw the graphics, and QGIS 3.12.0 was used to represent a map graphically. All analyses considered a significance level of 5% (p-value < 0.05). Missing data were not considered, as shown in the results.

Results

Baseline general characteristics

Between 1 August 2020 and 31 December 2023, 581 patients presented with a hip fracture and were assisted by the internal medicine team for the perioperative assessment. Five hundred forty-eight met the inclusion criteria, and after the initial evaluation, forty-five patients were excluded, as shown in Fig. 1. Thus, 503 patients were included and characterized by advanced age (79.4 ± 9.31 years old) and predominantly women, 369 (73%). Falling from height (including stumbling and dizziness) was the primary trauma mechanism, 437 (87%). Only 161 patients had information about the type of shoes; from that, 111 (69%) wore slippers or flip-flops during the fall. Extracapsular hip fractures (trans trochanteric and subtrochanteric) represented 58% of cases, while intracapsular fractures represented the remainder. Regarding comorbidities, the treatment of osteoporosis was low, reaching 21 (4%) patients. More than two comorbidities were present in 223 (44%), and polypharmacy was frequent, as reflected in using five or more medications in 262 (52%) of the population studied. Supplementary file (S2).

Regarding the patient pathway from fracture to the outcome, Fig. 2 shows they come from different municipalities in the region covered by the study centre. Most patients (83%) arrived at the hospital within 48 h after the fracture; however, the time from admission to surgery was 5 (3–8) days, and the length of stay was 7 (5–11) days.

Fig. 2
figure 2

A map showing the city where the hip fracture occurred of hospitalised patients in a tertiary centre from Sao Paulo, Brazil, 2020–2023.

Univariate analysis of 30-day mortality or survival

Table 1 compared two groups according to the outcome of 30-day mortality: 458 survived in the first 30-day (91%) from fracture, and 45 died in the same period (9%). The analysis of variables needed to calculate the NHFS showed no significant differences concerning gender, malignancies, and comorbidities. The patients in the surviving group were nearly ten years younger than the 30-day mortality group (p-value < 0.001). Living in LTCF, lower AMTS and haemoglobin levels were more frequent in the 30-day mortality group.

Table 1 Comparative analysis of patients who underwent surgical procedures to treat hip fracture and 30-day mortality in a tertiary center from Sao Paulo, Brazil, 2020–23.

More people in the 30-day mortality group presented NHFS > 4, reinforced by high score means (p-value < 0.001 for both). Additionally, in the traditional risk scores, significant differences in outcomes were observed between patients with ASA scores of ≥ 3 (p-value = 0.036), but no differences for the functional capacity accessed by METs.

Extracapsular and intertrochanteric fractures were more common in the 30-day mortality group (p-values = 0.017 and 0.025, respectively). Although 421 (84%) of patients had surgery after 48 h of admission, the delay in surgery and the days from admission to surgery did not differ between the survivors and those who died within 30 days (p-value 0.36 and p-value = 0.35, respectively). However, a secondary analysis showed that the patients submitted to surgery within the first two days were older: 82.5 years old (75.5–89.0) and with higher NHFS: 5 (4–6) when compared with the survivor’s age: 80.0 (72.0–86.0) and NHFS: 4 (4–5); p-value = 0.037 and p-value = 0.036, respectively.

NHFS performance in 30-day mortality and complications

Table 2 showed that the mortality rates in each level of NHFS in the present study’s population were similar to those of a cohort developed in three centers in the UK33. No significant difference was found between the observed and estimated proportions for almost all NHFS scores, except for the initial scores (0 and 2). This difference is due to the small number of cases in these two scores.

Table 2 NHFS observed in patients with hip fracture and 30-day mortality in a tertiary centre from Sao Paulo, Brazil, 2020–2023, compared with a Cohort in the UK and estimated mortality by each score category for preoperative assessment.

A Kaplan–Meier curve showed the 30-day survival probability according to the NHFS cutoff at 4 (Fig. 3A). The mortality at the end of follow-up demonstrated a difference between the groups (HR 3.94; 95% CI 2.19–7.07; p-value < 0.001). Figure 3B showed the curve for ASA ≥ 3 (HR 2.58; 95% CI 1.43–4.63); p-value = 0.001).

Fig. 3
figure 3

Kaplan–Meyer survival curve. (A) NHFS ≤ 4 vs. > 4- and 30-day mortality; HR 3.94; 95% CI 2.19–7.07; p-value < 0.001). (B) ASA 1–2 vs. 3–4 and 30-day mortality; (HR 2.58; 95% CI 1.43–4.63); p-value = 0.001).

Figure 4A presents the ROC curves with the corresponding sensitivity and 1-specificity values for the accuracy of NHFS and ASA in predicting 30-day mortality. NHFS demonstrated a reasonable AUC of 0.743 (95% CI 0.677–0.810), with a cut-off at 5 (sensibility 82.2%; specificity 53.7%). ASA showed a lower AUC of 0.648 (95% CI 0.566–0.730). Figure 4B showed that the ROC curve for NHFS to discriminate 30-day mortality in neck fractures was AUC 0.762 (95% CI 0.628–0.896) better than for extracapsular fractures 0.715 (95% CI 0.634–0.797).

Fig. 4
figure 4

The ROC curve performance for postoperative mortality after hip fracture surgery. (A) Nottingham Hip Fracture Score (NHFS) is green, and the American Society of Anesthesiologists (ASA) scores are blue. (B) The NHFS of neck fracture patients is blue, and the sub and intertrochanteric are green.

In the Cox multiple regression adjusted by the fracture classification, NHFS > 4 increased the chance of 30-day mortality (HR 4.55; 95% CI 2.11–9.82); p-value < 0.001), as well as ASA score, adjusted by fracture classification, age and gender (HR 2.43; CI 95% 1.21–4.85); p-value = 0.012). The Cox analysis is available in Table 3.

Table 3 Cox Regression to assess the association of NHFS and ASA scores with 30-day mortality in a tertiary centre from Sao Paulo, Brazil, 2020–23.

Finally, 249 (49.5%) of patients had NHFS > 4 and presented more overall surgery complications, 87 (36%), than patients with NHFS ≤ 4, 46 (18%); p-value < 0.001. The complications included pneumonia, delirium, thromboembolic events and infection. Data is available in the supplemental file (S3).

Discussion

After implementing the institutional care protocol for preoperative assessment of older patients with hip fractures, the database analysis showed that even in a single centre, outside Europe, in the countryside of São Paulo State, Brazil, the NHFS was associated with 30-day mortality, with a performance very similar to the original and other cohorts. Our study reflects a real-world clinical application, as we used the scores as intended by a large group of physicians and residents in their current practice supervised by trained Internal Medicine Clinicians. Since there is a lack of information about perioperative risk in older people with hip fractures in Brazil, the novelty of the present study is to show that NHFS can help detect risks and guide in-hospital care, decision-making, and evidencing the epidemiology of the patients.

The analysis of whether NHFS would fit our reality started by comparing our population characteristics with the hip fracture population that originated the NHFS, published in 2008. Demographically, the descriptive analysis showed that our study had similarities with those included in creating and validating the NHFS score in the UK10. The predominance of the female gender (~ 75%), frequency of altered AMTS (~ 40%), and mean age near 80 years old were observed in both studies. More than two comorbidities were twice as frequent in our study than in the original UK10. Another difference occurred in the item living in LTCF, which was present in 3% of the patients in this study versus 25% in the British one. In fact, in Brazil, nearly 1% of the population lives in care homes19, which is lower than in the UK, estimated at 2.5% in 202134. Therefore, the population presented similarities but some particularities.

Nevertheless, the NHFS performance in the present study was very similar to that of the UK population. Our findings indicate an association between higher NHFS values on admission and the probability of death 30 days after the fracture. The actual mortality of our patients was similar to what was observed by the original study published in 2008 (9.0% vs 10.2%, respectively)10. After that, in 2010, the UK started the “best practice tariff—(BPT) strategy,” incentivising hospitals to deliver the best practice assistance to older people with hip fractures35. Later, a study published in 2012 validated the NHFS in the transition of BPT implementation in three centres in the UK, when the 30-day mortality decreased to 6.6%33. More recently, in 2021, another UK study recalibrated NHFS after BPT in a single-centre study where the 30-day mortality was 6.3%36. Interestingly, NHFS > 4 indicated higher risks in all UK studies. The same was observed in our study, where NHFS > 4 increased 30-day mortality risk by 4.55 times, even adjusted by the anatomical fracture type. The survival curve also illustrated the differences between NHFS of four and below vs. five and above.

In addition, the AUC of the ROC curve in our Study (0.743) was slightly better than that of the original one in the UK (0.715)10. Since the UK study included only neck fractures and we included all the proximal femoral fractures (intra and extracapsular), two other ROC curves were drawn separately, one with neck (AUC = 0.762) and the other with sub or intertrochanteric fractures (0.715). All the curves (AUC from 0.7 to 0.8) suggested acceptable discrimination for the NHFS concerning the 30-day mortality37.

Another aspect explored in the present study was the growing increase in 30-day mortality of the NHFS score levels from 1 to 10. Since there is no BPT in Brazil, but some isolated actions, as the organisation of interprofessional clinical care like in the present study, we compared each level of the score with mortality with the multicenter study in the UK, which was performed in the transition of the implementation of the BPT33. As observed in Table 2, there were no differences in mortality by NHFS score levels between the present study and the one developed in the UK, nor between the observed mortality in Brazil and the mortality estimated by the tool.

The capacity of NHFS to predict mortality goes beyond the UK. In Australia, a single-centre study showed that NHFS had a prediction model of 0.760 (0.631–0.888)13 and 0.791 (0.709–0.873) in China15. In a Dutch population of older people with intracapsular hip fractures, the mortality observed for NHFS 5 was 10%, and NFHS 6 was 19.9%, similar to our data12. Therefore, NHFS applies to different populations and continents, even with contrasts in individual characteristics.

Regarding using NHFS over other screening tools, its performance is comparable to, and sometimes better than, other preoperative predictive scores. In previous studies, the 30-day predictive value of ASA38 (0.600; 0.488–0.711), Donati10,39 (0.717; 0.699–0.735), Charlson Comorbidity Index (CCI)38 (0.590; 0.482–0.698), and POSSUM38 (0.635; 0.518–0.751) were not superior to the NHFS original and in our Study. Also, in contrast to general predictive models, in non-cardiac surgeries, the performance was not inferior to often used scores such as Lee’s Revised Cardiac Risk Index40 (AUC 0.620; 0.54–0.78). The data in the present study showed that the ASA ≥ 3, adjusted by sex, age, and anatomical type of fracture, increased the risk for 30-day mortality by 2.4 times, which was lower than the NHFS odds ratio. In a qualitative systematic review, the authors identified 25 scoring systems in hip fracture patients. Only ASA, CCI, and NHFS were used in more than two studies in the preoperative moment. The authors, therefore, concluded that ASA did not perform well because it is not robust enough in this population41. Other studies suggested that ASA relies on a subjective evaluation5,15, and most of our patients were ASA ≥ 3, reducing the discriminative power. Therefore, the advantage of NHFS over the other scores is that it is easy to execute, brief, and objective, with few variables, while maintaining its predictive accuracy and good performance in different countries12,13,14,15,16.

In addition to the 30-day mortality, more complications were observed in NHFS > 4, but complications were not higher in the patients who died within 30 days. A previous study found the same aspect, and the authors suggested that complications might be related to some pathophysiological characteristics, such as sarcopenia, not measured in the score18.

The study presents limitations, such as the single-centre data collection in one region of Brazil, where data was collected as part of care provision and the medical residents rotated every month. The sample size is smaller than that of the UK studies; only some people represented NHFS from 7 to 10. In addition, the care pathway was utterly different from the UK centres since most surgeries were performed after 48 h of trauma, and the discharge generally happened the day after surgery. Less than 20% of the patients underwent surgical correction in the first two days after admission. Indeed, the Centre of this study is a 500-bed hospital that assists 2 million people in the regions named “Polo Cuesta and Vale do Jurumirim.” A tertiary hospital that assists 2 million people in the UK has three times more beds42. So, it is possible to observe a higher bed pressure in our overloaded Hospital. Nevertheless, our reality is similar to other centers in Latin America. For instance, the time for surgery in one public health system in Chile was 5 (3–9) days22. The time from fracture to surgery and hospital admission to surgery did not influence the mortality in our study. However, higher age and NHFS were prioritized for early surgery. Thus, such differences may have affected the findings and the observed higher mortality. There were 22 (4.3%) missing data for the ASA score and 39 (7.7%) for AMTS that were inferred as > 6 or ≤ 6. However, all the patients were reached at 30 days to collect the outcomes. Yet, the strengths are the consistency of the findings when compared with other countries and the innovation of using NHFS in the real world with data collection in Brazil.

It is essential to notice that Brazilian data for 30-day mortality are scarce, ranging from 17.65 to 18.4%43 in two retrospective studies, which is excessively high. Implementing screen risk such as NHFS helps guide therapy, focusing on more caution and delivering multidisciplinary care for patients with scores > 4. In addition, to help gather, compare, and audit data in one or more centres. Setting a universal perioperative risk, like NHFS, could work as a trigger or a sentinel to rethink the political implementation of hip fracture care in Brazil and other medium—and low-income countries.

Conclusions

The presented data indicate that NHFS is easily implantable in real-world clinical practice and estimated the 30-day mortality risk for hip fracture in older patients, similar to the original study. Even in different realities, NHFS being higher than 4 increases the risk of 30-day mortality, raising the relevance of the score to inform the clinical practice. The present study might motivate other centres to consider NHFS in their perioperative risk assessment routine. Therefore, future research could explore the performance of NHFS in helping with audit and gathering data, guiding therapy, decision-making, and national political implementation.