Introduction

An ultra-marathon is defined as any running race longer than the classical marathon distance and any running event of 6 h or longer1. Ultra-marathon running can be held as distance-limited races such as 50 km, 50 miles, 100 km, 100 miles or longer2 or time-limited races of 6 h, 12 h, 24 h, 48 h, 72 h, 6 days, and 10 days3. In addition to these single-stage ultra-marathons, also multi-stage ultra-marathon races exist where athletes must cover an ultra-marathon daily for several days4.

In September 1977, Jim Shapiro made a significant impact on public awareness of ultra-marathon running when he wrote a magazine article that was published in many Sunday newspapers (https://ultrarunninghistory.com/wp-content/uploads/2018/11/shapiro1977.pdf). Ultra-marathon running has been in high and unbroken popularity for decades (www.runnersworld.com/uk/news/a36528666/ultrarunning-more-popular-than-ever/). Today, hundreds of thousands of runners participate in several thousand ultra-marathon running races held worldwide (https://worldathletics.org/disciplines/ultra-running/ultra-running). Among the time-limited ultra-marathons and despite being the fourth-longest run, the 48-hour run is less popular than shorter races like the 24-hour run; nonetheless, its popularity is increasing5. A study investigating time-limited races of 6 h to 10 days found that most runners competed preferably in 6 h, 12 h, and 24 h5.

The relatively low interest in athletes competing in 48 h is also present in scientific research, where only a few studies investigated 48-hour runners. Studies investigated aspects of training6, body composition6, the age of peak performance7,8, the influence on the metabolism9,10, and the sex difference in performance11.

Since the 48-hour run is the race of particular interest in this investigation, yet still growing and lacking a large sample of participants, it would be of interest to know where the fastest 48-hour ultra-marathoners originate from. Older studies reported that 6-hour12 and 12-hour13 runners mainly originated from Europe.

However, there is no knowledge about the origin of athletes competing in this specific race format and where the fastest races are held. Therefore, we intended to investigate (i) where the fastest 48-hour runners would originate from and (ii) where the fastest race courses in 48-hour running would be located. Based on the existing reports for 6- and 12-hour races, we hypothesized that the fastest 48-hour runners would also originate from Europe such as the 6-hour and 12-hour ultra-marathoners.

Method

Ethical approval

This study was approved by the Institutional Review Board of Kanton St. Gallen, Switzerland, with a waiver of the requirement for informed consent of the participants as the study involved the analysis of publicly available data (EKSG 01/06/2010). The study was conducted in accordance with recognized ethical standards according to the Declaration of Helsinki adopted in 1964 and revised in 2013.

Data analysis

Data download

The race data was downloaded from the official website of Deutsche Ultramarathon Vereinigung (DUV) (https://statistik.d-u-v.org/geteventlist.php) using a Python script. Each race record included the athlete´s first and last name, age group, gender, country of origin, race location and year, race duration, and performance (i.e., achieved distance in km and running speed in km/h). Country information is shown with ISO 3 codes. We also included race course characteristic such as elevation (flat or hilly) and running surface (asphalt, cement, granite, grass, gravel, sand, track, or trail) as reported on the race course descriptions for the different races.

Data pre-processing

The data was checked for consistency, and any impossible (too fast running speed) or incomplete records (e.g. missing first name, last name, age, performance) were removed. Additionally, to reduce noise and maintain the results statistical significance, race records from athlete countries with only one unique runner were removed. While these measures will allow for an easier interpretation of the results, we must acknowledge, that some countries/runners will have needed to be included.

Country rankings by number of race records and unique runners

The dataset was aggregated by the values in the athlete country column and then sorted by participation (number of records) to rank the athletes’ countries by number of race records. Similarly, to rank the countries holding the events by number of race records, the dataset was aggregated by the values in the Event country column and then sorted by participation. In all cases, running speed descriptive stats (mean, std, max, and min) and the number of race records and unique runners are calculated for each country. The mean race (running) speed is color-coded, with darker colors corresponding to higher values (faster running speeds).

Regression modeling

We built and evaluated a predictive model for the 48-hour race distance, based on the XG Boost Regressor algorithm, to have a baseline and to compare against, an OLS-based Multi- variate Linear Regressor (MLR) was also built. The following variables are used as predictors, or inputs to the models:

  • Athlete_gender_ID.

  • Age_group_ID.

  • Athlete_country_ID.

  • Event_country_ID.

  • Elevation_ID.

  • Binary indicators of running surface (RS):

    • RS_asphalt.

    • RS_cement.

    • RS_granite.

    • RS_grass.

    • RS_gravel.

    • RS_sand.

    • RS_track.

    • RS_trail.

The predicted variable, or model output is the race (running) speed (km/h). Two evaluation metrics MAE and R2 are calculated, and several model explainability techniques are used, including the model relative features importances, the Partial dependence plots (PDP) and prediction distribution plots, and the SHAP aggregated values charts. After several iterations and tests, the optimal XG Boost model parameters and accuracy scores were:

  • 1500 estimators (learners or trees).

  • Learning rate of 0.5.

  • R2 score of 0.49 (in-sample test).

  • MAE of 0.77 km/h.

In comparison, the MLR model obtained an R2 score of 0.178, which is pretty low, confirming the immense superiority of ML models such as the XG Boost against the more traditional linear regressor.

Numerical encoding of categorical variables

Before the XG Boost model can be trained, the predictors’ values must be converted (encoded) into numerical data. The Athlete gender variable was encoded as female = 0 and male = 1. The Age group variable was already numerically encoded in 5-year groups (except group 18, which represents runners under 20 years, and group 75, which represents 75 years and over). To compare with existing studies, we used 5-year age groups although U23 and M/W23 IAAF and WMA categories exist (https://statistik.d-u-v.org/faq.php#CatCalc). The Athlete country and Event country variables were encoded based on their position in the respective rankings, with the countries with the highest participation first. The Elevation variable is encoded as 0 = flat and 1 = hilly. The Running surface variable is one-hot encoded, giving place to a set of 8 new binary variables that indicate if the race takes place in that surface (1) or otherwise (0). These flags are mutually exclusive, as each race takes place entirely in a single running surface.

Model training and evaluation strategy

The optimal hyperparameters were identified by iteratively training and evaluating different models with different test splits, and different number of estimators / learn rates. The best R2 score of almost 0.5 is obtained when no test data is held out, that is, when the full dataset is used for training the model, and its predictive accuracy is then tested against this same full dataset (in-sample testing). This is of paramount importance when ML models are built with the goal of using them in predictive use cases. However, this is not a problem in our case as all we want to do is understanding the relationships between predictors and target variables in the dataset.

Model interpretation

To assess how each predictor contributed to the model output, we computed the importance of the model’s relative features, the PDP plots, and the model prediction distributions. The PDP plots show the relative amount of change on the model output for the different values of each predicting variable with respect to a reference value (value 0). From them, we developed the prediction distribution plots which use boxplots to show the distribution of the model predictions of average race speed. Additionally, we calculated the SHAP aggregated values which display in a compact manner how each predictor influences the model predictions, both in magnitude and sign. All computation and analysis were done using a Jupyter notebook (Google Colab) and Python and associated libraries (pandas, numpy, xgboost, pdpbox, sklearn, matplotlib, sns…).

Results

A total of 16,233 race records from 7,075 unique runners from 60 different countries participating in races held in 36 different countries between 1980 and 2022 were analyzed. Table 1 summarizes the race records by the 60 countries of origin of the athletes. Participation is well spread across many countries, with USA, France, Germany, and Australia at the top of participants’ rankings. Athletes from Israel, Japan and Iceland have the fastest average running speeds.

Table 1 Athlete country ranking.

Figure 1 presents the race performance by gender and country of origin with the countries with most athletes on top. In several countries, no women competed in this race format.

Fig. 1
figure 1

Race performance by gender and country of origin.

Table 2 summarizes the events of 36 different countries. The Czech Republic joins USA, France, and Australia in the top rankings of records by country of the event, whilst the fastest average running speeds were obtained in Japan, France, Great Britain, Netherlands, and Egypt.

Table 2 Event country ranking.

Figure 2 presents the race performance by gender and country of event with the countries with most athletes on top. In several events, no women competed in this race format.

Fig. 2
figure 2

Race performance by gender and country of origin.

Participation and men-to-women ratio over years

Figure 3 presents the participation trends across years and the resulting men-to-women ratio. While the number of both women and men increased over years, the men-to-women ratio decreased.

Fig. 3
figure 3

Participation and men-to-women ratio over the years.

Figure 4 presents the race performance by gender and age group. For both women and men, runners in age group 45–49 were the fastest. Figure 5 shows the race performance by gender and running surface. Both women and men were the fastest in track running.

Fig. 4
figure 4

Race performance by gender and age group.

Fig. 5
figure 5

Race performance by gender and running surface.

Multivariate linear regressor (MLR)

In the MLR, R2 was 0.178 for this model and all predictors have a significant effect on the MLR model output.

XG boost model evaluation metrics and features relative importances

The optimal XG Boost regressor was built with n_estimators = 1500 and learn_rate = 0.5, and trained and tested over the full sample, obtaining a R2 = 0.49 which indicates a fair effect of the predicting variables in the model output. This means almost 50% of the 48-hour running speed variability can be explained by the 12 predictors. The features relative importance analysis indicates that the variables Elevation and RS_track are the most important ones (Fig. 6). This is based on entropy reduction, meaning that those two variables allow splitting the dataset into more deterministic subsets better than the rest. These two variables are also ranked high in the SHAP aggregated analysis, in respect of their effect on the model output, confirming them as the most relevant as far as race performance (running speed) is concerned.

Fig. 6
figure 6

XG boost model evaluation metrics and features relative importances.

SHAP aggregated values chart for XG boost model

The SHAP aggregated values chart ranks the model features by the magnitude of their contribution to the model predicted values (Fig. 7). They also indicate the sign of this contribution. The RS_track and Elevation_ID variables, in positions 3 and 5, respectively, show the clearest separation between red and blue dots. This is consistent with the relative feature importances analysis and indicate: track races (RS_track = 1, red dots) accumulate in the right side, indicating faster speeds hilly races (Elevation_ID = 1, red dots) accumulate in the left, indicating slower speeds. Age_group_ID shows red dots (higher ages) on the left side, hence having the effect of reducing the race (running) speed, however values in the mid-range (purple dots) seem to accumulate on the right side, indicating middle-aged runners achieve slightly faster race (running) speeds. Similarly, Gender_ID with only two values (0 for women and 1 for men) shows red dots (men´s data points) slightly on the positive side, whilst blue dots (women´s datapoints) are present at both sides, suggesting male dominance is not so clear in the 48-hour ultra-marathon race format.

Fig. 7
figure 7

SHAP aggregated values chart for XG Boost model.

Prediction distributions and target plots value plots for XG boost model

The difference among male and female predictions is of ~ 0.7 km/h (median value) or ~ 0.5 km/h (mean target value). Age group 45–49 years leads in both participation and performance (Fig. 8). In the athlete country charts, Japan, Israel and Iceland obtain the fastest predictions (median values) (Fig. 9). In the event country charts, we see a repetition of the descriptive stats with Japan, France, Great Britain, the Netherlands, and Egypt obtaining the highest predictions (Fig. 10).

Fig. 8
figure 8

Prediction distributions and target plots value plots for age group.

Fig. 9
figure 9

Prediction distributions and target plots value plots for origin of the athlete.

Fig. 10
figure 10

Prediction distributions and target plots value plots for the country where the race was held.

Discussion

In this study, we attempted to investigate where the fastest 48-hour runners originated and where the fastest 48-hour race courses are located with the hypothesis that the fastest 48-hour runners would originate from Europe. The main findings were (i) participation was highest in the USA, France, Germany, and Australia, (ii) the fastest average running speeds were achieved by athletes from Japan, Israel, and Iceland, and (iii) the fastest races were held in Japan, France, Great Britain, the Netherland, and Egypt, (iv) elevation of the course (flat course) and the running surface (track) were the most important variables for fast 48-hour races, whilst the country of origin of the athlete and the country where the event was hold would lead to the broadest difference in the predicted running speed range. Additional findings were that (vi) men were ~ 0.4 km/h faster than women, and (vii) the fastest running speeds were achieved by runners in age group 45–49 years, and most finishers were in the same age group 45–49 years.

Race course characteristics are the most important predictors

The first and most important finding was that the XG Boost model showed that elevation of the course (flat course) and the running surface (track) were the most important variables.

This finding confirms recent findings for longer running race durations such as the 72-hour14 and the 10-day15 ultra-marathons. In the 72-hour ultra-marathons, the fastest races were held on track, followed by road, and then trail14. Furthermore, flat race courses led to faster running speeds than hilly race courses14. In the 10-day ultra-marathons, running on asphalt was faster than running on any other race course surface15.

Participation in 48-hour races

We found that participation was not dominated by athletes from a specific continent or country, where runners from the USA, European countries such as France and Germany, and athletes from Australia were among the most numerous participants. These findings were also reported for 6-hour races where most athletes were from Germany, France, Netherlands, Russia, Austria, Italy, USA, Norway, Sweden, Canada, Belgium, Czech Republic, Denmark, Australia, and Japan12. Also, in 12-hour running, USA, France, and Germany were the countries with the most participants and from which most participants originated13. It is important to note that the statement that most ultra-marathon runners come from Europe may not be strictly accurate, as the popularity of ultra-marathons has been growing worldwide, and athletes from various regions participate in these competitions16.

It is important to note that, participation in ultra-marathons has been expanding globally over time, with athletes from various parts of the world competing in such events17. Various factors can influence the presence of ultra-marathon runners in different regions, and the sport’s popularity continues to grow internationally18. Some European countries have a strong sports tradition and a rich history in long-distance running. This may have influenced the local sports culture and stimulated interest in more extensive athletic challenges19. European countries often have good sports infrastructure, including running trails, national parks, and high-quality training facilities (https://www.coe.int/en/web/sport/mapping-of-sports-facilities). Additionally, European countries have the financial resources to cover travel expenses, participation fees and other related costs (https://single-market-economy.ec.europa.eu/sectors/tourism/eu-funding-and-businesses/business-portal/financing-your-business/tourism-related-taxes-across-eu_en). The diversity of climates and geographies in Europe offers a variety of terrains for training, from mountains to plains20. This can prepare runners for a wide range of conditions encountered in ultra-marathons21.

Institutional support and sponsorship might also play a role. In some European countries, athletes may receive institutional support and sponsorship more easily, enabling them to dedicate themselves more fully to ultra-marathon training22. In some European countries, a culture of well-being and emphasis on physical activity may encourage more people to engage in sports, including long-distance running23. Europe has established events and competitions in ultra-marathon running24. Hosting well-established ultra-marathon events in Europe can attract local and international runners, contributing to the development of the ultra-marathon scene in the region16.

As expected, runners competing in time-limited races originate from USA, Europe, and Australia. Thus, it was observed that the origin of participants in 48-h races had similarities with the participation in time-limited races of shorter duration. These similarities should be attributed to the affinity of the physiological demands of these races. The first 48-hour races in the USA were held in the 19th century with 48-hour splits in ‘Int. Astley Belt Long Distance Championship’ in 1879 and ‘New York 6-day race’ held in 1882 and 1888. In the Seventies of the 20th century, a re-start of 48-hour races was recorded again with split times in ‘David Copperfield 48-Hour Track Ultramarathon’ in 1979, held as a track race. In the Eighties, mainly indoor races (Haverford Indoor 48 h), track races (The Last Annual Rowdy Ultimate 48 h Track Race) and splits (The New Astley Belt Six Day Race, NYRRC Six-Day Race) were held. By the end of the Eighties, the 48-hour split of ‘Across the Years’ was held. In the Nineties, 48-hour splits were recorded from the ‘Sri Chinmoy Ultra Trio 1300 Mile Race’, ‘Sri Chinmoy Ultra Trio 1000 Mile Race’ and ‘Sri Chinmoy Ultra Trio 700 Mile Race’. The 48-hour splits were also recorded in the ‘Sri Chinmoy 6 Day Race’, ‘Sri Chinmoy 10 Day Race’ and in ‘Self-Transcendence 3100 Mile Race’. Single 48-hour races were held, such as ‘Ultracentric 48-Hour Run’, ‘Silverton 48-hour Challenge’ or ‘Merrill’s Mile 48-Hour Run’. After 2010, the number of 48-hour events increased in the USA considerably to about two dozen a year. In France, the 48-hour races started in the early Eighties with ‘48 Heures de Toulouse’ and ‘48 Heures Nice’ held as track races. The ‘48 heures pédestres de Surgères’ also started in the Eighties. 48-hour splits started with ‘Les Six Jours de La Rochelle’ as track race. Between 1990 and 2010, ‘48 heures pédestres de Surgères’ were held. From 2007 until 2013, ‘48 heures d’Antibes’ were held. In 2008, ‘Les 48 heures de Royan’ continued until nowadays. Since 2006, the 48-hour split was recorded in ‘Les 6 jours d’Antibes’. The first 48-hour run in Germany started in 1994 with ‘48-Stunden-Self-Transcendence-Lauf Köln’. From 2004 to 2007, a 48-hour split was recorded in ‘6-Tage-Lauf Erkrath’ held on a 400 m track. In 2010 and 2011, ‘Internationaler 48 h Bahnlauf in Köln’ was held as track race. Considering Australia, the first race was in 1880 as ‘Grenfell 48 h’. The next 48-hour races started 100 years later as 48-hour splits in ‘Colac Ultra Marathon’ and ‘Six-Days-Colac,’ held as track races. In the Nineties, more 48-hour splits were recorded in ‘Campbelltown Australian 6 Day Race’ as track race or ‘Caboolture Multi-Day Event’ as road race. In 1993, ‘Australian 48-Hour Track Championships’ and 1994, ‘Australian & Queensland 48-Hour Track’, were held as track race. Since 2000, a few 48-hour races were generally held, with ‘Gold Coast 48 Hours Track Race’ and ‘Caboolture Historical Village 48 h’ as track races. At 2020, the number of the races increased with 48-hour splits in ‘Adelaide 6 day’ and ‘Sri Chinmoy 48 Hour Track Festival Canberra‘ apart from single 48-hour runs.

Japanese are the fastest 48-hour runners

We found that athletes from Japan, Israel, and Iceland achieved the absolute fastest average running speeds, and could, therefore, not entirely confirm our hypothesis that the fastest 48-hour runners would originate from Europe. The athlete’s origin seems important in a specific race format. Reasons behind the success of specific nations in the field of athletics could be the availability of sports infrastructure25, the potential for altitude training26, and the role of genetic predisposition27.

Regarding Japanese 48-hour runners, the best men achieved more than 400 km and were mainly running in European races such as Surgères (France) and Brno (Czech Republic) but also in Taipei City (Taiwan). Similarly, the fastest Japanese woman achieved nearly 400 km and competed in European races such as Surgères (France), Brno (Czech Republic), and Athens (Greece), as well as in Taipei City (Taiwan). Japanese ultra-marathoners are not only the fastest in 48-hour running. They are the most numerous among the best 100-km ultra-marathoners28 and also the fastest 100-km ultra-marathoners29. Potential explanations for the Japanese dominance in ultra-marathon running could be their healthy lifestyle, their health status with low triglycerides and elevated high-density lipoprotein cholesterol, and their vigorous training of average monthly running distances of 257.2 ± 128.9 km for male and 209.0 ± 86.2 km for female runners30. Regarding Israel, the best men achieved slightly more than 300 km or below and competed apart from the USA (New York) and South Africa (Pietermaritzburg and Johannesburg) preferably in Athens (Greece). The best 48-hour runners from Iceland achieved between 330 and 350 km and competed preferably in Ronne (Denmark). Therefore, it was noticed that the fastest runners in the 48-hour race format competed in races far away from their country of origin and had to make a very demanding travel - in terms of fatigue and cost - to participate. It was assumed that the fastest runners were more ‘professional’ in contrast to the relatively higher participation rates of local runners, which could be more ‘recreational’ athletes.

Where are the fastest races offered?

We found that the fastest races were held in Japan, France, Great Britain, the Netherlands, and Egypt. In Japan, we found three 48-hour races with ‘Hirosaki 48 h Ultra Run’, ‘Tokyo Int. 48 h Run in Tokyo Gakugei University’ and ‘Tokyo Gakugei University 48-hour experimental run’. In France, more races have been offered since 1982, with ‘48 Heures de Toulouse’ and ‘48 Heures Nice’ held as track race. The best performances were achieved in ‘48 Heures pédestres de Surgères’ and ‘48 Heures Pedestre a Montauban’ held as track race and ‘Les Six Jours de La Rochelle’ held as indoor track race. Yiannis Kouros achieved in 1985 in Montauban 452 km in Great Britain, where the ‘Gloucester 48 h Track Race’, the ‘48 h Track Race Nottingham ‘, and the ‘Blackpool 48 hours track race’ were held as track races. In the Netherlands, the ‘72 h Lauf in Deventer’ as road-based ultra-marathon recorded a 48-hour split. For Egypt, the ‘Cairo 48hour World Run Challenge’ was held.

Sex difference in performance

Apart from the origin of the runners and the fastest race locations, we also investigated the sex difference in performance and found that men were ~ 0.4 km/h faster than women. In ultra-running, it is well-known that men run faster than women31. Recent studies showed that the gap decreases with increasing race distance/duration31 and increasing age of the athletes11. Despite some disadvantages (e.g., lower O2-carrying capacity), Tiller et al. reported that greater fatigue resistance, greater substrate efficiency, and lower energetic demands would be advantageous for women in ultra-marathon running32. Although studies reported that women narrowed the gap to men in ultra-marathon running33,34,35, women seemed only to narrow the gap to men in older age groups [36, 37, 38.]

Master runners are the fastest

A last important finding was that most finishers were in age group 45–49 years, and runners in age group 45–49 years achieved the fastest running speeds. Often, the best ultra-marathoners in very long distances or durations are master runners39. With the age group 45–49 years, we can confirm the results of an older study investigating the age of peak ultra-marathon performance for runners competing in time-limited ultra-marathons held from 6 to 240 h (i.e., 10 days) during 1975–2013, reporting a peak of ~ 46.8 years for 48-hour runners7. This was an interesting finding considering that the peak of physiological characteristics related to long-distance running performance occurs at a much earlier age40. Thus, this finding highlighted that other than physiological parameters – such as experience and motivation – might explain the high rates of participation and performance in finishers in age group 45–49 years41. One of the factors to be explored is the impact of age/longevity on sports performance42. Understanding how the aging process influences an individual’s ability to excel in sports involves delving into both physiological (e.g. muscle mass and strength, cardiovascular function, bone density and injury risk, and hormonal changes) and psychological (e.g. motivation and goal setting, cognitive function and decision-making, and social and environmental factors) aspects43. As individuals age, there is a natural decline in muscle mass and strength. This can affect the power and explosiveness required in certain sports and potentially impacting performance44,45. Age-related changes in cardiovascular function, such as a reduction in maximum heart rate and aerobic capacity, can influence endurance sports44. Aging is associated with a decline in bone density, increasing the risk of fractures and injuries46. Hormonal shifts, particularly in testosterone and estrogen levels, can impact muscle maintenance and recovery performance44. We also need to consider physiological adaptations associated with the aging process in the context of running47. Performance at older ages is associated with greater longevity and with older populations observed in certain countries in Europe and around the world, e.g., Japan30,48.

Physiological declines, such as a reduction in VO₂ max, are compensated for by experience, pacing strategies and psychological resilience in master athletes49. The mental toughness, accumulated life experience and motivation of older athletes are crucial to their success50. Regarding psychological aspects, the motivations of athletes may evolve with age51. Aging athletes may develop enhanced mental toughness and resilience through accumulated life experiences52. Cognitive function, including decision-making and reaction time, can be influenced by age52. The social and environmental context in which an athlete ages can also play a role. Exploring the support systems, training environments, and societal attitudes toward aging athletes can provide insights into the broader context of sports performance53.

The observed patterns in the study’s results may be attributed to a combination of physiological and psychological factors. The combination of genetic, training-related, nutritional, and psychological factors likely contributes to the observed variations in 48-hour ultramarathon performance across different countries and age groups54. Differences in genetic predispositions and anthropometric characteristics among athletes from various regions could contribute to variations in running performance54,55. Genetic factors influencing endurance, muscle composition, and metabolic efficiency may differ across populations55.

Training and environmental adaptations might also play a role. Athletes from Japan, Israel, and Iceland may benefit from specific training methodologies and environmental conditions that enhance their physiological adaptations for ultramarathon running56. Training at high altitudes, exposure to diverse terrains, and climate-specific conditioning could play crucial roles57. Variances in nutritional practices and dietary habits among athletes from different countries may influence energy metabolism and endurance58. Cultural preferences in dietary choices could impact the athletes’ ability to sustain high-intensity efforts over 48 h59. Cultural and individual differences in mental toughness and motivation might contribute to variations in performance60. Athletes from specific countries or regions may possess a cultural emphasis on endurance sports, fostering a mindset conducive to pushing through challenges during prolonged races61. Societal attitudes and cultural norms related to endurance sports may influence the psychological resilience of athletes62. Countries with a strong tradition in ultramarathon running may produce athletes with a mindset better suited to handle the mental demands of a 48-hour race63. The presence of a supportive community and peer influence within a country or region can significantly impact an athlete’s psychological well-being and performance64. The observed performance trends in the 45–49 years age group could be influenced by cultural attitudes toward aging and physical activity65. Societal perceptions of aging may impact the psychological approach of athletes in this age range, potentially leading to outstanding performances66. In conclusion, a comprehensive exploration of the impact of age and longevity on sports performance requires a nuanced examination of both physiological and psychological factors.

Limitations and implications for future research

The results were a summary of the observations across both descriptive charts (i.e., target plots) and model interpretability charts (PDP and prediction plots). Some countries with small sample sizes but with faster runners might have been excluded from the analysis due to the methodology used. Athletes could have changed their country of residence/nationality and started studying for another country, which was not considered in the present study. Factors such as altitude adaptation, sports infrastructure and cultural emphasis on endurance sports may provide a more nuanced explanation but were not considered67. Although we found a global diversity of participation in ultra-running, we were not able to link the findings to broader socio-economic or cultural factors, such as the level of institutional support, the existence of sports traditions, or the economic accessibility of ultraraces in different countries. On the other hand, the strength of the present study was its novel methodological approach since it was the first time that a machine learning model was used to predict 48-hour running performance from age, gender, country of origin, and event country. Furthermore, our findings would provide practical information for professionals (e.g., sports scientists and coaches) working with ultramarathon runners to set optimal performance goals depending on the event country. Future studies might investigate a detailed examination of the physiological and psychological factors that may contribute to the performance gap between men and women.

Conclusion

In summary, elevation of the course (flat course) and the running surface (track) were the most important variables in the XG Boost model. Country of origin of the athlete and country where the event was hold were the most important features by the SHAP analysis. Athletes from Japan, Israel, and Iceland achieved the fastest average running speed. The fastest races were held in Japan, France, Great Britain, Netherlands, and Egypt. Any athlete intending to achieve a personal best performance in this race format can benefit from these findings by selecting the most appropriate racecourse.