The regulatory status of health apps that employ gamification

Freyer, Oscar; Wrona, Kamil J.; de Snoeck, Quentin; Hofmann, Moritz; Melvin, Tom; Stratton-Powell, Ashley; Wicks, Paul; Parks, Acacia C.; Gilbert, Stephen

doi:10.1038/s41598-024-71808-2

Download PDF

Article
Open access
Published: 09 September 2024

The regulatory status of health apps that employ gamification

Oscar Freyer^1,2,
Kamil J. Wrona³,
Quentin de Snoeck^4,5,
Moritz Hofmann²,
Tom Melvin⁵,
Ashley Stratton-Powell^5,6,
Paul Wicks⁷,
Acacia C. Parks⁸ &
…
Stephen Gilbert¹

Scientific Reports volume 14, Article number: 21016 (2024) Cite this article

9372 Accesses
13 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Smartphone applications are one of the main delivery modalities in digital health. Many of these mHealth apps use gamification to engage users, improve user experience, and achieve better health outcomes. Yet, it remains unclear whether gamified approaches help to deliver effective, safe, and clinically beneficial products to users. This study examines the compliance of 69 gamified mHealth apps with the EU Medical Device Regulation and assesses the specific risks arising from the gamified nature of these apps. Of the identified apps, 32 (46.4%) were considered non-medical devices; seven (10.1%) were already cleared/approved by the regulatory authorities, and 31 (44.9%) apps were assessed as likely non-compliant or potentially non-compliant with regulatory requirements. These applications and one approved application were assessed as on the market without the required regulatory approvals. According to our analysis, a higher proportion of these apps would be classified as medical devices in the US. The level of risk posed by gamification remains ambiguous. While most apps showed only a weak link between the degree of gamification and potential risks, this link was stronger for those apps with a high degree of gamification or an immersive game experience.

Gamified eHealth interventions for health promotion and disease prevention in children and adolescents: a scoping review

Article Open access 20 March 2025

Can Apple and Google continue as health app gatekeepers as well as distributors and developers?

Article Open access 31 January 2023

Mobile app validation: a digital health scorecard approach

Article Open access 15 July 2021

Introduction

Smartphone applications (i.e., mHealth apps) are one of the main delivery modalities in digital health¹. As with other end-user-facing apps, user engagement is critical to the efficacy of these healthcare apps^2,3. One strategy to augment engagement and enhance user experience in these apps involves integrating elements that have demonstrated efficacy in digital games^4,5,6,7.

The integration of these game elements can be approached in two different ways. One is through the development of fully-fledged games, commonly referred to as ‘Serious Games’ (SG)⁸. Alternatively, specific game design elements (e.g., ‘points,’ ‘badges,’ or ‘leaderboards’) can be used within non-gaming apps, known as gamification^8,9,10. The definition and distinction of both terms are still much debated. Various definitions exist, some of which are incongruent^7,10. This is further complicated by the fact that gamification can be incorporated into apps to differing degrees, and in extreme cases, it is difficult to distinguish between gamification and SGs¹¹. SGs are often defined as an “Interactive computer application, with or without significant hardware components, that has a challenging goal, is fun to play and engaging, incorporates some scoring mechanism, and supplies the user with skills, knowledge, or attitudes useful in reality […]⁸,” while gamification is often defined as “the use of game design elements in non-game contexts⁹.” This definition is limited^10,11 and was extended by Sailer et al. in 2017, who defined gamification as the “[…] process of making activities in non-game contexts more game-like by using game design elements⁴.” SGs, however, are inspired by entertainment games in appearance and technology and often need specific hardware (e.g., powerful computers, VR headsets, or gaming controllers)^8,12. In contrast, gamified apps utilize game elements like points, badges, high scores, leaderboards, and storytelling⁴ and generally have cheaper development and fewer hardware constraints than serious games. The differences in and application of these approaches are described in Fig. 1.

The number of gamified apps on the market is growing^15,16,17, with many examples among fitness, business, and education apps^18,19. A sector that is also growing is healthcare-specific gamified approaches^20,21, which are increasingly linked to wearable devices (e.g., smartwatches) and are applied in chronic disease management, therapy, nutrition, and mental health^21,22,23. A further grouping is apps as platforms for healthcare organizations such as insurance companies²⁰.

The evidence on the effectiveness, safety, and benefits of gamification in apps is mixed. There is suggestive evidence of its beneficial effects on health-related behaviors in one systematic review, particularly in promoting physical activity, nutritional awareness, healthy eating, and medication adherence, with positive effects in 59% of included interventions and neutral or mixed effects in 41% of the included studies²², and a statistically significant but not clinical relevant improvement of knowledge (mean difference of 0.88 (0.05–1.75), 95% confidence interval, p < 0.05) and no effect on BMI in another meta-analysis²⁴. Behavioral outcomes seem to be positively targeted, with some potential in mental health²². Yet, another systematic review indicated that gamification elements within mental health apps did not significantly boost the effectiveness of interventions or user adherence²⁵. Gamification has shown a favorable influence on food selection and related nutritional behaviors in children and adolescents in the post-intervention phase (increase of the serving of fruits and vegetables of about 0.67)²⁴. In diabetic populations, gamified interventions have been associated with improved medication adherence and HbA1c control (mean difference − 0.21; 95% confidence interval (− 0.37 to − 0.05); p = 0.01)²⁶ and reduced high-fat food consumption in diabetes self-management²⁷. However, the lasting effect of gamification is not yet fully understood, as shown by the unknown long-term effects on blood glucose control beyond a 12-month period²⁶. In addition, insufficient technical implementation of gamification elements or the use of these elements without understanding the underlying psychological mechanisms limits user engagement²⁸. A recurring limitation in serious games and gamification research is the lack of robust study designs and randomized controlled trials, making it difficult to draw concrete conclusions^{22,24,27,29,30}.

When developing gamified applications for clinical practice rather than research purposes, new challenges might arise, such as ensuring clinical safety and managing risks associated with their use. Once gamified apps have a medical purpose, e.g., the treatment of specific diseases, they must comply with existing stringent regulatory requirements like all devices with a medical purpose. This includes obtaining clearance or approval as medical devices (MDs) and adhering to rigorous quality management frameworks encompassing development, release, audit, and clinical evidence of safety and efficacy. This procedure is not only intended to assess the potential benefits of an application, but primarily to mitigate the risks associated with the use of this product for end users, regardless of the actual clinical improvements. Globally, MDs, including gamified apps, fall under various regulatory jurisdictions, such as the EU’s Medical Device Regulation 2017/745 (MDR)³¹ and the US Food, Drug & Cosmetic Act (FFDCA)³². In the context of the EU, a device (e.g., software but also SGs) qualifies as a MD if, among other things, it is intended by the manufacturer to be used for “(…) diagnosis, prevention, monitoring, prediction, prognosis, treatment or alleviation of disease (…)”, injury or disability, or for the control or support of conception³¹. In addition to the binary classification as a medical device or not, many jurisdictions classify devices into certain classes (e.g., Class I to Class III) depending on the risk they pose to the user^31,32. This risk classification is based on defined rules for different kinds of devices, e.g., software, surgical instruments, or invasive devices^31,33. Supplemental Material 1 provides a detailed description of the steps for qualification as a medical device and for classification into specific risk classes with examples. Putting such a MD on the market without meeting these requirements, when identified by the regulator, would be followed by warning letters for removal of the product, followed by enforcement action, which could include heavy fines or even custodial sentences. The latter are rare, and regulators have limited capacity for market surveillance. Therefore, unapproved (and illegally on the market) apps persist in the app stores³⁴. An investigation of drug-dosage calculators, which are also considered SaMDs, has shown that the regulatory compliance of these freely available apps is poor and that they pose a risk to patients³⁵. Although there is no definitive evidence of widespread actual harm to patients caused by SaMD malfunctions^36,37, the task of regulation is to prevent this as much as possible. Since most app-based MDs are distributed via the Apple and Google app stores, which are considered distributors and often importers under MDR³¹, they are responsible for ensuring conformity with regulations³⁸.

Interpreting the regulatory clearance/approval process can already be challenging for non-software MD manufacturers or traditional software developers^39,40. Interpretation is yet more challenging for cutting-edge technologies, including artificial intelligence (AI), generative artificial intelligence (e.g., large language models)^41,42,43,44, or gamification. Here, both the interpretation of whether the intended purpose of the app qualifies as a medical device and their classification (grouping into regulatory ‘risk classes’ of medical devices) can be difficult. Also, since these apps could pose previously unknown risks, assessing their risks, including psychological effects, situation-specific interfaces, individualized user experiences, or device performance, can be challenging⁴⁵.

Despite the increasing popularity of gamified health apps, it remains unclear whether, as a category, they are effective or meet the regulatory requirements placed on them to ensure public safety. Even though the regulatory path of gamified applications is the same as that of non-gamified applications, this relatively new technology poses particular regulatory challenges within existing legislations⁴⁵. Existing regulations provide guidance on how to assess and mitigate the risks of certain aspects of SaMD, including audio-visual technologies⁴⁶, the use of AI⁴⁷, or cybersecurity⁴⁸. However, gamification differs from these aspects covered by existing guidance due to the unique characteristics described above. Therefore, this study assesses the regulatory compliance of the EU’s most popular gamified health apps, available via the Google PlayStore and Apple AppStore. It examines the evidence supporting their medical claims (including diagnostic and therapeutic claims) and compliance with the EU’s MDR. The study includes a qualitative comparison of the regulatory compliance of the identified apps in the US. We hypothesize that (H1) many gamified apps are on the market without the required MD approval and that (H2) unapproved gamified apps on the market without the required MD approval have hazardous or uncontrolled aspects to their gamification that could lead to patient harm.

Results

General results

The completed database contained 1674 apps, with the flowchart presented in Fig. 2. The search was conducted on the 27th of July, 2023. 807 apps were removed due to duplicates, and four could not be accessed since their sites were not retrievable in the app stores. The remaining 863 apps were then examined by the mHealth assessment panel. Of these, 328 met the criteria for mHealth apps. The gamification assessment panel examined those, and 77 apps were considered gamified. Eight apps were excluded since they functioned solely as interfaces for sensors or hardware medical devices. The remaining 69 apps were examined by the regulatory, clinical evidence, and risk expert panel. Seven of these 69 apps were already approved as medical devices in the EU, two according to the Medical Device Directive (MDD)⁴⁹ and five according to MDR³¹. All five (of seven) MDR-cleared and one (of two) MDD-approved apps were considered to have the correct risk class and CE-marking under the regulations as they currently apply (allowing for the transition arrangements for the older MDD regulations). One MDD app was assessed as likely to be up-classified in its risk class under MDR (from MDD Class I to MDR Class IIa). All seven MD-approved apps were assessed as qualifying as MDs by the panel. Of the 62 not approved apps, 32 were considered ‘non-MDs,’ 10 were considered potential MDs, and 20 were considered MDs. The strength of agreement between the reviewers regarding the qualification as MD was fair (Fleiss' kappa = 0.40).

Apps that qualify as MDs have to follow a defined regulatory process. Considering all 69 assessed apps, 32 (46.4%) do not need to follow the regulatory process since they do not qualify as MDs. Six apps (8.7%) were rated by the panel as compliant with the required regulatory process. 10 apps (14.5%) were considered as potentially on the market without the required regulatory approvals, and 21 (30.4%) as on the market without the required regulatory approvals. The strength of agreement between the reviewers on regulatory compliance was fair (Fleiss' kappa = 0.37).

All seven approved apps and two not approved apps (13.0%) presented evidence in the form of peer-reviewed publications on their websites that demonstrate the effectiveness of their medical claims.

Categorial analysis

The apps are distributed across 24 medical indication/intended purpose categories. The largest category of apps was cardiovascular health management apps (n = 11), followed by apps assisting with addictions (n = 8), health insurance management apps (n = 6), diabetes management apps (n = 5), and pregnancy management apps (n = 5). The remaining apps were distributed over categories, with low numbers per category (1–3 apps), belonging to categories including, e.g., clinical guidance for HCP apps or symptom checker. A detailed list of all categories can be found in Fig. 3.

The apps with the highest risk class as assessed by the regulatory panel can be found in the category of clinical guidance for HCP apps (Class IIa–III), diabetes management apps (Class IIb and Class IIb–III), and cardiovascular health management apps (Class IIa–III). All apps assisting with addictions, apps that foster physical activity, medication tracker, and parenting apps were considered ‘non-MD.’ The degree of compliance to applicable regulations varied greatly between app categories: in some categories, most or all apps were compliant with regulations (e.g., symptom checker, mental health apps), while in other categories, all or nearly all apps could be considered to not have the correct risk class and CE-marking under the regulations as they currently apply (cardiovascular health management, neurological rehabilitation, diabetes management, urological prevention/therapy, musculoskeletal therapy). Among the approved apps, two were intended for obesity management and two for diabetes management.

Qualification as a medical device and risk classification

32 (46.4%) apps were considered not to be classified as MDs under EU law, while 10 (14.5%) were considered potential MDs, and 27 (39.1%) were considered MDs. The reliability of agreement between the assigned ‘most-likely’ vote for risk classification of reviewers was fair (Fleiss' kappa = 0.33). In case of unanimous votes by the panel, ten apps would fall in Class I (14.5%), seven in Class IIa (10.1%), and one in Class IIb (1.4%). For non-unanimous votes, the range was quite broad. Four apps (5.8%) ranged between ‘non-MD’ and Class IIa, another three (4.3%) between Class IIa and Class III, four apps (5.8%) ranged between ‘non-MD’ and Class IIb and eight apps (11.6%) were considered ‘non-MD’ or Class IIa. A detailed presentation of the panel’s votes, the range of the panel’s ‘most-likely’ votes on the qualification/classification, and other qualification/classifications considered as possible but less likely by the panel are shown in Fig. 4.

App store overview

21 (30.4%) apps were only available on Android, 27 (39.1%) only on iPhone, and 21 (30.4%) on both. The highest share of apps that might be on the market without required approval was on Android, with 14 out of 21 apps (66.7%) compared to 13 out of 27 apps (48.1%) on iPhone.

Most apps fell in the category of ‘Medical’ (n = 42, 60.9%). Of these apps, 27 (64.3%) could be considered MDs or potential MDs, but according to the expert panel, only 21 were on the market with correct risk class and CE-marking under the regulations as they currently apply. All seven already cleared/approved apps were in this category. Among the ‘Health & Fitness’ apps (n = 27, 39.1%), 10 (37.0%) were considered MDs or potential MDs by the regulatory panel. None of them had the required clearance/approval. 44 (63.8%) apps were free of charge, and 25 (36.2%) were in the ‘Paid’ section. Further details can be found in Table 1.

Table 1 Regulatory compliance of the mHealth apps in both major app stores, Apple AppStore and Google PlayStore.

Full size table

Gamification analysis

The average degree of gamification, defined as the subjective perception of the effectiveness of gamification (‘gamefeel’) rather than a quantitative measure of the number of gamification elements, was assessed as medium (2.3 out of 5), with the highest degree of gamification assessed as 4. Five apps used the element ‘points,’ four the element ‘badges,’ six implemented ‘performance graphs,’ one included ‘avatars,’ and one included ‘storytelling elements.’ No app used ‘leaderboards’ or ‘multiplayer features’ to foster competition. Four apps were intended for cardiovascular health management, one for mental health management, one for diabetes management, one for pregnancy management, and one for neurological rehabilitation. Details can be found in Table 2.

Table 2 Characteristics of apps assessed in the gamification analysis.

Full size table

Analysis of app risks related to gamification

The numerically predominant risk identified across a subset of eight apps in the in-depth gamification analysis was the possibility of users being misled by incorrect medical information, predominantly attributed to software bugs, which could result from poor programming or inadequate testing, or attributed to conceptional problems in the design process, e.g., of interfaces. This risk was frequently classified as moderate in severity. The correlation between this risk and the gamification elements was generally assessed as weak, particularly in apps with a lower degree of ‘gamefeel’ (1 and 2 out of 5). Apps with a higher ‘gamefeel’ were assessed as having a stronger linkage between gamification and risk. These risks include the danger of users developing an over-reliance on the gamified tool caused by more compelling engagement mechanisms. Due to the assessment process based on ISO 14971, similar hazards, foreseeable sequences of events, hazardous situations, and harms in different apps could lead to different severities. The same applies to the association between gamification and risk, as this relationship depends on whether gamification elements alone have an impact on risk or whether other aspects of the app have an effect as well. Details can be found in Table 3.

Table 3 Specific risks of gamification elements in mHealth apps.

Full size table

US comparison

Of the 69 apps assessed by the regulatory panel, 63 were available on the US market and were therefore also evaluated by a US regulatory expert. These included all seven apps with EU regulatory approval. According to the expert assessment, 11 of the 63 apps (17.5%) were considered as not qualifying as MD under US regulations, 41 apps (65.1%) would fall in Class I, 10 in Class II (15.9%), and zero in Class III. One app could not be assessed due to uncertainties of the intended use, an inadequate description, and insufficient material.

Compared with the EU regulatory assessment, more apps would qualify as MDs in the US (51 of 63, 81.0%) than in the EU (37 of 69, 53.6%). The proportion of apps assessed as being on the market without the required approval is smaller in the US (7 of 63, 11.1%) than in the EU (31 of 69, 44.9%). The expert attributed this to the fact that although the definition for Class I includes many functionalities/purposes in its scope, many of these products’ regulatory pathways would be through the self-declared enforcement discretion route⁵¹ or the self-declared 510(k) exempt route⁵² with a low accompanying regulatory effort for developers⁵³.

Seven apps (11.1%) were considered on the market without the required regulatory clearance/approval, all of which would fall in Class II in the US. One of these apps has already been approved according to MDD in the EU. Its available evidence would, however, not be sufficient for the FDA approval process. In contrast to the EU, the three apps for neurological rehabilitation fall into Class II in the US (these were judged to be in Class I in the EU). The risk of a cardiovascular health management app that uses LLMs was assessed as lower in the US (Class II, EU: Class IIb or III). Details can be found in Fig. 5.

Discussion

Principal results

Our study has several findings that have implications for regulatory oversight and market compliance of publicly available gamified mHealth apps in the EU. Out of 863 apps analyzed, 69 were gamified mHealth apps. The panel considered 37 apps (53.6%) as MDs or potential MDs, necessitating appropriate clearance. Only 7 (10.1%) of these apps were already considered to have the CE-marking under the regulations as they currently apply, 6 of them in the appropriate risk class. One app was judged to face an up-classification with the transition from MDD to MDR on 31 December 2028⁵⁴. In total, 31 (44.9%) of apps were assessed as not or potentially not compliant with the regulatory requirements, which confirms our first hypothesis (H1).

Additionally, only 9 (13.0%) of the apps provide evidence for their effectiveness; among them, all seven cleared/approved apps^{4,22,24,27,29,30,45,55,56}. There is therefore a large gap in evidence for the largest segment of gamified apps and serious games. It is concerning that there is lack of clinical evidence (or the non-publication of this by the manufacturers) and a lack of appropriate regulatory approval, as lay people with no medical training could download these consumer-facing apps.

Only 14.5% (n = 10) of the assessed apps would fall into a low-risk category (Class I under MDR), whereas 39.1% (n = 27) could be classified as Class IIa or higher. MDs that fall into Class I, based on the MDR rules, have an easier pathway to regulatory approval since they can self-declare their conformity with the MDR without the involvement of a notified body^31,57. In comparison, higher-risk apps must be developed within a sophisticated and certified quality management system, bringing many quality requirements and increased regulatory burden to market access and involving a notified body for the conformity assessment^31,53. This process is presented in detail in Supplemental Material 1. It is highly unlikely that any of the 23 (33.3%) apps that are on the market without a CE-mark but assessed as Class IIa or higher will have undergone any standardized process of design, testing, risk assessment, and post-market surveillance, which are judged necessary by legislatures for apps in these categories.

The reliability of agreement among reviewers on this binary decision (MD or not MD) was fair (Fleiss kappa = 0.40), while the reliability of agreement for the risk classification itself (non-MD to Class III) was noticeably lower but still fair (Fleiss’ kappa = 0.33). This could be due to different reasons: firstly, a large variation of the reviewer’s perspectives indicating the challenge, even for experts, to classify gamified apps; secondly, a regulatory framework that is ambiguous; and thirdly, a regulatory framework challenged by the continuous emergence of new software approaches and app types that were not anticipated when the regulations were written. We suspect that the reason for this lies in a combination of all three explanations, considering existing evidence that the practical classification of MDs is ambiguous and discussed by courts⁵⁸ and among experts⁵⁹. Additionally, the ruleset for SaMD remains unclear and relatively broad, as stated in MDCG 2019–11⁶⁰. Some experts have even challenged whether any mHealth apps can be classified as MDR Class I devices, and some competent authorities have taken this view, refusing to accept the registration of any Class I apps⁶¹, while other experts and competent authorities have accepted the registration of Class I apps. This ambiguity is highlighted by the classification assessments of individual apps in this study, often ranging between Class I and IIa (Fig. 4).

Specific risks linked to gamification

Our assessment shows that gamification was used in both approved MDs and in apps with a clear medical purpose that are available through the app stores despite lacking the required clearances/approvals. The second research question centered on whether the observed lack of compliance means risks for patients that might be related to gamification. The presence or absence of gamified elements does not solely determine an app’s qualification as a medical device since, here, the app’s intended purpose, which is defined by the developer, is crucial. However, gamification elements are of great importance for the risk assessment process, an integral part of the MD approval process, particularly when they are integral to the app’s medical purpose. Thus, they may affect the MD’s risk classification. While it is likely that apps that are approved/released as MD have considered the risks associated with gamification in their design, risk assessment, usability engineering, and mitigation measures as required by the regulations, it is highly unlikely that any of these formalized approaches to safe design have been followed for apps that are on the market without the required clearance/approval.

The in-depth analysis of a subset of eight apps was focused on specific risks associated with gamification elements and harms that could emerge from the failure of such elements. In a standardized approach defined in ISO 14971⁵⁰, we have described risks related to hypothetical scenarios as well as risks that occurred while testing the app. This approach would be a requirement for the approval of MDs.

Our results show that for most of the apps examined, the scenarios of failure of gamification elements posed little risk to patients. This is due to the following: (i) the main risk to users was being misled by incorrect medical information, resulting in mild to moderate harm; and (ii) most of the apps studied had a low level of gamification, and we assessed that there was a low association between gamification and risk in the app. Some of these risks could be caused by software bugs, which could be addressed through updates, while others may stem from foundational design flaws requiring more substantive revisions. For apps with a higher degree of gamification, we found a stronger link between risks and gamification elements.

Of the eight apps included in the in-depth analysis, three had a strong connection between gamification and risk and possessed potentially hazardous or uncontrolled aspects to their gamification that could lead to patient harm. The first app (23_CARD) provided an interactive chat tool powered by ChatGPT-4 (a large language model-based chatbot⁶²), branded as an “AI doctor.” This tool claims to offer health advice based on deep learning algorithms, presenting a user-friendly interface adorned with comic-style 3D elements and avatars. Its gamification strategy hinges on a credit-based reward system designed to encourage daily logins and advertisement viewing, using gift timers and push notifications to foster habit-forming user behaviors. The number of interactions with the chatbots is limited by an in-app currency, which could lead to incomplete health assistance. The user would then rely on partial information, potentially resulting in delayed or incorrect medical treatments. Moreover, the app risks disseminating incorrect medical information, which occurred during the testing, misleading users about the severity or treatment of their conditions by providing contradictory advice under the guise of a specialized “AI doctor.” This misinformation could inadvertently direct users to make harmful medical decisions based on incomplete or inaccurate data. Additionally, since some of the app’s features are non-gamified while others are, the app poses the risk that patients neglect these non-gamified features (e.g., blood pressure management), leading to incorrect or delayed treatment.

The second app (43_DIAB) for use by diabetics incentivized users to enter their blood glucose measurement data by awarding them with ‘points.’ It used this data to calculate self-application dosage amounts of insulin. The app in question uses only five out of 81 items in the questionnaire for insulin dosage calculation and also rewards entering ‘0’ as a value (e.g., if no insulin was taken). If those constraints are implemented in a poor way (e.g., no option for adding ‘0’ as a value and still getting rewarded), or no consistency check of the input is performed, the user could ‘cheat’ to gain more points, e.g., by adding impossible hours of training or carbon intake, wrong insulin injection dosages. Those wrong values could potentially lead to false analyses and health recommendations by the app and hence might lead to lower clinical outcomes or, in a worst-case scenario, to high-risk situations for the patient, e.g., in hypoglycemia, if too much insulin is injected. An additional risk identified is that some features of the app are less gamified than others. This could lead to a neglect of the less gamified features, leading to incorrect or delayed treatment. We acknowledge that we have not had the opportunity to review the material provided for the regulatory process for this specific app. These reports would include both usability and clinical testing and may demonstrate that the approach taken in the app is indeed safe.

The third app (67_NEUR) was rated as having a high ‘gamefeel’, although it only has one gamification element as defined by Sailer et al. (2017). Even if it cannot be described as a fully immersive, full-fledged game due to its limited visual presentation and the absence of relevant game aspects, it still represents an edge case. The gamified character in this app was much more strongly linked to the intended purpose than in the other applications analyzed, leading to a limited immersive experience for the users. As a result, the connection between the gamification elements and potential hazards was higher than in apps that included gamification elements but no immersive experience. One gamification-related risk identified in this and several other apps was that of over-reliance on the apps due to their engaging design and the stimulation effect of gamification. In scenarios of app malfunction, gamification of this nature could exacerbate delays to the start of effective treatment, as the gamification could promote ongoing use. The stronger the gamification was linked to the intended purpose of the app, the stronger the connection between risk and gamification was among the apps analyzed. Whilst no fully immersive SGs could be included in our study, 67_NEUR could point to interesting future research. Since the intended medical purpose of immersive SGs is particularly strongly related to gameplay, the connection between the degree of gamification and risk observed in our study could be investigated better.

Considering the complex and varied link between gamification and risks, Hypothesis 2 could neither be definitively accepted nor rejected. However, a small number of dangerous gamified apps were identified on the market that are not approved/cleared and where there is a clear need for action by the regulatory authorities and the app store in question (Google PlayStore).

Implications for the app stores

Since Google and Apple have an effective duopoly on the mobile market⁶³, they play a pivotal role in distributing consumer-facing health apps. As importers or distributors under MDR³¹, the app stores are legally obliged to ensure regulatory compliance of the apps they offer³⁸. Our results on gamified apps support previous findings on other types of mHealth apps that neither Apple nor Google adequately meet those requirements^34,35. This study shows that the availability of unapproved apps, despite the legal requirement for MD approval, is greater on the Google PlayStore (42.9%) than on Apple’s AppStore (35.4%). This observation is even more pronounced for gamified apps exclusive to the Google PlayStore, where 66.7% are unapproved despite the assessment in this study as requiring l MD approval.

Additionally, the naming convention of the categories in both app stores remains unclear. Most apps considered in this study as MDs are in the ‘Medical’ category (n = 27), but some are also found in the ‘Health&Fitness’ category (n = 10), and in this latter category, none of which are adequately cleared/approved. While it was expected that a lot of MDs would be found in the ‘Medical’ category, the high number of MDs in the ‘Health&Fitness’ category was surprising since fitness apps are generally not considered MDs^31,33,60,64. A clear categorization could make distinguishing between MD and non-MD apps easier for users and HCPs.

Furthermore, apps communicate their MD status and associated risks in various ways. Some explicitly mention their regulatory classification and potential risks in their descriptions or supplementary documentation, such as websites or publications, and some do not at all. This inconsistency makes it difficult for users and HCPs to assess an app’s regulatory status and associated risk.

A balanced approach is necessary for app stores to address those challenges. Although regulatory bodies have the responsibility of approving new MDs, their limited resources make it difficult to provide complete oversight of all newly published applications. Here, the role of the app stores as market entry points becomes critical. Since they already have expertise in the automated screening and analysis of applications, they could assure regulatory approval or certification for apps with a medical purpose and implement basic checks to ensure content accuracy³⁴. The aim of that is to make it impossible to download clearly unsafe and illegal products from the app stores.

Assessment of the US approval status

Comparing the EU market with the US, we found a higher proportion of MD apps in the US. Nonetheless, due to the FDA’s enforcement discretion rules, fewer apps were marketed without the required approval in the US compared to the EU. This goes hand-in-hand with our finding that most assessed apps would fall into Class I and, therefore, be under enforcement discretion. The observation of a more flexible US regulatory classification approach for low-risk apps is consistent with the findings of other studies⁵³. We observed that the FDA product classification database⁶⁵ provided greater transparency around regulatory decisions and evidence than the EU counterpart database, which is only partially operational and will provide less transparency on completion⁶⁶.

This poses two main implications for the EU. Firstly, the US system enables low-risk Class I digital apps to reach the market more rapidly, potentially spurring faster innovation⁵³. Secondly, the easy access to information on regulatory decisions and evidence from the FDA databases provides openness on app classification decisions and associated evidence, which is lacking in the EU and is not planned for in the EU regulatory framework³¹. The greater US openness allows all stakeholders, including clinicians and the public, to examine the basis for regulatory approvals.

Implications for the regulation of gamified MDs

Our analysis shows that many gamified mHealth apps would be considered as MDs under current regulations. As such, gamified MDs, like non-gamified MDs, must comply with the applicable laws regulating them. While the general compliance of mHealth apps (gamified and non-gamified) was questioned by other researchers³⁵, the compliance issues might be caused by different reasons, including unawareness or insufficient guidance documents, although existing regulations provide guidance on certain aspects of SaMD, including audio-visual design⁴⁶, AI⁴⁴ and cybersecurity^48,67. For gamified apps, this lack of compliance could be caused by a notable gap of guidance documents addressing the intricate relationship between gamification elements, engagement design, and their impacts on health outcomes. The unique characteristics of such apps should be considered in the current regulatory approval process along with other aspects of the app’s safety and performance within the assessment, validation, and regulatory frameworks. A tailored framework, guidance, or standard should delineate the specific risk categories associated with gamification and outline the required steps for design and human factors evaluation, as well as for clinical evaluation, considering the prolonged and dynamic user activity characteristics of gamified apps. Such a framework should complement existing regulatory legislation, but not replace them. They are intended to provide guidance on how these products are manufactured and how their risks can be assessed within the framework of the applicable laws. Similar guidance documents exist for other risk aspects of MDs, e.g. cybersecurity^48,67 or AI⁴⁷. These horizontal features are present in many MDs and create additional risks, while the corresponding guidance documents help to assess and mitigate them. Our paper is a step towards how this can also be done for gamification. The increasing number of gamified health apps and the ambiguities around current regulations presented in this study underline the necessity for specific guidelines.

Additionally, there is a need for a more transparent and accessible system to build trust in the EU medical device landscape, as has been recognized for implantable medical devices in the past⁶⁸. This particularly applies to gamified mHealth apps, of which, in the judgment of the expert panel in this study, a surprisingly high proportion are on the market without the required approvals.

Limitations

This analysis has several limitations. The inclusion criteria limit assessed apps to the most populous EU countries, and this potentially limits the generalizability of results to all EU member states. Additionally, due to feasibility reasons, we had to limit our search of the app stores to the categories ‘Medical’ and ‘Health&Fitness’. Thus, we might have missed a small number of apps that have been misclassified by the app stores or developers. Assessment panel members had free choice of whether to download, install, and explore the functionality of the mHealth app or to assess the app based on the developer-provided descriptions, images, and videos (on the app stores, developer websites, and research publications). A minority of apps were downloaded. This is, however, compatible with the assessment of apps based on the developer-stated claims as they relate to the developer-stated functionality, which is also the core approach of the regulatory approval process. Only mobile apps were included in this study.

The definition of gamification and of gamification elements is still an area of discussion among researchers; an agreed consensus definition does not yet exist, and multiple researchers propose different lists of gamification elements^4,22,69,70. We used the definition proposed by Sailer et al. (2017)⁴, which is widely recognized. However, competing definitions also exist^9,11,13, which could limit our findings.

The identification of evidence on existing apps could be limited as developers of apps, whether MD or non-MD, are not obliged to put evidence in the public domain. As the EU EUDAMED database is still under development, there is no single source to definitively search a list of approved MDs and their risk class. A detailed search of the app store descriptions, the developer’s websites, research publications, and a general internet search was conducted. The qualification of apps as MDs in the EU is largely based on the manufacturer’s reported intended purpose. It was not always possible for the assessment panel to base their decisions on a stated intended purpose since not all apps clearly stated this. In these cases, the panel’s judgment was based on all available information on claims and functionality.

The analysis of specific risks of gamification was only conducted for a subset of eight apps, which had to be available in German and English, free of charge, available in the Google PlayStore, and accessible without a doctor’s prescription. Additionally, the degree of gamification was judged on a self-developed arbitrary scale. Thus, findings about the intersection of risks and gamification could be limited by the arbitrary character of this scale.

Conclusion

In the emerging field of gamified health apps, few commercial products meet the strict standards of medical device regulations, in particular, the EU’s MDR. Potentially unsafe and illegal products should not be available in the app stores, and the stores themselves and regulators should take action to remove them from the market in accordance with the applicable laws regulating MDs. Therefore, app stores must implement and enforce appropriate review procedures so that healthcare professionals and consumers can have confidence in the safety and efficacy of the apps they download. While the level of compliance for such apps is inadequate, the specific level of risk for most apps remains uncertain. In many cases, we found that there is only a weak link between the gamification elements and the potential risks of the apps. However, for the apps identified that have a high degree of gamification, the link between risks and gamification elements was stronger. This ambiguity highlights the need for further research to better understand and address the unique challenges that gamified health apps present in regulatory and safety contexts.

Methods

Overview

We used a 4-step approach adapted from other studies^14,71,72. First, a database of apps was set up and cleaned. Second, all apps were assessed against defined criteria to determine whether they could be considered mHealth apps. Third, the remaining apps were evaluated to see whether they could be considered gamified. Fourth, the remaining apps were considered by a panel of regulatory experts who assessed their regulatory compliance. Each assessment was carried out using defined evaluation criteria. To assess whether the hypothesized lack of compliance, if confirmed, means risks for patients that might be related to gamification, an in-depth analysis of the association between gamification and risks of a subset of apps following an approach defined in ISO 14,971 was conducted. Additionally, a qualitative comparison of regulatory compliance in the US was conducted for those apps on the US market.

The statistical analysis was conducted using Python version 3.11.6 and the following libraries: pandas, NumPy, and statsmodels. Data visualizations were created using Matplotlib. To calculate the reliability of agreement between the regulatory panel members, a Fleiss’ kappa measurement was performed⁷³; the interpretation was based on Landis and Koch⁷⁴.

Database setup

Since the app stores are organized regionally, and no EU-wide top list exists, each country had to be assessed individually. To keep the total number of apps manageable, only apps from the five EU countries with the largest populations, Germany, France, Italy, Spain, and Poland, which comprise 66% of the EU population, were included⁷⁵. Besides apps listed in the Category “Medical” in both app stores, we also included apps listed in the heterogeneous category of “Health and Fitness,” which includes apps with a medical purpose, e.g., cardiovascular health tracking apps and non-health related apps like hiking navigation apps. The Top 50 “free” and “paid” apps as listed by the stores in the categories “Medical” and “Health and Fitness” in the iOS AppStore and Google PlayStore in the selected countries were identified through the search function of the web versions of the iOS AppStore and the Google PlayStore on the 27th of July, 2023 and included in the study database. The manufacturer’s name, store link, rank, and operating system were listed for each app. The database was set up using Microsoft Excel for Mac Version 16.78.

App store listings were systematically reviewed, and duplicates, identified by the name of the app, store URL, and developer identity, were removed. Multilingual duplicates were omitted. Complimentary free versions of premium apps, inaccessible apps, and collections of multiple apps were also excluded.

Materials for the regulatory expert assessment

For all gamified mHealth apps, a store link and the main features of each app were provided. Since there is no standardized way in the app stores to report the state of regulatory approval and risks of an application, for each app, the EUDAMED database, the FDA databases, the DiGA directory (a directory of Digital Health Applications that can be prescribed by physicians and psychotherapists and are reimbursed by health insurers)⁷⁶, the manufacturer’s website, and the app were searched for existing clearance/approval. If the app had already been cleared/approved by any regulatory body, all linked and publicly available materials were presented to the panel.

Evidence for the apps was searched in a systematic literature review via Google Scholar and PubMed using the search terms “[app name]” and via an internet search engine search with Google using the search terms (“[app name]” AND study). Additionally, the websites of the manufacturer and, in case of existing clearance/approval, the according database^65,66,76 were screened. All identified peer-reviewed articles were presented to the panel members.

Selection and assessment of apps

Composition and expertise of the panels

Three different panels were involved in the assessment stage of the applications. The first panel for the mHealth assessment consisted of one physician and game developer (author O.F.), one public health and gamification researcher (author K.J.W), and one expert in medical device software (author Q.S.).

The second panel, responsible for the gamification assessment, consisted of one physician and game developer (author O.F.), one public health and gamification researcher (author K.J.W), and one developer of serious games (author M.H.).

The third panel, responsible for the regulatory assessment, consisted of one professor for regulatory science and consultant for regulatory affairs (author S.G.), one professor of medical device regulatory affairs and a former senior medical officer at the Health Products Regulatory Authority (HPRA) (author T.M.), one digital health consultant (Author P.W.), and Regulatory Consultant and former Senior Technical Assessor at the UK Competent Authority (MHRA) (Author A.S.P.).

mHealth assessment

The cleaned dataset was examined by three reviewers who independently assessed in a binary decision whether an app could be considered a mHealth app based on predefined criteria applied to the descriptions and available picture and video material in the app stores and on the manufacturer’s website. The reviewers used the definition of the WHO extended by Tomlinson et al. to determine whether an app is a mHealth app. Those were defined as “(…) medical and public health practice delivered on digital devices, covering the use of mobile phones to improve point of service data collection, care delivery, and patient communication to the use of alternative wireless devices for real-time medication monitoring and adherence support^77,78.” All apps that did not meet this definition were excluded. Each app was categorized according to its primary purpose, as stated in its description. Regarding non-English descriptions, two translation tools, “Google Translate” and “DeepL,” were used, and the resultant translations were checked for consistency. For German and French apps, the translation quality was verified by a native speaker. All apps from Spain, Italy, and Poland provided versions of the app and the app store page in English, which served as a base for the consistency check. In the case of divergent opinions between the reviewers, this was recorded, and the assessment of the majority of the reviewers was followed.

Gamification assessment

The dataset of the remaining apps was examined by three reviewers who independently assessed in a binary decision whether an app could be considered gamified based on predefined criteria applied to the descriptions and available picture and video material in the app stores and on the manufacturer’s website. The reviewers used the definition by Sailer et al., 2017⁴. The authors describe gamification as making activities in non-game contexts more game-like using the following game design elements. “Points” are awarded for specific accomplishments in a game. They signify a player’s progress with experience, redeemability, or reputation points. “Badges” are visual symbols of player achievements. They validate a player’s successes, serve as status symbols, provide feedback, and can influence player decisions. “Leaderboards” rank players based on their achievements compared to others. “Performance Graphs” are visual representations comparing a player’s performance to past results. “Meaningful Stories” are narratives that contextualize in-game activities, adding depth beyond scoring points. “Avatars” are visual icons representing players in the game. Players often choose or design them. “Teammates” are other players or computer-controlled characters promoting dynamics like conflict, competition, or cooperation⁴. This list by Sailer et al., however, is not necessarily exhaustive, with multiple competing lists existing^22,69,70. Any apps that did not meet the definition were excluded. Additionally, apps that function primarily as input software for a specific other device (e.g., a sensor) were excluded as they do not function in isolation to deliver a medical purpose.

Regulatory expert assessment

A panel of four experts in medical device clinical evidence, risks, and regulation examined all remaining apps. Each reviewer was provided with the complete list of 69 apps in randomized order. The assessment was performed individually without communication between assessors.

The assessment of the regulatory panel was based on the given definition of MDs in the Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on medical devices³¹. Here, a MD is defined as “any instrument, apparatus, appliance, software, implant, reagent, material or other article intended by the manufacturer to be used, alone or in combination, for human beings for one or more of the following specific medical purposes: diagnosis, prevention, monitoring, prediction, prognosis, treatment or alleviation of disease; diagnosis, monitoring, treatment, alleviation of, or compensation for, an injury or disability; investigation, replacement or modification of the anatomy or of a physiological or pathological process or state; providing information by means of in vitro examination of specimens derived from the human body, including organ, blood and tissue donations, and which does not achieve its principal intended action by pharmacological, immunological or metabolic means, in or on the human body, but which may be assisted in its function by such means³¹”. The risk classification was based on ‘Rule 11’ of the MDR, which states that “Software intended to provide information which is used to take decisions with diagnosis or therapeutic purposes is classified as class IIa, except if such decisions have an impact that may cause: death or an irreversible deterioration of a person’s state of health, in which case it is in class III; or a serious deterioration of a person’s state of health or a surgical intervention, in which case it is classified as class IIb. Software intended to monitor physiological processes is classified as class IIa, except if it is intended for monitoring of vital physiological parameters, where the nature of variations of those parameters is such that it could result in immediate danger to the patient, in which case it is classified as class IIb. All other software is classified as class I³¹,” on the MDCG 2021–24 Guidance on classification of medical devices³³ and on the expert’s interpretation and knowledge of the application of those rules. A standardized scale was used for the assessment based on the regulatory legislation. All experts used those documents in the past and were familiar with their content and rules. Additionally, a set of risk class definitions based on those guidelines was provided to the panelists to establish a common ground and understanding of the applicable rules. An application could either be assessed as “Non-MD” if it does not qualify as a MD in the eyes of the expert, or it could fall in one of the four risk classes defined by the regulatory legislation, “MD Class I,” “MD Class IIa,” “MD Class IIb,” or “MD Class III.” Criteria used by the panel for the risk classification are provided in Table 4, and example apps for each risk class are shown in Supplemental Material 1.

Table 4 Risk classification framework.

Full size table

Every reviewer had to: (i) report the classification into which the app is likely to fall, with an assignment of the “most-likely” classification and any number of less likely but also possible classifications; (ii) report their degree of confidence in their “most-likely” classification assignment, on a scale from 1 (low confidence) to 5 (high confidence); and (iii) if an app had already regulatory clearance/approval, decide whether the app could be considered as adequately cleared/approved (particularly important as lower risk classes are self-declared by developers ). The judgment was based on the intended purpose, as presented by the developer. Assessors were instructed to comment on any deviation between the developer’s stated intended purpose and actual delivered app functionality that would lead to a different risk class.

After the evaluation, the results were combined and analyzed. A vote of at least 3 out of 4 reviewers was defined as a definite vote. Compliance was then assessed as ‘regulatory compliant’ if the rules applicable to the individual app were followed or as ‘regulatory non-compliant’ in the event of deviations from the applicable regulations. Apps that did not qualify as MDs, in the opinion of the experts, were always assessed as ‘regulatory compliant’, as they are not subject to any special rules within the meaning of the MDR.

Gamification analysis

To assess the specific risks of the apps associated with their gamified aspects, an in-depth analysis of a subset of eight apps was carried out independently by two reviewers, one professor for regulatory science and consultant for regulatory affairs (author S.G.), and one physician and game developer (author O.F.). Only apps with a majority panel decision of MD status (3 out of 4 votes) were included. Selected apps had to be available in Germany and English, free of charge, available in the Google PlayStore, and accessible without a doctor’s prescription. Apps with similar functionalities by the same developer were excluded. Each app was downloaded and tested for at least 10 min. The number of gamification elements, following the adopted definition⁴, was stated, and the primary purpose of each app was assessed. Both reviewers had to report their personal game feel on an arbitrary scale from 1 (lowest) to 5 (highest). The scale was developed by three gamification experts (O.F., K.J.F., M.H.) to determine the degree of gamification. Both assessors had to answer the question “How close is the experience, interaction and appearance of this app to a full-fledged game?” on a scale from 1 (infrequent or minimal non-immersive gamefeel through gamification elements), 2 (occasional or moderate non-immersive gamefeel through gamification elements), 3 (occasional immersive aspects of gamefeel, but separated by user experience with no or minimal game feel), 4 (frequent immersive aspects of game experience throughout), and 5 (Fully immersive game experience throughout, the app experience feeling fully like a game). The rationale behind using this self-developed scale was that existing scales make the degree of gamification dependent on the number of elements used⁷⁰, leaving out the perceptual level^80,81. However, since the number of elements does not necessarily correlate with the quality of gamification or the game feel²⁵, we had to resort to a self-developed scale.

The specific medical risks that emerge from or are connected with the implemented gamification elements were assessed according to medical device risk assessment principles in the applicable compulsory standard ISO 14971⁵⁰. Firstly, hazards (a potential source of harm), the foreseeable sequence of events (sequence of events or circumstances that lead to a hazardous situation), the hazardous situation (circumstance in which people are exposed to a hazard), and the harm (injury or damage to the health of people) to the user were defined. The severity of harm, which is dependent on the intended use of the app, was then rated from 1 (Negligible—Inconvenience or temporary discomfort), 2 (Minor—Temporary injury or physical and mental impairment requiring simple or no professional medical intervention), 3 (Moderate—Temporary injury or physical or mental impairment requiring professional medical intervention (excluding surgical intervention)), 4 (Severe–Permanent physical or mental impairment or life-threatening injury requiring surgical intervention) to 5 (Catastrophic—User death as a direct result of hazard). The strength of the association of gamification elements with risks was reported on a scale from none (Gamification elements are not responsible for any part of the reported risk), weak (Gamification elements are minimally responsible for the reported risk, contributing insignificantly, with other aspects of the app being more responsible), moderate (Gamification elements are moderately responsible for the reported risk, contributing to an increase in risk likelihood or severity, with other aspects of the app being similarly responsible), to strong (Gamification is a major factor in the reported risk, significantly increasing its likelihood or severity, with other aspects of the app being minimally responsible).

US assessment

One senior expert for US regulation performed the comparative US market analysis. All apps that were not available in the US app store were excluded. The definitions followed FDA guidelines^51,65,82 and the FFDCA³². In the first step, an app was defined as an MD based on whether the product acts on a disease or disordered state and references the diagnosis, cure, mitigation, treatment, or disease prevention based on the app’s claim and intended use. In the second step, the specific risk class a MD would likely fall under (Class I, Class II, Class III) was based on the expert’s knowledge. Apps that have claims closer to mitigation (e.g., managing symptoms of MD) or if the product acts on a symptom rather than the disease itself (e.g., managing feelings of anxiety, where anxiety is not a disease), the device was considered as Class I. The device was considered Class II if the claims were specific to diagnosis, cure, or treatment. Since there are currently no software-only digital products in the US in Class III, such a classification was considered unlikely. In the third step, the most probable regulatory pathway was defined. Class I MDs mainly belong under the self-declared enforcement discretion status when they pose a low risk without needing evaluation or evidence⁵¹ or could belong in the self-declared 510(k) exempt status⁵². Class II MDs could also belong to the 510(k) exempt status, follow the 510(k) pathway if a predicate is available, or follow the De Novo pathway if no such predicate is available⁸².

Data availability

Access to the de-identified dataset of gamified mHealth-Apps can be made available by contacting the corresponding author.

Code availability

Only publicly available libraries were used for the statistical analysis.

References

Global Industry Analysts. mHealth Apps. https://www.marketresearch.com/Global-Industry-Analysts-v1039/mHealth-Apps-34001825/ (2023).
Michie, S., Yardley, L., West, R., Patrick, K. & Greaves, F. Developing and evaluating digital interventions to promote behavior change in health and health care: Recommendations resulting from an international workshop. J. Med. Internet Res. 19, e7126 (2017).
Article Google Scholar
Flaherty, S. J., McCarthy, M., Collins, A. M., McCafferty, C. & McAuliffe, F. M. Exploring engagement with health apps: the emerging importance of situational involvement and individual characteristics. Eur. J. Mark. 55, 122–147 (2021).
Article Google Scholar
Sailer, M., Hense, J. U., Mayr, S. K. & Mandl, H. How gamification motivates: An experimental study of the effects of specific game design elements on psychological need satisfaction. Comput. Hum. Behav. 69, 371–380 (2017).
Article Google Scholar
Rigby, S. & Ryan, R. M. Glued to Games: How Video Games Draw Us In and Hold Us Spellbound (Bloomsbury Publishing, New York, 2011).
Book Google Scholar
Hamari, J., Koivisto, J. & Sarsa, H. Does gamification work?—A literature review of empirical studies on gamification. In 2014 47th Hawaii International Conference on System Sciences 3025–3034. https://doi.org/10.1109/HICSS.2014.377 (2014).
Krath, J., Schürmann, L. & von Korflesch, H. F. O. Revealing the theoretical basis of gamification: A systematic review and analysis of theory in research on gamification, serious games and game-based learning. Comput. Hum. Behav. 125, 106963 (2021).
Article Google Scholar
Bergeron, B. P. Developing Serious Games (Charles River Media, Hingham, 2006).
Google Scholar
Deterding, S., Dixon, D., Khaled, R. & Nacke, L. From game design elements to gamefulness: defining ‘gamification’. In Proceedings of the 15th International Academic MindTrek Conference: Envisioning Future Media Environments 9–15 (Association for Computing Machinery, New York, NY, USA, 2011). https://doi.org/10.1145/2181037.2181040.
Warsinsky, S., Schmidt-Kraepelin, M., Rank, S., Thiebes, S. & Sunyaev, A. Conceptual ambiguity surrounding gamification and serious games in health care: literature review and development of game-based intervention reporting guidelines (GAMING). J. Med. Internet Res. 23, e30390 (2021).
Article PubMed PubMed Central Google Scholar
Werbach, K. (Re)Defining gamification: A process approach. In Persuasive Technology (eds Spagnolli, A. et al.) 266–272 (Springer International Publishing, Cham, 2014).
Chapter Google Scholar
Liu, D., Santhanam, R. & Webster, J. Toward Meaningful Engagement: A Framework for Design and Research of Gamified Information Systems. MIS Q. Forthcoming, (2016).
Schmidt-Kraepelin, M., Thiebes, S., Tran, M. C. & Sunyaev, A. Whats in the Game? Developing a Taxonomy of Gamification Concepts for Health Apps. In Proc. 51th Hawaii Int. Conf. Syst. Sci. HICSS 2018 1217 (2018).
Schmidt-Kraepelin, M., Toussaint, P. A., Thiebes, S., Hamari, J. & Sunyaev, A. Archetypes of gamification: Analysis of mHealth apps. JMIR MHealth UHealth 8, e19280 (2020).
Article PubMed PubMed Central Google Scholar
Lister, C., West, J. H., Cannon, B., Sax, T. & Brodegard, D. Just a Fad? gamification in health and fitness apps. JMIR Serious Games 2, e3413 (2014).
Article Google Scholar
Global Industry Analysts. Gamification. https://www.marketresearch.com/Global-Industry-Analysts-v1039/Gamification-33789259/ (2023).
Global Market Database. Global Market Size of Gamification Market—10 Year Market Forecast. https://globalmarketdatabase.com/product/global-market-size-of-gamification-market-10-year-market-forecast/ (2021).
Dicheva, D., Dichev, C., Agre, G. & Angelova, G. Gamification in education: A systematic mapping study. J. Educ. Technol. Soc. 18, 75–88 (2015).
Google Scholar
Rodrigues, L. F., Oliveira, A. & Rodrigues, H. Main gamification concepts: A systematic mapping study. Heliyon 5, e01993 (2019).
Article PubMed PubMed Central Google Scholar
Expert Market Research. Global Healthcare Gamification Market Report and Forecast 2023–2031. https://www.marketresearch.com/Expert-Market-Research-v4220/Global-Healthcare-Gamification-Forecast-34718757/ (2023).
Global Market Database. Global Market Size of Healthcare Distribution Market—10 Year Market Forecast. https://globalmarketdatabase.com/product/global-market-size-of-healthcare-distribution-market-10-year-market-forecast/ (2022).
Johnson, D. et al. Gamification for health and wellbeing: A systematic review of the literature. Internet Interv. 6, 89–106 (2016).
Article PubMed PubMed Central Google Scholar
Tolks, D. et al. Spielerische ansätze in prävention und gesundheitsförderung: Serious games und gamification. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 63, 698–707 (2020).
Article PubMed Google Scholar
Suleiman-Martos, N. et al. Gamification for the improvement of diet, nutritional habits, and body composition in children and adolescents: A systematic review and meta-analysis. Nutrients 13, 2478 (2021).
Article CAS PubMed PubMed Central Google Scholar
Six, S. G., Byrne, K. A., Tibbett, T. P. & Pericot-Valverde, I. Examining the effectiveness of gamification in mental health apps for depression: Systematic review and meta-analysis. JMIR Ment. Health 8, e32199 (2021).
Article PubMed PubMed Central Google Scholar
Kaihara, T. et al. Impact of gamification on glycaemic control among patients with type 2 diabetes mellitus: A systematic review and meta-analysis of randomized controlled trials. Eur. Heart J. Open 1, oeab030 (2021).
Article PubMed PubMed Central Google Scholar
Kwan, Y. H. et al. A systematic review of nudge theories and strategies used to influence adult health behaviour and outcome in diabetes management. Diabetes Metab. 46, 450–460 (2020).
Article CAS PubMed Google Scholar
Darejeh, A. & Salim, S. S. Gamification solutions to enhance software user engagement—a systematic review. Int. J. Hum. Comput. Interact. 32, 613–642 (2016).
Article Google Scholar
Silva, R. D. O. S. et al. Effect of digital serious games related to patient care in pharmacy education: A systematic review. Simul. Gaming 52, 554–584 (2021).
Article Google Scholar
Sardi, L., Idri, A. & Fernández-Alemán, J. L. A systematic review of gamification in e-Health. J. Biomed. Inform. 71, 31–48 (2017).
Article PubMed Google Scholar
European Parliament, European Council. Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on Medical Devices, Amending Directive 2001/83/EC, Regulation (EC) No 178/2002 and Regulation (EC) No 1223/2009 and Repealing Council Directives 90/385/EEC and 93/42/EEC (Text with EEA Relevance)Text with EEA Relevance. (2017).
U.S. Congress. United States Code: Federal Food, Drug, and Cosmetic Act. (1938).
Medical Device Coordination Group (MDCG). MDCG 2021–24 Guidance on Classification of Medical Devices—October 2021. (2021).
Sadare, O., Melvin, T., Harvey, H., Vollebregt, E. & Gilbert, S. Can Apple and Google continue as health app gatekeepers as well as distributors and developers?. Npj Digit. Med. 6, 1–7 (2023).
Article Google Scholar
Koldeweij, C. et al. CE accreditation and barriers to CE marking of pediatric drug calculators for mobile devices: Scoping review and qualitative analysis. J. Med. Internet Res. 23, e31333 (2021).
Article PubMed PubMed Central Google Scholar
Ceross, A. & Bergmann, J. Tracking the presence of software as a medical device in US food and drug administration databases: Retrospective data analysis. JMIR Biomed. Eng. 6, e20652 (2021).
Article PubMed PubMed Central Google Scholar
Therapeutic Goods Administration. Actual and Potential Harm Caused by Medical Software. https://www.tga.gov.au/resources/publication/publications/actual-and-potential-harm-caused-medical-software (2020).
European Coordination Committee of the Radiological, Electromedical and Healthcare IT Industry. COCIR Impact Paper Medical Device Regulation Medical Software. (2017).
Gilbert, S. et al. Learning from experience and finding the right balance in the governance of artificial intelligence and digital health technologies. J. Med. Internet Res. 25, e43682 (2023).
Article ADS PubMed PubMed Central Google Scholar
Melvin, T. The European medical device regulation-what biomedical engineers need to know. IEEE J. Transl. Eng. Health Med. 10, 1–5 (2022).
Article Google Scholar
Torous, J., Stern, A. D. & Bourgeois, F. T. Regulatory considerations to keep pace with innovation in digital health products. Npj Digit. Med. 5, 1–4 (2022).
Article Google Scholar
Gilbert, S., Harvey, H., Melvin, T., Vollebregt, E. & Wicks, P. Large language model AI chatbots require approval as medical devices. Nat. Med. 29, 2396–2398 (2023).
Article CAS PubMed Google Scholar
Saenz, A. D., Harned, Z., Banerjee, O., Abràmoff, M. D. & Rajpurkar, P. Autonomous AI systems in the face of liability, regulations and costs. Npj Digit. Med. 6, 1–3 (2023).
Article Google Scholar
Freyer, O., Wiest, I. C., Kather, J. N. & Gilbert, S. A future role for health applications of large language models depends on regulators enforcing safety standards. Lancet Digit. Health. 6(9), e662–e672. https://doi.org/10.1016/S2589-7500(24)00124-9 (2024).
Damaševičius, R., Maskeliūnas, R. & Blažauskas, T. Serious games and gamification in healthcare: A meta-review. Information 14, 105 (2023).
Article Google Scholar
International Organization for Standardization. IEC 62366-1:2015. (2015).
IMDRF AIMD Working Group. Machine Learning-enabled Medical Devices—A subset of Artificial Intelligence enabled Medical Devices: Key Terms and Definitions. (2021).
Medical Device Coordination Group (MDCG). MDCG 2019-16 Guidance on Cybersecurity for Medical Devices. https://ec.europa.eu/docsroom/documents/41863 (2020).
Council of the European Union. Council Directive 93/42/EEC of 14 June 1993 concerning medical devices. OJ L vol. 169 (1993).
International Organization for Standardization. ISO 14971:2019. (2019).
U.S. Food and Drug Administration (FDA). Policy for Device Software Functions and Mobile Medical Applications. (2022).
U.S. Food and Drug Administration (FDA). Medical Device Exemptions 510(k) and GMP Requirements. (2023).
Fink, M. & Akra, B. Comparison of the international regulations for medical devices–USA versus Europe. Injury 54, 110908 (2023).
Article PubMed Google Scholar
Flowchart to assist in deciding whether or not a device is covered by the extended MDR transitional period. https://health.ec.europa.eu/latest-updates/flowchart-assist-deciding-whether-or-not-device-covered-extended-mdr-transitional-period-2023-08-23_en.
Drummond, D., Monnier, D., Tesnière, A. & Hadchouel, A. A systematic review of serious games in asthma education. Pediatr. Allergy Immunol. 28, 257–265 (2017).
Article PubMed Google Scholar
Rodriguez, D. M., Teesson, M. & Newton, N. C. A systematic review of computerised serious educational games about alcohol and other drugs for adolescents. Drug Alcohol Rev. 33, 129–135 (2014).
Article PubMed Google Scholar
Keutzer, L. & Simonsson, U. S. Medical device apps: An introduction to regulatory affairs for developers. JMIR MHealth UHealth 8, e17567 (2020).
Article PubMed PubMed Central Google Scholar
Hanseatisches Oberlandesgericht Hamburg 3. Zivilsenat. Case 3 W 30/23. (2023).
Johner. MDR Rule 11: The Classification Nightmare. https://www.johner-institute.com/articles/regulatory-affairs/and-more/mdr-rule-11-software/ (2017).
MDCG 2019-11 Guidance on qualification and classification of software in regulation (EU) 2017/745—MDR and Regulation (EU) 2017/746—IVDR. (2022).
Eidel, O. The MDR class I software situation. https://openregulatory.com/mdr-class-i-software-situation/ (2022).
OpenAI. GPT-4 Technical report. https://ar5iv.labs.arxiv.org/html/2303.08774 (2023).
Statista. Number of Apps Available in Leading App Stores as of 3rd Quarter 2022. https://www.statista.com/statistics/276623/number-of-apps-available-in-leading-app-stores/ (2022).
European Comission. Manual on Borderline and Classification under Regulations (EU) 2017/745 and 2017/746 - Version2 - December 2022.
U.S. Food and Drug Administration (FDA). Product classification database. https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfpcd/classification.cfm (2023).
European Comission. EUDAMED database—EUDAMED. https://ec.europa.eu/tools/eudamed/#/screen/home (2020).
Freyer, O. et al. Consideration of cybersecurity in the benefit-risk analysis of medical devices: A scoping review and recommendations, 05 August 2024, PREPRINT (Version 1) available at Research Square https://doi.org/10.21203/rs.3.rs-4816554/v1 (2024).
Fraser, A. G. et al. The need for transparency of clinical evidence for medical devices in Europe. The Lancet 392, 521–530 (2018).
Article Google Scholar
Cugelman, B. Gamification: What it is and why it matters to digital health behavior change developers. JMIR Serious Games 1, e3139 (2013).
Article Google Scholar
Rajani, N. B., Weth, D., Mastellos, N. & Filippidis, F. T. Use of gamification strategies and tactics in mobile applications for smoking cessation: A review of the UK mobile app market. BMJ Open 9, e027883 (2019).
Article PubMed PubMed Central Google Scholar
Thiebes, S. et al. Valuable genomes: Taxonomy and archetypes of business models in direct-to-consumer genetic testing. J. Med. Internet Res. 22, e14890 (2020).
Article PubMed PubMed Central Google Scholar
Remane, G., Nickerson, R., Hanelt, A., Tesch, J. & Kolbe, L. A Taxonomy of Carsharing Business Models. (2016).
Fleiss, J. L. Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 378–382 (1971).
Article Google Scholar
Landis, J. R. & Koch, G. G. The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977).
Article CAS PubMed Google Scholar
Eurostat. Population on 1 January by age and sex 1960–2022. (2023).
Bundesinstitut für Arzneimittel und Medizinprodukte (BfArM). DiGA-Directory. https://diga.bfarm.de/de (2020).
Tomlinson, M., Rotheram-Borus, M. J., Swartz, L. & Tsai, A. C. Scaling up mHealth: Where is the evidence?. PLOS Med. 10, e1001382 (2013).
Article PubMed PubMed Central Google Scholar
WHO Global Observatory for eHealth. mHealth: New Horizons for Health through Mobile Technologies: Second Global Survey on eHealth. https://iris.who.int/handle/10665/44607 (2011).
Medical Device Coordination Group (MDCG). MDCG 2019–11 Guidance on Qualification and Classification of Software in Regulation (EU) 2017/745— MDR and Regulation (EU) 2017/746— IVDR. (2019).
Huotari, K. & Hamari, J. A definition for gamification: Anchoring gamification in the service marketing literature. Electron. Mark. 27, 21–31 (2017).
Article Google Scholar
Bateman, C., Lowenhaupt, R. & Nacke, L. Player typology in theory and practice. In Proc. DiGRA 2011 Conf. Think Des. Play (2012).
U.S. Food and Drug Administration (FDA). The device development process. https://www.fda.gov/patients/learn-about-drug-and-device-approvals/device-development-process (2020).

Download references

Acknowledgements

This work was supported by the German Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung, BMBF) through the European Union-financed NextGenerationEU program under grant number 16KISA100K, project PATH–‘Personal Mastery of Health and Wellness Data.’ Author K.J.W is supported by the Ministerium für Kultur und Wissenschaft des Landes Nordrhein-Westfalen (MKW NRW). The sole responsibility for the content of this publication lies with the author.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Else Kröner Fresenius Center for Digital Health, TUD Dresden University of Technology, Dresden, Germany
Oscar Freyer & Stephen Gilbert
WhalesDontFly H&F GmbH, Berlin, Germany
Oscar Freyer & Moritz Hofmann
Bielefeld University of Applied Sciences and Arts, Bielefeld, Germany
Kamil J. Wrona
Therapixel, Nice, France
Quentin de Snoeck
School of Medicine, Trinity College, University of Dublin, Dublin, Ireland
Quentin de Snoeck, Tom Melvin & Ashley Stratton-Powell
RQM+, Altrincham, Cheshire, UK
Ashley Stratton-Powell
Wicks Digital Health, Advantage House, Stowe Court, Lichfield, UK
Paul Wicks
Liquid Amber, Cortland, OH, USA
Acacia C. Parks

Authors

Oscar Freyer
View author publications
Search author on:PubMed Google Scholar
Kamil J. Wrona
View author publications
Search author on:PubMed Google Scholar
Quentin de Snoeck
View author publications
Search author on:PubMed Google Scholar
Moritz Hofmann
View author publications
Search author on:PubMed Google Scholar
Tom Melvin
View author publications
Search author on:PubMed Google Scholar
Ashley Stratton-Powell
View author publications
Search author on:PubMed Google Scholar
Paul Wicks
View author publications
Search author on:PubMed Google Scholar
Acacia C. Parks
View author publications
Search author on:PubMed Google Scholar
Stephen Gilbert
View author publications
Search author on:PubMed Google Scholar

Contributions

Authors O.F. and S.G. developed the concept of the manuscript. Authors O.F., K.J.W., and Q.S. were part of the mHealth panel. Authors O.F., K.J.W., and M.H. were part of the gamification panel. Authors S.G., T.M., A.S-P., and P.W. were part of the regulatory panel. Author A.P. was responsible for the US assessment. Authors O.F. and S.G. wrote the first draft of the manuscript. Authors O.F., K.J.W., Q.S., M.H., T.M., A.S.-P., P.W., A.P., and S.G. contributed to the writing, interpretation of the content, and editing of the manuscript, revising it critically for important intellectual content. Authors O.F., K.J.W., Q.S., M.H., T.M. A.S.-P., P.W., A.P., and S.G. had final approval of the completed version. Authors O.F., K.J.W., Q.S., M.H., T.M. A.S.-P., P.W., A.P., and S.G. take accountability for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Corresponding author

Correspondence to Stephen Gilbert.

Ethics declarations

Competing interests

P.W. is employed by Wicks Digital Health Ltd, which has received funding from Ada Health, AstraZeneca, Biogen, Bold Health, Corevitas, EIT, Endava, Happify, HealthUnlocked, Inbeeo, Kheiron Medical, Lindus Health, MedRhythms, Okko Health, PatientsLikeMe, Sano Genetics, The Learning Corp, The Wellcome Trust, THREAD Research, Unite Genomics, VeraSci, and Woebot Health. P.W. and spouse are shareholders of WDH Investments, Ltd., which owns stock in Avayl Gmbh, BlueSkeye AI Ltd, Earswitch Ltd, Lucida Medical Ltd, RunYourself Ltd, Sano Genetics Ltd, and Una Health GmbH. S.G. declares a nonfinancial interest as an Advisory Group member of the EY-coordinated “Study on Regulatory Governance and Innovation in the field of Medical Devices” conducted on behalf of the DG SANTE of the European Commission. S.G. declares the following competing financial interests: he has or has had consulting relationships with Una Health GmbH, Lindus Health Ltd., Flo Ltd, Thymia Ltd., FORUM Institut für Management GmbH, High-Tech Gründerfonds Management GmbH, and Ada Health GmbH and holds share options in Ada Health GmbH. A.S-P. is a Principal Regulatory Consultant for RQM + but does not have competing interests associated with this research. O.F., K.J.W., Q.S., M.H., T.M., and A.P. declare no competing interest with this research.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Freyer, O., Wrona, K.J., de Snoeck, Q. et al. The regulatory status of health apps that employ gamification. Sci Rep 14, 21016 (2024). https://doi.org/10.1038/s41598-024-71808-2

Download citation

Received: 10 May 2024
Accepted: 30 August 2024
Published: 09 September 2024
Version of record: 09 September 2024
DOI: https://doi.org/10.1038/s41598-024-71808-2

This article is cited by

Implementing exergames into healthcare for chronic conditions – insights from stakeholders: a qualitative study
- Marianna Antoniadou
- Aurel Zelko
- Leonie Klompstra
Health Research Policy and Systems (2025)
If a therapy bot walks like a duck and talks like a duck then it is a medically regulated duck
- Max Ostermann
- Oscar Freyer
- Stephen Gilbert
npj Digital Medicine (2025)

Subjects

Abstract

Similar content being viewed by others

Gamified eHealth interventions for health promotion and disease prevention in children and adolescents: a scoping review

Can Apple and Google continue as health app gatekeepers as well as distributors and developers?

Mobile app validation: a digital health scorecard approach

Introduction

Results

General results

Categorial analysis

Qualification as a medical device and risk classification

App store overview

Gamification analysis

Analysis of app risks related to gamification

US comparison

Discussion

Principal results

Specific risks linked to gamification

Implications for the app stores

Assessment of the US approval status

Implications for the regulation of gamified MDs

Limitations

Conclusion

Methods

Overview

Database setup

Materials for the regulatory expert assessment

Selection and assessment of apps

Composition and expertise of the panels

mHealth assessment

Gamification assessment

Regulatory expert assessment

Gamification analysis

US assessment

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Implementing exergames into healthcare for chronic conditions – insights from stakeholders: a qualitative study

If a therapy bot walks like a duck and talks like a duck then it is a medically regulated duck

Search

Quick links