Introduction

Artificial intelligence (AI) has made remarkable inroads in healthcare, offering opportunities to enhance patient care and operational efficiency. However, the integration of AI surfaces many ethical, legal, and social challenges. In order to mitigate risks, government agencies1,2, corporations3,4, and global groups like the World Health Organization5 have introduced principles to govern the use of AI.

Healthcare delivery organizations (HDOs) can be explicitly or implicitly required to comply with high-level principles. For example, as a federal agency, the Veterans Health Administration is required to comply with principles for AI use listed in Executive Order 139606. HDOs want to comply with the White House Blueprint for an AI Bill of Rights2 and have gone as far as to make public commitments to the safe, secure, and trustworthy use of AI in healthcare7. While the current state of AI regulation is based largely on voluntary compliance with existing principles, as the regulatory landscape expands HDOs will inevitably have to satisfy an increasing number of mandatory requirements.

But practically navigating these commitments is hard. As it stands, HDOs must wade through the many different principles that are not always aligned with each other. The principles are high-level, failing to account for the nuanced realities and experiences healthcare professionals encounter. However, the process of considering and implementing these principles is critical to ensure ethical and legally compliant use of AI in healthcare right now, as well as preparing for compliance with future mandatory regulations. HDOs must grapple with moving from principles to practices.

To help bridge this gap, Health AI Partnership (HAIP) conducted interviews with diverse stakeholders across the United States (US) and released an initial set of best practices for AI adoption in 20238. Shortly after releasing the best practices, HDOs began asking how the best practices mapped to commonly cited principles.

This comment illustrates the first formal effort to help HDOs translate between high-level AI principles and on-the-ground best practices. We present four novel contributions. First, we evaluate the extent of alignment between a small number of key AI regulatory frameworks and guiding principles. Second, we assess the extent to which the adoption of on-the-ground best practices can be used to ensure an HDO will consider and address all key guiding AI principles. Third, we provide practical strategies to empower HDOs to navigate the rapidly evolving AI regulatory landscape. Fourth, we highlight areas of overlap and gaps between on-the-ground practices and principles to inform opportunities to strengthen future principles and best practices.

Mapping AI guidelines and HDO best practices

Sourcing AI principles

Our analysis of AI principles focuses on 8 key frameworks that have recently published and are particularly relevant to HDOs in the US. Five frameworks were put forth by the US government between 2021 and 2023: three by the US executive branch2,6,9, one by the US National Institute of Standards and Technology (NIST)1, and one draft guidance document by the US Food and Drug Administration (FDA)10. One framework was put forth by the World Health Organization (WHO) in 20215, which was included in the analysis to promote a global perspective in contrast to the other primarily US-centric sources, as health system leaders of many countries reference the WHO for guidance. We supplement these government and global frameworks with the two most highly cited systematic reviews of AI principles11,12. We focus on these 8 key frameworks because they are either from entities that HDOs seek to align with or are foundational academic contributions to the field of responsible AI. Table 1 summarizes the included frameworks.

Table 1 Overview of the frameworks

We do not seek to account for all published AI principles but rather highlight the principles most relevant to US HDO leaders. We recommend the two most highly cited systematic reviews of AI principles for readers seeking comprehensive analyses11,12.

Sourcing best practices

Our analysis relies on the 31 best practice guides across 8 key decision points developed by HAIP to empower healthcare professionals to use AI safely, effectively, and equitably. The HAIP best practices and key decision points were derived through a rigorous process that combined a review of published literature and nearly 90 interviews with clinical, technical, and operational leaders from HDOs across the US. The qualitative analysis is presented in detail elsewhere and is the most extensive study to date of AI governance practices in US HDOs8. The HAIP best practices are publicly available at healthaipartnership.org and listed in Table 2.

Table 2 HAIP best practice guides

Mapping between AI principles and HAIP best practices

Different frameworks use different terms to refer to the same concept. For example, the AI Bill of Rights framework describes the concept of mitigating bias with the terms “algorithmic discrimination preventions,” while the Principled AI framework describes the same concept with the terms “fairness and non-discrimination”. To minimize the redundancy of principles while maintaining conceptual meaning, we grouped similar principles together. After grouping similar terms, the number of principles across the 8 frameworks decreased from 58 to 13. Details regarding redundant principles are included in Supplementary Table 1. The 13 distinct principles, which we hereafter refer to as synthesized principles, are listed in Table 3.

Table 3 Synthesized principles definitions

We then mapped the 31 HAIP best practice guides to the 13 synthesized principles. If a practice guide provided actionable recommendations related to a synthesized principle, it was mapped to that principle. A single guide could be mapped to multiple synthesized principles. Three of our team members (N.P., A.H., M.L.) separately reviewed and adjudicated the mapping process to ensure consistency.

Findings

Alignment of principles across frameworks

From the mapping process, we determined the extent of alignment between each framework and the synthesized principles. Table 4 summarizes these findings. Of the frameworks, the Global Landscape of AI Ethics Guidelines (n = 8), EO 14110 (n = 7), Principled AI (n = 7), and NIST RMF (n = 6) included the highest number of synthesized principles. The Blueprint for an AI Bill of Rights (n = 5) and FDA PCCP guidance Appendix A (n = 4) included the least synthesized principles. We also determined the frequency with which synthesized principles appeared in the frameworks, visualized in Fig. 1. Data privacy, transparency and explainability, and responsibility and accountability appeared in the largest number of frameworks while Sustainability and Government Infrastructure appeared in the fewest.

Fig. 1: Synthesized principle occurrence in framework.
figure 1

Bar chart displaying the number of frameworks that reference each synthesized principle in the mapping exercise.

Table 4 Framework to synthesized principles mapping

Alignment of HAIP best practices to synthesized principles

We also determined the extent to which the HAIP best practice guides incorporated actionable steps to fulfill the synthesized principles. Table 5 summarizes these findings. Topic guides for ‘(3.1) Define performance targets’ (n = 8, 61.54%) and ‘(6.3) Prevent inappropriate use of AI’ (n = 8, 61.54%) aligned with the most synthesized principles. Topic guides for ‘(1.1) identify problems across the organization’ (n = 1, 7.69%), ‘(1.2) prioritize problems’ (n = 1, 7.69%) and ‘(8.4) minimize disruptions from decommissioning’ (n = 1, 7.69%) aligned with the least.

Table 5 Synthesized principles to HAIP best practice guides mapping

Additionally, we determined the synthesized principles that appeared most frequently across all 31 HAIP best practice guides. These results are also displayed in Table 5. The synthesized principle of Responsibility and accountability was addressed by most guides (n = 17, 54.84%), followed by Respect for humanity/autonomy (n = 16, 51.61%) and Prevention of bias/discrimination (n = 16, 51.61%). The synthesized principles addressed by the fewest number of best practice guides were government infrastructure (n = 0, 0%) and sustainability (n = 4, 12.90%).

Use of HAIP best practices to address principles in regulatory frameworks

HAIP best practices varied in their relevance to the different frameworks. 71% of HAIP best practices guides applied to all 8 frameworks, while only 3 guides (‘[1.1] Identify Problems Across Organization’, ‘[1.2] Prioritize Problems’, and ‘[8.4] Minimize Disruptions from Decommissioning’) applied to fewer than six frameworks. These results are shown in Fig. 2. With the results of the mapping, we can determine which HAIP best practice guides comply with each of the 8 frameworks. These results are displayed in Table 6.

Fig. 2: Framework coverage by HAIP best practice guides.
figure 2

Bar chart displaying the number of frameworks represented by HAIP best practice guides.

Table 6 HAIP best practice guides to framework mapping

Identifying overlaps and gaps between principles and practices

Our final analysis maps the inclusion of synthesized principles in the 8 frameworks against the inclusion of synthesized principles in the 31 best practice guides. The findings are visualized in Fig. 3. The synthesized principle that is most poorly represented in frameworks (n = 1) and best practices guides (n = 0) is Government infrastructure. On the other hand, the synthesized principle that is most prominently represented in frameworks (n = 6) and best practice guides (n = 17) is Responsibility and accountability.

Fig. 3: Synthesized principle alignment.
figure 3

Chart displaying the number of guides vs. the number of frameworks covering a synthesized principle, with synthesized principles plotted in quadrants.

Discussion

Our analysis highlights the complex challenges that HDOs in the US face navigating the rapidly evolving regulatory landscape of AI. As highlighted in Table 4, no two AI frameworks are the same. Frameworks include different numbers and different sets of AI principles. HDOs will need to self-aggregate and prioritize government documents and published literature or they will need to address the breadth of principles covered by all the guidance. Thankfully, our analysis does demonstrate that the many variations of raw principles presented in different frameworks can be distilled into 13 synthesized principles. Rather than addressing specific frameworks individually, HDOs can prioritize among the 13 synthesized principles to put into practice.

Our findings create a practical workflow for HDOs to adopt and implement HAIP best practice guides. For example, referencing Table 5, an HDO that is concerned with the Transparency and Explainability of a specific AI product (a principle highlighted in many frameworks), may choose to reference the HAIP best practice guide ‘1.3 Identify potential downstream impacts’. This guide outlines strategies such as process mapping, conducting focus groups with affected parties, or creating standardized assessment rubrics to ensure that the HDO may thoroughly address that synthesized principle. Through the process of adopting a core set of 31 best practices across the AI product lifecycle, our findings ensure that HDOs will address all 13 synthesized principles that appear in key regulatory and peer-reviewed AI frameworks. On one hand, this demonstrates the implicit nature in which AI principles are already embedded within the operational practices of HDOs. On the other hand, it highlights the strong position HDOs are in, compared with other industries, to utilize AI safely, effectively, and equitably. Many of the synthesized principles, such as Data Privacy, Transparency and Explainability, and Responsibility and accountability, have rich legal and regulatory precedents in healthcare.

We do find several gaps between the content of the synthesized principles captured by key AI frameworks and the HAIP best practice guides. This may provide insight into the differences between high-level AI principles and practical “on-the-ground” considerations. First, key AI frameworks do not cover the tail ends of the AI product lifecycle as thoroughly as the HAIP best practice guides. This is most apparent in the content of the first two HAIP best practice guides, ‘1.1 Identify problems across the organization’ and ‘1.2 Prioritize problems’, which cover the beginning of the AI product lifecycle, as well as the last two HAIP best practice guides, ‘8.4 Minimize disruptions from decommissioning’ and ‘8.5 Disseminate information about updates to end users’, which cover the end of the AI product lifecycle. This content is a key focus of HAIP’s best practice guides yet is not addressed as thoroughly in the synthesized principles, as seen in Tables 5 and 6. This may highlight the importance of AI frameworks to more rigorously consider whether AI solutions are being applied to appropriate problems and the conditions under which AI solutions are decommissioned and updated. Additionally, considerations of the HAIP best practice guide ‘4.1 Design and test workflow for clinicians’ were similarly not covered to the same extent in the synthesized principles. This finding could mean that AI frameworks are not capturing the importance of sociotechnical challenges that occur at the interface of human users and AI technologies.

Lastly, our findings highlight an opportunity for several synthesized principles to be more prominently featured in both AI frameworks and HAIP best practice guides. In Fig. 3, we find five synthesized principles that are featured in two or fewer AI frameworks and 10 or fewer HAIP best practice guides (visualized in the bottom left quadrant separated by the red lines). These synthesized principles are: government infrastructure, sustainability, economic regulation, beneficence, and workforce considerations. Many of these poorly represented principles share a common theme: properly addressing them will require increased coordination between regulatory bodies and individual HDOs. For example, a single HDO has little control over government infrastructure or economic regulation—collaboration on a greater scale is required to achieve this. We discuss considerations for each of these principles in turn.

First, there is an urgent need for government investments in infrastructure to improve the use of AI in healthcare. This is not directly addressed in any HAIP best practice guides, however, there is a significant opportunity for the government to play a more prominent role in supporting the safe, effective, and equitable use of AI within HDOs. There have been many suggested courses of action in literature; a prominently discussed strategy is that of the FDA and other government bodies considering AI tools “medical devices” for regulation purposes13,14. In addition to central regulation, experts have also suggested that local regulation will be necessary to account for differences in care, patients, and system performance15. Second, there is also an urgent need for AI frameworks and HDOs to prioritize sustainability in the use of AI. HDOs are increasingly called upon to reduce greenhouse gas emissions and decarbonization strategies can include careful selection of AI technologies16. There is an opportunity for both HAIP and government agencies to develop best practices in this domain. Third, there is a gap in the economic regulation of AI in HDOs. This speaks to the urgent need for more mature reimbursement mechanisms to support the responsible use of AI in healthcare. Experts have proposed many different models for this, including reimbursement for value and outcomes to prevent overuse of AI, utilizing advance market commitments and time-limited reimbursements for new products, and financial incentives for bias mitigation17,18. Additionally, literature has suggested that establishing clearer regulation may incentivize innovation and increase reimbursement for developers of AI products13. Fourth and perhaps most surprisingly, the use of AI can more prominently prioritize beneficence to positively impact the well-being of people. AI is increasingly framed as a solution to address inefficiencies through automation, but in healthcare and other industries, its impact on people can be more prominently considered. Lastly, AI frameworks and HAIP best practice guide poorly capture workforce considerations. As opposed to the traditional model of the medical professional being responsible for the tool they use, as AI products become more advanced the burden of responsibility may shift in the direction of the vendor19. Additionally, the expanding use of AI will undoubtedly affect the labor force in a multitude of ways, replacing some tasks that are currently carried out by humans and leading to new roles requiring different skillsets20. Despite concerns around the future of work and the potential displacement of skilled labor, there is an opportunity to strengthen best practices to improve the experience of workers who will increasingly interact with AI.

Conclusion

As AI products become more deeply ingrained in healthcare, regulatory considerations expand with them. HDOs will be expected to comply with regulatory frameworks that are constantly evolving and difficult to translate into practice. HAIP bridges this gap by creating best practice guides for HDO leaders that translate regulatory principles into practice. This enables HDOs to align their AI governance efforts with regulatory priorities. This process is widely applicable and adaptable as new frameworks emerge. We hope that our analysis and findings serve as a blueprint for healthcare AI regulatory compliance as the field continues to mature.