Africa, the cradle of human genetic diversity and home to over 1.4 billion people, is uniquely positioned to lead in data science for health and innovation. The continent bears a disproportionate share of global health challenges, facing both infectious and non-communicable diseases1, now intensified by the cascading effects of climate change on public health, livelihoods, economies, agriculture, and biodiversity2,3. Data science presents a unique leapfrogging opportunity to bypass legacy systems and directly adopt or develop cutting-edge technologies to strengthen public health systems, drive scientific research, and build a burgeoning health innovation economy4.

This leapfrogging is already underway. In Uganda, mHealth platforms like mTrac are used for real-time disease surveillance, while in Cameroon, a data science social innovation, GiftedMom, is used in some settings to support maternal healthcare. Zimbabwe’s AfyaPap applies AI for disease prediction and clinical decision support, and Ghana’s minoHealth leverages cloud-based systems to enhance health record analytics and coordination of patient care. H3Africa, in collaboration with commercial actors like Illumina Inc., is using African genomic data to develop research tools tailored to African populations. Kenya’s AfyaRekod uses blockchain to provide secure health data management. Meanwhile, the South African start-up Zindi connects data scientists across 50 African countries to apply machine learning to real-world public health problems. Emerging programmes like Data Science for Health Discovery and Innovation in Africa (DS-I Africa), SickleInAfrica5, and AfriGen-D are in the process of building data infrastructures that integrate machine learning, ethics, national data protection legislation, and governance frameworks rooted in African worldviews6,7,8.

As Africa expands its data science capabilities, long-standing ethical concerns and structural inequities in relation to data sharing continue to define research practices on the continent. For example, much of the data collected in Africa fuels scientific and commercial progress elsewhere, sometimes without equitable recognition or benefits for African scientists and communities. To address this, African data science initiatives should adopt an ethical data culture that deliberately pays attention to fairness, inclusivity, harm prevention, and reciprocity across the data lifecycle. We propose a framework for data governance anchored in data justice and solidarity. The framework is accompanied by an operational roadmap that outlines actionable steps to embed these principles into data-intensive health research in Africa.

Toward a responsible data-sharing culture in African health research

There is growing recognition of the need to move beyond extractive, tokenistic, and charity-driven approaches in data-driven global health collaborations, as reflected in scientific ideologies such as ‘liberating the bottom billion’, ‘helping the unbanked [Africans]’ and ‘connecting the unconnected’9,10. For African data science to fulfil its promise of improving health outcomes and driving scientific innovation on the continent and the global health space, data science initiatives have a challenge to reimagine data governance that promotes the agency of African scientists over data generated from their projects, as well as benefits for their communities. One way this could be achieved is through advocating for data science stakeholders to adopt an ethical data culture, which we define as responsible, fair, and equitable practices and attitudes regarding how data is collected, shared, accessed, and reused in scientific research and innovation.

We propose a four-pillar framework for building an ethical data culture (Fig. 1). The pillars are moulded in the intersecting principles of data justice and, to a lesser extent, data solidarity. Data justice refers to shared practices of actively listening to, and prioritising the concerns of vulnerable and marginalised populations11. It encourages fairness, non-discrimination, inclusivity, visibility, and mutual accountability in collecting, analysing, and using data for human flourishing12. A related concept, data solidarity, builds on this by accentuating collective moral responsibility, ethical sharing, and equitable return of benefits to communities13. Our proposed framework is not a technical protocol or regulatory checklist, but a values-based foundation and a practical roadmap for researchers, institutions, funders, and communities navigating ethical, social, and political dilemmas in data-intensive science. It is designed to complement policy and legal instruments such as the AU Data Policy Framework and national data protection laws.

Fig. 1: Framework for integrating a novel data sharing culture in health research initiatives in Africa.
figure 1

Conceptual framework for integrating a data-sharing culture into health research initiatives in Africa. It is structured around two normative pillars (green): Data Justice and Data Solidarity, which should guide the development of responsible and equitable data sharing practices. These pillars can be operated through four interrelated domains, each associated with specific principles and implementation strategies.

Pillar 1: a culture of fairness and equity

This pillar upholds the need to redress historical and ongoing imbalances in global health research. African researchers often face pressure to share data prematurely, before they have had adequate opportunity to analyse it, publish, or secure recognition. A culture of fairness requires establishing norms and mechanisms that protect the scientific freedom of African scientists to lead research, determine data-sharing timelines, and co-develop global data-sharing collaborations on equal terms. Equity also demands the recognition of African intellectual contributions and appropriate credit through authorship, co-ownership of intellectual property, and participation in downstream translation of data science research.

Pillar 2: a culture of inclusivity and visibility

Ethical data practices must ensure that the voices and concerns of data subjects, especially from underrepresented or historically marginalised communities, are visible in how data is shared, governed, and reused. This includes participatory mechanisms for consent, community engagement in decision-making, and structures that make data flows transparent and accountable. Inclusivity goes beyond representation; it means building research agendas around the lived realities of communities and ensuring they are meaningfully involved in how their data informs science, policy, and health innovation.

Pillar 3: a culture of harm prevention and oversight

Data misuse, unauthorised access, and exploitation are risks that persist in underregulated and highly networked data environments. A responsible data culture must embed ethical, legal, and technical safeguards to protect data providers, communities, and researchers alike from harm. This includes putting in place security protocols, identifying mechanisms to ensure compliance with personal data protection legislation, such as the use of trusted research environments that allow for data sharing without loss of local control of the data6,14. Research ethics and data oversight bodies should also be equipped to monitor compliance, enforce legal and ethical standards, and ensure accountability across research partnerships.

Pillar 4: a culture of reciprocity

A just data culture must actively return value to the communities whose data enables research. This includes formalised benefit-sharing agreements, local capacity building, and translation of research findings into tangible health, social, or economic benefits. Reciprocity is also about closing the gap between data collection and real-world impact by ensuring that the contributions of African communities do not end with data extraction but are reflected in the outcomes and accessibility of the innovations their data makes possible. Strategic legal and policy tools such as data licences, custodianship models, and public-good IP frameworks can help ensure that benefits circulate within local systems15.

A roadmap for advancing an ethical data culture in Africa

Translating the principles of fairness, inclusivity, harm prevention, and reciprocity into practice requires more than ethical aspirations. It demands institutional action, policy and legal frameworks, technical mechanisms for data sharing, and investment in data governance infrastructure that centres African autonomy and public value. Drawing from our experiences, we propose five operational strategies to help embed an ethical data culture into African health research systems.

Promoting data sovereignty and accountability

Research data is the lifeblood of scientific progress, central to publications, career advancement, funding acquisition, and academic visibility. In African health research contexts, the understandable hesitation amongst scientists and biotech or health innovation firms to share data is informed not only by professional considerations but also by long-standing experiences of extractive research practices, where African-generated data has been used without appropriate attribution, oversight, or benefit-sharing. Despite this, African researchers have consistently shown a willingness to share data, provided that ethical safeguards and accountable governance mechanisms are in place to prevent misuse, ensure fairness, and return benefits to contributing communities16.

To address these legitimate concerns around extractive health research practices, data science initiatives in Africa must transition from trust-based models that rely solely on goodwill to systems rooted in accountability, transparency, scientific freedom, and equity. One promising pathway is the development and adoption of Controlled Access Data Environments (CADEs), which are secure digital platforms that enable researchers to analyse datasets collaboratively without requiring data to leave their country or institution of origin. CADEs can help ensure compliance to legislations that emphasise data sovereignty. It also empowers researchers to retain custodianship over data while engaging in global collaborations on terms that are beneficial to all.

The CADE model is not theoretical; it is feasible if there is a will amongst all stakeholders. The H3Africa initiative, for example, is implementing a semi-controlled access model through its Data and Biospecimen Access Committee17. This body oversees the ethical use of genomic and health data by evaluating requests based on relevance, collaborative equity, and alignment with African health priorities. Although not a CADE in the technical sense, it mimics its core principles of ensuring that data remains within the control and oversight of African institutions. Furthermore, the DS-I Africa eLwazi project is designing next-generation data infrastructures and computing environments that would allow researchers to discover, access, and analyse data across multiple storage locations and to visualise the results without downloading the data.

From paternalistic to community-centred data governance

An ethical data culture requires that community agency is not a one-time checkbox, but a sustained and responsive process. To this end, integrating dynamic consent mechanisms could help ensure that data subjects have a voice and retain control over how their data is used throughout the research lifecycle. This could be effectuated by leveraging widely accessible digital technologies such as SMS and mobile apps, to enable data subjects to update their consent preferences over time, as their concerns and expectations evolve. However, digital tools alone will be insufficient in the African contexts due to the uneven access to technology as well as digital literacy. Therefore, dynamic consent must be complemented with offline channels such as printed newsletters, radio programming, or community meetings to ensure wider reach. For this to be possible, public engagement should be funded as a central component of data science initiatives, and not relegated to outreach or auxiliary roles. Embedding these relational practices in data science initiatives18 would strengthen respect for communities, accountability, and shared decision making, all of which are important to sustaining the core pillars of an ethical data culture.

Reciprocity: benefit-sharing, commercialisation and licencing

Too often, African research participants and their communities remain disconnected from the innovations their data drives, whether these are commercial products, clinical interventions, or academic outputs. This could eventually undermine public trust in science. A central pillar of an ethical data culture is reciprocity, which is the principle that communities that contribute data must share in the benefits their data enables. This principle can be operationalised through formal benefit-sharing agreements and legally enforceable mechanisms that ensure research translates into health, social, and economic value for scientists and communities that provided data. To achieve this, benefit-sharing must become a component of funded projects, ethics review processes, and institutional policies. Benefit-sharing agreements can include access to interventions or diagnostics developed from the research; capacity-building investments, such as training, data infrastructure, and institutional strengthening; co-authorship and recognition in academic publications and patent applications; and revenue-sharing mechanisms where commercialisation is pursued.

Another pathway for operationalising a culture of reciprocity is through ethical commercialisation and data licensing. When guided by equitable principles, it can empower African institutions to transition from the role of data suppliers to full participants in the health innovation economy19. African research institutions must establish data licensing frameworks that govern how data is reused, by whom, and under what conditions. These licences should define permissible uses, mandate appropriate attribution, and include enforceable obligations for benefit-sharing or reinvestment into health systems that contribute data. This could be done by engaging legal and innovation offices within universities and research institutes to ensure data use aligns with institutional and public interests. In parallel, strong intellectual property protections are necessary to ensure that African researchers are not merely data providers, but also co-creators of innovation, with legal rights to patent, license, and commercialise outputs derived from their data. Furthermore, there is a need to develop data custodianship models, rather than narrow conceptions of data ownership, to manage data stewardship in ways that the interests of researchers, institutions, funders, and data are appropriately balanced.

Conclusion

As Africa becomes an increasingly vital contributor to global health data, the continent must chart a governance path that resists extractive paradigms and centres African agency, autonomy, and innovation. This requires reimagining data sharing through frameworks that promote fairness, inclusivity, harm prevention, and reciprocity. By investing in controlled access infrastructures, dynamic consent mechanisms, benefit-sharing agreements, and locally appropriate licensing models, African institutions can steward data in ways that empower communities and catalyse local health and biotech ecosystems. Fundamentally, ethical data governance must safeguard scientific freedom, so that African researchers have the autonomy to define research priorities and control how their data is used and commercialised. We have also provided a roadmap that offers a practical and principled foundation for realising these goals and ensuring that data science for health and innovation in Africa but responsive to the lived realities, aspirations, and knowledge systems of African scientists and communities. The framework and roadmap are intended to supplement and not replace existing legal, technical, and institutional efforts.

Reflexivity statement

This framework for ethical data culture in African health research is informed by our collective experiences as African researchers working across bioethics, data science, the humanities, and biomedical sciences; and actively engaged in multi-country, externally funded, data-intensive projects. Our perspectives are shaped by our positions within relatively well-resourced research networks that are not equally accessible to all African scientists. Nonetheless, our views are also informed by our empirical work and collaborations with researchers in differently resourced institutions, who have consistently voiced the need for greater control over African-generated data. We also recognise our limited exposure to commercial data pipelines and private-sector innovation ecosystems in Africa. Therefore, our views may not represent the full spectrum of experiences on data sharing on the continent. However, the framework we propose is offered as a small contribution to a broader and necessary dialogue on advancing ethical and socially responsive data practices in African health research. Overall, we view our current framing for data justice as a potential prerequisite that may play a crucial role and influence the development and functioning of legal and economic structures for the governance of health data sharing.