Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Comment

Filter By:

  • A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to include the software, tools, algorithms, and workflows that produce data. FAIR principles are now being adapted in the context of AI models and datasets. Here, we present the perspectives, vision, and experiences of researchers from different countries, disciplines, and backgrounds who are leading the definition and adoption of FAIR principles in their communities of practice, and discuss outcomes that may result from pursuing and incentivizing FAIR AI research. The material for this report builds on the FAIR for AI Workshop held at Argonne National Laboratory on June 7, 2022.

    • E. A. Huerta
    • Ben Blaiszik
    • Ruike Zhu
    CommentOpen Access
  • The Minimum Information for High Content Screening Microscopy Experiments (MIHCSME) is a metadata model and reusable tabular template for sharing and integrating high content imaging data. It has been developed by combining the ISA (Investigations, Studies, Assays) metadata standard with a semantically enriched instantiation of REMBI (Recommended Metadata for Biological Images). The tabular template provides an easy-to-use practical implementation of REMBI, specifically for High Content Screening (HCS) data. In addition, ISA compliance enables broader integration with other types of experimental data, paving the way for visual omics and multi-Omics integration. We show the utility of MIHCSME for HCS data using multiple examples from the Leiden FAIR Cell Observatory, a Euro-Bioimaging flagship node for high content screening and the pilot node for implementing Findable, Accessible, Interoperable and Reusable (FAIR) bioimaging data throughout the Netherlands Bioimaging network.

    • Rohola Hosseini
    • Matthijs Vlasveld
    • Katherine J. Wolstencroft
    CommentOpen Access
  • Medical real-world data stored in clinical systems represents a valuable knowledge source for medical research, but its usage is still challenged by various technical and cultural aspects. Analyzing these challenges and suggesting measures for future improvement are crucial to improve the situation. This comment paper represents such an analysis from the perspective of research.

    • Julia Gehrmann
    • Edit Herczog
    • Oya Beyan
    CommentOpen Access
  • A data commons is a cloud-based data platform with a governance structure that allows a community to manage, analyze and share its data. Data commons provide a research community with the ability to manage and analyze large datasets using the elastic scalability provided by cloud computing and to share data securely and compliantly, and, in this way, accelerate the pace of research. Over the past decade, a number of data commons have been developed and we discuss some of the lessons learned from this effort.

    • Robert L. Grossman
    CommentOpen Access
  • With increased availability of disaggregated conflict event data for analysis, there are new and old concerns about bias. All data have biases, which we define as an inclination, prejudice, or directionality to information. In conflict data, there are often perceptions of damaging bias, and skepticism can emanate from several areas, including confidence in whether data collection procedures create systematic omissions, inflations, or misrepresentations. As curators and analysts of large, popular data projects, we are uniquely aware of biases that are present when collecting and using event data. We contend that it is necessary to advance an open and honest discussion about the responsibilities of all stakeholders in the data ecosystem – collectors, researchers, and those interpreting and applying findings – to thoughtfully and transparently reflect on those biases; use data in good faith; and acknowledge limitations. We therefore posit an agenda for data responsibility considering its collection and critical interpretation.

    • Erin Miller
    • Roudabeh Kishi
    • Caitriona Dowd
    CommentOpen Access
  • The Multidisciplinary drifting Observatory for the Study of Arctic Climate (MOSAiC) is a multinational interdisciplinary endeavor of a large earth system sciences community.

    • Stephan Frickenhaus
    • Daniela Ransby
    • Marcel Nicolaus
    CommentOpen Access
  • The biomedical research community is investing heavily in biomedical cloud platforms. Cloud computing holds great promise for addressing challenges with big data and ensuring reproducibility in biology. However, despite their advantages, cloud platforms in and of themselves do not automatically support FAIRness. The global push to develop biomedical cloud platforms has led to new challenges, including platform lock-in, difficulty integrating across platforms, and duplicated effort for both users and developers. Here, we argue that these difficulties are systemic and emerge from incentives that encourage development effort on self-sufficient platforms and data repositories instead of interoperable microservices. We argue that many of these issues would be alleviated by prioritizing microservices and access to modular data in smaller chunks or summarized form. We propose that emphasizing modularity and interoperability would lead to a more powerful Unix-like ecosystem of web services for biomedical analysis and data retrieval. We challenge funders, developers, and researchers to support a vision to improve interoperability through microservices as the next generation of cloud-based bioinformatics.

    • Nathan C. Sheffield
    • Vivien R. Bonazzi
    • Andrew D. Yates
    CommentOpen Access
  • In response to COVID-19, governments worldwide are implementing public health and social measures (PHSM) that substantially impact many areas beyond public health. The new field of PHSM data science collects, structures, and disseminates data on PHSM; here, we report the main achievements, challenges, and focus areas of this novel field of research.

    • Cindy Cheng
    • Amélie Desvars-Larrive
    • Sophia Alison Zweig
    CommentOpen Access
  • Digital services such as repositories and science gateways have become key resources for the neuroscience community, but users often have a hard time orienting themselves in the service landscape to find the best fit for their particular needs. INCF has developed a set of recommendations and associated criteria for choosing or setting up and running a repository or scientific gateway, intended for the neuroscience community, with a FAIR neuroscience perspective.

    • Malin Sandström
    • Mathew Abrams
    • Wojtek J. Goscinski
    CommentOpen Access
  • The Brain Imaging Data Structure (BIDS) is a standard for organizing and describing neuroimaging datasets, serving not only to facilitate the process of data sharing and aggregation, but also to simplify the application and development of new methods and software for working with neuroimaging data. Here, we present an extension of BIDS to include positron emission tomography (PET) data, also known as PET-BIDS, and share several open-access datasets curated following PET-BIDS along with tools for conversion, validation and analysis of PET-BIDS datasets.

    • Martin Norgaard
    • Granville J. Matheson
    • Melanie Ganz
    CommentOpen Access
  • Measuring and monitoring non-pharmaceutical interventions is important yet challenging due to the need to clearly define and encode non-pharmaceutical interventions, to collect geographically and socially representative data, and to accurately document the timing at which interventions are initiated and changed. These challenges highlight the importance of integrating and triangulating across multiple databases and the need to expand and fund the mandate for public health organizations to track interventions systematically.

    • Yannan Shen
    • Guido Powell
    • David L. Buckeridge
    CommentOpen Access
  • As big data, open data, and open science advance to increase access to complex and large datasets for innovation, discovery, and decision-making, Indigenous Peoples’ rights to control and access their data within these data environments remain limited. Operationalizing the FAIR Principles for scientific data with the CARE Principles for Indigenous Data Governance enhances machine actionability and brings people and purpose to the fore to resolve Indigenous Peoples’ rights to and interests in their data across the data lifecycle.

    • Stephanie Russo Carroll
    • Edit Herczog
    • Shelley Stall
    CommentOpen Access
  • Open access to global forest data, especially ground-measured (in situ) records, is critical for saving the world’s forest systems. Integrated approaches to achieve sustainable data openness will involve legal assurances, shared ethics, innovative funding schemes and capacity development.

    • Jingjing Liang
    • Javier G. P. Gamarra
    CommentOpen Access
  • Development of world-class artificial intelligence (AI) for medical imaging requires access to massive amounts of training data from clinical sources, but effective data sharing is often hindered by uncertainty regarding data protection. We describe an initiative to reduce this uncertainty through a policy describing a national community consensus on sound data sharing practices.

    • Joel Hedlund
    • Anders Eklund
    • Claes Lundström
    CommentOpen Access
  • “Speak to the past and it shall teach thee”. I first read those words on a dedication tablet within the John Carter Brown library at Brown University where I was a graduate student. Little did I know the phrase would accurately describe the next three and a half decades of my career. Paleoclimate data are the language we use to look into the past to understand ourselves and ultimately our future.

    • Harry Dowsett
    CommentOpen Access
  • Efficient response to the pandemic through the mobilization of the larger scientific community is challenged by the limited reusability of the available primary genomic data. Here, the Genomic Standards Consortium board highlights the essential need for contextual genomic data FAIRness, for empowering key data-driven biological questions.

    • Lynn M. Schriml
    • Maria Chuvochina
    • Ramona Walls
    CommentOpen Access
  • As information and communication technology has become pervasive in our society, we are increasingly dependent on both digital data and repositories that provide access to and enable the use of such resources. Repositories must earn the trust of the communities they intend to serve and demonstrate that they are reliable and capable of appropriately managing the data they hold.

    • Dawei Lin
    • Jonathan Crabtree
    • John Westbrook
    CommentOpen Access

Search

Quick links