Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain
the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in
Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles
and JavaScript.
A new class of AI models, called foundation models, has entered healthcare. Foundation models violate several basic principles of the standard machine learning paradigm for assessing reliability, making it necessary to rethink what guarantees are required to establish warranted trust in them.
Ruijiang Li et al. assess the reproducibility of a variational graph encoder-based framework and examines its reusability for chemical toxicity prediction. It explores how a generalist model can function as a specialist model with adaptation.
Schmidgall et al. describe a pathway for building general-purpose machine learning models for robot-assisted surgery, including mechanisms for avoiding risk and handing over control to surgeons, and improving safety and outcomes beyond demonstration data.
Automating the image analysis process for oncologic whole-body positron emission tomography–computed tomography data is a key area of interest. Gatidis et al. describe the autoPET 2022 challenge, an international competition focused on the segmentation of metabolically active tumour lesions, aiming to advance techniques in the field.
A transformer-based approach called Translatomer is presented, which models cell-type-specific translation from messenger RNA expression and gene sequence, bridging the gap between messenger RNA and protein levels as well as providing a mechanistic insight into the genetic regulation of translation.
Accurate prediction of T cell receptor (TCR)–antigen recognition remains a challenge. Zhang et al. propose a contrastive transfer learning model to predict TCR–pMHC binding that enables interpretable analyses of epitope-specific T cells and can decipher residue-level interactions.
Distinguishing between real and fabricated facts has long been a societal challenge. As the Internet becomes increasingly littered with AI-generated content, the need for curation and safeguarding of high-quality data and information is more crucial than ever.
Designing molecules in drug design is challenging as it requires optimizing multiple, potentially competing qualities. Wu and colleagues present a prompt-based molecule optimization method that can be trained from single-property data.
Neural-network-based solvers for partial differential equations (PDEs) suffer from difficulties tackling high-frequency modes when learning complex functions, whereas for classical solvers it is more difficult to handle low-frequency modes. Zhang and colleagues propose a hybrid numerical PDE solver by combining a Deep Operator Network with traditional relaxation methods, leading to balanced convergence across the eigenmode spectrum for a wide range of PDEs.
Genome-wide association studies generate extensive data, but interpreting these data remains challenging. A Bayesian-network-based method is presented that uses imputed and raw gene expression data to decipher the causal effects of individual genes.
Metagenome-assembled genomes (MAGs) provide insights into microbial dark matter, but contamination remains a concern for downstream analysis. Zou et al. develop a multi-modal deep language model that leverages microbial sequences to remove ‘unexpected’ contigs from MAGs. This approach is compatible with any contig binning tools and increases the number of high-quality bins.
Machine learning often includes secondary objectives, such as sparsity or robustness. To reach these objectives efficiently, the training of a neural network has been interpreted as the exploration of functionally invariant paths in the parameter space.
Walking efficiency declines in older adults. To address this challenge, Tricomi and colleagues present a pair of lightweight, soft robotic shorts that enhance walking efficiency for older adults by assisting leg mobility. This method improves energy efficiency on outdoor tracks while maintaining the users’ natural movement control.
In this Reusability Report, Heirman and Bittremieux evaluate MIST, a tool for annotating small-molecule mass spectrometry data, focusing on reproducibility and generalizability. They call for community efforts in benchmarking, transparency and data sharing to advance metabolomics research.
Predicting TCR–antigen–human leucocyte antigen binding opens the door to neoantigen identification. In this study, a physics-inspired sliding transformer (PISTE) system is used to guide the positioning of amino acid residues along the gradient field of their interactions, boosting binding prediction accuracy.
Forecasting epidemic progression is a complex task influenced by various factors, including human behaviour, pathogen dynamics and environmental conditions. Rodríguez, Kamarthi and colleagues provide a review of machine learning methods for epidemic forecasting from a data-centric computational perspective.
Stimulated emission depletion microscopy is a super-resolution imaging technique that utilizes point scanning in fluorescence microscopy. pySTED is developed to aid in the development and benchmarking of optical microscopy experiments, testing it in both synthetic and real settings.
A systematic review of machine learning approaches to solve partial differential equations related to fluid dynamics highlights concerns about reproducibility and indicates that studies in this area have reached overly optimistic conclusions.