Mechano-sensing of extracellular matrix (ECM) biochemical and biophysical properties dictates various cellular behaviors including proliferation, migration, and differentiation1,2,3. One major mechanism whereby ECM mechano-sensing proceeds is through the formation of adhesion complexes, which are multi-protein structures with ECM linking, ECM property sensing, and downstream signaling functions4,5. Adhesion complexes and most other biological systems comprise a huge number of components with complex interactions. Because of recent technological advancements over the past two decades, there is now a large number of available affordable tools that can help unravel these complex interactions and reveal biomechanical insights into mechanisms, leading to the discovery of targets for new drugs. These tools have helped extract vast RNA and protein level information, which has helped gain a broader and more accurate picture of these complex systems. However, there is a need for better analysis tools that can extract useful information from these huge datasets and distinguish signaling connections that are difficult without computational methods. In this regard, artificial intelligence approaches may enable several transformative advances in research, such as advanced data analysis (e.g. live-cell imaging, cellular pattern recognition), predictive modeling (e.g., mechanical properties of tissue and cells), automation of experimental procedures (e.g., atomic force microscopy, tweezers), integration of multi-scale data (from cells to tissues), or new insights into mechanotransduction pathways (e.g., gene networks, protein interactions, and cellular pathways). Recently, machine learning (a subfield of artificial intelligence) and several artificial intelligence-based methods have been used to analyze and extract useful information from large datasets6,7,8. In this review, we highlight the approaches that have either been employed in the mechanobiology field, or that could potentially be applied to extract information from raw images, amino acid sequences, and other raw measurements to extract and predict biologically meaningful quantities.

Forces experienced by cells

Cells inside tissues exist within a complex and highly organized microenvironment, the ECM, which is primarily comprised of proteins such as collagens, fibronectin, and elastin, glycoproteins such as proteoglycans, laminin and glycosaminoglycans3,9,10. The specific composition and arrangement of this ECM determines the stiffness and solid stresses that cells experience (Fig. 1a). These physical cues include both endogenous forces, primarily resulting from cytoskeletal contractility within the cells, and exogenous forces from the surrounding microenvironment, including gravity, shear stress, and tensile and compressive forces11,12. Mechano-sensing of these external physical forces leads to morphological changes and downstream signaling events13,14. Various ECM proteins also alter their structure and interactions depending on extrinsic forces: ECM sensitivity to tensile forces leads to fibronectin unfolding15, the enzymatic resistance of collagen fibers16, and the interactions between fibronectin and collagen fibers17. The mechanical properties of the ECM, such as its rigidity and viscoelasticity, mediate the forces transduced to the cells, which play a crucial role in influencing tissue organization (Fig. 1a) and cell behavior18,19,20.

Fig. 1: Understanding the cell and ECM mechanics through biophysical techniques as inputs for AI-based approaches to use in mechanobiology.
figure 1

a Schematic representing cellular and ECM changes in normal versus malignant epithelial tissues. Malignancy in the epithelial tissues is accompanied by disruption of homeostatic balance between ECM mechanical properties and fibroblast and epithelial cells. During tumor progression, tumor and transformed stromal cells (CAF) interact with the ECM, increasing its mechanical properties (e.g., elasticity, viscoelasticity), leading to changes in cellular actomyosin tension, phenotypes (e.g., size, shape), mechanosignaling and the genetic landscape. b Cellular and ECM mechanical properties are measured at the single protein level using molecular tension detection probes, or at the cellular level using force spectroscopy or traction and monolayer stress microscopy. Molecular tension probes can be based on DNA unfolding by force or FRET sensors, proving a readout of the transduced forces. Force spectroscopy can be based on physical force application using cantilevers (such as AFM) or using optical or magnetic force tweezers. The force-response curves can then be used to measure the physical properties of cells or tissues. Force application on the substrate can be measured using traction force microscopy by either fluorescently labeling gels of different stiffness or using pillar deflections as a force readout. c Measurements of forces from different techniques, either at the multi-cell level (such as fluorescence and light microscopy or AFM curves) or at the bulk and single-cell level (such as DNA/RNA sequencing or genomic enrichment) are used as inputs for machine learning (ML) or artificial intelligence (AI)-based algorithms, which are then trained to provide outputs and predictions such as the cell and ECM mechanical properties, cell states and heterogeneity, or protein structure and gene expression. Here, ML/AL-based methods are denoted by a brain, and the color code of the dots indicates a different type of network or layers in the networks. Images in (c) adapted from Refs. 6,7,117.

Tissue rigidity and solid stresses

Solid stresses, or the mechanical forces contained in and transmitted by cell ECM, range from < 100 Pa in glioblastomas to ~10 kPa in pancreatic adenocarcinomas21. Solid stresses can increase in normal tissues due to excessive proliferation, cell infiltration, and a dense matrix21. Other direct effects of solid stress include the promotion of invasiveness of cancer cells22 and the stimulation of tumorigenic pathways in colon epithelia23. ECM stiffness increases, e.g., during embryonic development because of increased deposition and crosslinking of collagen and hyaluronic content24. An increase in tissue rigidity is also observed in tumor progression (Fig. 1a), where higher levels of matrix crosslinking occur, leading to enhanced integrin signaling24,25. Morphologically, a stiffer ECM leads to increased cell spreading area, planar motility polarization, increased matrix adhesion formation, disrupted cell-cell adhesion, and pseudopodia formation26. ECM stiffness also leads to changes in the nuclear morphology, such as nuclear elongation, malformations of the nuclear envelope, and rupture of the nuclei, which further affects nuclear stability due to Lamin-A and gene regulation27,28.

Shear/Fluid flow stress

Fluid shear stress, primarily at the interface between blood and the endothelial cells lining the blood vessels, affects various cellular functions such as cell elongation, cell polarization, nuclear shrinkage, and suppresses proliferation, and expression of anti-inflammatory genes29,30,31. The magnitude of shear stress depends on the blood velocity, viscosity, and diameter of the vessel. Mechano-sensing of fluid shear stress can lead to the activation of pathways that promote nitric oxide (NO) and prostacyclin (PGI2) production, a known vasodilator in endothelial cells32. Morphological changes due to shear flow include cell elongation and the appearance of stress fibers aligned with the direction of flow in aortic and umbilical vein endothelial cells33,34. Cyclic pulsatile hydraulic pressure due to sheer flow has also been shown to stimulate smooth muscle cells while inhibiting proliferation35.

ECM topography

Mechanisms of cell interaction with the ECM differ significantly between two-dimensional (2D) and three-dimensional (3D) microenvironments. One of the most apparent differences between 2D and 3D ECM-cell interactions is the structural organization of the ECM, wherein 2D culture systems such as tissue culture plates or coverslips coated with ECM proteins like collagen or fibronectin lack the complex 3D architecture found in native tissues, because of which cells experience limited spatial constraints and encounter a uniform mechanical microenvironment. In contrast, 3D ECM environments, such as hydrogels or scaffolds, mimic the natural architecture of tissues more closely with varying degrees of porosity, stiffness, and topographical features. This structural complexity influences cell morphology (Box 1), migration, and differentiation, leading to distinct cellular responses compared to 2D cultures. Such an in vivo environment allows cells to retain many of the differentiated features of their native tissue including their viability, apical-basal polarity, cell-cell contact, metabolism, gene expression, and resistance to exogenous stress such as hypoxia or chemotherapy10,36,37,38,39,40,41.

Cells interacting with a 3D ECM have decreased cortical tension, leading to increased protrusions and higher protein secretion, which regulates ER stress and viability39. A 3D ECM microenvironment also promotes the activation of signaling pathways such as enhanced activation of focal adhesion kinase (FAK) and Rho GTPases, which play key roles in regulating cell motility, cytoskeletal dynamics, and cell-matrix remodeling10,37. Furthermore, the 3D architecture of ECM can modulate growth factor signaling, such as transforming growth factor-beta (TGF-β) and epidermal growth factor receptor (EGFR), leading to differential cellular responses as compared to 2D cultures42,43. Morphologically, cells in 2D culture display flattened morphologies and polarized arrangements due to the planar nature of the substrate, while a 3D ECM microenvironment promotes the formation of multicellular structures, such as spheroids or organoids, that better mimic tissue-level organization and function10.

The forces experienced by cells, such as tension, compression, and shear stress, play a crucial role in regulating cellular behavior (Box 1). However, analyzing how cells respond to varying mechanical environments and predicting their behavior under different force conditions involves large and complex datasets. Machine learning (ML) and AI offer powerful tools to unravel these complexities by analyzing large-scale experimental data, identifying patterns in cell mechanics, and integrating diverse datasets such as force maps, imaging, and gene expression profiles. ML/AI can help in improving the accuracy of mechanistic models, predicting cell responses to mechanical forces, and uncovering new insights into how mechanical signals drive biological processes.

How these physical properties impact cell response and how cells adapt to them

Normal and cancer cells have mechano-sensitive machinery, such as cell-ECM44, cell-cell adhesions45, and stretch-sensitive ion channels46, that allow them to respond to applied forces. In this process, known as mechanotransduction, extracellular mechanical cues are translated into biochemical signals, which in turn affect various cellular processes47. Mechanotransduction requires the sensing of the mechanical properties of the ECM, which depends upon the formation of multi-protein complexes called focal adhesions4. Various components in these adhesion complexes have been implicated in the sensing of specific ECM properties. For example, a myosin IIA-tropomyosin 2.1 complex is responsible for rigidity sensing48,49, and an integrin-talin-FHOD1-actomyosin complex formation is required for sensing the density of ligand nanoclusters50. A theoretical model involving talin as a clutch has also been shown to sense the spatial organization of ligands on varying stiffness18,51. This sensing is then translated into biochemical signals, which can be either activation of signaling pathways through kinases or phosphatases52, or the nuclear translocation of proteins such as paxillin, FAK, or YAP53,54, which then affects genome unfolding and hence gene regulation. These cascades further regulate the cell state and induce quantifiable changes in survival, cell shape morphology, migration, and invasion. For example, it has been shown that stiffer tissues lead to the nuclear translocation of TWIST1 in breast cancer cells, which in turn promotes cell invasion by inhibiting the expression of E-cadherin55. These morphological and genetic changes, which are inferred from various experimental methods, can then be used as inputs to train ML or different AI models to predict cell state and other useful information.

Mechanical cues affect diverse fundamental cell processes, which manifest as modifications in cell state and morphology (Box 1). For example, stiff matrices induce high cellular contractility and a high cellular aspect ratio, which are necessary for mesenchymal stem cells to differentiate into an osteogenic lineage56,57. Conversely, adipogenic differentiation is favored by reduced forces and polarization, both of which are fostered by low rigidity and or high cell compliance56,57. The tension induced effects are mediated via Rho GTPases, which activate actomyosin contractility downstream of the FAK/Src pathway as well as G-Protein coupled receptor signaling56. Cell morphology and spreading are also linked with cell survival since cells spread on very soft substrates remain small and round, triggering the death-associated protein kinase (DAPK) and activating apoptosis58. Forcing cells into specific shapes using micropatterning was shown to alter cellular contractility and plasma membrane topography, which were found to affect cellular differentiation, thus underlining the role of cell shape and morphology in cellular fate59. ECM topography also profoundly influences cell morphology since cells tend to align along the direction of fibers or grooves in the ECM. This alignment has been shown to induce distinct cell morphologies depending on the organization features of the ECM organization within the tissues as has been documented for spindle-shaped cells in connective tissues60. Topography can also direct cell migration through contact guidance. ECM topology can induce cells to preferentially migrate along ECM fibers or ridges and can additionally influence their migration speed and directionality61. The dimensionality of the ECM further alters cell morphology and behavior since cells typically spread out and adopt a flattened morphology on 2D environments compared to 3D ECMs, in which cells assume a more rounded or elongated morphology, depending on the matrix composition and stiffness39. A 3D extracellular matrix microenvironment also promotes polarized migration or amoeboid-like movement in some cell types, a phenotype that is typically not observed when cells interact with a 2D extracellular matrix microenvironments61. Differences in cell morphology and motility are also observed because of ECM porosity62, fiber architecture63, and ECM degradability64, thereby implying there are pivotal roles played by the physical properties of the ECM in determining cell fate and phenotype (Box 1).

Understanding how the physical properties of the ECM impact cell responses and how cells adapt to these mechanical cues is central to mechanobiology. However, deciphering these adaptive responses is challenging due to the nonlinear and multiscale nature of mechanobiological processes. This is where machine learning (ML) and artificial intelligence (AI) can come into play. ML/AI tools can analyze vast datasets from imaging, force measurements, and gene expression studies to identify hidden patterns and relationships between mechanical stimuli and cellular responses.

Experimental tools to measure forces

Force spectroscopy techniques

The development of force spectroscopy techniques (Fig. 1b), such as optical and magnetic tweezers, atomic force microscopy (AFM), and nanoindentation methods have provided fundamental information about the mechanical properties of the ECM and tissues at the molecular and tissue level12. AFM has become a gold-standard technique to measure the mechanical properties of the ECM and cells. In the static AFM mode, a tip indents the sample, creating deformation that is acquired and translated into a force versus indentation curve. The force-indentation curves are then fitted to mathematical models such as the Hertz model to ultimately extract the elastic modulus65. Using this approach, extensive mechanical characterization of the elasticity of distinct tissues, including brain66,67, breast68,69, lung70, or pancreas71, in normal, benign, and malignant scenarios, has been performed72. On the other hand, in the dynamic model, the cantilever oscillates for a varied range of frequencies. Alterations in the oscillatory amplitude are associated with the dissipation between tip and sample, reflecting the viscoelastic properties. Models such as Kelvin–Voigt, power law, and Standard linear solid models are then mostly used to retrieve viscoelastic parameters73. For instance, studies revealed that human prostate tissue showed a more compliant and less viscous response as a function of tumor progression74. Similarly, viscosity in malignant thyroid tissue was found to serve as a good predictor of malignancy75. Interestingly, viscoelastic properties of the brain tissue and distinct cell types have also been retrieved from conventional AFM curves by taking benefit of the intrinsic hysteresis associated with the equipment or redefining previous mathematical models76,77. Nevertheless, AFM is time-consuming, not affordable for every lab and a low throughput technique, and the method additionally requires a well-trained user. It is also difficult to compare results between labs and studies because of the different tips and mathematical models employed to extract the values of the mechanical properties.

Single-molecule force measurements

The force spectroscopy techniques mentioned above have also been used at the single-molecule level. Single-molecule approaches allow interrogation of the required mechanical forces for folding and unfolding transitions of proteins or molecular bond forces, amongst others78. In these systems, the tip of the AFM is functionalized with a limited number of ligands. The functionalized tip thus binds to the single receptor, which is attached to silica glass. The AFM retracts with a given velocity to subsequently clamp a force to measure mechanical interactions between ligand and receptors molecule. Seminal work using this approach showed that the interaction between integrin α5β1 and fibronectin follows a catch-bond relationship79. Similarly, AFM techniques or tweezers have been used to experimentally address the role of mechanical forces in unfolding proteins such as titin, tenascin, talin, or fibronectin80,81,82.

Another suitable tool for measuring forces at the molecular level is molecular tension probes83,84. In the past years, there has been an increase in the usage of these sensors, which are based on fluorescence energy transfer (FRET) due to the force transduced (Fig. 1b). The distance between the fluorophore and the quencher (or other FRET pairs) attached to different sites thus determines the force readout. These FRET-based molecular tension probes have been attached to integrin-binding ligands84, ECM proteins such as fibronectin17, and force transducers such as talin85. This approach has been used to reveal that integrin tension is highly dynamic and increases with integrin recruitment during focal adhesion formation84. These sensors have also been used to measure forces for other focal adhesion proteins such as vinculin and talin85,86,87 as well as cell-cell junctions such as E-cadherin and VE-cadherin88,89. The first FRET sensor for measuring compressive forces revealed high compression on the glycocalyx protein mucin induces reciprocal tension on integrin adhesions90. Similarly, CD3-mediated adhesion at the cell edge of T-cells revealed an increment of the ligand-receptor compressive forces91.

Traction Force Microscopy (TFM)

TFM was one of the first techniques to measure intrinsic cellular forces at the cell-substrate interface. The forces measured using this technique are primarily actomyosin-mediated, thereby providing an indirect measurement of cellular contractility12. Recently, refined methods employing label-free approaches have also been developed92,93. Additionally, new fashions for measuring cellular tractions in 3D have been implemented93. A complementary technique to traction force microscopy is monolayer stress microscopy, which measures the intracellular and intercellular tension within a given cluster of cells using the fact that traction forces generated by a group of cells must be balanced by the forces transmitted within cells12,94.

Traction forces have also been measured using elastic micropillar arrays95. Micropillar setups are based primarily on photolithographic techniques that allow the creation of nano- or micropatterned master molds to create topographically patterned surfaces, mostly in elastomers (Fig. 1b). The pillar bending stiffness can be computed with traditional mechanics equations from the diameter and the height of the micropillars95. A very elegant study exploring the impact of bending stiffness, cell spreading, and post density on traction forces and focal adhesion showed that spread area and substrate stiffness follow opposite trends; cells on stiffer substrates generate higher average forces, but cells with larger spread areas create lower average forces96. These studies also showed that for different stiffness, traction forces mirror focal adhesion area, indicating a close relationship between cellular forces and focal adhesion formation96.

Other methods

Tissues in vivo are subjected to compressive forces either from their extracellular surroundings or cell-cell crowding, which translates into mechanical stress propagation. One elegant technique to measure this stress propagation is to use oil droplets with defined mechanical properties in which local mechanical stresses can be extracted from changes in the droplet shape. This technique was first used in 3D aggregates of premalignant mammary epithelial and embryonic tooth mesenchyme cells and, more recently, during incisor growth97,98. One of the drawbacks of this approach is that it can only provide information on anisotropic stresses. A similar approach has been developed, introducing elastic beads of known elasticity into 3D aggregates. In contrast to oil microdroplets, the beads are compressible, allowing the quantification of mechanical stress under external isotropic stress. In multicellular aggregates of malignant murine colon cancer cells, measurements revealed that the mechanical stress is non-uniformly distributed and that the stress profile is associated with the anisotropy of the cellular shape99. The nano- and micro-patterned biomaterials have further enabled the control of cell structure and function by patterning growth factors, ECM proteins, and other bioactive molecules onto surfaces. Specifically, engineering topographical, chemical, and/or mechanical cues in defined geometries have allowed to directly regulate cell adhesion, morphology, cytoskeletal organization, and cell-cell interactions50,51,100,101,102. Other methods, such as microchannels, ECM-functionalized polymer substrates, laser ablation, and drugs targeting actomyosin contractility, have also been used to modulate forces experienced by cells and understand cellular responses in various contexts12.

With a wide variety of methods used to collect data to make sense of the cellular response to ECM properties, there is an urgent need for proper analysis tools that can not only be used to obtain useful unbiased information but also can identify connections between datasets. Analyzing these massive datasets manually is not only time-consuming but also impractical for data storage and computational power. ML/AI tools can handle and process large datasets efficiently, identifying patterns and insights that might be missed by human analysis. These connections (usually correlations) can then allow us to perform further experiments in a more meaningful way by understanding the hidden connections and the interplay between various pathways.

Analysis and predictions in mechanobiology using Machine Learning/Artificial Intelligence

AI-based methods were first envisioned in the early 50 s by Alan Turing, who is considered one of the fathers of AI. Turing wondered: ‘Can machines think?’103. A few years later, John McCarthy responded to this question and coined the term “Artificial Intelligence,” defining it as “the science and engineering of making intelligent machines”. (We refer to the reader to these publications to an extensive review of AI models104,105). Over the past decade, various AI tools have been used to enhance the ability to handle complex data, automate routine tasks, and gain deeper insights into cellular processes, ultimately accelerating research and improving outcomes in cell biology. These tools can also automate manual repetitive analysis, enhance imaging analysis, integrate diverse data types, and uncover new biological insights, thereby accelerating and improving outcomes.

Particularly, ML methods have taken the lead over the last couple of decades. Four different types of models have been used: supervised, unsupervised, semi-supervised, and transfer learning. Briefly, supervised machine learning models use labeled data to train the model while unsupervised models identify unknown patterns from collected data. Semi-supervised machine learning is a hybrid combination that uses both labeled and unlabeled data. Reinforcement models consist of evaluating the optimal behavior in an environment to obtain maximum reward. These models can use a variety of input data, such as brightfield or fluorescence images, traction force or other measures of forces experienced by the cells, gene expression, or various other downstream effects.

Machine learning tools for mechanobiology

Machine learning methods have been employed to predict the mechanical properties of soft materials and mechanical forces (Fig. 1c). For instance, by using phase contrast images of the wrinkles of the substrate when cells adhere to a soft substrate, traction forces can be measured106. Studies determined that traction forces and wrinkles could be used as an input to train generative adversarial networks, which can then be used to predict cellular forces using phase contrast images of the wrinkles106. More recently, traction forces were assessed by using fluorescence images of diverse cellular markers8. This work found that the adhesion protein zyxin was the best predictor of the overall magnitude of cellular forces and direction8. These approaches are not restricted to single cells but have also been used to measure forces in groups of cells. In a recent study, a generative adversarial network was used to predict traction forces in cell colonies of distinct sizes and seeded on different stiffnesses107. Importantly, the network was able to predict the classical pattern of asymmetry on the traction cell forces distribution when cells were adhered to a stiffness gradient, which prior work has shown plays a crucial role in directed cell migration107. Similarly, another study used cell morphological features from brightfield images to train and predict both tractions and stresses in a cellular monolayer108. Similarly, data has revealed that areas of high cellular tension favor BMP4-dependent mesoderm differentiation by facilitating the release of β-catenin to promote Wnt signaling109. Potentially, such models can be used in combination with immunofluorescence images to explore how different markers correlate with cellular forces, thereby revealing the fundamental interplay between the spatial location of mechanical forces and cellular markers.

ML models have also been employed to predict tissue stiffness without the use of direct force measurements. Recently, a convolutional neural network named STIFMap was developed to predict tissue elasticity in the context of breast tissue using only fluorescence images6. The supervised model combined force curves collected with atomic force microscopy with fluorescence images of collagen and a nuclear marker. This model permitted the spatial resolution of tissue elasticity values and captured the intrinsic heterogeneity of tissue elasticity6. Intriguingly, the activation of mechanical markers such as active integrin β1 or phospho-myosin light chain was found to coincide with regions of high elasticity6. Another technique to test the mechanical properties of tissues at the nanoscale is nano-indentation. Load-displacement curves from a nano-indenter were used by AI-Dente to predict neo-Hookean (non-linear stress-strain behavior) and Gent models110. AI-Dente uses an inverse or forward approach to obtain mechanical properties to overcome classical approaches using the Hertzian model beyond the linear strain regime and the reduction of computational usage and time. Another study used machine learning to retrieve viscoelastic properties of different Newtonian fluids111. The model was trained by employing simulated data of the trajectories of the beads. The model predicted viscoelastic properties using shorter trajectories and high accuracy compared to classical methods111. This approach is particularly suitable when measuring highly viscous materials such as cells, paving the way to use micro-rheology with optical tweezers in living systems. In the future, it is conceivable that these models could be expanded to be tissue/cell agnostic to obtain different mechanical properties such as elasticity, non-linear elasticity, or viscoelasticity simultaneously. For instance, while elasticity is usually associated with collagen, viscoelastic tissues contain more glycosaminoglycans. This suggests that combining the current models with additional ECM or cell markers may facilitate the identification of tissue phenotype while simultaneously predicting different physical properties of the tissue. This may permit the segmentation of tissue images based on micromechanical entities and, therefore, permit the dissection regarding which physical parameter plays a major role during biological processes in each specific tissue, which are not necessarily identical. However, the use of additional markers for tissue identification may require multiplexing imaging which is not a common technique, and the prediction of different physical properties may require additional training. To reduce the training phase and experimental procedures, these new models could apply data augmentation methods to subsequently retrieve the physical properties of other tissue.

Finally, ML models can be used to analyze ECM morphology. For instance, In the context of the breast, tumor-associated collagen signature (TACS) is a model that describes three layers of collagen that radiate outward from the tumor’s main body. ML models can be trained based on these geometric features to help faster and more accurate diagnosis. Potentially, using a classification like TACS can be employed in any tissue. Also, mammographic density is strongly influenced by ECM composition, being a known risk factor for breast cancer. ML can analyze mammograms to detect subtle textural changes in the ECM that precede tumor formation, potentially identifying patients at high risk based on ECM-related density patterns. The classification of these different patterns in combination with medical history, presumably may be used as a predictive tool in the context of cancer recurrence and metastasis112. In the context of the brain, brain tumors tend to have diffuse and irregular borders. ML-enhanced imaging could map ECM density to help distinguish tumors from healthy tissue more precisely, providing surgeons with clearer boundaries for tumor removal. Additionally, these can be used for analyzing ECM fragments in Blood and cerebrospinal fluid (CSF)113. The levels of tenascin in CSF increase in astrocytic tumors. ML models could identify these new biomarkers to classify molecular changes in CSF samples associated with early-stage brain cancer, potentially offering a non-invasive diagnostic option for a disease.

Machine learning in genomics

Several techniques have emerged to explore enriched regions in the genome for regulatory elements or changes in chromatin accessibility, such as ChIP-seq (Fig. 1c) and ATAC-seq, amongst others114,115,116. Classical approaches are based on the detection of peaks, which ultimately allow for the creation of annotation data sets after human visual inspection. This approach, however, is time-consuming and impractical when analyzing large datasets. To overcome this, machine learning approaches using a supervised learning model to identify ChIP-seq peaks have been employed. After training the model with ChIP-seq annotated data, CNN-peaks could predict peaks within previously unknown genomic regions with unprecedented resolution117. More recently, following a similar approach LanceOtron was developed. This deep learning model is not only able to predict accurate peaks in ChIP-seq but also in ATAC-seq and DNAse-seq data118. These new studies demonstrate the versatility of using these approaches to handle large and complex datasets efficiently to uncover complex genetic interactions and regulatory mechanisms.

To our knowledge, current developed methods for these analyses have not been applied in the context of mechanobiology but clearly merit discussion for future applications. Several studies for instance have shown how chromatin accessibility changes, for instance, during stem cell differentiation or in response to alterations in tissue mechanical properties119. The current challenge therefore is to be able to integrate the abovementioned models with either tissues that have distinct mechanical properties or contain specific cellular phenotypes. To address this issue one possibility would be to use a reduced amount of the training data and employ data augmentation techniques to retrain new models. We speculate that these new models should be able to predict different regulatory elements or chromatin modifications that subsequently can be used to categorize cellular genotypes.

Machine learning for cell phenotypes and states

Classical approaches for identifying cellular phenotypes and states rely on the usage of microscopy images (Fig. 1c). While few techniques have emerged to generate multiplex imaging, routine microscopy images utilize only three to four channels. From these channels, features such as cell area, nuclear geometry, and pixel intensity can be extracted to detect cellular states or phenotypes120. The question arises- can machine learning predict cell states or phenotypes (Box 1) using simple images such as transmitted light or immunofluorescence images?

Various supervised machine-learning models have been used to analyze cell morphology from fluorescence imagines48. For example, In Silico Labeling (ISL) uses unlabeled images to predict cell nuclei. Cell nuclei are altered in malignant scenarios. Therefore, ISL could be a rapid and easily applied approach to obtain a robust amount of data to further analyze nuclear shape. ISL has also been used to predict cell viability, cell type, and subcellular process type with high accuracy. For example, ISL was able to identify neurons when mixed with astrocytes and immature dividing cells7. One challenge that arises is whether this model can predict the cell types of different tissues. If so, a potential application is to identify cellular heterogeneity. Cellular heterogeneity is a hallmark of cancer and has been measured using the intra-tumor heterogeneity classification. If ML models can distinguish between different cellular types7, potentially they can be used to provide an objective heterogeneity score that ultimately can help during cancer diagnosis or tumor aggressiveness. Another image-based application can be to analyze the tumor microenvironment to elucidate how immune cells and tumor cells interact. Recently, an immune infiltration score has been defined in contrast to the “cold/hot” classification121. ML models can be trained for several tissues and tumor stages to predict immune infiltration scores. This information may be used to classify how aggressive the cancer is likely to be or how to respond to therapies.

A self-supervised residual neural network with squeeze-and-excite blocks (SE-RNN) uses multi-channel fluorescence microscopy images to classify cell morphometric phenotypes to depict cell state49. More recently, a supervised trained model with nuclei and cell physical properties was designed to predict cellular state using as an output the transcription factor122. The model used fluorescence images of a nuclear marker and cell morphology. While it is arguable that nuclear YAP translocation is an indicator of any particular “cellular state”, YAP translocation into the nucleus has been shown to increase in response to more aligned fibers and increased tissue rigidity123. Therefore, using such models could potentially provide indirect information on the properties of the mechanical environment. Also, in breast cancer loss of Scribble disrupts 3D acinar formation and promotes tumorigenesis. Loss of Scribble is accompanied by YAP mislocation into the nucleus. Therefore, YAP nuclear prediction can be used to further explore other important regulators in biological processes. Such models could also be used to detect various conditions such as cancer or senescence. For instance, chromatin features and nuclear morphologies have been classified using an AI model for cancer diagnostics from liquid biopsies124. Another study used nuclear features as the input of machine-learning classifiers to predict cellular senescence125.

Key opportunities for ML/AI in mechanobiological research

A well-known limitation of a number of mechanobiological assays is the size of the datasets analyzed. In a number of cases, only a few cells (typically ~50 cells) are analyzed because of the experimental setup. ML/AI tools trained on these small datasets can be used to extrapolate the results, making broader connections and correlations at the physiological level. A model pre-trained on large-scale image data (e.g., microscopy images of a specific cell type from a particular context) could be adapted for studying cell mechanics or tissue stiffness in a new experiment with a different cell type. This allows the model to leverage previously learned patterns onto new data, making analysis easier and more automated.

Image and data analysis plays a crucial role in mechanobiology. Extracting quantitative information from microscopy images and their accurate analysis is essential for advancing mechanobiological research. Various AI tools which can be integrated into microscopy or analysis software to remove blur in images, to enhance signal, to denoise, and to segment images. Such post-processing of microscopy images can improve their analysis. ML/AI can further automate the analysis of complex datasets, including image processing, morphological analysis, segmentation, and feature extraction, thus eliminating manual image analysis, which can be time-consuming and prone to human error. This can further lead to the development of integrated pipelines for data analysis, including data integration from multiple sources, fully automated analysis and data visualization, and real-time data analysis.

ML/AI can also be used to generate predictive models, which can predict how cells or tissues will respond to mechanical stimuli based on extracted features or experimental inputs. AI algorithms can use extracted features (e.g., cell stiffness, migration speed, force generation) to predict outcomes such as cell differentiation, proliferation, or migration in response to mechanical stimuli. AI algorithms can also classify cellular behaviors and cell states based on the extracted features, leading to quick identification of distinct subpopulations of cells or tissues based on mechanical properties. For example, an AI algorithm can be used to link or correlate a particular phenotype or genotype with known mechanisms of the same phenotype/genotype in literature and suggest related pathways or mechanisms, thus revealing hidden patterns in experimental data that suggest new mechanistic hypotheses. AI can also be used for the smoother implementation of mechanistic modeling through mathematical frameworks. ML/AI offers ways to enhance and complement traditional mechanistic modeling approaches by improving predictive accuracy, simplifying complex processes, and integrating various data types. ML/AI can also be used to learn from simulation data, creating faster and more efficient models.

ML/AI predictions can further be used to predict mechanical responses in the case of an unexplored condition. In vivo mechanical conditions are usually a complex integration of multiple factors, which are studied using a reductionist approach. To understand the system as a whole requires a bottom-up approach, integrating multiple ECM and cellular mechanical phenotypes and conditions. ML/AI can be used to predict how biological systems will respond under complex conditions that have not yet been experimentally tested (for example, increasing dimensions with stiffnesses, strains, or viscoelasticity). This can be further investigated for complex perturbations, disease states, engineered tissues, or drug treatments. ML/AI models can also be used to extrapolate beyond available experimental data, predicting mechanical behaviors or cellular responses under extreme conditions (e.g., very high forces, long-term exposure to mechanical stimuli) that may be difficult to explore experimentally.

Finally, ML/AI can be used to integrate multimodal data fusion approaches thus, combining data from imaging, blood biomarkers, genetic profiles, and physical properties of cells and tissues. Such strategies could improve early detection of diseases and predicting disease outcomes by identifying complex patterns that can be missed if each data type were considered in isolation, or for a better understanding of a therapy’s response.

Challenges for the integration of AI/ML algorithms with mechanobiology

Using ML/AI in mechanobiology poses a significant challenge due to the high level of expertise required to develop, implement, and interpret AI tools effectively. While ML/AI technologies have proven to be powerful in automating data analysis and identifying patterns in vast datasets, harnessing their full potential requires a combination of domain-specific knowledge and technical expertise in AI. These include various complex algorithms such as decision trees, logistic regression, deep learning and neural networks. Implementing these models effectively requires a deep understanding of the strengths, limitations, and underlying mathematics of each approach. Misapplying a model can result in inaccurate predictions, false correlations, or overlooked insights. This complexity often acts as a barrier for many research labs. Effective ML/AI deployment also requires access to computational resources, such as powerful processors (GPUs) and high-performance computing environments, which are not readily accessible to all mechanobiology labs.

While large datasets are crucial for training robust and accurate AI models, the unique nature of mechanobiology often limits the availability of such extensive data, creating several obstacles for researchers. For example, measurements using techniques such as FRET, AFM, TFM, or live-cell imaging often produce data from a limited number of samples or conditions, which can lead to poor model performance, including overfitting, where the model ‘memorizes’ the training data but fails to generalize to new, unseen data, thus producing unreliable predictions. To overcome the small dataset problem, data augmentation or the generation of synthetic data has been used. Data augmentation includes rotating or flipping images, artificially increasing the size of the dataset, but these methods may not fully capture the complexity of mechanical forces and cellular responses. This can lead to incorrect conclusions or pursuing unproductive research directions. The limitation of small datasets can be further approached using ML frameworks, such as few-shot learning, which can make predictions by training on a very small dataset. Transfer learning models trained on larger datasets from related fields can also be adapted for mechanobiology-specific applications, which can leverage pre-existing knowledge and patterns learned from other domains and apply them to mechanobiology.

The performances of both machine learning and deep learning approaches hinge strongly on the amount of training data. Because of the sheer size of the experimental data that is acquired, there is a need to store and share huge amounts of raw data. This becomes especially difficult when the size of individual data files is in gigabytes, such as single-molecule localization microscopy (SMLM) raw files or RNA-seq files. Thus, open platforms that can store and provide easy access to these raw files are required, consistent with the FAIR (findability, accessibility, interoperability, and reusability) principles126. This has led to the advent of platforms and databases such as BioStudies, Image Data Resource, and ShareLoc to share different types of biological data127,128,129. Such databases can provide raw data for training deep learning algorithms and help accelerate the development of new quantification methods in numerous applications in the life sciences.

AI/ML algorithms can detect patterns and relationships between variables in large datasets. For instance, an algorithm might correlate a certain gene expression with a particular cell behavior. This, however, should not be confused with causation, which requires demonstrating that changes in one variable directly cause changes in another. AI tools might identify correlations that are not biologically meaningful. These false positives can be misleading, wasting time and resources. Large datasets often contain coincidental correlations. It is therefore feasible that AI tools may highlight these correlations as being significant, leading to misinterpretation of the underlying biological processes. Causation requires controlled experiments and deeper biological understanding, which AI/ML tools alone cannot provide. AI models, especially complex ones, can overfit the data they are trained on, capturing noise rather than true underlying patterns. This reduces their ability to generalize findings to new data or different biological contexts.

Conclusions

Since the first observation of the adhesion complexes in the early 70 s130 to the recognition of the importance of the instructive signals provided by the ECM to regulate many biological processes there has been an increased interest in the field of mechanobiology. This interest has led to the development of techniques and tools to understand how physical properties and parameters impact cellular behavior. Recently, the implementation of high-throughput methods has permitted the acquisition of large amounts of data. With the goal to “generalize” and gain a better knowledge of the experimental findings, there is an urgent need to use better and more comprehensive techniques.

AI/ML have emerged as powerful tools to address these challenges. Nevertheless, AI/ML approaches face many challenges, particularly with respect to the creation of fraudulent data131. Noam Chomsky stated that technology “is basically neutral. It is like a hammer. The hammer doesn’t care whether you use it to build a house or whether for torture, using it to crush someone’s skull, the hammer can do either”132, if this premise is true, the proper use of AI/ML tools can be a breakthrough in many research fields and its misuse can compromise progress.

In this regard, AI tools are indispensable in modern cell biology for their ability to handle and analyze large, complex datasets efficiently. While AI excels at identifying patterns and correlations in large datasets, distinguishing these from true causal relationships requires careful experimental design, validation, and integration with domain knowledge. By addressing this issue, researchers can harness the power of AI while ensuring robust, meaningful, and biologically relevant insights. In this age of big data, AI tools need to be used together with traditional experimental methods to combine both computational power and biological insight. Such tools are paving the way for more personalized and precise approaches in medicine and fostering a deeper understanding of cellular processes.