A structural biology compatible file format for atomic force microscopy

Jiang, Yining; Wang, Zhaokun; Scheuring, Simon

doi:10.1038/s41467-025-56760-7

Download PDF

Article
Open access
Published: 15 February 2025

A structural biology compatible file format for atomic force microscopy

Nature Communications volume 16, Article number: 1671 (2025) Cite this article

7655 Accesses
6 Citations
34 Altmetric
Metrics details

Subjects

Abstract

Cryogenic electron microscopy (cryo-EM), X-ray crystallography, and nuclear magnetic resonance (NMR) contribute structural data that are interchangeable, cross-verifiable, and visualizable on common platforms, making them powerful tools for our understanding of protein structures. Unfortunately, atomic force microscopy (AFM) has so far failed to interface with these structural biology methods, despite the recent development of localization AFM (LAFM) that allows extracting high-resolution structural information from AFM data. Here, we build on LAFM and develop a pipeline that transforms AFM data into 3D-density files (.afm) that are readable by programs commonly used to visualize, analyze, and interpret structural data. We show that 3D-LAFM densities can serve as force fields to steer molecular dynamics flexible fitting (MDFF) to obtain structural models of previously unresolved states based on AFM observations in close-to-native environment. Besides, the .afm format enables direct 3D or 2D visualization and analysis of conventional AFM images. We anticipate that the file format will find wide usage and embed AFM in the repertoire of methods routinely used by the structural biology community, allowing AFM researchers to deposit data in repositories in a format that allows comparison and cross-verification with data from other techniques.

Localization atomic force microscopy

Article 16 June 2021

Deciphering conformational dynamics in AFM data using fast nonlinear NMA and FFT-based search with AFMFit

Article Open access 29 September 2025

Distance-AF improves predicted protein structure models by AlphaFold2 with user-specified distance constraints

Article Open access 30 September 2025

Introduction

Cryogenic electron microscopy (cryo-EM)¹, X-ray crystallography², and nuclear magnetic resonance (NMR)³ solve three-dimensional (3D) structures of biomolecules, i.e., proteins, DNA, RNA, and mixed complexes thereof, at high-resolution through averaging of thousands of molecules. These 3D structures, deposited as density in EM Data Bank (EMDB)⁴ and as atomic coordinate models in Protein Data Bank (PDB)⁵ files, form the basis of our understanding of biomolecular structure and often provide insights into the chemistry of biomolecular function. In contrast, atomic force microscopy (AFM)⁶ is a surface technique and therefore is unlikely to ever provide a complete 3D structure independently, without an integrative and cross-methodology approach.

However, AFM is unique in providing structural information of single molecules under close-to-physiological and dynamic conditions and has several strengths compared to other structural biology techniques. As an advantage, to study biomolecules or cells, AFM operates in physiological buffers and at ambient temperature and pressure. For the study of membrane proteins, e.g., channels, transporters, and receptors, these molecules can be reconstituted in a lipid membrane of controlled composition⁷. Thus, AFM allows the structural analysis of proteins under conditions arguably more native than the conditions necessary for X-ray diffraction, i.e., proteins in 3D-crystals, or cryo-EM, i.e., proteins in an ~50 nm thin liquid layer imaged at liquid nitrogen temperature (77 K). In addition, for the study of membrane proteins, the molecules are typically analyzed in environments less-native than a reconstituted membrane, namely in micelles, bicelles, or nanodiscs; though recently, membrane protein structures were solved in vesicles by cryo-EM⁸. Another advantage of AFM is its more recent speed-enhanced version, termed high-speed atomic force microscopy (HS-AFM)⁹. The term HS-AFM typically refers to image acquisition speeds between 1 and 100 frames per second. These devices have already provided invaluable dynamic information on conformational changes¹⁰, conformational transition pathways^11,12, and rare conformational states of proteins¹³. In addition, HS-AFM operates with short cantilevers that have low hydrodynamic drag and large angular change when deflected¹⁴, as well as faster feedback operation¹⁵, and is, therefore, substantially more sensitive and less invasive than conventional AFM.

As briefly mentioned above, a disadvantage of AFM and HS-AFM data is that it is restricted to surface contouring and thus provides only pseudo-three-dimensional (pseudo-3D) information. While true 3D data associates a density value to all voxels used to describe a structure in x,y,z-space as data from X-ray diffraction and cryo-EM imaging, AFM only associates a single h-value (height) to each x,y-position, where the tip interacted with the sample surface. Another disadvantage of AFM and HS-AFM is that molecules have to be surface-adsorbed in order to be analyzed. This is, however, not as limiting as one may think, especially for a 2-dimensional (2D) membrane with inserted membrane proteins that naturally only expose their two hydrophilic faces to the liquid environment and when using atomically flat mica as a sample surface. On a freshly cleaved mica surface, i.e., a clean and atomic layer, it has repeatedly been shown that channels and transporters are free to undergo conformational changes^{12,16,17,18,19,20,21,22,23,24}, and diffuse freely with diffusion coefficients of up to ~0.7 µm²/s ⁽⁷⁾, especially membrane proteins that do not protrude strongly from the membrane.

Finally, AFM and HS-AFM topographic data are convoluted by the tip shape, enlarging and convoluting the protein topographies with the AFM tip radius. This limitation was overcome by the introduction of localization atomic force microscopy (LAFM), a super-resolution method that extracts topographic peak positions from individual particle observations and merges them into a topography probability map that can reach a quasi-atomic resolution on protein surfaces²⁵. The fact that only topographic peaks are merged restricts the data content to true tip-sample interactions only, while eliminating data that is prone to emerge from tip convolution. Thus, LAFM is poised to become a standard AFM data processing method to produce data that can interface with other structural biology techniques.

X-ray diffraction and cryo-EM result in electron- or nuclear-density maps, respectively. Subsequently, the known amino acid sequence chain of the protein under investigation is built into these density maps so as to best fit the density while respecting the angular constraints of peptide bonds and their interactions in 3D space²⁶. Thus, X-ray diffraction and cryo-EM data provide the basis for building atomic models of protein structures. AFM cannot provide such constraints, and will likely never be able to provide data that allows building a protein structural model because the volumetric information is missing, but it can provide unique information about the conformational transitions occurring on the surface of a protein along the timeline of a protein in action¹⁰. As we have recently shown, HS-AFM allows single-molecule structural biology, providing multiple structures that can be associated with the states of a molecule at work. Thus, the data is not only acquired in physiological buffers and at ambient temperature and pressure, but it can also provide the spatial and temporal constraints to dock, model, and order atomic models, to result in real movies of proteins in action.

However, so far, AFM data has been published as figure panel images only and remained quite anecdotal because there was no generally usable data nor file format output (such as the EMDB or PDB files) that could be exchanged with, analyzed, cross-validated, or used as input by other techniques. To achieve this goal and establish AFM as a complementary tool for dynamic structural biology, we need to extract data that is of high enough resolution to inform and be comparable to data from other methods, and that is readily readable and accessible in most common structural biology software. Here, we developed a pipeline that transforms AFM data into 3D probability density maps that are compatible with data from other structural biology methods. To achieve this, we automatized and unbiased LAFM peak detection and 3D-space merging, resulting in a 3D surface probability density map encoded in a ‘.afm’ file that is structured like the corresponding density maps from X-ray diffraction and cryo-EM and can be directly read (drag-and-drop) in structural biology software (e.g., Chimera²⁷).

Modeling atomic structures into AFM topographs, to acquire an atomistic model of what AFM images in buffer solution and at ambient temperature and pressure reveal, has been done since the very first sub-molecular resolution AFM images of OmpF in 2D-crystals²⁸. Later, cross-correlation searches of atomic structure surface representations, for the determination of the positional and rotational degrees of freedom of the molecules in AFM images of native photosynthetic membranes, were used to build atomic models of super-complexes²⁹. This approach was extended for further structural insights using 2D image comparisons between AFM images and pseudo-AFM images computed from either PDB structures, simulated structures, or 3D densities generated from the PDB structures using the Gaussian mixture model^30,31,32, and image correlation was also used as a force field to drive simulations^33,34,35,36. Though limited, these applications demonstrate that an integrative approach employing a combination of AFM, structure integration, and molecular dynamics (MD) simulations can advance biophysics and structural biology.

Here, we transform AFM data into 3D probability densities that relate the spatial free energy distribution to the physical (topographical) boundaries for the underlying protein structure and dynamics by the Boltzmann relationship and encode them in ‘.afm’ files. We use the ‘.afm’ 3D-density map files to generate force fields for MD flexible fitting (MDFF)³⁷, taking advantage of the fundamental principle of MDFF where the gradient of the density is related to force. This strategy allows us to apply a physically meaningful bias to the input structure, steering X-ray crystallography, and cryo-EM structures of a protein to obtain models that reflect the observed structural features in the AFM experiment. The data treatment pipeline and file format allow AFM data to be deposited, opened, compared, and analyzed by any researcher in common software, bringing AFM into the toolbox of structural biology.

Results

Construction and evaluation of 3D-LAFM detections

The recent development of LAFM (Methods) allowed us to break resolution limitations set by pixel sampling and tip convolution²⁵. Image expansion permitted translational and rotational fine alignment of particles and extraction of LAFM detections with great spatial precision, which were then merged into high-resolution LAFM maps²⁵. The canonical LAFM algorithm is capable of resolving fine structural details on the molecular surfaces of proteins²⁵, but direct comparisons of LAFM maps to data from other structural biology methods are impractical, primarily because LAFM maps were 2D images while other structural methods generate 3D density or coordinate files. Therefore, it is of utmost importance to transform LAFM maps (and AFM data in general) into a 3D format to propose AFM as a complementary tool for dynamic structural biology. To this end, we extracted LAFM detections through local image expansion³⁸, and subsequently aligned and allocated these detections into a 3D-volume space (Fig. 1a, Supplementary Fig. 1, Methods). Here, we used annexin V (A5), a peripheral membrane protein that assembles into membrane-bound 2D-lattices and has been widely studied using HS-AFM^39,40,41, as an example dataset (2.5 Å/pixel).

The AFM particle stack, the initial dataset from which aligned LAFM detections were extracted, is a stack of n particles containing structural data as x,y-coordinates, each informing about the height, h, of the sample surface. Thus, the particle stack represents an x-y-n matrix recording h (Fig. 1b, raw AFM frames). The local extraction and merging of LAFM detections from n particles retained the height values of each detection, at coordinates x’,y’ with sub-pixel spatial precision (Fig. 1b, aligned LAFM detections, Methods). The 3D-LAFM volume space, instead, is an i-j-k matrix (corresponding to x-y-h axes) merging the count of LAFM detections. This requires voxelization of the spatial x’, y’, and h values. Consequently, we defined a voxel size dv and allocated the aligned LAFM detections into the corresponding voxels in the 3D-LAFM volume space (Fig. 1b, 3D-LAFM detection stack, see Methods). As expected, the LAFM detections localized mostly in voxels characterizing the A5 molecular surface (Fig. 1c, dark voxels), whereas other voxels had 0 or 1 detections. No LAFM detections were excluded by a user-defined, arbitrary, prominence threshold; in contrast, all detections were merged into the 3D-space (1.1 × 10⁵ detections in the A5 example). We reason that local maxima that resulted from imaging noise would be sparsely and equally distributed in the 3D space and, therefore, would not influence the further density-weighted treatment of the data.

To evaluate the distribution of the aligned LAFM detections in 3D space, we allocated them into two independent half-stacks (each had ~5.3 × 10⁴ detections), then masked the voxels characterizing the A5 molecular surface (Supplementary Fig. 2, Methods), and calculated the Fourier shell correlation (FSC) of the two half-stacks (Fig. 1d). Since distinct procedures were applied to determine the k-value (height dimension) and the i,j-values (lateral dimensions) of 3D-LAFM detections, we interpret the FSC signal as an objective evaluation of the spatial distribution of 3D-LAFM detection data and not as an isotropic ‘spatial resolution’ as is the case for canonical volume data⁴². Therefore, we use, instead of ‘spatial resolution’, the term ‘half-bit wavelength’, λ_hb, for further discussion of the 3D-LAFM data quality⁴³.

The two 3D-LAFM detection half-stacks had a λ_hb of ~1.1 Å while the voxel size was dv = 0.3 Å/voxel (Fig. 1c, d). We found that the data quality increased with finer voxelization, i.e., decreasing voxel size dv, of the aligned LAFM detections in 3D space, as expected. λ_hb saturated at ~0.3 Å/voxel (Fig. 1e, Supplementary Fig. 3a), before being limited by the local expansion LAFM detection extraction (0.167 Å/pixel, Methods). Therefore, a 15× bicubic expansion coefficient (2.5 Å/pixel to 0.167 Å/pixel) was reasonable for the local expansion LAFM detection extraction for the A5 example data. In addition, the 3D-LAFM detection stack quality also depended on the total count of detections, i.e., the number of raw AFM data particles n that were integrated into the process, as λ_hb increased as more detections were pooled and then saturated at ~2 × 10⁴ detections (~6 × 10⁴ considering the three-fold symmetry of the A5 trimer) (Supplementary Fig. 3b).

Transformation of 3D-LAFM detections to density map

In localization-based super-resolution techniques, including LAFM, a point spread function or density function informs the likelihood of an observable, e.g., a fluorophore in fluorescence microscopy^44,45 or a peak topographic feature in AFM²⁵, to be detected at its location. To transform these detections (Fig. 2a) into a spatial density distribution (Fig. 2b), we must apply a probability function that characterizes the localization precision of the tip-sample interaction coordinates. In the canonical LAFM algorithm, a 2D Gaussian density function, usually with σ = 1.4 Å, was assigned to each detection, accounting for the solvent-accessible surface of an atom from which the tip-sample interaction on the protein surface originated²⁵.

**Fig. 2: 3D-LAFM density map of annexin-V.**

To transform detection coordinates into density in the 3D-LAFM pipeline, we applied a 3D Gaussian density function to each aligned LAFM detection in the 3D-LAFM detection stack, using the computationally derived σ value equivalent to λ_hb determined from the FSC analysis of the 3D-LAFM detection half-stacks. This step transforms a 3D-LAFM detection stack (Fig. 2a, Supplementary Movie 1) into a 3D-LAFM density map (Fig. 2b, Supplementary Movie 2), and thus concludes the objective 3D-LAFM pipeline from raw AFM images to 3D-density data (Fig. 1b to Fig. 2a, b, follow the yellow dots showing the trajectories of local maxima with low, middle, and high h-values in the first particle in the raw AFM frames). Slices through the A5 3D-density map highlight the high-resolution features that are resolved on the protein surface at the various levels of protrusion height (Fig. 2b, from low (right) to high (left) topographical features).

The A5 monomer consists of four domains, aka annexin-repeats (Fig. 2c, I, II, III, IV), of which repeats I, III, and IV protrude further and are well exposed to AFM contouring (once A5 is trimerized and membrane-bound, Fig. 2d)⁴⁶. Substantial height differences were observed in the 3D-LAFM density map of the A5-trimers where repeats I, III, and IV are located, while repeat II gave only a minor topography signal (Fig. 2b–e, arrowheads I, II, III, IV), reported by a counter-clockwise topographical height decrease, i.e., the height of repeat III > IV > I in HS-AFM data. Precise height differences, Δh, between the repeats could be measured from the i-j planes in the 3D-LAFM density map (Fig. 2b, blue). Considering the voxels with the highest density values, i.e., the most likely 3D location of the repeat of interest, we found that repeat III topped IV by a Δh of ~2 Å while repeat IV topped I by a Δh of ~1 Å. Similarly, the relative distance, Δd, between the repeat protrusions can be measured with high confidence in the j-k (red, for Δdy measurement) and i-k planes (for Δdx measurement), such that repeats III and IV had a Δd of ~26–27 Å and repeats IV and I had Δd of ~24–25 Å. These measurements provide insightful information about the general structural features of the molecules in the HS-AFM experiments under close-to-physiological conditions and showcase the potential of 3D-LAFM to analyze Angstrom-scale conformational changes in enzymatically active proteins.

To make 3D-LAFM density map data interpretable and comparable to results from other structural biology techniques, we compile the 3D-LAFM density map into a ‘.afm’ file which has a file structure similar to the MRC2014 file format that is commonly used for cryo-EM density maps⁴⁷ (Table 1, metadata code for 3D-LAFM density map: AFM1, Methods). Thus, the ‘.afm’ file format is fully compatible with general structural biology software and can be opened using drag-and-drop in e.g. Chimera²⁷ (Fig. 2e). As a consequence, the 3D-LAFM density map is generally compatible with the built-in tools in Chimera for in-depth analysis of the AFM data, which will be discussed later. For 2D visualization of the 3D probability density (for print panels and convenience for the human eye), we developed a pipeline to generate a high-density surface presentation of the 3D-LAFM density map (Fig. 2f, Methods), in which height is assigned an RGB color according to the LAFM false-color scale²⁵. This is useful for the generation of figure panels to communicate the characteristic topography information of AFM. The topographical features of the membrane-bound A5-trimer in the AFM experiment under close-to-physiological conditions were comparable to the A5 X-ray structure solved from proteins in less-physiological 3D-crystal lattices⁴⁶ (Fig. 2g, compare X-ray structure (left) and 3D-LAFM (right two panels)): A5 has overall a concave shape in both methods (Fig. 2c, bottom). In addition, the fine structural features along the backbone on the protein surface, i.e., the surface protruding residues, were resolved in the 3D-LAFM density (Fig. 2g, compare X-ray structure (left) and 3D-LAFM (right two panels)). Finally, we constructed 3D-LAFM density maps from half datasets for FSC analysis, which revealed a surface λ_hb of ~1.4 Å for the A5-trimer 3D-LAFM density map.

Table 1 Standard ‘.afm’ file header

Full size table

3D-LAFM density as force field for flexible fitting

MDs flexible fitting (MDFF) has been widely used to fit atomic structures into density maps^{37,48,49,50,51}. MDFF allows all-atom MD simulations of a known structure, i.e., atomic coordinates, under an external force field proportional to the gradient of a density map³⁷, e.g., a cryo-EM density map, in addition to classic MD force fields characterizing the physical laws, U_MD, and the secondary structure restraints, U_SS, e.g., CHARMM force fields⁵². Although only providing surface information, the high-resolution 3D-LAFM density maps (λ_hb ~ 1–2 Å⁻¹) can serve as an MDFF force field, U_AFM, to steer MD simulations of an atomic structure solved by another method, towards the most probable conformation that matches the topographical features obtained in AFM experiments under close-to-physiological conditions (Fig. 3a, Methods).

**Fig. 3: 3D-LAFM steered molecular dynamics flexible fitting (MDFF) of annexin-V.**

The AFM force field, U_AFM, contains both an active fraction where the force is proportional to the gradient of the 3D-LAFM density map (Fig. 3a), covering the molecular surface as an ~20 Å thick density cushion in the A5 example (Fig. 3b, t = 0 ns, gray), as well as an inactive fraction in the space covering the rest of the molecule below the 3D-LAFM density map filled with a background-matching density value (Methods). Using this force field, we ran a ~60 ns MDFF simulation, consisting of three ~20 ns simulation cycles at 300 K, interspersed by an energy minimization step (Fig. 3b–e, Supplementary Movie 3, Methods). To improve convergence of the MDFF simulations and avoid overfitting (Methods), we applied symmetry restraint in all three simulation cycles so that the subunits evolve towards the same conformation (as the 3D-LAFM map is also n-fold symmetric), divided the structured regions into individual domains and applied domain restraint in the first two simulation cycles, and, in the final MDFF cycle, allowed the structure to freely explore its local energy landscape free of the domain restraint to adopt a low energy conformation.

We observed, that the rigid body of the A5 structure moved rapidly into the 3D-LAFM density map within the first ~0.5 ns of the simulation (Fig. 3b, compare t = 0 ns and t = 0.5 ns), and reached an equilibrium conformation after ~10 ns (Fig. 3b, t = 10 ns). This was reported by the internal energy (E) that relaxed to ~1.55 × 10³kcal/mol (Fig. 3c), the root-mean-squared distance (rmsd) to the initial structure that plateaued at ~9 Å (Fig. 3d), and the normalized correlation of the structure to the density map that flattened at ~0.58 (Fig. 3e), whereas the initial roughly fitted model had a correlation value of ~0.3. Large annexin-repeat movements were completed in the first MDFF cycle (Fig. 3b, t = 20 ns), while the additional MDFF cycles involved local structural adjustment (Fig. 3b), further lowering the internal energy to ~1.48 × 10³kcal/mol, and improving the correlation coefficient to ~0.61 (Fig. 3c, e, t ~ 40 ns and t ~ 60 ns). It is worth noting that the AFM tip interaction with the sample surface may not always resolve flexible loops on the molecular surface, which is why we focused our analysis of 3D-LAFM MDFF results predominantly on the structured regions.

Compared to the initial crystal structure (Fig. 3f, purple), the final model from 3D-LAFM MDFF (Fig. 3f, scarlet) preserved the overall tertiary annexin-repeats structure, while rigid-body movements of repeats I and III were evident. To quantify the movements, we tracked the relative annexin-repeat height differences Δh over the simulation trajectory (Fig. 3g). We restricted this analysis to all the atoms in regions with secondary structure in the annexin-repeats, because the 3D-LAFM density that served as force field might comprise tip-sample interaction deformations in the flexible loops. Using repeat II (closest to the membrane) as a reference point, we found that repeats I, III, and IV all moved closer to the membrane during MDFF, to a final Δh_(III–II) ~ 4.4 Å, Δh_(IV_–_II) ~ 2.5 Å, and Δh_(I_–_II) ~ 2.5 Å, respectively (Fig. 3g, left). Alternatively, since the 3D-LAFM density map covered ~20 Å of the A5 liquid-facing surface, we estimated Δh using all surface atoms within this range (Fig. 3g, middle), and found a Δh_(III_–_II) ~ 4.9 Å, Δh_(IV_–_II) ~ 2.4 Å, and Δh_(I_–_II) ~ 1.4 Å, respectively.

Excitingly, the MDFF models captured essential structural features in the 3D-LAFM density map, where repeats III and IV shared a Δh of ~2.8 Å and repeats I and IV shared a Δh of ~0.8 Å (Fig. 3g, right, Fig. 2e). In contrast, a Δh of ~5.1 Å between repeats III and IV as well as a Δh of ~1.7 Å between repeats I and IV were estimated from the A5 crystal structure (Fig. 3g, right). Therefore, using the 3D-LAFM MDFF strategy, we obtained the most probable conformation underlying the AFM observation under close-to-physiological conditions. As detected by the Δh values, this A5 conformation adopts an overall flatter annexin-repeat arrangement that is less concave than the X-ray structure. We reason that, unlike the X-ray crystal where A5 is not membrane-bound, and thus no force is acting on the structure, the membrane to which A5 is attached in the HS-AFM experiments may exert a flattening force on the A5 molecules. Therefore a slightly flatter A5 structure, as compared to the X-ray structure, when A5 is membrane surface-associated, appears as a probable structure refinement under native conditions.

3D-LAFM MDFF conformational changes in a transporter

HS-AFM has recently allowed single-molecule structural biology, in which the conformational changes of a single Glt_Ph membrane transporter molecule were tracked under close-to-physiological conditions with high spatio-temporal resolution¹⁰. Glt_Ph comprises three functionally independent transport domains associated with a central scaffold- or trimerization domain (Fig. 4a, top)⁵³. Related to the release of the transport substrate on the intracellular side, large transport domain rearrangements between the inward-facing state (IFS) closed, IFS_closed, and open, IFS_open, conformations were reported in structural studies using X-ray crystallography^54,55,56, cryo-EM⁵⁷, and HS-AFM¹⁰ where IFS_open adopts a conformation with a more tilted (thus opened) transport domain than IFS_closed (Fig. 4a, bottom). In contrast, the trimerization domain is essentially static in the IFS_open–IFS_closed transition. In the HS-AFM single-molecule analysis, ~1000 protomer observations (POs) were obtained from an HS-AFM movie of an individual membrane-reconstituted Glt_Ph from the cytoplasmic side in apo condition, and subsequently sorted using principal component analysis (PCA) into outward-facing, OFS, IFS_open, and IFS_closed conformations¹⁰. Thus, calculating LAFM maps of these states and combining them with conformation-time traces of individual transport domains allowed HS-AFM single-molecule structural biology reporting about the structural interconversions of a single molecule over time¹⁰.

**Fig. 4: 3D-LAFM steered MDFF structural models of Glt_Ph IFS states.**

Here, taking advantage of the PO-sorting results, we constructed 3D-LAFM density maps of IFS_closed and IFS_open (Fig. 4b, Supplementary Fig. 4, Supplementary Movies 4 and 5) and used them to generate force fields (U_AFM_–_IFSc and U_AFM_–_IFSo) to steer 3D-LAFM MDFF (Fig. 4c–f, Supplementary Fig. 5). To obtain the most probable atomic models of the states determined in the HS-AFM experiment, we used PDB structures resolved in apo condition as the initial MDFF input, namely an IFS_open cryo-EM structure (PDB_IFSo_–_apo_–_EM, PDB 6X12) and an IFS_closed crystal structure (PDB_IFSc_–_apo_–_Xray, PDB 4P19). To account for the different methods used in the structure determination, an additional IFS_closed cryo-EM structure was used (PDB_IFSc_–_trsp_–_EM, PDB 6X15), though it was acquired in the presence of Na⁺ and Asp (transport condition), unlike the AFM experiments. To assess the transport domain movement, we defined three collective variables (CVs) to characterize the structure and measure the changes of the transport domain in the MDFF trajectories (as well as in the PDB structures), including: (1) The angle between the trimerization domain and the z-axis (angle φ). (2) The angle between the transport domain and the z-axis (angle θ). And, (3) the height difference between the transport- and trimerization domain from the cytoplasmic side (Δz) (Fig. 4c).

As control, we started the MDFF setup with the U_AFM-matching PDB structures—the ‘cis-fitting’ strategy—where U_AFM_–_IFSc was used as force field to steer PDB_IFSc_–_apo_–_Xray and PDB_IFSc_–_trsp_–_EM in MDFF (Supplementary Fig. 5a, b), and U_AFM_–_IFSo was used as force field to steer PDB_IFSo_–_apo_–_EM in MDFF (Supplementary Fig. 5c). After ~60 ns simulations, the ‘cis-fitting’ strategy led to an IFS_open structural model from U_AFM_–_IFSo with CV measurements matching PDB_IFSo_–_apo_–_EM. Interestingly, U_AFM_–_IFSc generated an IFS_closed structural model that was conformationally more akin to PDB_IFSc_–_trsp_–_EM than PDB_IFSc_–_apo_–_Xray, indicating that the HS-AFM IFS_closed 3D-LAFM data was conformationally more similar to the closed conformation resolved by cryo-EM, where Glt_Ph was reconstituted into a membrane nanodiscs in transport condition, than to the X-ray crystallography structure from Glt_Ph in 3D-crystals in the absence of membrane yet in apo condition.

Although the ‘cis-fitting’ MDFF strategy allowed us to obtain structural models of both IFS_open and IFS_closed Glt_Ph that incorporated the AFM structural features under close-to-physiological conditions, the approach requires prior knowledge about which initial PDB structure best matches the AFM force fields. Therefore, as the essential MDFF simulation, we engaged in the ‘trans-fitting’ strategy—intentionally starting with the ‘wrong’ PDB structures (Fig. 4d–i)—where U_AFM_–_IFSo was used as a force field to steer PDB_IFSc_–_apo_–_Xray (Fig. 4d, g, Supplementary Movie 6) and PDB_IFSc_–_trsp_–_EM (Fig. 4e, h, Supplementary Movie 7) in MDFF; and U_AFM_–_IFSc was used as a force field to steer PDB_IFSo_–_apo_–_EM (Fig. 4f, i, Supplementary Movie 8). Excitingly, the ‘trans-fitting’ strategy yielded similar structural models as obtained using the ‘cis-fitting’ strategy, as evaluated from the CV measurements in the final ~20 ns of the simulation trajectories, meaning that closed structures opened and the open structure closed their intracellular gate (compare Fig. 4g, h to Supplementary Fig. 5a, b, and compare Fig. 4i to Supplementary Fig. 5c). An expected transport domain opening, with a θ increase of ~12° and ~6° and a Δz reduction of ~7 Å and ~5 Å, was observed at t ~ 5 ns in the MDFF trajectories using U_AFM_–_IFSo as force field, indicative of a transition from PDB_IFSc_–_apo_–_Xray (Fig. 4d) and PDB_IFSc_–_trsp_–_EM (Fig. 4e) towards PDB_IFSo_–_apo_–_EM. Similarly, an expected transport domain closing, with a θ reduction of ~8° and a Δz increase of ~4 Å, was evident in the MDFF trajectories using U_AFM_–_IFSc as force field (Fig. 4f). Besides, the rmsd measurements of the ‘trans-fitting’ trajectories (Fig. 4g–1, right) further corroborated that structural models analogous to PDB_IFSo_–_apo_–_EM and PDB_IFSc_–_trsp_–_EM were obtained from MDFFs with U_AFM_–_IFSo and U_AFM_–_IFSc as force fields, respectively. Hence, the ‘cis-fitting’ and ‘trans-fitting’ MDFF simulation runs demonstrated the feasibility of using 3D-LAFM density map to generate force fields to steer a structure into another meaningful conformation.

Structural analysis of 3D-LAFM MDFF and experimental structures

The converging outcomes of the ‘cis-fitting’ and ‘trans-fitting’ strategies encouraged us to further explore the power of 3D-LAFM MDFF using an objective ‘blind-fitting’ strategy: Indeed, in addition to IFS_closed and IFS_open, we discovered a kinetically locked state in our recent HS-AFM single-molecule study, which we termed IFS_open1 (since its 3D-LAFM map resembled more IFS_open than IFS_closed) that has no structural correspondent in cryo-EM yet (Supplementary Movie 9). Thus, in the ‘blind-fitting’ approach, we used all three input structures (PDB_IFSo_–_apo_–_EM, PDB_IFSc_–_apo_–_Xray, and PDB_IFSc_–_trsp_–_EM) and steered them in MDFF simulations using U_AFM_–_IFSo, U_AFM_–_IFSc, and the uncharted U_AFM_–_IFSo1 that is so far only described by AFM (Methods).

For a comprehensive analysis of the 3D-LAFM MDFF results, we inspected the models from the last 20 ns of all MD trajectories (MD_IFSo_–_apo_–_AFM, MD_IFSc_–_apo_–_AFM, MD_IFSo1_–_apo_–_AFM) starting from any of the experimental structures (PDB_IFSo_–_apo_–_EM, PDB_IFSc_–_apo_–_Xray, and PDB_IFSc_–_trsp_–_EM), as well as these original experimental PDB structures (Fig. 5, Supplementary Fig. 6, n = 1116 protomers, Methods). We then created a CV (Fig. 4c) dataset from all structures, an 1116 × 3 matrix (Methods), and performed PCA on this matrix (Fig. 5a, Supplementary Fig. 6a). As expected, the MDFF results clustered in the pc1-pc2 space, where MD_IFSo_–_apo_–_AFM was close to PDB_IFSo_–_apo_–_EM and MD_IFSc_–_apo_–_AFM was close to PDB_IFSc_–_trsp_–_EM (quite a bit further away from PDB_IFSc_–_apo_–_Xray) (Fig. 5a, left). Interestingly, MD_IFSo1_–_apo_–_AFM structures had a slight overlap with MD_IFSo_–_apo_–_AFM structures but formed a clear cluster at a nearby location. To evaluate the structural similarity, we defined a similarity score (ss) which is inversely related to the pairwise distances between all structures from each species in the pc1-pc2 space (Fig. 5a, right, Methods). MD_IFSo1_–_apo_–_AFM had a ss ~ 0.55, ~0.38, and ~0.58 to MD_IFSo_–_apo_–_AFM, MD_IFSc_–_apo_–_AFM, and PDB_IFSo_–_apo_–_EM, respectively, reflecting an overall open transport domain arrangement but in a clearly distinctive conformation for the kinetically inactive IFS_open1 in the transport cycle of Glt_Ph. Overall, these 3D-LAFM MDFF results show that the 3D-LAFM density maps can be used as a force field for MDFF that steers any structure into a coherent conformational cluster, and that the method can produce a model of a so far structurally unknown state, from any starting conformation.

**Fig. 5: Protomer structural analysis of Glt_Ph 3D-LAFM MDFF models and PDB structures.**

Complementary to the CVs depicting local structural features, we alternatively built an autoencoder (AE) neural network⁵⁸ for an unsupervised all-atom analysis (Fig. 5b, Supplementary Fig. 6b) to further analyze the structural changes during the Glt_Ph IFS_open–IFS_closed transition. AE networks are trained to project high-dimension data onto a low-dimension latent space and then use the latent space information to re-generate the high-dimension data with minimal errors. Owing to its efficacy in learning essential features from high-dimension data, AE networks are widely used to analyze MD trajectories^{59,60,61,62,63,64}, using all-atom coordinates as a training dataset to build 2D latent spaces that reflect the most distinctive structural features inherent to the dataset structures. To this end, we trained an AE using the IFS Glt_Ph derived training dataset (1116 × 1170 matrix, for 1116 protomers of 390 Cα atoms, Methods). Remarkably, the MDFF results were distributed in the trained 2D latent space in a comparable pattern to the PCA of CVs (compare Fig. 5a, left with Fig. 5b, left), where MD_IFSo_–_apo_–_AFM, MD_IFSo1_–_apo_–_AFM clustered near PDB_IFSo_–_apo_–_EM, while MD_IFSc_–_apo_–_AFM clustered near PDB_IFSc_–_apo_–_Xray and PDB_IFSc_–_trsp_–_EM. Hence, the AE-based all-atom approach provided an unsupervised and complementary corroboration of the PCA of CVs.

The presented analysis methods allowed direct and quantitative comparisons between structures obtained from AFM experiments (3D-LAFM MDFF structures) as well as between AFM and other structural biology methods. Thus, 3D-LAFM combined with MDFF offers an avenue to refine conformational states, and to describe transition states and conformations observed in the membrane, at ambient temperature and pressure, as well as in a physiological buffer.

Discussion

AFM has long been struggling to be truly complementary and to interface with other structural biology methods, e.g., X-ray crystallography, cyro-EM, and NMR, because AFM did not produce data that was visualizable, analyzable, interpretable, and comparable to commonly used structural biology platforms with data from these other structural biology methods. Here, we built on LAFM and introduced a pipeline to convert AFM single-particle data into 3D density files, i.e., 3D-LAFM density maps, which are encoded in a ‘.afm’ file that has a structure reminiscent of the MRC2014 file format usually used for cryo-EM data (Table 1). The ‘.afm’ file is readable by commonly used structural biology software, e.g., Chimera, allowing direct visualization and interpretation of 3D-LAFM density maps concomitantly with other structural data. Besides, structural features of the protein under investigation can be measured directly in the 3D-LAFM density maps, as showcased here on A5 as well as on IFS Glt_Ph¹⁰, and compared with data from other structural methods.

Moreover, we used the 3D-LAFM density maps to generate external force fields, U_AFM, for MDFF simulations to steer cryo-EM and crystal structures toward the conformational states observed by HS-AFM. This application makes AFM data complementary to other structural methods and potentially enhances our current structural understanding of proteins. Although AFM data does not allow the construction of complete 3D densities, and as a consequence does not allow the structure determination per se, the 3D-LAFM density maps are structurally informative: Indeed, the use of 3D-LAFM density maps to drive MDFF allowed us to obtain models that reflect the structural features observed in the AFM experiments under close-to-physiological conditions, which complements, yet obligatorily builds on, the structures resolved in less physiological conditions by the other methods. Using ‘cis’ and ‘trans’ conformational 3D-LAFM MDFF, we could quantitatively assess the robustness of the approach. Indeed, the IFS Glt_Ph 3D-LAFM MDFF simulations worked successfully: both the ‘cis-fitting’ strategy, where PDB structures in conformations matching the 3D-LAFM density maps were used as the MDFF input models, and the ‘trans-fitting’ strategy, where PDB structures in conformations different from the 3D-LAFM density maps were intentionally used as the MDFF input models, led to indistinguishable structural models at the end of the simulations. Therefore, a ‘blind-fitting’ strategy could be used for 3D-LAFM MDFF for an unbiased search of structural models that match the 3D-LAFM density maps, enabling the potential discovery of so-far unknown conformations.

The structural models obtained from 3D-LAFM MDFF, as well as between 3D-LAFM MDFF and experimental structures, could be directly and quantitatively compared using PCA of CVs as well as AE neural networks of all-atom coordinates. Thus, the 3D-LAFM pipeline (Supplementary Fig. 7), from raw AFM frames, to 3D-LAFM density maps, and finally to 3D-LAFM MDFF structural models, has the potential to incorporate AFM into the sets of methods routinely employed by the structural biology community. In addition to MDFF, alternative simulation methods such as coarse-grained simulations could potentially be employed, for example, in cases of limited-resolution 3D-LAFM density fitting or for modeling large protein complexes to avoid overfitting or reduce computational demand. AFM experiments and MD simulations are becoming an emergent combination in biophysical studies, allowing for the comparisons of measurables, such as inter-molecular interactions and distances^39,65,66.

Furthermore, with an appropriate single-particle sorting strategy such as PCA, as we introduced in the single-molecule structural analysis of Glt_Ph¹⁰, multiple structural models could be obtained from AFM and HS-AFM experiments to study the conformational dynamics of a protein at work (Supplementary Fig. 8, Supplementary Note 1), potentially resulting in atomic scale movies of individual proteins in action. In summary, we take a structural biology approach by (1) analyzing AFM data (picking AFM single-particles, classifying the particles in the case of Glt_Ph, and aligning the particles), subsequently (2) building 3D densities from the AFM data, and finally (3) using these 3D densities as physically meaningful force fields to drive PDB structures into unknown conformations.

These datasets could be deposited into suitable repositories in structural biology compatible formats (3D-LAFM densities in ‘.afm’ format, and 3D-LAFM MDFF models in PDB format), for cross-methodology comparison and structural analysis. In analogy to the EMDB where EM densities are deposited, we have initiated the establishment of an AFM Data Bank (AFMDB) with deposits of current “.afm” files (Methods). Besides single-molecule structural data, other AFM data, e.g., the assembly configuration of tens of aquaporins in a membrane⁷ (Supplementary Fig. 9), or the surface topography and the mechanical properties of a cell (Supplementary Fig. 10), can be encoded in ‘.afm’ files following the ‘AFMx’ code format, facilitating data interpretation, comparison, and cross-validation between AFM and other relevant methods and opening possibilities (Supplementary Note 2). We anticipate that the presented methods will integrate AFM into the toolbox of structural biology and advance our understanding of protein structure and dynamics at a single-molecule level under close-to-physiological conditions.

Methods

HS-AFM sample preparation

The A5 used in the study was purchased from Sigma-Aldrich (Annexin-V, 33kD from human placenta). GltPh was purified as previously described^10,18,57. All lipids (1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC), 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE) and 1,2-dioleoyl-sn-glycero-3-phospho-L-serine (DOPS)) were purchased from Avanti polar lipid. To prepare small unilamellar vesicles (SUVs), for both A5 and Glt_Ph experiments, lipids were first dissolved in chloroform at a ratio of DOPC:DOPS = 3:2 (w:w, for A5) or DOPC:DOPE:DOPS = 8:1:1 (w:w:w, for Glt_Ph), and then dried by a nitrogen flow and kept in a vacuum chamber overnight for further drying. The dried lipid mixture was resuspended into buffer solutions (For A5: 20 mM HEPES at pH 7.4, 150 mM NaCl, and 2 mM CaCl₂. For GltPh: 10 mM Tris-HCl at pH 7.6, 100 mM NaCl, and 10 mM MgCl₂), and subsequently sonicated to obtain SUVs.

For A5 experiments^39,41, 1.5 μL of diluted SUV (DOPC:DOPS = 3:2, w:w) solution at a lipid concentration of 0.1 mg ml⁻¹ was deposited onto freshly cleaved mica for ~1 min to form supported lipid bilayers, and then rinsed with A5 imaging buffer (20 mM HEPES at pH 7.4, 150 mM NaCl, and 2 mM CaCl₂). During HS-AFM imaging, A5 molecules were added to the HS-AFM fluid chamber and growth of 2D-crystals on preformed membranes could later be observed.

For GltPh reconstitution¹⁰, purified Glt_Ph was diluted to a final protein concentration of 0.5 mg ml⁻¹ with the reconstitution buffer (10 mM Tris-HCl at pH 7.6, 100 mM NaCl, and 10 mM MgCl₂). SUV solution (DOPC:DOPE:DOPS = 8:1:1, w:w:w) with 2% DDM was then supplemented to the protein at a lipid-to-protein ratio of 0.7. This mixture was allowed to equilibrate overnight before the addition of ~5 mg wet Bio-Beads (Bio-Rad) for overnight detergent removal. The reconstitutions were checked by negative-stain electron microscopy for the presence of densely-packed proteo-liposomes. For poly-L-lysine (poly-lys) coating, 2 μL of poly-lys at a concentration of 0.001% (0.01 mg ml⁻¹) was deposited onto freshly cleaved mica and incubated for 30 s. The excessive poly-lys were rinsed with the imaging buffer (20 mM Tris-HCl at pH 7.6, 150 mM NaCl, apo condition for Glt_Ph), and the poly-lys coated mica was allowed to air dry before the sample physisorption. For HS-AFM imaging of immobilized Glt_Ph molecules, the membrane-extension membrane protein reconstitution (MEMPR) strategy was applied. In brief, the Glt_Ph reconstitution was diluted in the reconstitution buffer with an additional 1 × CMC DDM, to a final protein concentration of ~25 μg ml⁻¹, the SUVs (DOPC:DOPE:DOPS = 8:1:1, w:w:w) was diluted in the same buffer to a final lipid concentration of 1 mg ml⁻¹. The diluted Glt_Ph reconstitutions (~1× CMC DDM) and the diluted SUVs (~1× CMC DDM) were mixed at a ratio of 1:1 to give the prephysisorption mixture. After equilibration for >15 min, 2 μL of the prephysisorption mixture was deposited onto the poly-lys coated mica for 10 min, and then extensively washed with the imaging buffer more than five times to dilute the detergent and remove excessive proteo-liposomes that were not physisorbed.

HS-AFM imaging and data analysis

HS-AFM measurements were performed with an HS-AFM (RIBM) operated in amplitude modulation mode. Igor Pro version 7 was used for HS-AFM data collection. In brief, we used short cantilevers (USC-F1.2-k0.15, NanoWorld) with a nominal spring constant of 0.15 N m^–1, a resonance frequency of ~0.6 MHz, and a quality factor of ~1.5 in the imaging buffers^10,39. All data were acquired at standard laboratory temperature (298 K). HS-AFM movies were flattened and aligned using home-written ImageJ plugins (ImageJ, NIH).

Construction of raw AFM single-particle frames

Particles from the flattened and aligned HS-AFM movies (2.5 Å/pixel in the A5 example) were picked using a home-written ImageJ plugin ‘Particle Picker’, utilizing a 2D cross-correlation-based pattern recognition algorithm. In brief, a reference particle with a user-defined size (64 × 64 pixels in the A5 example) was manually selected from a reference frame and then symmetrized using the molecular symmetry information (3-fold symmetry average in the A5 example). The symmetrized particle served as the reference for particle recognition and picking in the reference frame, after which the average of all picked particles from this step (aligned and symmetrized) was used as the updated reference for particle recognition and picking for the entire movie. During the image recognition process, the algorithm applies a sliding window to each HS-AFM frame, calculating the 2D cross-correlation coefficient value (CCV) between the reference particle (R) and the window (W), both symmetrized, as:

$${CCV}=\frac{1}{{IJ}}{\sum }_{j=1}^{J}{\sum }_{i=1}^{I}\frac{\left({{{\bf{W}}}}_{i,j}-{\mu }_{W}\right)\left({{{\bf{R}}}}_{i,j}-{\mu }_{R}\right)}{{\sigma }_{R}{\sigma }_{W}}$$

(1)

where μ_W and μ_R are the standard deviations and μ_W and μ_R the mean measurements of the image height values. Since the particles could be randomly or multi-directionally oriented, the sliding window also rotates to seek the angle giving the highest CCV at each sliding step. Windows with a CCV larger than a user-defined threshold value were marked as ‘pattern-containing’. Since densely packed proteins, especially those forming 2D protein lattice on the membranes were common in AFM experiments, a ‘minimal particle-particle distance’ value (usually proportional to the 2D lattice unit-cell size) was applied to further eliminate windows that are too close to a pattern-containing window with a higher CCV. At last, all particles that met the criteria were merged, without rotation adjustment (to avoid pixel interpolation), to construct a stack of raw AFM single-particle frames in Fig. 1a. The angle information was recorded as a separate output for 3D-LAFM detection coordinates alignment. The procedure ensures that all local maxima from these particles contain real AFM height measurements.

Rationale of automatized and unbiased LAFM

AFM images are pseudo-3D maps, where the image plane represents the x- and y-dimensions, and the pixel intensity (often displayed by a false-color scale) represents the topographical height, h, of the sample surface. However, the convolution of the tip shape compromises the accuracy of sample contouring, affecting the data resolution. The recent development of the LAFM algorithm solved this problem by exclusively extracting the peak detections in AFM images (LAFM detections), where a peak pixel is a local maximum, i.e., is higher than all surrounding pixels in a 3 × 3 kernel, represents therefore a real tip-sample interaction, and is free from tip convolution²⁵. In the canonical LAFM pipeline, single molecules were picked from a raw AFM image or HS-AFM movie to give a raw data particle stack. These particles were expanded, usually from ~2–5 Å/pixel to ~0.5–1 Å/pixel, and translationally and rotationally fine-aligned, to generate a particle stack from which LAFM detections were extracted. Image expansion is necessary to detect local maxima with greater spatial precision, thus effectively breaking the lateral resolution limitations of the pixel sampling and the tip convolution of the raw data. Besides, image expansion enables translational and rotational alignment of the particles with greater precision than the original pixel sampling would allow, facilitating the merging of LAFM detections that characterize identical structural features, e.g., a surface exposed amino acid, from all particles²⁵.

However, both particle expansion and fine alignment relied on bicubic interpolation. This operation inevitably creates additional local maxima, as well as loses track of precise absolute height values of the local maxima, despite yielding higher precision of their lateral position. Besides, in the canonical LAFM pipeline, local maxima were selected based on their prominence, i.e., their height over the surrounding pixels, to avoid the selection of maxima that potentially emerged from noise, where the local maxima prominence threshold was user-defined. We consider these aspects as limitations that challenge the standardization and generalization of LAFM for dynamic structural biology. To this end, we developed a computational pipeline for objective extraction of LAFM detections from raw AFM data (Supplementary Fig. 1). First, we obtained an HS-AFM movie of A5 trimers as an example dataset (Supplementary Fig. 1a). Second, we picked single particles of A5 trimers from the raw data, and, third, extracted local maxima from these raw particles (Supplementary Fig. 1b). Fourth, we used bicubic interpolation, but only to expand pixels surrounding individual maxima (Supplementary Fig. 1c, step 1, 15 × expansion in the A5 example) for lateral peak localization with sub-pixel precision (Supplementary Fig. 1c, steps 2 and 3), where the new local maxima x,y-position (Supplementary Fig. 1c, d, pink cross) must reside within the 3 × 3 pixel kernel (Fig. 1c, box) that defined the original maxima (Fig. 1c, blue cross). Fifth, we recorded the localized maxima x,y coordinates and their original height (h) as unaligned coordinates of the LAFM detections of each particle (Fig. 1a, bottom).

This pipeline takes advantage of the bicubic interpolation to acquire greater spatial precision of the LAFM detections without introducing additional detections. Indeed, an average of ~4.1 × 10³ detections was extracted from a single A5 particle (64 × 64 pixels, n = 171), while a ten-fold increased number of detections, ~3.7 × 10⁴, was extracted from the same particles after three-fold bicubic expansion (192 × 192 pixels) in the canonical LAFM pipeline. Moreover, because in our approach the height (h) of the local maxima was determined from the raw data AFM particles before expansion, absolute height values were retained for LAFM detection processing. A similar strategy was also reported in another AFM analysis software that allows LAFM map construction³⁸. In parallel to LAFM detection extraction (Fig. 1a, unaligned coordinates), we obtained the AFM particle alignment information, i.e., translational and rotational particle repositioning with sub-pixel precision, of individual particles with respect to each other from aligning globally expanded particles, as in the canonical LAFM pipeline (Fig. 1a, top, alignment information). The fine alignment information provides the relative spatial relationship of unaligned LAFM detections in the particles to a consensus alignment. Hence, supplementing this information to the LAFM detection extraction pipeline results in aligning LAFM detections bypassing the drawbacks of bicubic interpolation. Since LAFM detections were obtained from only locally expanded images, this pipeline was termed local expansion LAFM detection extraction.

Local expansion LAFM detection extraction

In our method, a local maximum position (Supplementary Fig. 1b) is defined as a pixel that is higher than all the surrounding pixels in its 3 × 3 pixel kernel. Accordingly, we selected all local maxima from the raw AFM single-particle frames and then cropped their 9 × 9 pixel neighborhood images to facilitate sub-pixel localization of the detection positions. Thereafter, we expanded the 9 × 9 pixel neighborhood images using bicubic interpolation (Catmull–Rom interpolation⁶⁷). This method utilizes a 16-pixel surface (4 × 4 pixel grid) to calculate the intermediate pixel values in the central 2 × 2 area, through 3rd-order 2D polynomial interpolation. The choice of the 9 × 9 pixel neighborhood (before expansion) around each local maximum position ensures the accuracy of the bicubic interpolation calculation in the central 3 × 3 pixel neighborhood (before expansion), where the sub-pixel position of the local maximum detection must reside. Using this method with an expansion scale of 15× (from 2.5 Å/pixel to 0.167 Å/pixel in the A5 example), we generated a series of 135 × 135 pixel local maximum neighborhood images, where the sub-pixel localized detection coordinates could be found in the corresponding central 45 × 45 pixel target regions (Supplementary Fig. 1c, step 1). The sub-pixel localized detection must be a local maximum detection in the target region after expansion. In most cases, there exists a single local maximum position within this region. However, the local maximum that is closest to the image center (in the x-y plane) was selected if multiple positions were observed (Supplementary Fig. 1c, step 2). The integer pixel x,y coordinates of the detection were then adjusted by the expansion scale to give the final sub-pixel localized coordinate pair, x’and y’, of the local maximum position (Supplementary Fig. 1c, step 3). The local maximum h-value, measured from the AFM experiments, and the sub-pixel x’,y’-coordinates (all in decimal format) were then exported as an unaligned LAFM detection (Fig. 1a, bottom row). Finally, all (N) detection pairs were pooled to a stack D, a N × 3 matrix where columns 1, 2, and 3 correspond to x’, y’, h values of each detection, respectively. The local expansion LAFM detection extraction was developed in MATLAB (MatLab, Mathworks). The usage of raw height measurements via the automatized and unbiased LAFM extraction workflow further warrants the accuracy of the final 3D-LAFM densities, and ensures the effectiveness of the overall method.

Alignment of the LAFM detection coordinates

The raw AFM single-particle frames were expanded (5× in the A5 example) using bicubic interpolation. The expanded particles were rotation-adjusted (using the angle output from ‘Particle Picker’) and averaged to generate an alignment reference particle. Then a sliding-window strategy to calculate CCV between expanded particles and the reference particle, both symmetrized before CCV calculation, was applied to find the optimal translation and rotation (based on the angle output from ‘Particle Picker’) of each expanded AFM single-particle frame with respect to the reference particle (for convenience, the particle center is moved to (0, 0) at this step). The particle translation and rotation information was then adjusted by the expansion scale and used to align all LAFM detection coordinates in their x,y dimension (the first two columns in matrix D) (Fig. 1a, top row). In the A5 example, the adjusted coordinate values in D₁ (all x’ values) and D₂ (all y’ values) were distributed within ±80 Å (particle size is 160 × 160 Å), the height values in D₃ (all h values) had a range of ~30 Å, and the total count of detections N is ~42,000.

In the A5 example, the aligned LAFM detections have a lateral resolution of 0.167 Å/pixel and a height resolution <0.25 Å/pixel (depending on the z-piezo sensitivity, ~10 nm/V, the digital-to-analog resolution, 12 bit, and data acquisition range, ±5 V, in the AFM experiments). This z-dimension resolution imaging could be easily boosted to <0.1 Å/pixel by setting the data acquisition range to ±1 V using HS-AFM gain multiplier¹⁰. The alignment of the LAFM detections was developed in MATLAB.

3D-LAFM detection stack construction

The aligned LAFM detections with extreme height values should be removed by setting height thresholds h_min and h_max for the following reasons: (1) Including extremely high detections likely due to the tip moving away from the surface may require too much computation power and is unnecessary. (2) Including too many membrane- (for imaging on extended lipid bilayers) or mica-detection may bias the density distribution toward the membrane rather than the protein of interest. Thresholds in the x- and y-dimensions could also be applied to furtherly clean the peripheral detections. In our setup, detections with a 2D-distance of >0.5 × particle-size (80 Å in the A5 example) were removed. In the A5 exmaple, D had a total count of N ~ 36,000 detections after cleaning.

The aligned LAFM detection coordinates, D, were then allocated into a 3D-matrix, the 3D-LAFM detection stack V, with a user-defined voxel size, dv, and a size of I × J × K. The voxel size dv (0.3 Å/voxel used in the A5 example) must be larger than the resolution limit of the detections. Dimensions i and j correspond to the lateral dimensions x (fast-scan) and y (slow-scan) of the AFM particles, while dimension k corresponds to the vertical dimension h of the AFM particles. Voxels in stack V should record the total count of the local detections. Thus, the value of voxel i,j,k, V_i,j,k, is:

$${{{\bf{V}}}}_{{{\rm{i}}},{{\rm{j}}},{{\rm{k}}}}={\sum }_{n=1}^{N}\left[\begin{array}{c}\delta \left(i-0.5{{\rm{d}}}v \, < \, {D}_{1} \, < \, i+0.5{{\rm{d}}}v\right)\times \ldots \\ \delta \left(j-0.5{{\rm{d}}}v \, < \, {D}_{2} \, < \, j+0.5{{\rm{d}}}v\right)\times \ldots \\ \delta \left(k-0.5{{\rm{d}}}v \, < \, {D}_{3} \, < \, k+0.5{{\rm{d}}}v\right)\end{array}\right]$$

(2)

Then, 2D molecular symmetry addition (3-fold symmetry addition in the A5 example) was applied to each i-j plane of stack V to conclude the construction of the 3D-LAFM detection stack (V contained ~100,000 detections in the A5 example). The 3D-LAFM detection stack construction was developed in MATLAB.

3D-LAFM detection stack evaluation

To evaluate the 3D-LAFM detection stack V, a 3D mask that covers the detections characterizing the molecular surface (on-target detections) must be created (Supplementary Fig. 3). To this end, we first projected all detections in V along its k-axis onto a 2D i-j plane (k-projection map). Then, a 2D moving-mean filter was applied to the k-projection map, so the region of the molecular surface became outstanding. Using a user-defined threshold, we generated an initial i-j plane mask for the molecular surface 2D geometry. Next, we shrank and/or grew the initial i-j plane mask, pixel by pixel along its circumference, and measured the detection coverage by the mask at each step to search for an optimal mask size. Usually, a decreasing Δcoverage/Δstep (detection coverage change with respect to one-pixel increment) followed by an increasing Δcoverage/Δstep could be found, corresponding to a turning point where the mask starts to include off-target detections. Using this 2D ‘mask growth/shrinking’ strategy, the optimal i-j plane 2D-mask was defined and then applied to each i-j plane of V to select for the on-target detections, resulting in stack V_m2d. Similar to the i-j plane 2D-mask, a 3D mask was made by first applying a 3D moving-mean filter to V_m2d, then creating an initial i-j-k 3D-mask with a user-defined threshold, and finally searching for the optimal i-j-k 3D-mask using the 3D ‘mask growth/shrinking’ strategy. Note that the optimal 2D- and 3D-masks should be almost indifferent to the user-defined threshold values.

The aligned LAFM detection coordinates in D was allocated randomly into two 3D-LAFM detection half-stacks and masked with the optimal 3D mask previously determined to eliminate off-target detections, resulting in half-stacks V_a,m3d and V_b,m3d. Then, the FSC method was applied to the two masked half-stacks to determine λ_hb of stack V (r_V) using the half-bit threshold criteria⁴³. The ‘half-bit wavelength’ λ_hb characterizes the distribution of the aligned LAFM detections in the 3D space, and should depend on the imaging quality and the total count of detections (Supplementary Fig. 3). The 3D-LAFM detection stack evaluation was developed in MATLAB.

3D-LAFM density map construction and evaluation

The 3D-LAFM density map was constructed by applying a 3D density function to each detection in the 3D i-j-k space. In our setup, we used a 3D Gaussian density function N₃ ~ (0, r_V), where r_V is the ‘half-bit wavelength’ value of the 3D-LAFM detection stack V (see “3D-LAFM detection stack evaluation”). This choice to determine the σ value of the 3D Gaussian from the data itself provides a more objective basis for this parameter reflecting the 3D spatial distribution of the aligned LAFM detections, which is likely different for each individual dataset, related to differences of the proteins under investigation and/or instrument- or experiment-dependent noise. One can anticipate that more densely distributed LAFM detections, likely to be found in high-quality raw data with small pixel size should give smaller σ values, which in turn avoids over-smoothing of the probability density. In contrast, lower-quality data will result in wider detection distributions and thus larger σ values, which must be assigned to compute reliable densities from less densely distributed detections. Hence, we constructed the 3D-LAFM density map, P, by a convolution operation, as:

$${{\bf{P}}}\left[i,j,k\right]={{\bf{V}}}\otimes {{{\bf{N}}}}_{3}={\sum }_{{i}^{{\prime} }=-\tau }^{\tau }{\sum }_{{j}^{{\prime} }=-\tau }^{\tau }{\sum }_{{k}^{{\prime} }=-\tau }^{\tau }{{\bf{V}}}\left[{i}^{{\prime} },{j}^{{\prime} },{k}^{{\prime} }\right]\times {{\bf{N}}}\left[i-{i}^{{\prime} },j-{j}^{{\prime} },k-{k}^{{\prime} }\right]$$

(3)

where τ is the size of the 3D density function kernel, and a τ > 5 × r_V was used in our setup. The resulting density map P was then 2D-symmetrized at each i-j plane (see “3D-LAFM detection stack construction”).

To evaluate the 3D-LAFM density map P, we applied the same 3D density function to the two (unmasked) half-stacks V_a and V_b, and then analyzed their FSC curve using the half-bit threshold criteria for the 3D-LAFM density map ‘half-bit wavelength’ value r_P. The 3D-LAFM density map construction and evaluation were developed in MATLAB.

Converting the 3D-LAFM density map to a ‘.afm’ file

The 3D-LAFM density map P (a 3D-array matrix of 32-bit single values) was encoded into an MRC2014 extended ‘.afm’ file using a home-written MATLAB ‘encoder’. To allow data deposition into common repertories as well as promote experimental and analytical details sharing, we used the extra space (Word 25–49, Bytes 97−196) in the MRC2014 file header for the storage of essential parameters (Table 1)⁴⁷. Once a home-written Python ‘decoder’ is installed, the ‘.afm’ file is compatible with Chimera²⁷, a common structural biology 3D viewer (see Supplementary Methods: “Accessing ‘.afm’ files in ChimeraX”). Note that metadata code AFM1 is introduced for 3D-LAFM density map (Table 1, Word 27)

High-density 3D-LAFM surface representation

The 3D-LAFM density map, encoded as an MRC2014 extended ‘.afm’ file, was opened in Chimera as volume data and displayed in the surface mode, which allowed an iso-density surface to be calculated at any user-defined density value. Then, we colored the top ~7 Å of the density map using the LAFM false color codes²⁵ by the height values. To generate a high-density 3D-LAFM surface representation, we collected ~200 iso-density surface snapshots at a user-defined view (e.g., top or side) with descending density values. The snapshots were exported to ‘png’ files with the ‘transparent background’ mode in Chimera. The first three of the four channels in these ‘png’ files record the RGB color values (LAFM false color codes) of the pixels, informing the surface height as colors, while the last channel encodes the transparency (alpha) values, corresponding to the density value (normalized to 0–255) used in the iso-surface generation. If the pixel corresponds to an empty position, i.e., gaps in the high-density iso-surfaces, the background color (RGB = 0/0/0 for white or 255/255/255 for black) was assigned to the RGB channels.

These snapshots were merged using a home-written MATLAB script. The snapshots were first ordered by descending density values to generate a stack of images, i.e., snapshots taken with high-density values were placed in the front of the stack. Then all background pixels in the stack were assigned a value of 0 in their transparency (4th) channel (100% transparent). We next looked at each i-j column of the stack for the first non-transparent pixel and assigned its color code r,g,b, and transparency α to the corresponding pixel in the high-density 3D-LAFM surface representation image.

The high-density 3D-LAFM surface reflects the most probable height (color) at a position with the corresponding probability of detection (transparency), similar to the (2D) LAFM map²⁵. However, this conversion loses the information of the width of the detection distribution which presumably reflects the local dynamics of the imaged region. Therefore, AFM structural features should be directly analyzed in the 3D-LAFM matrices, not the high-density surface images. The surface representation strategy could be applied to different views of the volume data, but the top view is the most reliable. Note that this representation is not for any detailed comparison or analysis of the 3D-LAFM data with the atomic structures but for a visual of the 3D-LAFM density map.

MDs flexible fitting

MDFF simulations^37,48 were set up in VMD (version 1.9.4)⁶⁸ and performed using NAMD (version 2.13)⁶⁹ with the CHARMM27 force field⁵². To incorporate the AFM data as the external potential used by MDFF, the 3D-LAFM density map was converted to U_AFM (see “3D-LAFM MDFF force field”) and integrated into the potential energy function, as:

$${U}_{{{\rm{total}}}}={U}_{{{\rm{MD}}}}+{U}_{{{\rm{AFM}}}}+{U}_{{{\rm{SS}}}}+{U}_{{{\rm{AX}}}}$$

(4)

where U_total is the total potential, U_MD the conventional MD potential energy function, U_SS the potential to preserve the secondary structure of proteins, and U_AX the auxiliary potentials (see discussions below).

All atomic structures were first rigid-body docked into the target 3D-LAFM density map in ChimeraX. To avoid overfitting and prevent structural artifacts, restraints for dihedral angles and hydrogen bonds were added to enforce the secondary structure of the protein, as well as restraints generated for preserving the cis/trans configuration and chirality (U_SS). If applicable, domain and symmetry restraints were then applied to maintain domain rigidity and the inherent structural symmetry of the proteins as auxiliary potentials (U_AX)⁷⁰, respectively. All MDFF simulations in this study were performed for 60 ns, comprising three distinct cycles. Each cycle commenced with an initial energy minimization step of 2 ps, followed by a running process of 20 ns, and concluded with an additional minimization step of 2 ps. The symmetry restraint was applied to all three cycles, while the domain restraint was applied to the first two cycles to assist potential large domain movements. All simulations were carried out at a constant temperature of 300 K, at 1.01 bar pressure, and Langevin dynamics with a damping coefficient of 5 ps⁻¹, and a time-step of 1 fs. Detailed parameters and constraints are provided as supplementary files (Supplementary Methods).

3D-LAFM MDFF force field

The 3D-LAFM MDFF force field U_AFM was constructed using a home-written MATLAB script. U_AFM consists of an active fraction (U_AFM_–_a), where the force is proportional to the gradient of the 3D-LAFM density map and covers the molecular surface, and an inactive fraction (U_AFM_–_i), where the space is filled with a background-matching density value and covers the rest of the space below the 3D-LAFM density map. U_AFM_–_a was constructed by normalizing the 3D-LAFM density map to [0, 1]. To construct U_AFM_–_i, we generated simulated 3D density data from the atomic structure of the protein using a very large resolution of 10 Å. Then, the i-j cross-section of the protein was obtained by projecting the 3D simulated density volume along its z-axis (corresponding to the height dimension in AFM) and then normalized to [0, 1]. To merge U_AFM_–_i and U_AFM_–_a, we calculated the mean value of the background densities in U_AFM_–_a, using the same 3D-mask for 3D-LAFM detection stack evaluation (see “3D-LAFM detection stack evaluation”, ~0.03 in the A5 and Glt_Ph maps). Using this value as a cutoff, we replaced all voxels below it in U_AFM_–_a with the smaller of the cutoff and the value of the corresponding U_AFM_–_i voxel to generate U_AFM.

Analysis of the MDFF trajectories

The internal energy (E) and root-mean-squared distance (rmsd) measurements of the MDFF trajectories were directly obtained from NAMD and VMD, respectively. The normalized cross-correlation between the 3D-LAFM density map and the atomic structures (simulated 3D density data with a resolution of 10 Å) was calculated in Chimera. It should be noted that the (HS-)AFM imaging presumably perturbs the flexible regions on the molecular surface. Therefore, 3D-LAFM MDFF force fields are expected to predominantly drive the movement of structured molecular backbones. It should be taken with caution while analyzing the flexible region movement from these simulations. For A5 MDFF trajectory analysis, the structured annexin repeats were defined as: repeat I: Residues 17–28, 35–72, and 75–85; repeat II: Residues 88–144 and 148–158; repeat III: 169–217 and 232–245; And repeat IV: 247–259 and 266–318. For Glt_Ph MDFF trajectory analysis, the movements of the transport- and trimerization- domains, were analyzed using a home-written MATLAB script. In brief, we defined four collective residues: res1: Residues 80–84, 250–263, 287–302, and 404–417 (transport domain cytoplasmic side); res2: Residues 377–382, 326–328, and 223–227 (transport domain extracellular side); res3: Residues 167–170 (trimerization domain cytoplasmic side); And res4: Residues 150–153 (trimerization domain extracellular side). The mean coordinates of these collective residues in the MDFF trajectories were tracked to illustrate the domain movements. Specifically, the angle between vector res2-res1 (transport domain vector) and the z-axis (θ) as well as the angle between vector res4-res3 (trimerization domain vector) and the z-axis (ψ) were used the characterize the opening/closing of the transport- and trimerization- domains, respectively. The z-axis projection of vector res3-res1 (dz) was used to characterize the height difference between the transport- and trimerization domains from the cytoplasmic side which was exposed to the tip in the AFM experiments. The computational fitting of a molecule into an experimentally derived conformation using the experimental constraints as an external force field does not necessarily represent a meaningful transition. The process itself is as meaningless as morphing, but the resulting MDFF structure is of interest. Consequently, the analysis of the resulting trajectories should focus on the effective convergence of the structural models to the experimental data, rather than interpreting the trajectory as a physically accurate depiction of protein dynamics.

Structural analysis

The IFS Glt_Ph protomer structure dataset used for the PCA of CVs and AE analysis of all-atom coordinates contains 9 PDB protomer structures, including PDB_IFSo_–_apo_–_EM (PDB 6X12, 3 protomers), PDB_IFSc_–_apo_–_Xray (PDB 4P19, 3 protomer), and PDB_IFSc_–_trsp_–_EM (PDB 6X15, 3 protomer), as well as 1107 3D-LAFM MDFF protomer structures selected from 9 MD trajectories, each giving 41 trimer structures (123 protomer structures) at an output-step of 0.5 ns/step from the last 20 ns simulation. The MD trajectories include (‘PDB-to-U_AFM’): PDB_IFSo_–_apo_–_EM-to-U_AFM_–_IFSo, PDB_IFSo_–_apo_–_EM-to-U_AFM_–_IFSc, PDB_IFSo_–_apo_–_EM-to-U_AFM_–_IFSo1, PDB_IFSc_–_apo_–_Xray-to-U_AFM_–_IFSo, PDB_IFSc_–_apo_–_Xray-to-U_AFM_–_IFSc, PDB_IFSc_–_apo_–_Xray-to-U_AFM_–_IFSo1, PDB_IFSc_–_trsp_–_EM-to-U_AFM_–_IFSo, PDB_IFSc_–_trsp_–_EM-to-U_AFM_–_IFSc, and PDB_IFSc_–_trsp_–_EM-to-U_AFM_–_IFSo1. According to the applied force field U_AFM, the 1107 protomer structures were sorted into three groups: MD_IFSo_–_apo_–_AFM, MD_IFSc_–_apo_–_AFM, and MD_IFSo1_–_apo_–_AFM, each group contains 369 protomer structures. For the PCA of CVs, three variables characterizing the opening of the transport domains (see “Analysis of the MDFF trajectories”) were measured from all protomer inputs, therefore giving an 1116 × 3 matrix for PCA. The first two principal components (the pc1-pc2 space) accounted for >95% of the data variations, hence sufficiently capturing the most essential structural features to characterize the transport domain opening. The pc1-pc2 space was thereby used for further structural similarity analysis. For the AE analysis of all-atom coordinates, the protomers were first aligned with respect to their trimerization domain. Then, the x,y,z-coordinates of all shared backbone alpha carbon (Cα, n = 390) atoms were merged to an 1170 × 1 array for each protomer input, therefore giving an 1116 × 1170 matrix for AE analysis. For the AE network construction and training, we used the MATLAB ‘trainAutoencoder’ function series. Although the exact network architecture from ‘trainAutoencoder’ is not available, a customized 4-layer AE architecture reproduced comparable latent space distribution. The customized network has an input size of 1170, an encoder size of 512-128-32-8 neurons, a latent space size of 2, a decoder size of 8-32-128-512 neurons, and an output size of 1170. The network training used a mean squared error (mse) loss function and a learning rate of 10⁻⁴. For both networks, 80% of the data was used for training and 20% for evaluation. The latent space of the AE network was used for further structural similarity analysis. We calculated a structural similarity score (ss) for any two species (i and j) considering all protomer-pairs from the two species, as:

$$s{s}_{{ij}}=\exp \left(-\sqrt{\frac{1}{{IJ}}{\sum }_{i=1}^{I}{\sum }_{j=1}^{J}{d}_{{ij}}^{2}}\right)$$

(5)

where ${d}_{{ij}}^{2}$ is the pairwise 2D Euclidean distance in the pc1-pc2 or latent space between protomers i and j in the corresponding species. The ss was then normalized to [0, 1] for all pairs in the heatmap presentations.

AFM data bank

The AFMDB aims to archive ‘.afm’ files contributed by AFM researchers worldwide, facilitating the integration of AFM data with other structural biology data for cross-methodology analysis. Submitted ‘.afm’ files should adhere to the format requirements detailed in the “Standard ‘.afm’ file header” section (Table 1), with the corresponding metadata code. Here, we introduce the metadata code ‘AFM1’ for 3D-LAFM density maps. Other AFM-related applications using ‘.afm’ extension should adopt metadata code following the “AFMx” format and provide a detailed data description if a new code is introduced. Currently, AFMDB is accessible at (https://scheuringlab.com/afmdb-2/), from where all the 3D-LAFM densities presented here can also be downloaded.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

MDFF simulation parameters files, as well as initial and final structure files, are available as vis Figshare (https://doi.org/10.6084/m9.figshare.27737346)⁷¹. 3D-LAFM density maps presented are available via Figshare⁷¹ or at AFM Data Bank (AFMDB, https://scheuringlab.com/afmdb-2). Additional information and raw data are available from the corresponding author upon request. PDB structures used in this article are available in the protein data bank (PDB) with the following access codes: 1AVR, 6X12, 6X15, and 4P19. Source Data are provided as a Source Data file. Source data are provided with this paper.

Code availability

Codes used for 3D-LAFM density map construction and analysis, 3D-LAFM MDFF force field U_AFM construction, as well as tools for ‘.afm’ file encoding and decoding are available on GitHub (https://github.com/rafaeljiang23/3D-LAFM)⁷².

References

Liao, M., Cao, E., Julius, D. & Cheng, Y. Structure of the TRPV1 ion channel determined by electron cryo-microscopy. Nature 504, 107–112 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Doyle, D. A. et al. The structure of the potassium channel: molecular basis of K+ conduction and selectivity. Science 280, 69–77 (1998).
Article ADS CAS PubMed MATH Google Scholar
Riek, R. et al. NMR structure of the mouse prion protein domain PrP(121-231). Nature 382, 180–182 (1996).
Article ADS CAS PubMed MATH Google Scholar
Consortium, w. EMDB-the electron microscopy data bank. Nucleic Acids Res. 52, D456–D465 (2024).
Article Google Scholar
Berman, H. M. et al. The protein data bank. Nucleic Acids Res. 28, 235–242 (2000).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Binnig, G., Quate, C. F. & Gerber, C. Atomic force microscope. Phys. Rev. Lett. 56, 930–933 (1986).
Article ADS CAS PubMed Google Scholar
Jiang, Y. et al. Membrane-mediated protein interactions drive membrane protein organization. Nat. Commun. 13, 7373 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Yao, X., Fan, X. & Yan, N. Cryo-EM analysis of a membrane protein embedded in the liposome. Proc. Natl Acad. Sci. USA 117, 18497–18503 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Ando, T. et al. A high-speed atomic force microscope for studying biological macromolecules. Proc. Natl Acad. Sci. USA 98, 12468–12472 (2001).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Jiang, Y. et al. HS-AFM single-molecule structural biology uncovers basis of transporter wanderlust kinetics. Nat. Struct. Mol. Biol. 31, 1286–1295 (2024).
Article CAS PubMed PubMed Central MATH Google Scholar
Jiao, F. et al. Perforin-2 clockwise hand-over-hand pre-pore to pore transition mechanism. Nat. Commun. 13, 5039 (2022).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Uchihashi, T., Iino, R., Ando, T. & Noji, H. High-speed atomic force microscopy reveals rotary catalysis of rotorless F₁-ATPase. Science 333, 755–758 (2011).
Article ADS CAS PubMed Google Scholar
Lansky, S. et al. A pentameric TRPV3 channel with a dilated pore. Nature 621, 206–214 (2023).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Viani, M. et al. Fast imaging and fast force spectroscopy of single biopolymers with a new atomic force microscope designed for small cantilevers. Rev. Sci. Instrum. 70, 4300–4303 (1999).
Article ADS CAS MATH Google Scholar
Kodera, N., Sakashita, M. & Ando, T. Dynamic proportional-integral-differential controller for high-speed atomic force microscopy. Rev. Sci. Instrum. 77, 083704–083704-7 (2006)
Shibata, M., Yamashita, H., Uchihashi, T., Kandori, H. & Ando, T. High-speed atomic force microscopy shows dynamic molecular processes in photoactivated bacteriorhodopsin. Nat. Nanotechnol. 5, 208–212 (2010).
Article ADS CAS PubMed Google Scholar
Ando, T., Uchihashi, T. & Scheuring, S. Filming biomolecular processes by high-speed atomic force microscopy. Chem. Rev. 114, 3120–3188 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ruan, Y. et al. Direct visualization of glutamate transporter elevator mechanism by high-speed AFM. Proc. Natl Acad. Sci. USA 114, 1584–1588 (2017).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Ruan, Y. et al. Structural titration of receptor ion channel GLIC gating by HS-AFM. Proc. Natl Acad. Sci. USA 115, 10333–10338 (2018).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Heath, G. R. & Scheuring, S. Advances in high-speed atomic force microscopy (HS-AFM) reveal dynamics of transmembrane channels and transporters. Curr. Opin. Struct. Biol. 57, 93–102 (2019).
Article CAS PubMed PubMed Central MATH Google Scholar
Sanganna Gari, R. R. et al. Correlation of membrane protein conformational and functional dynamics. Nat. Commun. 12, 4363 (2021).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Perrino, A. P., Miyagi, A. & Scheuring, S. Single molecule kinetics of bacteriorhodopsin by HS-AFM. Nat. Commun. 12, 7225 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Heath, G. R., Lin, Y. C., Matin, T. R. & Scheuring, S. Structural dynamics of channels and transporters by high-speed atomic force microscopy. Methods Enzymol. 652, 127–159 (2021).
Article CAS PubMed Google Scholar
Maity, S. et al. High-speed atomic force microscopy reveals a three-state elevator mechanism in the citrate transporter CitS. Proc. Natl. Acad. Sci. USA 119, https://doi.org/10.1073/pnas.2113927119 (2022).
Heath, G. R. et al. Localization atomic force microscopy. Nature 594, 385–390 (2021).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Allen, G. S. & Stokes, D. L. Modeling, docking, and fitting of atomic structures to 3D maps from cryo-electron microscopy. Methods Mol. Biol. 955, 229–241 (2013).
Article CAS PubMed PubMed Central MATH Google Scholar
Pettersen, E. F. et al. UCSF chimera-a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
Article CAS PubMed MATH Google Scholar
Schabert, F. A., Henn, C. & Engel, A. Native Escherichia coli OmpF porin surfaces probed by atomic force microscopy. Science 268, 92–94 (1995).
Article ADS CAS PubMed Google Scholar
Scheuring, S. et al. Structural models of the supramolecular organization of AQP0 and connexons in junctional microdomains. J. Struct. Biol. 160, 385–394 (2007).
Article CAS PubMed MATH Google Scholar
Niina, T., Matsunaga, Y. & Takada, S. Rigid-body fitting to atomic force microscopy images for inferring probe shape and biomolecular structure. PLoS Comput. Biol. 17, e1009215 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Amyot, R., Marchesi, A., Franz, C. M., Casuso, I. & Flechsig, H. Simulation atomic force microscopy for atomic reconstruction of biomolecular structures from resolution-limited experimental images. PLoS Comput. Biol. 18, e1009970 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Ogane, T. et al. Development of hidden Markov modeling method for molecular orientations and structure estimation from high-speed atomic force microscopy time-series images. PLoS Comput. Biol. 18, e1010384 (2022).
Article CAS PubMed PubMed Central Google Scholar
Niina, T., Fuchigami, S. & Takada, S. Flexible fitting of biomolecular structures to atomic force microscopy images via biased molecular simulations. J. Chem. Theory Comput. 16, 1349–1358 (2020).
Article CAS PubMed MATH Google Scholar
Fuchigami, S., Niina, T. & Takada, S. Particle filter method to integrate high-speed atomic force microscopy measurements with biomolecular simulations. J. Chem. Theory Comput. 16, 6609–6619 (2020).
Article CAS PubMed Google Scholar
Dasgupta, B., Miyashita, O. & Tama, F. Reconstruction of low-resolution molecular structures from simulated atomic force microscopy images. Biochim. Biophys. Acta Gen. Subj. 1864, 129420 (2020).
Article CAS PubMed Google Scholar
Dasgupta, B., Miyashita, O., Uchihashi, T. & Tama, F. Reconstruction of three-dimensional conformations of bacterial ClpB from high-speed atomic-force-microscopy images. Front. Mol. Biosci. 8, 704274 (2021).
Article CAS PubMed PubMed Central Google Scholar
Trabuco, L., Villa, E., Mitra, K., Frank, J. & Schulten, K. Flexible fitting of atomic structures into electron microscopy maps using molecular dynamics. Structure 16, 673–683 (2008).
Article CAS PubMed PubMed Central Google Scholar
Heath, G. R., Micklethwaite, E. & Storer, T. M. NanoLocz: image analysis platform for AFM, high-speed AFM, and localization AFM. Small Methods 8, e2301766 (2024).
Article PubMed Google Scholar
Miyagi, A., Chipot, C., Rangl, M. & Scheuring, S. High-speed atomic force microscopy shows that annexin V stabilizes membranes on the second timescale. Nat. Nanotechnol. 11, 783–790 (2016).
Article ADS CAS PubMed Google Scholar
Heath, G. R. & Scheuring, S. High-speed AFM height spectroscopy reveals µs-dynamics of unlabeled biomolecules. Nat. Commun. 9, 4983 (2018).
Article ADS PubMed PubMed Central MATH Google Scholar
Lin, Y. C., Chipot, C. & Scheuring, S. Annexin-V stabilizes membrane defects by inducing lipid phase transition. Nat. Commun. 11, 230 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Rosenthal, P. B. & Rubinstein, J. L. Validating maps from single particle electron cryomicroscopy. Curr. Opin. Struct. Biol. 34, 135–144 (2015).
Article CAS PubMed Google Scholar
van Heel, M. & Schatz, M. Fourier shell correlation threshold criteria. J. Struct. Biol. 151, 250–262 (2005).
Article PubMed Google Scholar
Schermelleh, L. et al. Super-resolution microscopy demystified. Nat. Cell Biol. 21, 72–84 (2019).
Article CAS PubMed MATH Google Scholar
Betzig, E. et al. Imaging intracellular fluorescent proteins at nanometer resolution. Science 313, 1642–1645 (2006).
Article ADS CAS PubMed MATH Google Scholar
Huber, R. et al. Crystal and molecular structure of human annexin V after refinement. Implications for structure, membrane binding and ion channel formation of the annexin family of proteins. J. Mol. Biol. 223, 683–704 (1992).
Article CAS PubMed MATH Google Scholar
Cheng, A. et al. MRC2014: extensions to the MRC format header for electron cryo-microscopy and tomography. J. Struct. Biol. 192, 146–150 (2015).
Article PubMed PubMed Central MATH Google Scholar
Trabuco, L. et al. Applications of the molecular dynamics flexible fitting method. J. Struct. Biol. 173, 420–427 (2011).
Article CAS PubMed MATH Google Scholar
Singharoy, A. et al. Molecular dynamics-based refinement and validation for sub-5 Å cryo-electron microscopy maps. Elife 5, https://doi.org/10.7554/eLife.16105 (2016).
McGreevy, R., Teo, I., Singharoy, A. & Schulten, K. Advances in the molecular dynamics flexible fitting method for cryo-EM modeling. Methods 100, 50–60 (2016).
Article CAS PubMed PubMed Central Google Scholar
Goh, B. C. et al. Computational methodologies for real-space structural refinement of large macromolecular complexes. Annu. Rev. Biophys. 45, 253–278 (2016).
Article CAS PubMed PubMed Central MATH Google Scholar
Brooks, B. R. et al. CHARMM: the biomolecular simulation program. J. Comput. Chem. 30, 1545–1614 (2009).
Article CAS PubMed PubMed Central MATH Google Scholar
Yernool, D., Boudker, O., Jin, Y. & Gouaux, E. Structure of a glutamate transporter homologue from Pyrococcus horikoshii. Nature 431, 811–818 (2004).
Article ADS CAS PubMed Google Scholar
Verdon, G., Oh, S., Serio, R. N. & Boudker, O. Coupled ion binding and structural transitions along the transport cycle of glutamate transporters. Elife 3, e02283 (2014).
Article PubMed PubMed Central Google Scholar
Verdon, G. & Boudker, O. Crystal structure of an asymmetric trimer of a bacterial glutamate transporter homolog. Nat. Struct. Mol. Biol. 19, 355–357 (2012).
Article CAS PubMed PubMed Central MATH Google Scholar
Reyes, N., Ginter, C. & Boudker, O. Transport mechanism of a bacterial homologue of glutamate transporters. Nature 462, 880–885 (2009).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Wang, X. & Boudker, O. Large domain movements through the lipid bilayer mediate substrate release and inhibition of glutamate transporters. Elife 9, https://doi.org/10.7554/eLife.58417 (2020).
Kramer, M. Nonlinear principal component analysis using autoassociative neural networks. Aiche J. 37, 233–243 (1991).
Article ADS CAS MATH Google Scholar
Lemke, T. & Peter, C. EncoderMap: dimensionality reduction and generation of molecule conformations. J. Chem. Theory Comput. 15, 1209–1215 (2019).
Article CAS PubMed MATH Google Scholar
Tian, H. et al. Explore protein conformational space with variational autoencoder. Front. Mol. Biosci. 8, https://doi.org/10.3389/fmolb.2021.781635 (2021).
Jin, Y., Johannissen, L. & Hay, S. Predicting new protein conformations from molecular dynamics simulation conformational landscapes and machine learning. Proteins-Struct. Funct. Bioinformatics 89, 915–921 (2021).
Article CAS Google Scholar
Degiacomi, M. Coupling molecular dynamics and deep learning to mine protein conformational space. Structure 27, 1034 (2019).
Article CAS PubMed MATH Google Scholar
Ramaswamy, V., Musson, S., Willcocks, C. & Degiacomi, M. Deep learning protein conformational space with convolutions and latent interpolations. Phys. Rev. X 11, https://doi.org/10.1103/PhysRevX.11.011052 (2021).
Tsuchiya, Y., Taneishi, K. & Yonezawa, Y. Autoencoder-based detection of dynamic allostery triggered by ligand binding based on molecular dynamics. J. Chem. Inf. Model. 59, 4043–4051 (2019).
Article CAS PubMed MATH Google Scholar
Webby, M. N. et al. Lipids mediate supramolecular outer membrane protein assembly in bacteria. Sci. Adv. 8, eadc9566 (2022).
Article CAS PubMed PubMed Central Google Scholar
Rico, F., Gonzalez, L., Casuso, I., Puig-Vidal, M. & Scheuring, S. High-speed force spectroscopy unfolds titin at the velocity of molecular dynamics simulations. Science 342, 741–743 (2013).
Article ADS CAS PubMed Google Scholar
Catmull, E. & Rom, R. Computer-aided geometric design. 317–326 (Academic Press, 1974).
Humphrey, W., Dalke, A. & Schulten, K. VMD: visual molecular dynamics. J. Mol. Graph. Model. 14, 33–38 (1996).
Article CAS Google Scholar
Phillips, J. et al. Scalable molecular dynamics with NAMD. J. Comput. Chem. 26, 1781–1802 (2005).
Article CAS PubMed PubMed Central MATH Google Scholar
Chan, K. et al. Symmetry-restrained flexible fitting for symmetric EM maps. Structure 19, 1211–1218 (2011).
Article CAS PubMed PubMed Central MATH Google Scholar
Jiang, Y., Wang, Z. & Scheuring, S. A structural biology compatible file format for atomic force microscopy. Figshare. https://doi.org/10.6084/m9.figshare.27737346 (2025).
Jiang, Y., Wang, Z. & Scheuring, S. A structural biology compatible file format for atomic force microscopy. GitHub. https://doi.org/10.5281/zenodo.14641377 (2025).

Download references

Acknowledgements

Work in the Scheuring laboratory was supported by grants from the National Institute of Health (NIH), National Center for Complementary and Integrative Health (NCCIH), DP1AT010874 (S.S.), and National Institute of Neurological Disorders and Stroke (NINDS), R01NS110790 (S.S.).

Author information

Authors and Affiliations

Biochemistry & Structural Biology, Cell & Developmental Biology, and Molecular Biology (BCMB) Program, Weill Cornell Graduate School of Medical Sciences, New York, NY, USA
Yining Jiang
Weill Cornell Medicine, Department of Anesthesiology, New York, NY, USA
Yining Jiang, Zhaokun Wang & Simon Scheuring
Physiology, Biophysics and Systems Biology Graduate Program, Weill Cornell Graduate School of Medical Sciences, New York, NY, USA
Zhaokun Wang
Weill Cornell Medicine, Department of Physiology and Biophysics, New York, NY, USA
Simon Scheuring

Authors

Yining Jiang
View author publications
Search author on:PubMed Google Scholar
Zhaokun Wang
View author publications
Search author on:PubMed Google Scholar
Simon Scheuring
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.J. and S.S. designed the study; Y.J. and Z.W. developed the 3D-LAFM workflows and codes; Y.J. and S.S. analyzed the 3D-LAFM data; Y.J. designed the 3D-LAFM force field; Y.J., Z.W., and S.S. designed MDFF experiments; Z.W. performed MDFF simulations; Y.J. and Z.W. analyzed the MDFF trajectories; Y.J. and S.S. wrote the manuscript; Y.J., Z.W., and S.S. edited the manuscript; S.S. supervised the study.

Corresponding author

Correspondence to Simon Scheuring.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Movie 1

Supplementary Movie 2

Supplementary Movie 3

Supplementary Movie 4

Supplementary Movie 5

Supplementary Movie 6

Supplementary Movie 7

Supplementary Movie 8

Supplementary Movie 9

Reporting Summary

Transparent Peer Review file

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Jiang, Y., Wang, Z. & Scheuring, S. A structural biology compatible file format for atomic force microscopy. Nat Commun 16, 1671 (2025). https://doi.org/10.1038/s41467-025-56760-7

Download citation

Received: 14 November 2024
Accepted: 30 January 2025
Published: 15 February 2025
Version of record: 15 February 2025
DOI: https://doi.org/10.1038/s41467-025-56760-7