Extended Data Figure 1: Integrative structure determination of the S. cerevisiae NPC at 9 Å precision.
From: Integrative structure and functional anatomy of a nuclear pore complex

a, Schematic of integrative structure determination of the S. cerevisiae NPC. Random initial structures of the Nups and their sub-complexes were optimized by satisfying spatial restraints implied by the input information. b, The full description of integrative structure determination of the S. cerevisiae NPC, proceeded through four stages8,90,91,92 (Supplementary Table 3): (1) gathering data, (2) representing subunits and translating data into spatial restraints, (3) configurational sampling to produce an ensemble of structures that satisfies the restraints and (4) analysing and validating the ensemble structures and data (Extended Data Figs 7, 8Supplementary Tables 2–4 and Methods). The integrative structure modelling protocol (stages 2, 3 and 4) was scripted using the Python modelling interface package version 4d97507, which is a library for modelling macromolecular complexes based on our open-source IMP package90 version 2.6 (https://integrativemodeling.org). c, Convergence of the structure score for the 5,529 good-scoring NPC structures; the scores do not continue to improve as more structures are computed, essentially independently of each other. The error bar represents the standard deviations of the best scores, estimated by repeating sampling of NPC structures ten times (n?=?10, mean score values plotted). The red dotted line indicates the total score threshold (88,644.1) that defines the good-scoring NPC structures (Methods). d, Distribution of scores for structure samples 1 (red) and 2 (blue), comprising the 5,529 good-scoring NPC structures (nsample1?=?2,359 and nsample2?=?3,170 structures). The non-parametric Kolmogorov–Smirnov two-sample test120,121 (two-sided) indicates that the difference between the two score distributions is insignificant (P value (1.0)?>?0.05). In addition, the magnitude of the difference is small, as demonstrated by the Kolmogorov–Smirnov two-sample test statistic, D, of 0.045. Thus, the two score distributions are effectively equal. e, Three criteria for determining the sampling precision (y axis), evaluated as a function of the r.m.s.d. clustering threshold123 (x axis) (n?=?5,529 structures). First, the P value is computed using the χ2-test (one-sided) for homogeneity of proportions122 (red dots). Second, an effect size for the χ2-test is quantified by the Cramer’s V value (blue squares). Third, the population of structures in sufficiently large clusters (containing at least ten structures from each sample) is shown as green triangles. The vertical dotted grey line indicates the r.m.s.d. clustering threshold at which three conditions are satisfied (χ2-test P value (0.75)?>?0.05 (red, horizontal dotted line), Cramer’s V (0.065)?<?0.10 (blue, horizontal dotted line) and the population of clustered structures (0.90)?>?0.80 (green, horizontal dotted line)), thus defining the sampling precision of 9?Å. The three solid curves (in red, blue and green) were drawn through the points to help visualize the results. f, Population of sample 1 and 2 structures in the three clusters obtained by threshold-based clustering123 using an r.m.s.d. threshold of 12?Å. The dominant cluster (cluster 1) contains 80.3% of the structures. Cluster precision is shown for each cluster. The precision of the dominant cluster defines the structure precision.