Fig. 3: Error rate estimation by marginalization. | Nature Methods

Fig. 3: Error rate estimation by marginalization.

From: Cell tracking with accurate error prediction

Fig. 3

a, Simplified example explaining marginalization. Lines indicate putative links A–D, with thickness indicating their estimated probability. Link A (red) has a low predicted probability. However, the high-probability link D implies that A is true with high certainty by excluding options containing B and C. The probability P(A|G) that A is true given graph structure G can be calculated by comparing the probability of configurations containing link A and those that do not. b, Schematic outline of marginalization performed on a subset of links around the link of interest. P(A|G) is given by the summed energy of all configurations containing link A normalized to the summed energy of all configurations. c,d, Measured link likelihood versus naive likelihoods predicted by the neural network (c) or context-aware likelihoods calculated by marginalization (d). Data are shown for all possible links (black) or links that are either in the global solution (blue) or not (gray). For naive likelihoods (c), links in the tracking solution are more likely correct than expected, while, for context-aware likelihoods (d), they more closely match measured likelihoods, reflecting integration of graph information. Dashed line represents perfect calibration. Data for n = 5 organoids. Shaded region is the standard deviation around the mean. e, Context-aware likelihoods versus naive likelihoods. Dots are individual links. Lines are averages for true (green) or false (red) links. Marginalization increased the predicted likelihood of correct links while decreasing it for incorrect links. f,g, Number of links versus predicted naive (f) or context-aware (g) link likelihood. In f, while most links in the globally optimal solution (blue) are predicted with high confidence (>99% probability), a fraction have confidence levels similar to those of rejected links (gray). By contrast, for g, virtually all globally optimal links are now predicted with high confidence. h, Fraction of links in the globally optimal solution deemed low confidence (<99% probability). The fraction of low-confidence links that were actual errors compared to ground truth (red) is almost identical to the fraction of errors among all links (triangle), indicating that a <99% probability threshold covers virtually all errors. Marginalization thus reclassified many low-confidence links as high-confidence links but not those that represent errors.

Back to article page