Extended Data Fig. 7: Performance for different image acquisition parameters.

(a) Tracking performance for intestinal organoids imaged on a different scanning confocal microscope (Leica TCS SP8) than used for collecting the training data (Nikon A1R MP). Consequently, imaging data had lower planar resolution (0.4 μm/px rather than 0.32 μm/px), cells could be imaged deeper into the organoid, but at low signal-to-noise. Green squares are tracked cells at 60 μm depth, while red and blue squares are their previous and next locations, respectively. The cells marked with an asterisk (*) correspond to cells in the lineage trees (right). Open circles (o) denote missed cell detections. Even at this comparatively low signal-to-noise ratio most (>80%) cells are detected. The YZ-cross-section show the depth of these cells in terms of cell number. Lineage trees are color-coded by predicted error rate. Most cells can be tracked for multiple hours without potential mistakes. b) Same as in a, but for cells at 50 μm. Signal-to-noise ratios are higher here and all cells are detected, while lineages show cell tracking for long (>10 h) periods without potential mistakes. c, d) Top: measured link likelihood versus likelihoods predicted by the marginalization procedure either without (c) or with (d) recalibration on newly corrected data. Data is shown for all possible links (black) or links that are either in the global solution (blue). Dotted line is perfect calibration. Data is for the 4 different data sets shown in (e) and (f). Shaded region is S.D.M. Without recalibration (c), predicted likelihoods are overconfident, for example the 99% probability threshold actually corresponds to a lower level of certainty. Using the actual 99% probability threshold would increase the number of links needed to be checked (c, bottom histogram). However, the number of flagged potential mistakes omitted when using the uncalibrated threshold remains low (0.06% of links). e) Visualization of the recalibration process. The recalibration constant, defined as the ratio of the old and new scaling temperature, is estimated by manual error correction of predicted tracks. Graphs show the estimated recalibration constant versus the number of frames corrected for four different data sets, as well as all data pooled. The estimates converge on the consensus estimate (dotted line) for >5 corrected frames. Shaded area denotes 95% confidence interval. f) The number of reviewed potential errors (<99% certainty links) as a function of frames corrected. Tight estimates of the recalibration constant are reached after ~200 reviewed potential errors. g) Survival curve indicating the probability that a cell has not divided at time t after its birth. Shown are the original (blue, green) and recalibrated (orange, red) data for the same organoid. h) Same as g) but for the timing of division relative to the sister division. The strong overlap between original and recalibrated data indicates that for downstream applications perfect calibration is often not essential.