Fig. 4: Analysis and modeling strategy for the live cell transcriptional data. | Nature Communications

Fig. 4: Analysis and modeling strategy for the live cell transcriptional data.

From: Stochastic pausing at latent HIV-1 promoters generates transcriptional bursting

Fig. 4

A, B Determination of models for transcription initiation. A example of a complex multiple state promoter model, describing the different steps leading to transcription initiation and their kinetic relationship. OFF: inactive promoter state; ON: active promoter state; orange ball: RNA polymerase. B the survival function (equal to one minus the cumulative function) describes the distribution of polymerase waiting times (delay between two successive initiation events). For multiple state models such as the one depicted on the left, the survival function can be fitted by a sum of exponentials, with the number of exponentials being equal to the number of promoter states. C Experimental and machine learning strategy to determine the survival function of polymerase waiting times. Left: signals of short movies made at high temporal resolution result from the convolution of the signal from a single polymerase and the sequence of temporal positions of initiation events. The sequence of initiation events can thus be reconstructed by a deconvolution numerical method (see Supplementary Note 2), provided that the signal of a single polymerase is known. This allows us to estimate the distribution of waiting times for waiting times shorter than the movie duration (i.e. a conditional distribution). Right: long movies made with a lower temporal resolution, in the order of the residency time of RNA polymerase on the gene (3 min), allow us to estimate the distribution of polymerase waiting times for waiting times greater than the temporal resolution. The two conditional survival functions, short and long, can then be combined to reconstitute the complete, multiple time scale survival function. The reconstitution uses affine transformations of the conditional survival functions, defined by two parameters ps and pl. pl is the probability that the waiting time is larger than the frame rate of the long movie. It is proportional to the number of waiting times hidden within active periods of the long movie, and is estimated from the number of inactive intervals and the cumulative duration of active periods of the long movie (see Supplementary Note 3). ps is the probability that the waiting time is larger than the short movie length and is fitted to minimize the distance between short and long parts of the distribution. Finally, the complete survival function is fitted with a sum of exponentials to determine the number of promoter state, the kinetics of transitions between them, and the initiation rate. Multiple models can be easily fitted to the same survival function and the most appropriate one is selected based on parsimony, parametric indeterminacy and consistency with complementary experiments.

Back to article page