Fig. 4: Genome signature protected synthetic E. coli genome.
From: Self-authenticating genomic materials in Escherichia coli via advanced genome signatures

a E. coli genome of 4,521,562 nucleotides in length with a genome signature that is coded in ten genes (Supplementary Data 2). Each gene encodes a genome segment ranging from 622,018 to 4,521,562 nucleotides. b A genome with ten coded genes is constructed by precise genome editing (“Methods,”) and the expression levels of the ten encoded genes were quantified using quantitative PCR. Error bars represent the means ± SD from n = 3 biologically independent samples. ∗ represent p < 0.05; two-tailed t test. c The cell growth is measured in standard LB medium, the data represent the means ± SD from n = 3 biologically independent samples, scale bar = 1 cm. The orange bars represent the coded genes, while the blue bars represent the corresponding wild-type genes. d Various combinations of random substitutions (X-axis), and indels (Y-axis) were identified in computation experiments. The success rate of correction for each mutation combination was obtained from n = 100 computations. The mutations (red dot) were successfully identified in the sequenced genome of cells that were treated with ARTP (“Methods”.) e The genome of E. coli BL21(DE3) contains the genome signature encoded in a total of 100 genes, their position on the genome was marked as outer red (mutation correction) bars. A nine-layer nested structure is depicted in the center as curved color lines, representing the 100 segments that are coded within these 100 genes (Supplementary Data 2). f Gene kbl codes a segment of 76,795 nucleotides (right blue curve line), glyS codes the downstream segment of 64,937 nucleotides (left blue curve line) and aldB codes segment spanning both kbl and glyS segments of 143,271 nucleotides (orange curve line). g All of the 100 coded genes were identified in iterative DM decoding cycles depicted as numbered gray lines. h A total of 700 substitution mutations were identified and corrected. A Red cross shows a successfully corrected mutation at round 337 of the GMS decoding. Orange lines indicate sequences with undetermined authenticity, while green lines represent sequences with GMS-validated authenticity. i Distribution of decoded genome signature and mutations in the local genome.