Fig. 1: The CALBIA method for the assembly of Mb-scale human genome DNA.

a Design principle of the conjugation-associated linear-BAC iterative assembly of large DNAs. In each assembly, the strain harboring the helper plasmid pUZ8002 served as the donor, while the other strain acted as the recipient. The linear BAC plasmid in the donor was transferred into the recipient through conjugation, where it recombined with the other linear BAC plasmid at overlapping regions, resulting in the construction of the assembled plasmid. Selective markers R1 and R2 were used to select recombinants. A and B represent human sequences cloned on the BAC plasmid (red and blue bars, respectively). The small red bar to the left of the blue bar B indicates the overlapping region with A. The tos site (gray square) and telN gene (black dot) are also marked. b Assembly of the human IgHV gene cluster. The left panel presents a schematic diagram of the BAC plasmid assembly containing the 1.07-Mb IgHV gene cluster. The right panel shows PFGE validation of the assembled plasmids pA (330 kb), pB (456 kb), pC (612 kb), pD (716 kb), pE (841 kb), pF (951 kb), and pG (1065 kb). S. cerevisiae sixteen chromosomes were used as markers (Bio-Rad, 1703605). c Schematic diagram of the construction of the E. coli strain EMT21. To construct the strain EMT5, the oriC of the E. coli MDS42 chromosome was replaced with the tos-BAC cassette containing the BAC replication origin, partition system, and tos site, followed by the deletion of the tus gene. When the TelN expression plasmid was introduced into EMT5 (tus knockout) by transformation, it took a week for a dozen very small colonies to appear on the plate. After moving the TelN expression cassette from the plasmid to its chromosome, EMT21 was constructed, and its chromosome was validated as linear. Through genome sequencing, a 4-bp deletion at positions 276–279 in matP was identified, resulting in a frameshift mutation. d Schematic diagram of the assembly of 2.13-Mb human DNAs on the linear chromosome of E. coli. Assembled plasmids pZR5 (845 kb), pZR3 (311 kb), pZR12 (423 kb), pZR9 (438 kb), pZR8 (424 kb), and pZR13 (398 kb) were iteratively assembled and integrated into the linear chromosome of EMT21, resulting in strain HMT18. e Verification of assembled human DNA released by Cas9 cleavage using PFGE. The linear chromosomes with integrated DNA assemblies in strains HMT11, HMT12, HMT13, HMT15, HMT16, and HMT18 are very large, ranging from 5–6 Mb, and mostly remain in the gel block under the electrophoresis conditions. Cas9 cleavage released the assembled DNA, which ranged from 0.85 to 2.13 Mb and could be separated by PFGE. The chromosome of Hansenula wingei was used as a marker (Bio-Rad, 1703667). f Validation of the integrity of the assembled 2.13-Mb human DNA through high-throughput sequencing. The lower part shows the sequencing coverage of the insert in HMT18, while the upper part displays the coverage of the 12 original BACs. Read coverage (y-axis) is plotted against the position in base pairs from the start of the 2.13-Mb assembly. g PFGE validation of the genetic stability of the integrated 2.13-Mb assembled DNA. The integrated DNA was released by Cas9 nuclease from HMT18. The size and integrity of the assembly remained consistent after serial cultivation for three days, as shown in the PFGE gel.