Introduction

The interrogation of antigen-specific antibody repertoires plays a key role in the development of therapies including multispecific antibodies, antibody–drug conjugates and cellular therapies1,2. Since the specificity of an antibody is determined by two independent chains—the variable heavy (VH) and light chains (VL)—each of which is encoded by different mRNA transcripts3, it has, until recently, been difficult to comprehensively and simultaneously probe the sequence and chain pairing diversity in human antibody repertoires4. Conventional antibody discovery has leveraged low-throughput, well-based sorting strategies to interrogate less than 1% of paired VH-VL sequences derived from B-cells within a typical blood draw and even less from harvested secondary lymphoid organs, thus preserving chain pairing at the cost of sequence depth and diversity5,6. The limited sequence diversity explored by single-cell sorting strategies has heralded technologies where the VH and VL repertoires can be mined independently through “deep sequencing”7,8 or by generating and panning immune display libraries where the cognate chain pairings are decoupled and reassembled combinatorially9,10. The unprecedented screening depth of these approaches provides greater access to rare, high affinity antibody sequences. However, the loss of native pairing in antibodies in these combinatorial repertoires is thought to contribute to developability risks—features that lead to their eventual attrition from the drug pipeline11,12,13.

In the last decade, droplet microfluidics has emerged as a transformative technology that addresses the limitations of both low-throughput cell sorting and manipulation, where cognate pairing is preserved, as well as high-throughput interrogation methods that decouple native chain pairings but allow for deeper mining of antibody repertoires14,15. Several groups14,16,17,18 have developed elegant microfluidic platforms where individual B-cells can be isolated into nanoliter-sized droplets and individually manipulated to splice cognate V genes and create natively paired scFv libraries for panning via display-based selections or screening by NGS. More recently, such workflows have been utilized to build target-specific, natively paired immune libraries and demonstrate the greater likelihood of natively-paired binders to specifically and sensitively bind their target relative to counterparts with non-native chain pairings16,19.

However, while affinity and specificity are critical features that determine the clinical translatability of an antibody, the successful development of biologics requires consideration of their physicochemical properties as well20,21. Termed ‘developability’, these properties, which include thermal and colloidal stability as well as the propensity for aggregation or self-interaction, encompass the feasibility of an antibody-based drug to successfully progress from discovery to development22. Since Jain et al.’s comprehensive effort to characterize acceptable metrics of ‘developability’ among clinical-stage antibodies23, there is a strong consensus that antibodies with the most optimal drug-like behavior not only display high affinity and specificity but also well-defined functional and biophysical properties. While anecdotal evidence exists that preserving the inherent diversity and pairing of heavy and light chains found within a native repertoire can contribute to antibodies with superior developability and function11,12,13, there have been few systematic, high-throughput efforts to compare the developability and functional profiles of target-specific antibodies with native and non-native chain pairings11,19.

In this work, we highlight a series of technical accomplishments, both in fluidics and molecular biology, that have been deployed to generate and characterize two majority natively paired human scFv libraries—one targeting soluble Antigen or Target A and the other against membrane-expressed Antigen or Target B. While others have utilized droplet microfluidics-based approaches to link the heavy and light chains from a single BCR and build natively paired scFv libraries, few of these previous efforts have bioinformatically and experimentally validated the degree of native pairing16 and the composition of antigen-specific clones within these in vivo-derived repertoires19. By engineering validated natively paired libraries and comparing them to combinatorial immune libraries built from similar starting diversities, we present evidence that natively paired libraries contain a larger number of antigen-specific antibodies than combinatorial libraries and these cognate paired antibodies display favorable developability and functional profiles. Although potent drug-like antibody candidates can and have been previously identified from combinatorial immune and synthetic antibody repertoires24 as well as through low-throughput single B cell screening strategies25,26, combining high-throughput discovery approaches like phage display with large natively paired repertoires may increase the likelihood of successfully identifying multiple clinically-translatable drug candidates and accelerating the identification of not only high affinity but also functional, developable antibody-based drug candidates.

Results

Acrylamide hydrogel beads efficiently capture V(D)J diversities within natively paired repertoires

Previous approaches for generating natively paired libraries have leveraged a series of sequential emulsions to lyse B cells, capture variable heavy and light genes and splice cognate chains to create scFvs as shown in Fig. 1. However, when generating paired libraries using these standard approaches, we observe three points of inefficiency—(1) limited or skewed diversities of captured variable regions from lysed B cells, (2) the stochastic movement of mRNA transcripts during the transfer of capture beads between emulsions and (3) droplet merging events during in-drop OE RT-PCR—that affect both the preservation of cognate pairing and sequence diversity within the repertoire. Figure 2A highlights each pinchpoint in traditional workflows as well as existing technologies to improve pairing efficiency and diversity of generated libraries.

Fig. 1
figure 1

Workflow for generating natively paired libraries. A schematic describing the workflow for generating natively-paired scFv immune phage libraries using droplet microfluidics. Humanized mice are immunized with an antigen of interest and allowed to mount an immune response. Following immunization, (1) lymphoid organs, specifically the spleen and lymph nodes, are isolated and (2) subjected to a magnetic separation workflow to isolate IgG + B-cells. IgG + B-cells are then individually (3) encapsulated alongside oligodT-functionalized, compressible acyrlamide beads and cell lysis buffer in nanoliter-sized droplets. Following lysis, (4) the oligodT beads capture RNA transcripts (via the 3’ polyA-tail) and the RNA-laden beads are recovered and subjected to bulk reverse transcription in the presence of high salt conditions to generate cDNA. (5) cDNA-decorated beads are then reencapsulated in nanoliter-sized droplets (alongside custom V-gene specific primer pools) for ~ 30 rounds of overlap extension (OE-PCR) PCR. Following OE-PCR, VH-VL linked constructs or scFv inserts are isolated from drops and subcloned into a phagemid vector for expression on the surface of phage.

Fig. 2
figure 2

Technologies for improving the generation of natively paired libraries. (a) A schema of the main process improvements used to generate natively-paired immune phage libraries—(1) improvements in V gene capture via compressible oligodT beads, (2) improvements in droplet stability through the use of a two-phase “Velcro”-like surfactant, (3) reduction in antibody cognate chain mispairing through buffer optimization. For each process improvement, the associated stage within the workflow where the technical advance occurs is highlighted. (b) The heavy and (c) light chain V gene usage for naïve human repertoires captured using either compressible, oligodT beads or free oligodT. Segments with different shades of a given color or gene family indicate distinct subfamilies. Families representing at less the 0.1% of the total frequency are shown in the exploded view to the left of each sector diagram. (d) The experimental scheme used to assess cognate chain pairing in a 3-member library using a bulk reverse transcription (RT) strategy. In this approach, oligodT beads are coencapsualted with single cells from an equal mixture of three distinct hybridomas. Following cell lysis and RNA capture, RNA-laden beads are retrieved from drops, pooled together in a high salt buffer and subjected to bulk RT followed by reencapsulation in “velcro” drops to generate linked constructs. VH-VL linked constructs are isolated from drops and TOPO-cloned to generate colonies for Sanger Sequencing. (e, f) A Circos Plot of the cognate chain pairings for forty-four clones assessed by Sanger Sequencing for libraries generated using (e) ‘standard’ and (f) ‘high salt’ buffer conditions. Correctly paired heavy and light chains are highlighted by colored streams while incorrectly paired heavy chain-barcode pairings are delineated in gray. (g) The “velcro”-like surfactant used to stabilize drops during thermocycling. Upon droplet generation, the aqueous polyvinyl alcohol (PVA) interacts with the oleic fluoropolymer to create a stabilizing interfacial film on the surface of droplets. (h) Fluorescent image and (i) bright field image of “velcro” surfactant-stabilized drop(s) following 25 rounds of thermocycling. Drops contain either EvaGreen dye (Biotium) and an amplifiable gene template or the dye and no template. Inset schematizes the activity of the dye following PCR in empty and template-containing drops.

One such technology, aimed at capturing in vivo antibody repertoires without biasing VDJ gene representation, is the use of compressible, oligodT-conjugated acrylamide gels to collect mRNA transcripts from lysed cells. Unlike smaller commercial beads with surface-conjugated oligo-dT, oligodT-conjugated acrylamide gel beads can be individually loaded into 0.5 nL drops (Fig. S1A, B) as previously shown27. Furthermore, these porous gel beads have ~ 1 × 109 polydT oligos dispersed volumetrically27, ensuring that a single hydrogel can capture all mRNA transcripts from an individual B-cell. To assess the capacity of oligodT beads to comprehensively capture heavy and light chain gene repertoires from IgG + B cells, individual B cells from naïve Trianni mice were either coencapsulated with a single oligodT-conjugated acrylamide gel bead or an equivalent amount of free oligodT (Fig. S1C). Following cell lysis, transcript capture and bulk RT, heavy and light chain genes were individually amplified and deep sequenced to determine VJ gene diversity. Figure 2B, C show the relative distribution of heavy and light chain gene families and subfamilies within repertoires captured with acrylamide gel beads and free oligodT. Although gene families are differentially represented within each repertoire, both free oligodT- and bead-based capture strategies produce near identical repertoires containing each human V or J gene subfamily engineered in Trianni mice. More importantly, these repertoires show similar frequency distributions of V and J genes (Fig. S1D, E), suggesting that individual oligodT-conjugated acrylamide gel beads can comprehensively capture antibody gene repertoires without introducing functional biases.

RT buffer optimization and two-phase surfactant enhance cognate chain pairing integrity

Aside from the unbiased capture of antibody genes, a second critical challenge in paired library construction remains the cognate chain mispairing that arises during the transfer of transcript-laden beads from emulsions optimized for lysis to those optimized for reverse transcription and amplification19. The stochastic movement of mRNA transcripts across beads during these emulsion transfers can result in a single bead displaying several distinct transcripts, giving rise to a mixture of cognate and noncognate chain pairings28. To avoid such mispairings and eliminate emulsion transfers, some studies have leveraged less efficient single emulsion workflows16. However, high concentrations of cell lysate in droplets following B cell lysis can inhibit the downstream splicing and amplification required to maintain scFv diversity within single emulsions15,29. To minimize cognate chain mispairing while maximizing library diversity, we performed lysis and capture in an initial emulsion but isolated transcript-laden beads for lysate-sensitive operations downstream. Bulk reverse transcription was performed in a high-salt buffer to shield electrostatic interactions between nucleic acids and minimize the stochastic transfer of mRNA across beads29 (Fig. 2A). For simple two-member libraries, as shown in Fig. S2, such approaches can mitigate transcript transfer across beads and maintain cognate pairings for 91% of the total library while also retaining the roughly 1:10 initial library distribution. In Fig. 2D, a similar approach is used to generate and interrogate libraries composed of three distinct hybridomas at initial ratios of 1:2:1. When bulk reverse transcription is performed in standard buffers, allowing for unencumbered movement of transcripts across beads, only 39% of Sanger sequenced clones in this three member library maintain cognate pairing as seen in Fig. 2E. Interestingly, among these appropriately paired sequences, the library composition skews to 1:3:1, suggesting an overrepresentation of the most dominant native pairing as well as some more promiscuous non-native pairings. However, in Fig. 2F, we see significant improvements on the near random pairing efficiencies observed under standard buffer conditions when reverse transcription is performed in a high salt buffer. Here, by contrast, 78% of clones retain native pairing within the final library and library compositions more closely mirror the original input at 1:2.5:1, suggesting that high salt buffers can not only improve pairing efficiencies but also help maintain diversity distributions within multimember libraries.

While the bead-to-bead transfer of transcripts during bulk incubations is one contributor to the observed chain mispairing in natively paired libraries generated via droplet microfluidics, other key contributors are droplet merging events that occur during amplification (Fig. 2A). Commercial non-ionic, tri-block copolymer fluorosurfactants, such as the PEG-PFPE2 surfactant (RAN Biotechnologies), are widely used to stabilize aqueous droplets within fluorinated oil30,31,32,33,34. While these commercial surfactants show remarkable stability at room temperature, synthesis limitations as well as structural features of PEG-PFPE2 have been previously shown to destabilize droplets that are subjected to large temperature changes35. During thermocycling, in particular, a large number of droplets can merge, effectively mixing and splicing distinct, partially amplified scFv cassettes. To minimize droplet merging events during OE-PCR, we utilized a previously described two-component fluorosurfactant36,37,38, schematized in Fig. 2G, capable of forming a highly thermostable “velcro”-like shell encasing droplets within an emulsion (Figs. S3A, 2I). In order to assess the integrity of “Velcro” droplets subjected to thermocycling, two emulsions—one containing a cDNA template as well as a primer-coupled EvaGreen probe that fluoresces in response to target gene amplification and a second emulsion lacking the template—were produced. When each emulsion is independently thermocycled, amplification as well as amplification-mediated dye activation is strictly observed within droplets containing both the probe and template (Fig. S4B, D). By contrast, those droplets lacking the template remain nonfluorescent (Fig. S3C). Likewise, no dye is observed in the oil-phase outside the droplet suggesting effective sequestration of aqueous components within the droplet and limited outward diffusion into the oil phase. When these same emulsions are mixed 1:1 and subsequently thermocycled for template-mediated dye activation as seen in Fig. 2H, droplets maintain their integrity, forming well-segregated emulsions containing equal numbers of fluorescent and nonfluorescent members. In aggregate, this data and previous work with this surfactant38 suggests that the “Velcro” surfactant supports the efficient in-drop amplification of gene templates. Perhaps more importantly, however, such surfactants effectively mitigate key contributors of chain mispairing—droplet destabilization, merging and passive diffusion of molecules across the oil–water interphase—that occur during thermocycling when using standard surfactants (Fig. S3E–G) and reduce the diversity and integrity of natively paired antibody libraries.

Implementation of key process improvements facilitates native pairing in large, high diversity libraries

Having identified important technologies or process improvements to improve cognate chain pairing in smaller two- and three-member libraries, we were interested in understanding if these technologies could synergize to improve cognate chain pairing in larger, more diverse libraries. To assess this, all three process optimizations were implemented to create three natively paired immune repertoires as shown in Fig. 3A. Each of these repertoires was derived from independently immunized Trianni mice, whose IgG + B cells contribute the majority of the variable heavy and light chain diversity. However, the repertoires were also laced, at 5% frequency, with a Trianni-derived hybridoma of known sequence and the maintenance of this cognate pair was used to infer both native pairing efficiency as well as preservation of clonal diversities. Figure 3B shows representative confocal imaging following the initial encapsulation of the predominant IgG + B cells and lower frequency hybridomas. In spite of the differences in size and frequency of each cell type, single cell encapsulations were largely maintained and, in the presence of lysis buffer, both hybridomas and B cells were fully lysed, uniformly dispersing cytoplasmic tracking dyes throughout the droplet as seen in Fig. 3C.

Fig. 3
figure 3

50–60% Preservation of cognate chain pairing in natively paired, multi-member libraries. (a) The workflow for demonstrating preservation of cognate chain pairing in multi-member libraries. Humanized mice are immunized with an antigen of interest and allowed to mount an immune response. Following immunization, (1) lymphoid organs are isolated and (2) subjected to a magnetic separation workflow to isolate IgG + B-cells. IgG + B-cells, alongside ~ 5% of a previously sequenced hybridoma, are then (3) coencapsulated with oligodT-functionalized, compressible acyrlamide beads and cell lysis buffer in nanoliter-sized droplets. Following lysis, (4) the oligodT beads capture RNA transcripts (via the 3’ polyA-tail) and the RNA-laden beads are recovered for bulk reverse transcription in the presence of a 50mM salt buffer to generate cDNA. (5) cDNA-decorated beads are then reencapsulated in nanoliter-sized droplets stabilized with a two-component, “Velcro” surfactant for ~ 30 rounds of overlap extension (OE-PCR) PCR. These scFv inserts are then isolated from drops and TOPO-cloned to facilitate Sanger Sequencing of ~ 600 clones per experiment. (b, c) Confocal imaging of individual hybridoma (red) or IgG + B-cells (green) coencapsulated with oligodT-functionalized acrylamide beads (b) before and (c) after lysis. (d) Agarose gel electrophoresis of the linked scFv PCR product or a no-template control (‘water’) following 25 rounds of OE-PCR and 20 or 28 rounds of nested PCR. The ladder is a GeneRuler 1kb Plus DNA Ladder (Invitrogen) with sizes of DNA fragments indicated on the left. (e) Funnel diagram of the total number of clones sequenced across experiments (n = 4), the percentage of screened sequences with at least one productive chain, the percentage of screened sequences with a productive VH:VL pairing as well as the percentage of retrieved and correctly paired hybridoma sequences.

Following recovery of transcript-laden oligodT gels, bulk RT under high salt conditions and reencapsulation of cDNA-decorated gels in ‘Velcro’ droplets, droplets were thermocycled to splice and amplify a library of scFv inserts. Figure 3D shows similar density bands at ~ 850 bp, representing the spliced scFv, following both a conventional 28 cycles or a minimal 20 cycles of OE-PCR aimed at reducing amplification-introduced biases in the repertoire. To better understand the native pairing efficiency and diversity within these scFv libraries, each amplified library was gel purified, TOPO cloned and screened by Sanger Sequencing. Of the 400–600 scFvs screened by sequencing from the three independently generated repertoires, 82% formed a productive heavy and light chain pairing, encompassing 100 unique germline pairings. Furthermore, 4–6% of the retrieved VHs from these productive pairings were hybridoma-derived, as schematized in Fig. 3E, suggesting an overall maintenance of clonal diversities—a feature that is mirrored in the large diversity of VH and VL germline pairings as well as the number of unique paratopes (i.e. 85–90% of productive pairings) seen in the accompanying B cell-derived repertoires (Fig. S5). To assess chain pairing efficiencies, the number of hybridoma-derived VHs paired with the appropriate cognate VL was also monitored for each library. It is important to note that across all three generated repertoires, a majority of the hybridoma-derived VHs (> 50%) were paired with the appropriate cognate VL. By contrast, we observed that repertoires generated in the absence of process enhancements, particularly the high-salt RT buffer, showed minimal (< 5–10%) maintenance of cognate pairing among spiked hybridoma-derived sequences (Fig. S4). This observation highlights the importance of process improvements in both chain capture and transcript amplification to support native pairing in low and high diversity multimember libraries.

Generation and characterization of optimized natively paired libraries against human antigens A and B

Given the ability to generate and validate majority natively paired repertoires through improvements in transcript capture and amplification within droplets, we utilized the same workstream, shown in Fig. 4A, to generate paired immune libraries against human Antigen A and B.

Fig. 4
figure 4

Library generation and repertoire analysis via PacBio SMRT sequencing. (a) The workflow for generating natively paired or combinatorial immune libraries. For each library, three humanized mice are immunized with an antigen of interest—either human Antigen or Target A or human Antigen or Target B—and allowed to mount an immune response. Following immunization, (1) lymphoid organs are isolated and (2) subjected to a magnetic separation workflow to isolate IgG + B-cells. IgG + B-cells, alongside ~ 5% of a previously sequenced hybridoma, are then (3) coencapsulated with oligodT-functionalized, compressible acyrlamide beads and cell lysis buffer in nanoliter-sized droplets. Following lysis, (4) the oligodT beads capture RNA transcripts (via the 3’ polyA-tail) and the RNA-laden beads are recovered for bulk reverse transcription in the presence of a 50mM salt buffer to generate cDNA. (5) cDNA-decorated beads are then reencapsulated in nanoliter-sized droplets stabilized with a two-component, “Velcro” surfactant for ~ 30 rounds of overlap extension (OE-PCR) PCR. These scFv inserts are then isolated from drops for use in sequencing workflows or phagemid subcloning and library expression. (b, c) Violin plots of the number of unique light chains paired with each unique heavy chain for paired and combinatorial control immune libraries targeting (b) human Target A and (c) human Target B. All sequencing data used for this analysis was derived from single-molecule, real-time (SMRT) sequencing. An unpaired t-test with Welch’s correction was conducted with ** and **** indicating a p < 0.01 and 0.001 respectively. (d, e) Heat maps of VH and VL germline gene usage for Sanger Sequenced clones from Target A (d) paired and (e) combinatorial control libraries.

The two antigens were chosen for proof-of-concept library generation given their critical roles in both innate and adaptive immunity and their promise as therapeutic targets for a wide range of biologics and cell therapies. To generate each antigen-specific paired library, between nine and twelve age-matched, humanized Trianni double homozygous (DHZ) mice were subjected to five rounds of intraperitoneal immunization with his-tagged antigen (Fig. S6A). Following immunizations, mice were bled and the presence of antigen-specific antibodies in the serum was assessed by ELISA. Figure S6B, C show the specific binding of antibodies from Target A- and Target B-immunized mice, respectively, across 106-fold serum dilutions. For the nine Target A-immunized mice, antigen-specific binding responses are observed for serum dilutions as low as 1:1.25 × 106 while the serum from the twelve Target B-immunized mice showed slightly weaker immune responses and concordant antigen-specific binding at dilutions approaching 1:160,000. Most importantly, all mice showed a significant specific immune response post-immunization compared to naïve serum samples collected from the same mice prior to immunization.

To produce paired immune libraries or corresponding randomly paired control libraries, three mice with similar, strong immune responses from each cohort were selected for B-cell isolation, VH-VL splicing and amplification. For each immune library generated, the age and weight of each mouse used, the number of recovered B cells utilized for library generation, and the amount of spliced VH-VL insert retrieved following PCR amplification are shown in Table 1. Figure 4B and C show the ratio of unique light chains per unique heavy chain (VL:VH ratio), as determined by long-read sequencing, for both paired and combinatorial, control libraries of spliced scFvs targeting Antigen A and Antigen B. For each target, paired libraries are defined by a lower VL:VH ratio, indicative of greater conservation in cognate pairing, than combinatorial libraries, where less diverse light chains freely and promiscuously pair with multiple heavy chains.

Table 1 Throughput and yield of immune libraries targeting antigen A and B.

At the germline level, the preservation of cognate pairings manifests as a strong preference for particular germlines or pairs (Figs. 4D, S7C) while combinatorial repertoires display a greater diversity of germlines and pairings (Figs. 4E, S7D). This conservation of cognate pairing, however, is less pronounced when the VH:VL ratios are analyzed (Fig. S7A, B), possibly a consequence of shared, common light chains that partially mask promiscuous pairings within combinatorial libraries.

Bioinformatics-led selection and characterization of library-specific clones underscores developability and functional outperformance of droplet-generated, paired repertoires

With paired and combinatorial scFv repertoires against two independent targets built and sequence characterized, we were now interested in understanding the performance of these libraries and their capacity to yield both highly functional and broadly developable antibodies. To assess library performance and identify library-specific clones for downstream characterization, two approaches—an experimental and a bioinformatics-driven approach—were utilized. For our experimental strategy, each paired or combinatorial repertoire was independently subcloned into a phagemid vector, expressed on the surface of M13 phage, subjected to two rounds of bead-based selection to enrich for target-specific clones and screened at random by phage ELISA (Fig. S8A). Figure S8B–E show the relative target-specific binding signal of 192 antibody-expressing phage selected at random from each of the four generated libraries. While hit rates vary across targets, paired libraries yielded two- to three-fold greater numbers of phage expressing unique, target-specific antibodies than analogous combinatorial repertories.

To further mine each repertoire for target-specific antibodies, long-read sequencing for each paired and combinatorial library was analyzed using a suite of commercially available bioinformatic tools as outlined in Fig. 5A. In short, each deep-sequenced repertoire was uploaded to IgX, parsed via MiXCR to generate in silico chain paired reads and clustered via IgX Cluster to group sequence similar VH/VL pairs together based on CDR3 size and similarity. Figure 5B, C show the number of clusters and the average number of unique sequences within a cluster for each paired and combinatorial repertoire. Mirroring the germline analysis, paired repertoires tend to contain fewer clusters with a smaller number of unique VH/VL pairs—a feature that may be attributable to their more restricted diversity. Based on phylogenetic similarity to hits identified by phage ELISA or known Antigen A- or Antigen B-targeting benchmark antibodies from internal databases, a subset of clones within each repertoire were identified using IgX Branch.

Fig. 5
figure 5

Bioinformatic analysis and selection of unique, library-specific clones for characterization using the IgX and SAbPred platforms. (a) The bioinformatics workflow for identifying unique clones from each library for IgG production and downstream characterization. In this scheme, each immune library was digested to generate DNA fragments containing the target scFv and submitted for long-read SMRT sequencing. Long-read outputs were then converted to a FASTQ format and, alongside FASTA-formatted Sanger data of previously screened hits, processed via MiXCR to generate paired reads. Using IgX Cluster and Branch, both Sanger and NGS data sets were clustered by sequence similarity and visualized as phylogenetic trees. Sequences from the NGS candidate pool were chosen based on similarity to well-characterized Sanger clones within the same cluster with validated function or binding. Chosen antibody sequences were then processed via IgX Track and promiscuous clones present in multiple libraries were removed to generate a final set of library-specific clones chosen for in silico mAb developability and canonical form prediction using TAP and SCALOP respectively. (b, c) Bar graphs showing (b) the number of clusters and (c) the average size of the clusters generated for each library-specific NGS data set using IgX Cluster. (dg) V Gene and CDR3 length for chosen sequences from (d) the Target A paired, (e) the Target A combinatorial, (f) the Target B paired and (g) the Target B combinatorial libraries visualized as bubble plots. (hk) Heat maps showing the sequence-based canonical form predictions for five CDRs—CDRL1, L2, L3 and CDRH1 and H2—in the bioinformatically chosen paired (h, j) and combinatorial (i, k) repertoires targeting Antigen A (h, i) and Antigen B (j, k). Scale bars indicate the percentage of chosen antibodies with each combination of predicted canonical forms.

For each selected subset, redundant clones (i.e. clones with identical CDRs) that were either found in other repertoires or within the same repertoire were removed and library-specific, unique clones were analyzed via TAP and SCALOP, two algorithms for predicting antibody developability and canonical loop conformation of CDRs. In particular, each repertoire-specific subset was assessed for five in silico parameters measured by TAP that have been previously shown to correlate with antibody developability39.

For each parameter or metric, the queried antibodies are compared to a set of 664 post Phase-I therapeutic Fv domains and antibodies with properties outside the metric distributions of Phase I domains were flagged. In Table 2, we report the percentage of such unique, flagged antibodies within each repertoire. While both repertoire subsets contained antibodies with dissimilar properties to Phase I domains, combinatorial libraries contained a larger percentage of such flagged antibodies with anomalous patches of positive or negative charges that have been implicated in higher rates of in vivo clearance and poor expression levels40.

Table 2 Percentage of flagged antibodies for each therapeutic antibody profiling (TAP) metric.

In order to better understand the sequence and structural diversity of each repertoire, selected paired and combinatorial clones were analyzed for heavy chain V gene distributions and HCDR3 length as schematized in Fig. 5D–G as well as the canonical form of the H1, H2 and L1-3 CDRs as summarized in Fig. 5H–K. Canonical form prediction, an in silico assessment of CDR conformation that is performed by SCALOP, roughly clusters antibodies by their CDR loop structure into canonical, structurally similar bins41. Although all four repertoires make use of similar V gene families and canonical loop forms, paired repertoires broadly utilize a more limited set of canonical loop structures (Fig. 5H, J) arising from a more restricted set of V genes (Fig. 5D, F).

Ultimately, however, understanding the benefits of native pairing on an antibody’s performance requires experimental validation. To determine how sequence and structural diversity as well as the predicted in silico developability affect the expression and target-specific binding of the antibodies, the variable regions of each bioinformatically-selected cognate paired or combinatorial antibody were expressed at high-throughput and small-scale using a previously described plasmid-free, linear expression cassette system42 (Fig. S9A). In Fig. S9B–E, the antibody titer is plotted as a function of target-specific binding for each bioinformatically selected, crude antibody from Antigen A-targeting as well Antigen B-targeting paired and combinatorial repertoires. Though cognate paired and combinatorial antibodies do not show significant absolute differences in expression or binding signal, a larger percentage of cognate paired antibodies show moderate to high expression (i.e. yields > 25 µg/mL) as well as fivefold or higher target-specific signal by ELISA. We were, therefore, interested in understanding how these high-expression, target-specific antibodies, labeled in yellow in Fig. S9B–E, would perform in terms of target-specific binding, function and developability when produced and purified at a larger scale.

Figure 6A, B show the normalized binding signal by ELISA of the experimentally and bioinformatically-selected cognate chain paired and combinatorial IgGs alongside relevant benchmarks from the Antigen B- and Antigen A-targeting repertories. While three cognate chain paired antibodies showed significant specific binding relative to an irrelevantly-targeted negative control, only one of the two combinatorial library-derived Antigen A-targeting antibodies showed significant target-specific binding on par with a second generation monoclonal antibody and clinical benchmark. Notably, as the singular Antigen B-targeting antibody from the combinatorial repertoire showed poor binding to Target B, this cohort of antibodies was not progressed for functional screening. To understand the differential capacity of cognate paired and combinatorial antibodies to elicit downstream antagonist activity, the functional activity of combinatorially and natively paired Antigen A-targeting antibodies were assessed in a commercial, in vitro NFκB/Luciferase-based U2OS reporter assay. Figure 6C, D show the NFκB-mediated activity of the two combinatorial antibodies and the ten cognate paired antibodies targeting Antigen A across an 18-point dose range. Although a single antibody from each repertoire, delineated in green, showed strong antagonist functionality on par or better than benchmark, four additional cognate-paired antibodies also elicited mild antagonism. The natively paired agonist antibody (i.e. antibody #6) was approximately tenfold more potent than the analogous antagonist antibody from the combinatorial repertoire (i.e. antibody #1) and showed improved potency relative to benchmark although it is important to note that trends with single antibodies cannot be used to draw more universal conclusions about the role of cognate pairing in enhancing potency. Notably, while antibody #6 shows relatively weak binding to Target A (~ fivefold higher than the negative control), it does show strong functional antagonism—a feature that supports previous published work suggesting that reducing antibody affinity can boost its function in specific, immunomodulatory contexts43.

Fig. 6
figure 6

Binding, function and developability assessments on selected, purified antibodies from paired and combinatorial libraries. (a, b) Single-point, ELISA-based binding for paired and combinatorial library-derived antibodies targeting (a) human Antigen A and (b) human Antigen B. An ANOVA with Dunnet’s multiple comparisons test was performed to assess significance in normalized binding signal between each experimental or positive control antibody and the negative control, with *** representing p < 0.001 and **** representing p < 0.0001. (c, d) An 18-point dose response curve assessing the functional activity of Antigen A-targeting antibodies derived from either the (c) combinatorial or (d) paired antibodies alongside the benchmark in an in vitro NFkB-Luc2P/U2OS reporter. Antibodies showing strong agonist function on par or better than benchmark are marked in green, those with mild agonist function are marked in yellow and those with no agonist function are shown in red. (e, f) Heat maps showing five developability parameters—purity by analytical SEC (aSEC), molecular integrity by analytical HIC (aHIC), polydispersity as measured by A280-DLS, melting temperature (Tm), and purity by capillary gel electrophoresis (cGE) —for (e) human Antigen A- and (f) human Antigen B-targeting antibodies. All values are normalized to highest value (100%).

While several groups have demonstrated the superiority of natively paired libraries in generating high affinity and high potency antibodies16,17,18,19, the role that maintaining cognate chain pairing plays on an antibody’s developability profile has been largely unexplored. Previously, Adler et al. speculated that antibodies arising from combinatorial repertoires, by virtue of their random VH-VL pairings, could be inherently less stable than natively paired antibodies that benefit from in vivo selection for stable chains and pairings19. To understand if this is the case, we assessed each experimentally shortlisted antibody from Antigen A- and Antigen B-targeting repertoires for key developability measures—purity by analytical SEC, molecular integrity and exposed hydrophobic patches by analytical HIC, fragmentation by capillary gel electrophoresis (cGE), aggregation propensity by DLS, and thermal stability via DSF. Figure 6E, F shows heatmap representations of each developability parameter, normalized to the highest value, for Antigen A- and Antigen B-targeting antibodies, respectively. Globally, regardless of target specificity, the few antibodies validated from combinatorial libraries show lower purities, higher polydispersities and lower melting temperatures, Tm, than the subset of validated antibodies from paired libraries, hinting at potentially poorer overall developability profiles for antibodies with random, non-cognate chain pairings. Interestingly, while these combinatorial antibodies showed poorer purities as well as thermal and colloidal stability, antibodies from both the combinatorial and paired repertoires showed poor molecular integrity by HIC. This latter observation suggests that native pairing, while stabilizing the antibody structure and functional paratopes, does not necessarily reduce its propensity to partially unfold and expose hydrophobic residues that are traditionally buried in its native state. From this standpoint, while native pairing does not always enhance molecular integrity, its positive correlation with other aspects of developability and potency suggests that the maintenance of cognate pairing and the in vivo selection underlying it can play a critical role in guiding favorable antibody phenotypes.

Discussion

Adequately interrogating antigen-specific antibody repertoires remains a crucial requirement for developing effective biologics and cell therapies that rely on an antibody fragment to guide specificity4. Traditionally, repertoire screening has been constrained by low-throughput methods that capture a small fraction of the BCRs found in vivo5,6, or by high-throughput sequencing strategies which decouple and recombine antibody chains, losing essential pairing optimizations occurring within the B cell germinal center7,8. More recently, numerous emulsion-free approaches for generating high fidelity natively paired libraries, some with pairing efficiencies exceeding 50–60%, have been reported and successfully leveraged to identify, characterize and develop target-specific antibody and antibody subsets. Alongside developments in emulsion-free approaches, recent advancements in droplet microfluidics promise alternative solution by preserving native chain pairing while enabling comprehensive exploration of in vivo antibody repertoires14,15. With repertoires derived from an antigen challenged animal, these natively paired immune libraries rely on much of the same high affinity, ‘natural’ diversity found in in vivo discovery platforms. However, since this repertoire is displayed on a chassis—either yeast, phage or a mammalian system—such libraries can also leverage the throughput and epitope targeting capabilities unavailable in the ‘black box’ paradigm of in vivo antibody discovery44. Previous work by Rajan et al.16 and others17,18,19 have shown some evidence that natively paired libraries can yield drug leads with higher specificity and sensitivity than synthetic or even randomly paired immune libraries. Nevertheless, despite progress, and partly owing to the limited characterization of microfluidics-generated antibody repertories, investigations of the function and developability of antibodies with native and non-native chain pairings remain scarce11.

In this work, we provide one of the first systematic efforts to perform in-depth characterization of antibodies derived from microfluidics-generated repertoires and develop process optimizations to enhance native chain pairings within these libraries. The previously developed technologies applied here—compressible acrylamide beads for comprehensive VJ gene capture as well as high salt buffers and a stabilizing surfactant to minimize chain mispairing28,35—provide some advantages over current, state-of-the-art microfluidics-based technologies for natively paired library generation. First, the use of compressible acrylamide beads allows for the complete and efficient capture of VJ gene repertories from single B-cells27,45. Previous work17,18,19 aimed at generating natively paired libraries has captured RNA from individual B cells via commercially available micron-sized beads. The use of 30–40 beads per cell for the capture of RNA from a single B cell has obvious disadvantages, namely the requirement of generating an order of magnitude more droplets during the downstream amplification than the effective library size—a feature that limits the scalability of microfluidics-based library generation for large, high diversity repertoires. By contrast, each compressible bead used here is functionalized with 109 polyT oligos27 to capture all the mRNA molecules from a given cell (~ 106), ensuring that even highly diverse B cell repertoires with 10 million or more members can be comprehensively captured in a single experiment.

Alongside chain capture fidelity, the maintenance of already captured chain pairings remains an important consideration for generating libraries that faithfully replicate in vivo repertoires. Adler et al. had shown that while microfluidics-derived ‘natively-paired’ libraries do have a higher percentage of target-specific antibodies than randomly paired counterparts, identical antibodies from both repertoires could be identified through deep sequencing19. In fact, our own analysis of long-read sequencing from target-specific natively paired and combinatorial libraries supports this observation and suggests that a sizable fraction of a “natively-paired” library is, in fact, mispairings between promiscuous antibody chains. While eliminating such library species is not always possible, reducing the frequency of mispaired species within the library is critical to ensuring its phenotypic fidelity. In this work, both the two-part ‘Velcro’-like surfactant as well as the high salt buffers counteract events in the library generation process that usually give rise to mispaired species, namely droplet coalescence and stochastic movement of mRNA across beads during droplet transfers. Utilizing these two technologies in combination allowed us to construct libraries with ~ 50–70% native pairing as opposed to a minority of natively paired species using traditional workflows. Although such ‘native paired’ repertoires still contain significant mispairing, the supplementation of bioinformatics workflows such as IgX Track, for comparing sequencing data sets and eliminating shared members of discrete repertoires, as well as experimental workflows that pre-enrich the starting B cell population for target antigen specificity44 can mitigate some of the mispairing bias that arises from library engineering.

Ultimately, the identification of true cognate- and randomly-paired antibodies from well-characterized libraries aims to better understand the functional and biophysical profiles of each antibody subclass. Several groups have previously speculated that natively paired antibodies may be more “developable” than randomly paired antibodies16,19, due to the inherent stability of certain preferential chain pairings19,46 as well as paratope-distant, stabilizing mutations that arise during in vivo evolution47. For the triaged subset of target-specific binders investigated in this study, those with native pairings showed improved in silico and in vitro developability relative to their randomly paired counterparts, particularly for metrics implicated in the molecule’s in vivo clearance23,39. While our in vitro developability assessments were performed on a small cohort of target-specific, high-expression antibodies from each repertoire and requires validation in a larger, more unbiased subset to definitively attribute native pairing to gains in developability, the bioinformatic workflows employed—particularly our clustering-based clonal selection strategies—suggest that the selected antibodies should be representative in sequence and function of the larger target-specific diversity. In light of this, future work should look at the in vivo pharmacokinetics as well as the anti-drug antibody responses of natively paired antibodies, particularly as it compares to antibodies from randomly paired or synthetic repertoires. Interestingly, one of the developability parameters where some of the investigated natively paired antibodies underperform their combinatorial counterparts is in surface hydrophrobicity. Both by in silico measures and HIC-HPLC, among the investigated antibodies from natively paired and combinatorial repertoires, an equivalent percentage displayed predicted or validated surface hydrophobicity. While this suggests that native pairing may not reduce an antibody’s tendency to unfold and expose buried hydrophobic patches, for the subset of antibodies investigated, the impact of such unfolding events on the potency and thermal or colloidal stability appear minimal. However, HIC and more broadly, surface hydrophobicity, may be an important predictor of another related liability—self-interactions via exposed hydrophobic patches. While natively paired antibodies are, in theory, subjected to in vivo selection against self-reactivity48,49, future studies may be required to understand if native pairing plays any role in reducing self-association.

In this study, we developed technologies to enhance native pairing efficiency in microfluidics-derived libraries and utilized these technologies in combination with bioinformatic workflows to identify antibody subsets with favorable affinity, potency and in vitro developability. Although the development of these identified antibodies would require a larger panel of clinical and nonclinical assessments, the work presented here points to the potential utility of native pairing in identifying high potency drug leads with biophysical properties that are amenable to clinical translation. While it is clear that natively paired libraries can be instrumental in identifying more drug-like antibody candidates, it may be important to incorporate complementary technologies to the library and panning strategy to synergize with and accelerate the pace of discovery. In combination with natively paired libraries, technologies for the bulk purification of antigen-specific B cells for library generation44 and in-droplet functional screening31,50,51 could streamline antibody identification and validation, providing rich functional insights into antibody design and accelerating the discovery process for more effective therapeutics.

Materials and methods

Production and characterization of target A and target B antigens

Antigen production

Plasmids encoding genes of interest for His6-tagged Antigen A and Antigen B were chemically transfected into mammalian cells for expression. For Antigen A Avi-His and all DTA-His tagged constructs, ThermoFisher ExpiFectamine™ 293 Transfection Kit (Catalog Number: A14525) was used following the manufacturer’s protocol. The pTT5 expression plasmids were transfected into Expi293 cells at a seeding density of 2.8 × 106 cells/mL and cultured in Expi293 Expression Media (ThermoFisher Catalog Number: A14635) at 37 °C, 140 rpm, and 8% CO2. At 24 h post-transfection, ExpiFectamine 293 enhancers 1 and 2 were added and cultures maintained using the incubation conditions described above.

For Antigen B-Avi-His, Expi293F cells conditioned in BalanCD HEK293 Media (VWR Catalog Number: 91165-1L) were seeded at 2.9 × 106 cells/mL and chemically transfected with a 3:10 ratio of polyethylenimine (PEI Max, PolySciences #24765-1) to plasmid DNA. EBNA plasmid was co-transfected with the pTT5-Target B expression plasmid and media was supplemented with 4 mM L-Glutamine. For Target B samples that were biotinylated in vivo, a BirA expressing plasmid was co-transfected and media was supplemented with sterile filtered biotin that was resuspended in OptiMEM media (Gibco, ThermoFisher Catalog Number: 31985070). The transfected cultures were maintained at 37 °C, 140 rpm, and 8% CO2 and, 24 h post-transfection, cells were fed with the BalanCD HEK293 feed media.

Purification of target A and target B antigens

At day 5 (Antigen A-Avi-His, Antigen A-DTA-His, Antigen B-DTA-His) or day 8 (Antigen B-Avi-His), cells were harvested by microfiltration using Sartorius Sartoclear Dynamics® diatomaceous earth included in the Lab V filtration kit (Fisher Catalog Number: 14-578-809) and supernatants were collected and sterile filtered by vacuum through the included 0.22 µm PES membrane.

For all constructs, filtered supernatants were loaded onto a cleaned and equilibrated 5 mL Roche cOmplete His column (Millipore Sigma Catalog Number: 6781535001) for immobilized metal affinity capture using a benchtop Cole Parmer Masterflex® LS peristaltic pump (Catalog Number: EW-07522-20). After loading, the column was washed with 6–10 column volumes (CV) of Phosphate Buffered Saline (PBS, Gibco, Fisher Catalog Number: 10-010-031). For elution from the His6-capture column, Antigen A samples were isocratically eluted with 6 CV of Gibco PBS supplemented with 300 mM Imidazole and Antigen B samples were isocratically eluted with 6 CV of 200 mM Sodium Phosphate, 200 mM Sodium Chloride, 250 mM Imidazole. Eluates were subsequently concentrated using Millipore Amicon® Ultra-15 centrifugal filter units (Catalog Number: UFC901024 for 10 kDa cutoff or UFC900324 for 3 kDa cutoff) at 3500 rpm to a final volume of 4–5 mL. On a Cytiva AKTA Pure 25, proteins were injected on a cleaned and equilibrated Cytiva HiLoad Superdex 200 26/600 (Antigen A, Catalog Number: 28989336) or HiLoad Superdex 200 16/600 (Antigen B, Catalog Number: 28989335) using the Gibco PBS pH 7.4 as the purification buffer. Fractions containing the protein of interest were pooled and concentrated by ultrafiltration. Protein concentrations were measured using a nanodrop with UV absorbance at 280 nm. Concentrations were calculated using computationally derived molar extinction coefficients for the respective construct.

Biotinylation of antigen A-avi-his

The biotinylated Target A sample had the His6-tag cleaved at 4 °C overnight using Eton Bio TurboTEV protease (Catalog Number: 1500020012) prior to SEC. The TEV protease and His6-tag were removed by passing the mixture over the 5 mL Roche cOmplete His column. The flow through was collected, concentrated, and further purified by SEC. Tag-cleaved, SEC-purified Antigen A was chemically biotinylated following purification using the manufacturer’s protocol in the ThermoFisher EZ-Link™ NHS-PEG4-linked Biotinylation kit (Catalog Number: 21455). The protein was buffer exchanged using the Cytiva PD-10 desalting column (Catalog Number: 17085101) into PBS.

Characterization of antigen A and antigen B antigens

The identity and purity of all proteins were confirmed using LC/MS, antibody binding by biolayer interferometry (Octet Red96) or Western blotting, SDS-PAGE under reducing and non-reducing conditions, analytical size exclusion chromatography, and LAL endotoxin measurements following previously described methodology52. The degree of biotinylation was calculated using reduced, PNGase F (NEB Catalog Number: P0704L) treated sample for biotinylated Antigen B on LC\MS or Vector Laboratories QuantTag™ Biotin Quantification Assay (Catalog Number: BDK-2000) for biotinylated Antigen A. Antigen B was confirmed to have 0.897 biotins per molecule while Antigen A had 4.32 biotins per molecule.

Immunization of Trianni mice and generation of antigen-specific B-cell repertoires

Antigen A-DTA-his and Antigen B-DTA-his were prepared in phosphate-buffered saline (PBS) at 1 mg/mL. The Sigma adjuvant (Sigma Adjuvant S6322) was prepared according to manufacturer’s instructions. For immunizations, 50 µg of prepared antigen was combined 1:1 (v/v) with adjuvant and administered by intraperitoneal injection (i.p.) using a 1 mL syringe with a 27-gauge needle to eight-week-old Trianni HHKK mice (Trianni, San Francisco)53. For Antigen A-immunized mice, five immunizations were delivered to the mice with immunizations occurring every 10 days. Antigen B-immunized mice received 7 immunizations in total with immunizations occurring every 10 days. For both sets of mice, final boosts were administered i.p. without adjuvant, three days before the harvest of lymphoid organs.

Serum titer determination

Serum bleeds were performed retro-orbitally after the 4th and 5th immunization. Serum was tested for the presence of an antigen-specific immune response using antigen-specific enzyme-linked immunosorbent assay (ELISA). 1 µg/mL of biotinylated antigen was coated on 96-well streptavidin-coated plates (Pierce 15,500) and incubated overnight at 4 °C. Plates were washed thrice and subsequently blocked with PBS/BSA (1% w/v) for 1 h at 37 °C. Blocking solution was removed and serially diluted serum samples in PBS/BSA (1% w/v) were added to wells and incubated for 1 h at 37 °C. After incubation, the plates were washed with PBS with 0.05% Tween-20 (v/v) (PBST) and anti-mouse IgG horse-radish peroxidase (HRP) antibody (Jackson Immuno-Research 715-035-150), diluted at 1:2000 in PBS/BSA (1% w/v), was added to each of the wells. Following incubation for 1 h at 37 °C, plates were washed with PBST and 3,3',5,5'-Tetramethylbenzidine (TMB Sera care TMB-5120–0047) was added to all wells and incubated for 5 min at room temperature (RT). The reaction was stopped with a solution of 0.16 M sulfuric acid (Thermo stop solution N600) and read at 450 nM on a PerkinElmer EnVision Multimode Plate Reader.

B cell isolation

Spleens were collected in PBS supplemented with 5% FBS (v/v) (PBS-FBS) and processed the same day. To process the spleens, tissue was placed in a 35-mm petri dish in 10 mL PBS-FBS and ground between frosted glass slides (Thermo 2955) before filtration to generate a single cell suspension. The splenocyte suspension was then washed in PBS-FBS before incubation at 37 °C for 5 min in 1X red blood cell lysis buffer (Milteyi Biotec, Cat. No. 130-094-183) to remove contaminating erythrocytes. Cells were then washed twice in PBS-FBS, counted, resuspended in Easy Sep buffer (Stem Cell Technologies 20144) and transferred to a 12 × 75 mm polystyrene tube. From this single cell suspension, IgG + B cells were isolated using the mouse Pan B-cell isolation kit (StemCell Technologies 19844) as previously described and according to the manufacturer’s protocol.

scFv library generation using droplet microfluidics

For all microfluidic operations, aqueous and continuous phase solutions were loaded into syringes and connected to the microfluidic device using blunt needles (Beckton-Dickinson, Franklin Lakes, NJ, USA, PrecisionGlide, cat. no. D929_SS27 × 0.5) and PE-2 tubing (Scientific Commodities, Lake Havasu City, AZ USA, cat. no. BB31695). Syringe pumps (Harvard Apparatus, Holliston, MA, USA, 2000/2200, cat. no. 702001) were used to flow the aqueous and continuous phases. According to volume required, 1, 3, or 5 mL Becton–Dickinson Disposable Syringes with Luer-Lok™ Tips (ThermoFisher Scientific, Waltham, MA USA 02451; parts 14-823-30, 14-823-435, and 14-829-45, respectively) were used and droplet formation was monitored by a high-speed camera (FASTEC Imaging, San Diego, CA, USA, HiSpec1 4G Mono) mounted on a bright-field inverted microscope (Nikon, 803,743, Melville, NY, USA).

Microfluidic device generation

Microfluidic devices were fabricated in PDMS (polydimethyl siloxane) using previously established soft-lithography methods54. Schematics of the devices used to (1) generate acrylamide capture beads and (2) support bead-based capture of mRNA to produce linked scFvs can be found in the supplementary materials (Fig. S10A–D).

“Velcro” fluorosurfactant synthesis

“Velcro” fluorosurfactant was synthesized as previously described36,37,38. Briefly, a mixture of 0.0301 g of azobisisobutyronitrile (AIBN; Merck Group, Darmstadt, Germany, cat. no. 441090), 0.2 g of N-[3-(4,4,5,5-tetramethyl-1,3,2-dioxaborolan-2-yl)phenyl] acrylamide (Tokyo Chemical Industry, Portland, OR, USA, cat. no. T3826) and 0.927 mL of 1H,1H,2H,2H-heptadecafluorodecyl acrylate (Merck Group, cat. no. 474487) in 3.44 mL of dimethylformamide (DMF; Merck Group, cat. no. D4551) was de-aerated by bubbling with nitrogen. The mixture was then stirred under inert conditions at 70 °C for 48 h. The solution typically turns turbid within an hour. The product was then washed three times in an excess of methanol (Merck Group, cat. no. 1424109). Following each wash, the product was centrifuged and the supernatant was decanted. After the final decanting step, the polymer was dried in a vacuum oven (Cole-Parmer OVV-400–24-120, UX-52411–30, Vernon Hills, IL, USA) at ~ 70 °C overnight. Once the polymer had completely dried, it was added to 15 mL of a ~ 1 M solution of hydrochloric acid (Merck Group, cat. no. H9892) and stirred overnight at 100 °C on a magnetic stir/hot plate under reflux in order to remove the pinacol protecting group of the boronic acid component of the polymer. Following deprotection, the fluoropolymer was washed once more in an excess of methanol (Merck Group, cat. no. 1424109) before being dried overnight in a vacuum oven (Cole-Parmer) at ~ 70 °C. The fluoropolymer was then mixed with HFE7500 (3 M™ Novec™ 7500 Engineering Fluid; 3 M Company, Chelmsford, MA, cat. no. 7100025016) (2 wt%) and stirred at 60 °C overnight a magnetic stir/hot plate to disperse the polymer. The dispersion was then filtered with a 1–2 µm filter (Merck Group, cat. no. WHA10462260) and the filtered polymer solution was used for droplet generation as described below.

mRNA capture gel fabrication

Oligo-dT conjugated acrylamide hydrogel beads, the mRNA capture gels, were prepared according to a previously established protocol45. Unlike previous work, these gels were ~ 25 micron in diameter and conjugated with /5Acryd//iSp18/TTT TTT TTT TTT TTT TTT TTV N (IDT, Coralville, Iowa, with Standard Desalting purification). Acrylamide gels were generated by flowing an aqueous phase, containing 4X acrylamide/bis-acrylamide, 1X TBSET, 10% (w/v) APS and a 100 µM of the acrydite-modified capture oligodT, through a continuous phase of HFE-7500 containing 2% w/v fluorosurfactant (Ran Biotechnologies, Beverly, MA, cat#: 008-FluoroSurfactant-2wtH-50G 452045) and 0.6% (v/v) TEMED. The aqueous and continuous phases were flowed at 300 µL/hr and 1000 µL/hr, respectively, and droplets were collected under a layer of mineral oil. Following overnight incubation at 65 °C to allow gel-bead polymerization, gels were carefully transferred to an aqueous buffer and washed with TBSET. All droplets were then merged by adding HFE7500 containing 20% (v/v) 1H,2H,2H-perfluoro-1-octanol (PFO; Sigma-Aldrich, cat. no. 370533) and gently inverted several times and centrifuged at 1000 g to separate the gels from the residual continuous phase. To remove residual HFE, gels were again transferred to an aqueous buffer and washed several times with TBSET. A disposable hemocytometer (Bulldog Bio, Portsmouth, NH, part DHC-N420) and phase contrast microscopy were used to count the gels. Gels that have settled by gravity are typically at a concentration of roughly 25–40 × 106/mL.

Cell encapsulation and bead-capture of mRNA

As shown in Fig. S10C, to generate droplets of ~ 0.5 nL, cell encapsulation devices containing three aqueous phase inlets (for capture gels, 2 × lysis buffer, and cells) and a single inlet for the continuous phase of oil and surfactant were used. In order to maximize pre- and post-encapsulation viability, syringes for the inlets were prepared in the following order: (1) continuous phase, (2) 2 × lysis buffer (spacer), (3) capture gels and (4) cells. For the continuous phase, 4 mL of HFE7500 containing 2% (w/v) surfactant was mounted onto a syringe pump and set to flow at 2500 µL/hr. The spacer syringe, containing 1 mL of in-house prepared 2 × lysis buffer see Supplemental Materials for preparation details) supplemented with 2% (v/v) RNasin® Plus RNAse inhibitor (Promega, WI, Cat. No. N2611), was flowed at 350 µL/hr.

To prepare the capture gel syringe, the desired number of gels, roughly 10 times the number of B-cells to be captured, were pelleted at 5000 g for 30 s. The TBSET supernatant was aspirated and the pelleted gels were washed in a 3 × lysis buffer (1 M NaCl, 2 mM EDTA, 20 mM Tris pH 7.5 supplmented with 0.04% (v/v) SDS, 0.05% (v/v) Tween-20). Two additional washes, one in a 2 × lysis buffer (1 M NaCl, 2 mM EDTA, 20 mM Tris pH 7.5 supplemented with 0.04% (v/v) SDS and 0.05% (v/v) Tween-20) and another in 2 × lysis buffer supplemented with 20% (v/v) RNAse inhibitor, were performed to buffer match the gels and the spacer. Following the final wash, the excess 2 × lysis buffer was removed and the pelleted gels were collected into a length (~ 2 m) of PE2 tubing, ensuring that the gels close packed into the tubing with no visible air bubbles. Gel-loaded tubing was then connected to a syringe preloaded with HFE7500, mounted on the syringe pump and set to flow at 250 µL/hr. For the cell syringe, IgG + B-cells were counted, passed through a 40 µM cell strainer and resuspended in filtered PBS containing 5 mM EDTA and 16% (v/v) Optiprep at a concentration of ~ 400,000/mL (0.4 cells/nL) to ensure a single cell is present in every 10 droplets following droplet formation. Cells were loaded into a syringe immediately prior to start of encapsulation to prevent sedimentation and set to flow at 600 µL/hr.

Once all inlet syringes were connected to the device, each syringe was sequentially primed at high flow rates to expel air in the tubing prior to droplet generation. To ensure that drops were appropriately collected and to prevent droplet merging, a single PE2 tubing was connected to the outlet and the non-device end was submerged in a chilled Eppendorf containing HFE supplemented with 2% (w/v) surfactant.

Bulk reverse transcription

Following mRNA capture, excess oil was removed from the sample and drops were gently washed first with neat HFE and subsequently with HFE supplement with 10% (v/v) PFO to promote droplet merging. Once droplets were visibly merged, the mRNA-laden gels and associated supernatant were resuspended in 500 µL of water and passed through a 70 µm strainer to remove cellular debris. Filtered gels were then collected, resuspended in fresh water and pelleted at 3000 g for 1 min. Following a second wash at identical conditions, gels were collected, weighed and resuspended in a sufficient volume of 500 mM NaCl buffer to ensure a final salt concentration of 170 mM. The salt-buffered gels were then added to 100 µL reaction volumes containing a final concentration of 1X Superscript IV RT buffer, 5 mM DTT, 5 mM dNTPs, 5% (v/v) RNAase Inhibitor and 5% (v/v) Superscript IV reverse transcription enzyme and subjected to single cycle of reverse transcription (i.e. 20 min at 30 °C, 20 min at 37 °C, 20 min at 55 °C).

In-drop overlap extension PCR

Once reverse transcription was completed, cDNA-decorated gels were collected, resuspended in 600 µL of wash buffer (see Supplemental Materials for preparation details) and centrifuged at 5000 g for 30 s. Following the wash, supernatant was carefully removed, leaving a small residual volume of gels and buffer behind, and samples were weighed to approximate the number of close-packed gels. To encapsulate these gels in drops for overlap extension PCR, a thermostable “Velcro surfactant” was used to ensure that that droplets would maintain integrity over 25–30 rounds of thermocycling. As with the initial lysis, four syringes for the inlets were prepared in the following order: (1) continuous oil phase containing the ‘Velcro surfactant’, (2) 2 × PCR buffer or spacer syringe, (3) a gel-rich 2 × capture gel and enzyme cocktail and (4) a gel-poor 2 × capture gel and enzyme cocktail. For the continuous phase, 4 mL of HFE7500 containing 2% (w/w) of ‘Velcro surfactant’ was mounted onto a syringe pump and set to flow at 2500 µL/hr. The 2 × PCR buffer syringe, containing 1.6 mL of a final concentration of 1X Q5 Buffer, 1X GC Enhancer, 0.5 mM dNTPs, Q5 Enzyme (1:25) and water, was flowed at 600 µL/hr.

For the final two syringes containing the cDNA-decorated capture gels, 1.6 mL of a 2 × gel and enzyme cocktail was prepared by mixing gels and water with a final concentration of Optiprep (16% v/v), 0.5 mg/mL BSA, 10 VH forward primers each at 0.2 µM, a single Vκ reverse primer at 2 µM, 2% (v/v) PVA and a cocktail of inner primers annealing to both the heavy and light chain where each primer is at a concentration of 7.6 nM and 3.8 nM respectively. Once the gel and enzyme cocktail was prepared, it was centrifuged at 1000 g for 30 s. For the ‘gel rich’ syringe, a 1 mL syringe with a 150 cm of attached PE2 tubing was preloaded with 400 µL of neat HFE such that the tubing was filled with oil. The PE2 tubing was then fully submerged in the centrifuged gel-enzyme cocktail and the plunger was carefully retracted to withdraw packed gels into no more than half the tubing. The tubing was positioned vertically to ensure that gels remained gravity packed within the tubing. The remaining cocktail, substantially depleted of gels, was gently pipetted into a second 3 mL syringe above a preloaded spacer of 400 µL of neat HFE. To ensure that each droplet contains no more than one gel, the flow rates of third and fourth syringes were maintained at 250 µL/hr and 350 µL/hr respectively and adjusted based on visual inspection of generated drops. To ensure that drops were appropriately collected, a single PE2 tubing was connected to the outlet and the non-device end was submerged in a chilled Eppendorf vial containing HFE supplemented with 2% (v/v) of ‘Velcro surfactant’.

Collected drops were subjected to an initial denaturation at 98 °C for 1 min and then cycled for two sets of 25 PCR cycles (first cycle: 98 °C for 15 s, 50 °C for 20 s, 72 °C for 1 min; second cycle: 98 °C for 15 s, 65 °C for 20 s and 72 °C for 90 s). Following cycling, a final elongation step was performed at 72 °C for 120 s. To shear the drops, break open the ‘shell’-like coating formed by the ‘Velcro surfactant’ and release the linked PCR product, cycled drops were transferred to a Qiashredder column and centrifuged at 10,000 g for 60 s to shear the drops. The aqueous phase, containing the linked PCR product, was collected and the column and pelted debris were washed several times with water to collect any residual PCR product. Collected DNA was then ethanol-precipitated and gel-purified for use as template for further amplification.

Nested PCR and linked product amplification

To generate sufficient linked amplicon product for library generation while retaining library diversity, PCR product were subjected to less than 15 cycles of nested PCR amplification. For nested PCR, a final concentration 1X Q5 Buffer, 1X GC enhancer, 0.25 mM dNTPs, Q5 enzyme (1:50), water as well as 0.2 µM of cocktail of nested forward primers and 0.2 µM of a cocktail of nested reverse primers were mixed with 25 µL of purified linked product template to generate as many 200 µL reaction as needed to amplify all of the linked template. Samples were then subjected to an initial denaturation at 98 °C for 30 s and then cycled for 15 PCR cycles (98 °C for 20 s, 56 °C for 20 s, 72 °C for 90 s). Following cycling, a final elongation step was performed at 72 °C for 60 s. Samples were then collected and gel purified for use in phage-scFv library construction.

Generation of phage-scFv libraries using scFv inserts

Construction of natively paired and combinatorial scFv-phage display libraries

All scFv amplicon libraries containing the original M13 short linker were double digested with the restriction enzymes BstAPI (NEB catalog no. R0654S) and BglII (NEB catalog no. R0144S) and subsequently PCR-purified (Macherey–Nagel, catalog no. 740609). Following double digestion and purification, both natively paired and combinatorial scFv libraries were generated by ligating 5 µg of the appropriate scFv insert into 25 µg of the double-digested phagemid vector, LVEC2295 (proprietary to Sanofi), at a phagemid:insert ratio of 1:3 using T4 DNA Ligase (NEB, catalog no. M0202M). All reactions were conducted in a 1 mL volume at 16 °C and allowed to proceed overnight. Once ligations were complete, each reaction was column-purified (Macherey–Nagel, catalog no. 740609) into a final volume 20 µL of which 5µL was transformed into 50 µL of TG1 electrocompetent cells (Lucigen Corp., catalog no. 605022) in a 0.1 cm cuvette (Bio-Rad Laboratories, catalog no. 1652083). Immediately following electroporation, the electroporated cells were recovered in 1 mL of SOC media for 1 h at 37 °C at 120 rpm, titrated to determine transformation efficiency, and plated overnight on a large 2YT-Agar bioassay dish containing 100 µg/mL ampicillin and 2% glucose (Teknova, catalog no. Y6292). Following overnight incubation, transformed TG1s were scraped from bioassay plates and a maxiprep was performed on scraped cultures to generate the phagemid template to replace the M13 linker with a flexible, expressible Gly-Ser-Ser-Ser-Ser or G4S linker.

To exchange the linkers, maxiprepped libraries of M13-linked scFvs were amplified using Phusion Plus with a pool of overlapping primers annealing to either framework 4 (FW4) of the heavy chain or framework 1 (FW1) of the light chain. The overlapping region of the forward and reverse primers anneal to generate the new G4S linker. Following 20 cycles of PCR, the product was digested with DpnI for 1 h at 37 °C to remove methylated parent or template plasmid, column purified into a final volume of 20 µL and electroporated into TG1 electrocompetent cells as previously described. Sanger validation was performed to ensure that the majority of the library (~ 75–80%) displayed the corrected G4S linker.

Packaging and expression of scFv-phage display libraries

To package both the natively paired and combinatorial libraries into phage, glycerol stocks from TG1 bacteria transformed with G4S-linked libraries were inoculated at a cell density 100–500 times higher than the size of the library. To ensure minimal growth-related biases, large-scale, 1L cultures were inoculated at an OD600 of 0.1 and grown at 37 °C at 250 rpm until they reached an OD600 of 0.5–0.7. At exponential phase of growth (OD600 = 0.5–0.7, cultures were infected with M13K07 helper phage at a MOI of 20 for 30 min at 37 °C at 150 rpm. Cells were then centrifuged at 3000 g for 30 min to eliminate glucose-containing media and resuspended in 2YT containing kanamycin (50 µg/mL) and ampicillin (100 µg/mL). Cultures were grown overnight at 30 °C and 200 rpm. The next day, cultures were centrifuged at 3000 g for 30 min and the phage-containing supernatant was collected and filtered through a 0.2 µm filter to remove any residual bacteria. Supernatants were immediately added to a 20% PEG8K in 2.5 M NaCl solution at a ratio of 5:1 and incubated on ice for 1 h with shaking at 15 min intervals to ensure complete mixing. PEG-precipitated supernatants were then centrifuged for one hour at 10,000 rpm at 4 °C and resuspended in 8 mL of PBS and 2 mL of PEG8K in 2.5 M NaCl for a second round of PEG precipitation as described above and elsewhere. Following centrifugation, phage pellets were resuspended in 1 mL of PBS. Titers were determined by infecting TG1 cells undergoing exponential growth phase with limiting dilutions of phage and performing colony counts.

Comparator phage generation

Comparators were transformed into TG1 cells (Zymo Research Corporation T3017) according to the manufacturers protocol. To validate expression of comparators, a single clone of each was inoculated into a 25 mL culture and grown overnight. The next day, cultures were diluted to a OD600 of 0.05 and allowed to grow at 37 °C shaking until they reached and OD600 of 0.5. At OD600 of 0.5, cultures were infected with M13K07 helper phage by adding 5 µL of 2 × 1010 PFU/µL stock to 25 mL cultures. Cultures were infected for 30 min at 37 °C, 150 rpm. Cultures were spun down and the pellet resuspended in 25 mL 2YT containing kanamycin (50 µg/mL) and ampicillin (100 µg/mL) and grown overnight at 30 °C and 200 rpm. The next day, the cultures were spun down and the supernatants filtered through a 0.22 µm filter. A solution of 20% PEG8K/2.5 M NaCl was added at 1/5th the supernatant volume and the mixture was incubated on ice for 1 h. Samples were centrifuged at 9000 rpm for 30 min in a Sorvall RC5C centrifuge. Supernatant was removed and the pellet resuspended in 1 mL sterile filtered PBS.

PacBio analysis of scFv libraries

Long-read PacBio sequencing was performed to evaluate the diversity of the parent scFv library. Glycerol stocks (3 mL) from the parent library were used to isolate the phagemid using a midi-prep plasmid isolation kit (Qiagen, Cat. No. 12943). The final DNA pellet was dissolved in 100 µL RNAse and DNAse free ddH20. The DNA concentration was measured with a nanodrop and 15 µg of the phagemid was then used to isolate the scFv fragment library by restriction digestion. Restriction digestion was performed at 37 °C for 4 h using 10 µL each of the enzymes MluI-HF (NEB, Cat. No. R3198L), and ApaI (NEB, Cat. No. R0114L) in a final reaction volume of 250 µL. After restriction digestion, the digested product was mixed with 50 µl of 6X Gel-Red dye and the sample was run on a 1% agarose gel. The gel was visualized under a UV transilluminator, and the digested scFv fragment library was cut from the gel. The scFv fragment library was then isolated using a gel extraction kit (Fisher Scientific, Cat. No. NC0389463) and the final product was dissolved in 30 µL of RNAse and DNAse free ddH20. Further, 2 µg of the scFv library was sequenced by a PacBio sequencer at the DNA Sequencing Center of Brigham Young University.

Bioinformatic analysis of chain pairing efficiency

For datasets from both combinatorial and paired libraries, the number of the unique pairs of variable light (VL) and heavy (VH) chains in the NGS and Sanger-derived sequences were obtained based on the Complementarity Determining Regions (CDR) loops. Following the removal of the non-AA characters, The CDR regions were annotated using the IMGT numbering provided by the ANARCI suite. Sequences with length outside of 5–12, 0–10, and 2–13 amino acid range for CDR1, CDR2, and CDR3, respectively, were excluded from the following analysis. For each dataset, the unique dash-concatenated L1—L3 and H1—H3 CDRs were obtained and their corresponding pairs, sequences with the same IDs, were added to a list. Subsequently, the list was parsed and, for each unique VH/VL CDR pairing, the total number of reads was counted and recorded for the comparison between the combinatorial and microfluidic-based library generation methods.

Phage display selections and screening

Library panning against target A and target B

The selection and identification of target-specific antibodies was performed by panning both natively paired and combinatorial immune libraries on the appropriate biotinylated antigen target—human Antigen A-avi-his or human Antigen B-avi-his—conjugated to streptavidin beads. Briefly, 200 µL of streptavidin beads were conjugated with either 100 nM biotinylated target antigen (i.e. Antigen A or Antigen B) or 200 nM of an irrelevant depletion antigen respectively. Conjugations were performed on the rotary shaker in 500 µL of PBS with 1% BSA (v/v) (PBS-B) for 30 min at RT. Following conjugation, antigen-coated beads were centrifuged at 8000 g for 5 min, washed twice with 500 µL of PBS with 0.1% Tween 20 (v/v) (PBS-T) and resuspended in 500 µL of PBS-B. For each library, prior to performing positive selections using biotinylated target antigen, one round each of non-specific phage depletion was performed using uncoated streptavidin beads as well as irrelevant antigen-coated beads. Both depletions as well as positive selections were performed on the KingFisher™ mL instrument (Thermo Scientific, catalog no. 5400050).

In short, a 1010 phage titer of each immune library, accounting for 100–500 times the library size, was diluted in 500 µL of PBS-B and added to the first well within the KingFisher strip. Subsequent wells within the strip were filled with streptavidin beads alone, beads coated with the irrelevant depletion antigen and beads with target, positive selection antigen. During negative selection, depletion beads were coincubated with the phage library for 30 min before the beads, and bead-bound phage, were collected and removed. Following negative selection, target antigen-conjugated beads were coincubated with the depleted phage library for 1 h of positive selection. Phage bound to target antigen conjugated beads were washed five times with 500 µL of PBS-T and twice with PBS. After the final wash step, phage bound to target antigen-bead complex were eluted with 500 µL of 100 mM freshly-made triethanol amine (TEA) and neutralized with 250 uL of 0.5 M Tris–HCl at pH 7.5.

Eluted and neutralized output phage were immediately collected and used to infect 5 mL of TG1 cells at an OD600 of 0.5. Infected TG1 cells were incubated at 37 °C for 30 min with no shaking followed by another 30 min of shaking incubation (120 rpm) at 37 °C. Following incubation, infected TG1 cells were centrifuged at 3000 g for 10 min and the pelleted cells were spread on square bioassay plates for outgrowth and generation of glycerol stocks for subsequent rounds of selection. To determine round-to-round enrichment, titrations were performed on both post-depletion input and post-selection output phage by infecting TG1 cells undergoing exponential growth phase with limiting dilutions of phage and performing colony counts. For the subsequent rounds of selection, crude phage supernatant rescued from bioassay plate outgrowths were used as the input for selection following the parameters described above.

ELISA

To validate binding of comparator scFvs to intended targets, 384 well streptavidin coated (Thermo Scientific Cat # 436,017) were coated with 0.05 µg/mL Antigen A or Antigen B in 1% BSA in PBS, and incubated overnight at 4 °C. The next morning, phage supernatants were blocked with in a final concentration of 3% BSA for 2 h with gentle shaking. Plates were washed and also blocked with 3% BSA in PBS-T (PBS containing 0.1% Tween). Following blocking, the blocked phage supernatants were added to the plates and incubated for two hours at room temperature. Following incubation, plates were washed three times with 25X DELFIA Wash Concentrate (Perkin Elmer Cat #1244-114) diluted to 1X and incubated for one hour with an internal antibody against M13 pre-conjugated to DELFIA Eu-N1 Rabbit Anti-Mouse-IgG antibody (Perkin Elmer Cat #AD0207) diluted in DELFIA Assay Buffer (Perkin Elmer Cat #4002-0010). Plates were again washed twice with 1X DELFIA wash buffer and finally 30 µL of DELFIA Enhancement Solution (Perkin Elmer Cat #4001-0010) was added to each well for 15 min before imaging on a PerkinElmer EnVision Multimode Plate Reader.

Bioinformatic selection of hits from paired and combinatorial libraries

Following long-read PacBio sequencing of each library, raw FASTQ files from NGS sequencing as well as FASTA sequence files from Sanger sequencing of ELISA-identified, target-specific hits or known, functionally validated antibodies (Azenta Life Sciences, Waltham, MA) were uploaded to the IGX Platform (IGX Platform, Version 3.0.6, August 2022). Sequence quality was initially assessed via IGX Inspect, a quality control application designed for assessing immune receptor or antibody sequencing data, for average read quality, Q30 scores and V and J gene alignment distributions and read fate. All samples with good sequence quality (a Q30 score of > 85%) and > 95% of reads with successfully extracted heavy and light chain pairs were processed using MiXCR55, an open-source tool within the IGX Platform for the annotation and quantification of B-cell receptor sequences, to bioinformatically link heavy and light chain sequences from each scFv pair together. All identifications and alignments of heavy and light chains were performed using the V(D)J as the region of interest and default stringency settings for alignment.

Once linked heavy and light chain sequences were generated using MiXCR, paired sequences from each natively-paired or combinatorial library were individually clustered alongside target-specific paired and validated hits from Sanger Sequencing using IgX Cluster. Clones were assigned to the same cluster if the receptor amino acids within both the heavy and light chain shared greater than 80% similarity, as measured by Levenshtein Distance56, in the primary amino acid sequence as well as identical V genes, J genes and identical CDR3 lengths. Generated clusters were then visualized by IgX Branch and the phylogeny of clusters with one or more validated, Sanger-Sequenced hits was inspected further. Approximately 100 unique library-derived clones from these clusters, which showed high sequence similarity to validated clones, were selected for each combinatorial or paired library. Selected sequences for a given library that were also found in any of the other three libraries were removed using IgX Track, which allows datasets to be joined, intersected or subtracted to discover highly enriched or specific candidates within a display library. Following redundancy analysis using IgX Track, unique, library-specific clones were assessed by SAbPred57, a collection of computational tools developed by the Oxford Protein Informatics Group, as well as modified internal pipelines for developability and sequence-based canonical form predictions.

In silico developability (TAP and SCALOP) assessments of paired and combinatorial antibodies

Computational therapeutic antibody profiling (TAP) simulations were performed for bioinformatically-selected unique clones. TAP simulations consist of generating five computational descriptors derived from clinical-stage therapeutic antibodies and aims to serve as guideline for the identification of poor developability antibodies39. Herein, the software MOE (Chemical Computing Group ULC. Molecular Operating Environment (MOE), version 2022.02. 2023) (Thorsteinson implementation) was used for the generation of homology models from antibody sequences and computing five sequence and structure-based descriptors: CDR length, patches of surface hydrophobicity (PSH), patches of positive charge (PPC), patches of negative charge (PNC), and the structural Fv charge symmetry parameter (SFvCSP).

In addition, the canonical loop conformation of the CDRs were also assessed using SCALOP41, which allows for the identification of different antibody binding shapes. It is important to note that SCALOP computations rely solely on sequence. Each CDR (except HCDR3) was assigned to a conformational cluster and labeled with CDR type, sequence length, and cluster rank. For instance, the label 'L2-3-A' denoted the highest-ranked cluster for LCDR2 sequences of 3 amino acids. Each antibody received five CDR labels as its annotation. Structural diversity within a dataset was assessed by counting unique annotations across all antibodies.

LEC screening

Linear expression cassette construction

HC and LC variable regions from lead antibodies and controls were ordered as gene fragments from Twist Biosciences, containing short DNA tag sequences on the 5’ and 3’ ends complementary to CMV fragment of constant fragments to allow for linear expression cassette construction. Variable heavy and variable light fragments were cloned into linear DNA expression cassettes as IgG’s as previously described42. Briefly, a CMV promoter was amplified from the plasmid pTT5-mIg1H and used for both the heavy chain LEC and light chain LEC. A DNA fragment containing a human IgG1 constant region and rbGlob poly (A) tail was amplified from the plasmid pTT5-huIgG1H. The DNA fragment containing human IgK constant region, stop codon, and rbGlob poly (A) tail was amplified from the plasmid pTT5_HuIg kappa. The DNA fragments were purified by gel electrophoresis (ThermoFisher) and QIAquick Gel Extraction Kit (Qiagen Cat #28706X4). To construct heavy and light linear expression cassettes by overlapping PCR, 10 ng of heavy or light chain fragments were used in a 50 µL PCR reaction containing 25 µL of Q5 hot start high fidelity DNA polymerase 2 × Master Mix (New England Biolabs Cat # M0494L), 10 ng of the CMV fragment, 10 ng of the IgG or Kappa constant fragment, 20 pmol of the forward primer (gaccgagcgcagcgagtc) and 20 pmol of the reverse primer (tctccgagggatctcgacc). PCR cycles consisted of one cycle at 98 °C for 1 min followed by 30 cycles of 98 °C for 10 s, 68 °C for 15 s, and 72 °C for 1 min and 40 s, followed by a final extension at 72 °C for 2 min. Amplicons were confirmed running a small sample of the PCR products in an agarose gel, and the linear expression cassettes were purified using MinElute 96 UF PCR Purification Kit (Qiagen Cat #28,004).

Linear expression cassette production

Immediately following purification of cassettes, HC and LC cassettes were co-transfected using a concentration of 200–300 ng of each into 0.5 mL of Expi293F cells at 3 × 106 cells/mL (Thermo Fisher Scientific Cat #A14528) in a 96 well deep-well plate (USA Scientific, 1896–2110) with an ExpiFectamine 293 Transfection Kit (Thermo Fisher Scientific Cat #A14524), following the manufacturers recommendation. Cells were incubated at 37 °C, 8% CO2 with shaking at 900 rpm for 6 days.

Antibody characterization

After 6 days, cells were harvested and supernatants quantified using biolayer interferometry on an Octet QK384 using ProA Di and Read Biosensors (Sartorius Cat #18-0004). The LEC supernatants were diluted 1:4 in Octet running buffer containing PBS pH 7.4 supplemented with 0.1% BSA and 0.02% Tween20 to a final volume of 100 µL in a 384-well polypropylene black flat-bottom plate (Greiner Bio-one Cat #781209). Biosensors were pre-incubated with 200 µL of Octet buffer for 30 min at room temperature in 96-well polypropylene black flat bottom plates prior to the assay run. Supernatants derived from mock and reference plasmid controls were used as negative and positive controls, respectively. Prior to data collection, both antibody and biosensors plates were incubated for 5 min at 30 °C while shaking at 400 rpm. In the Octet Data Acquisition v13 software (Sartorius), the binding rate of each LEC supernatant to the biosensor was measured using the quantitation with regeneration method for a load time of 1200 ms at 30 °C and shaking at 400 RPM. In addition, a standard curve of a 1:2 serial dilution from 100 to 0.5 µg/mL of a purified human IgG reference was included for the assay (Jackson ImmunoResearch Cat #009-000-003). The evaluation was performed in the Octet Data Analysis software v13.0.0.32 to determine the initial binding rates for all samples, including the standard curve. Antibody concentration was calculated based on the reference antibody standard curve prepared in octet buffer.

To assess antigen-specificity, 384-well ELISA plates (Thermo Scientific Cat #8755) were coated with either Target B, Target A, or a control antigen at 2 µg/mL in PBS. The following day, plates were washed twice with PBS-T (PBS containing 0.1% Tween) and then blocked for 30 min with PBS containing 3% BSA. Plates were washed again with PBS-T and then incubated with 50 µL of the supernatant for 2 h at room temperature. Plates were washed with PBS-T and then incubated with DELFIA Eu-N1 Anti-Human IgG (Perkin Elmer #1244-330) diluted in DELFIA Assay Buffer (Perkin Elmer Cat #4002-0010). Plates were washed three times with 25X DELFIA Wash Concentrate (Perkin Elmer Cat #1244-114) diluted to 1X. 30 µL of DELFIA Enhancement Solution (Perkin Elmer Cat #4001-0010) was added to plates for 15 min, and then plates were imaged with a PerkinElmer EnVision Multimode Plate Reader.

Developability assessments of natively paired and combinatorial antibodies

Analytical SEC

All size exclusion chromatography (SEC) analyses were done on a Waters ACQUITY UPLC with a Superdex 200 Increase 5/150 GL column (Sigma-Aldrich, Darmstadt, Germany) using phosphate buffered saline pH 7.2 as the mobile phase. For each sample, 10 µg of protein was injected in singlet over the column at 0.3 mL/minute for 15 min and an absorbance at 280 nm (A280) was measured. Buffer blanks and Gel Filtration Standard (Bio-Rad, 1511901) were included at the beginning and end of the run. Resulting chromatograms were blank subtracted and manually integrated to determine percent purity.

Analytical HIC

All hydrophobic interaction chromatography (HIC) analyses were done on an Agilent 1260 HPLC with a MabPac HIC 10 HPLC column (ThermoFisher, Somerville, MA, USA) using a concentration gradient of 1.5 M ammonium sulfate, 25 mM sodium phosphate pH 7.0, and 25 mM sodium phosphate pH 7.0. For each sample, 5 µg of protein was injected over the column at 0.6 mL/minute for 17.5 min and an absorbance at 280 nm was measured. Buffer blanks were included after every 10 samples. HIC Protein Standard Mix (Waters Corp, 186007953) was injected at the beginning and end of the run. Resulting chromatograms were blank subtracted and manually integrated to determine the percent main peak.

NanoDSF

Nano Differential Scanning Fluorimetry (nanoDSF) was performed using Prometheus NT (NanoTemper technologies PR001, Cambridge, MA, USA). Intrinsic tyrosine and tryptophan fluorescence were measured during thermal denaturation from 30 to 90 °C at 2 °C/minute ramp.

Functional assessments of antigen A-targeting natively paired and combinatorial antibodies

Cell culture: Promega GloResponse™ NFkB-Luc2P/U2OS reporter cells (Promega Corporation, Madison, WI, cat. no. CS1979A03) were grown under standard cell culture conditions using McCoy’s 5A medium (Invitrogen, Carlsbad, CA, cat. no. 16600-082) supplemented with 10% fetal bovine serum (HyClone/GE Healthcare, Chicago, IL, cat. no. SH30070.02) and 200 mg/ml Hygromycin B (Invitrogen, Carlsbad, CA, cat. no. 10687-010). Prior to seeding into assay plates, reporter cells were resuspended in assay medium (McCoy’s 5A supplemented with 1% fetal bovine serum).

Automated reporter assay

Automated reporter screens were run on a GNF screening system (GNF Systems, San Diego, CA), equipped with a Stäubli TX90L robotic arm (Stäubli Corporation, Pfäffikon, Switzerland) serving an integrated Envision plate reader (Perkin Elmer Inc., Waltham, MA), a Bravo liquid handling platform (Agilent Technologies Inc., Santa Clara, CA), two GNF Model I washers/dispensers, and two ThermoFisher incubators (Forma Environmental Chamber, model 4933, ThermoFisher Scientific, Waltham, MA).

Reporter cells were seeded into 384-well assay plates (white, flat-bottom TC-treated, Greiner Bio-One Kremsmünster, Austria, cat. no. 781073) at 10,000 cells per well in 12.5 µL of assay medium. Samples and positive control (Target A, R&D Systems, Minneapolis, MN, cat. no. 6420-CL) were added manually to 384-well master plates (polypropylene, V-bottom, Greiner Bio-One, Kremsmünster, Austria, cat. no. 781280). Assay medium was added to master plates and 9-point or 18-point serial dilutions were performed on the Agilent Bravo. Subsequently, 12.5 µL of sample solution were stamped from a master plate into each of three replicate assay plates. Assay plates were incubated for 4 h at 37 °C, and then equilibrated for 15 min at room temperature. In the meantime, the Bio-Glo detection reagent (Bio-Glo Luciferase Assay System, Promega Corporation, Madison, WI, cat. no. G7940) was prepared according to the manufacturer’s instructions. 50 µL/well of Bio-Glo reagent was added to the assay plates via the GNF dispenser, it was incubated for 5–10 min at room temperature, and luminescence output was measured on the Envision using the US luminescence aperture and 0.1 s/well readout time.

Data analysis and curve fitting

Raw data files from the Envision were imported into Genedata Screener (version 16.0.2, Genedata, Basel, Switzerland). Well meta data was added through a cmt file that was generated by a custom-written Knime pipeline (Knime AG, Zürich, Switzerland). Data was normalized and scaled using negative and positive controls as central and scale reference, respectively. Plate-based RZ’ factors were calculated in Screener and plates with RZ’ < 0.5 were masked and excluded from further analysis. Curve fitting was carried out using the Screener SmartFit algorithm, and valid IC50 or EC50 values were reported as qAC50 values.

Statistical analysis

Unless otherwise noted, data were presented as mean ± s.d. For single comparisons, an unpaired t test with Welch’s correction was used. For multiple comparisons, one-way analysis of variance was used with post hoc Dunnet’s correction. p < 0.05 was regarded as significant.