Introduction

Rare Earth Elements (REEs; including the lanthanides (Z = 57 to 71), scandium, and yttrium) are crucial ingredients in present-day sustainable energy technologies including wind turbines1,2 and electric vehicles3, and in future ones like high-temperature superconductors4,5,6,7. As a result of this, the demand for REEs is expected to grow by between 3- and 7-fold by 20408. However, existing hydrometallurgical methods for REE mining pose considerable environmental risks9,10,11.

The growing demand for an environmentally-friendly supply chain for REEs has fueled interest in biotechnological processes for REE mining and recycling9,12,13,14,15,16,17,18,19,20. Bioleaching solubilizes metals from minerals and end-of-life feedstocks with a microbially-secreted mineral-dissolving cocktail called a biolixiviant13,14,16,21,22,23. In recent years, the search for more efficient bioleaching strategies has gained significant interest23,24,25.

The Generally Regarded As Safe (GRAS) acidophilic bacterium Gluconobacter oxydans26,27,28,29 has played a significant role in bioleaching research over the past decade14,21,30,31,32,33. This microbe offers a unique capability for secretion of organic acids21,29,34 and creation of a highly acidic biolixiviant (primarily containing gluconic acid14) which is particularly effective for recovering REEs from sources including minerals like allanite15; industrial residues like phosphogypsum30; red mud35, and blast furnace slag36; and recycled materials like fluid cracking catalyst (FCC)21, nickel-metal-hydride (NiMH) batteries37, and retorted phosphor powder (RPP)16,21,35.

We have already improved bioleaching of REEs by G. oxydans by first identifying genes involved in acid production with high-throughput screening16, and then engineering their regulation15. We increased REE-bioleaching from the REE-containing mineral allanite by up to 73% by up-regulating the membrane-bound glucose dehydrogenase, mgdh, and knocking out the phosphate signaling and transport system pstS15.

Despite progress in enhancing REE-bioleaching with G. oxydans, further genetic improvements may be necessary to allow it to leapfrog traditional thermochemical extraction processes33,38. One way to further improve it is to target the regulation of additional genes implicated in bioleaching by acid production screens16. However, it is proposed that metal solubilization in bioleaching happens through three processes: acidolysis via acid production, complexolysis by creating complexes between metal ions and organic compounds, and redoxolysis, which involves electron transfer and oxidative reactions33. G. oxydans’ biolixiviant appears to be more effective than pure gluconic acid controls at the same pH21. This could suggest that the biolixiviant contains additional components that amplify its effectiveness21,22,30,32,39.

We hypothesize that further characterizing the mechanism of bioleaching and then engineering the composition of the biolixiviant is the route to future significant increases in bioleaching efficiency. But the acid production screens that we have used to date may not be sensitive to complexolysis and redoxolysis. Thus, to fully characterize the biolixiviant, it is necessary to identify genes directly linked to bioleaching, rather than just acid production alone. In this study, we conducted a genome-scale screen of a new quality-controlled G. oxydans B58 whole genome knockout collection with a high-throughput assay designed to directly identify genes involved in REE-bioleaching.

Results

G. oxydans’ quality-controlled collection is 85% smaller than previous collection

To facilitate direct high-throughput screening of mineral-dissolution, we built a highly non-redundant Quality-Controlled (QC) whole genome knockout collection for G. oxydans B58. The QC collection was built by removing redundancy from our earlier condensed collection (CC) of G. oxydans transposon disruption mutants16. Throughout this article, we refer to transposon disruption mutants with a “δ” symbol (as opposed to a “Δ” symbol for clean deletion mutants; e.g., δGO_1096).

In total, the QC collection contains a single carefully chosen transposon disruption mutant for 2733 non-essential genes in the G. oxydans B58 genome arrayed onto 57 checker-boarded 96-well plates (Supplementary Data 1). The QC collection is 94% smaller than our G. oxydans saturating-coverage transposon mutant collection used to source all mutants in our earlier work (containing 49,256 mutants)16, and 85% smaller than our previous best curated collection (17,706 mutants)16. An overview of QC collection construction is shown in Fig. 1, and collection statistics are shown in Table 1. Full details of QC collection construction can be found in “Materials and Methods”.

Fig. 1: Construction of a quality-controlled (QC) collection containing a single disruption mutant for 2733 genes in G. oxydans’ genome.
figure 1

A The QC collection was built by choosing the best available mutants from an earlier condensed collection (CC) of G. oxydans mutants16, and by supplementation from the originating saturating-coverage transposon collection (Progenitor Collection or PC)16. Mutants were transferred from the earlier collections by fluid transfer into fresh 96-well plates with a colony picking robot. B The QC collection contains 2733 unique mutants distributed across 57 checker-boarded 96-well plates. This collection represents a 94% reduction in size from the PC, and an 85% reduction in size from the CC, indicated in red. It is worth noting, that the CC collection contained disruption mutants with confirmed identity for 2556 genes (93.5% of the 2733 genes disrupted in the PC), whereas the QC collection aims to include a disruption mutant for 100% of genes disrupted in the PC.

Table 1 Overview of G. oxydans B58 quality-controlled (QC) whole genome knockout collection statistics

Genome-scale screen of QC collection identifies 68 genes not previously implicated in REE-bioleaching

We performed a genome-scale screen of the QC collection to identify gene disruption mutants with differential REE-bioleaching capabilities. We developed a rapid microplate-based colorimetric assay with the REE-chelating dye Arsenazo-III (As-III) to screen for REE extraction18,40,41 from a synthetic monazite powder (Fig. 2A, B) (monazite is the mineral host for REEs at the Mountain Pass Mine in California, one of the largest REE mines in the world10). Unlike natural monazite (which contains multiple light lanthanides), this synthetic mineral contains just neodymium (Nd)42, allowing greater consistency in the determination of extraction efficiency. Two synthetic monazite powder batches were synthesized (Mon1 and Mon2) as described by Balta et al.42, presenting minor differences in Nd-phosphate composition (Supplementary Data 2). Mon1 was composed of monazite-Nd (NdPO4), rhabdophane-Nd (NdPO4·2H2O), and neodymium oxyphosphate (Nd3PO7). In contrast, Mon2 consisted solely of monazite-Nd and rhabdophane-Nd.

Fig. 2: Direct bioleaching screening of the quality-controlled whole genome knockout collection of Gluconobacter oxydans identifies 68 gene disruptions not previously associated with bioleaching.
figure 2

A We developed a rapid 96-well plate based high-throughput screen for REE-bioleaching that uses a colorimetric assay with the Arsenazo III (As-III) REE-chelating dye in presence of synthetic monazite. This assay directly detects REE release, facilitating the identification of disruptions with differential bioleaching capabilities. B Color differences in each well identify which genes control bioleaching. Gene disruptions that reduce bioleaching leave the As-III dye unbound and pink, while disruptions that increase bioleaching turn the dye blue. C The genome-scale screening of the QC collection identified 89 hits including 68 gene disruptions not previously associated with the bioleaching capabilities of G. oxydans. D Gene Ontology (GO) Enrichment was performed with a Fisher’s Exact Test (p < 0.05, yellow dashed line) to analyze the hits identified. Significantly enriched GO terms with at least 2 representatives are displayed, and numbers above the bars represent the significant hits out of the total annotated genes associated with that gene ontology in the genome of G. oxydans (the full list of GO terms can be found in Supplementary Data 3D). Only terms belonging to the Biological Process and Metabolic Function GO groups were significantly enriched.

Mon1 demonstrated greater dissolution than Mon2 during bioleaching. This discrepancy is likely due to the elevated concentration of neodymium oxides in Mon1 (Supplementary Data 2), along with the higher thermal expansion and oxidation state of the Nd3PO743,44. We utilized both Mon1 and Mon2 in our screening to leverage their distinct characteristics. Mon1 enhanced the detection sensitivity for potential hits in the initial preliminary survey, thereby expanding the selection pool for subsequent validation. Mon2, being less reactive, allowed for the isolation of the most significant outliers in the initial survey, offering a closer analog to bioleaching with natural monazite.

Our preliminary survey (Supplementary Data 3A) identified 89 gene disruptions that produced significant changes in Nd extraction from the Mon1 synthetic monazite (Fig. 2C). These hits included disruptions of 21 genes previously linked to acidification16 and 68 genes that have not previously been linked to bioleaching (Supplementary Data 3B).

A follow-up screen of these 89 mutants found 6 mutants that produced statistically significant changes to Nd-extraction from the Mon2 synthetic monazite (Supplementary Data 3C, Table 2). We included 6 additional mutants for further analysis based on their prominence in the initial Mon1 Nd-extraction survey (Table 2).

Table 2 Selected gene disruptions for direct measurement of bioleaching

Metal binding and DNA repair gene ontologies are highly enriched among bioleaching hits

The 68 genes not previously associated with REE-bioleaching that were identified through the initial Mon1 REE-extraction survey were analyzed for gene ontology (GO) enrichment with topGO45. A total of four GO terms were found to be significantly enriched (p < 0.05 by Fisher’s Exact Test) and to be associated with more than one gene (Supplementary Data 3D). These four enriched GO terms belong to the biological process (BP) and molecular function (MF) groups. No significant terms belonging to the cellular component group were found.

The two most statistically significantly enriched GO terms are metal ion binding and nucleotide-excision repair (Fig. 2D, Supplementary Data 3D). We speculate that the disruption of these genes affects bioleaching indirectly by modifying the expression of metabolic pathways that produce metal-solubilizing compounds. For example, metal ion binding proteins are often associated with signaling, sensing, and regulation46,47,48,49, suggesting that their disruption could alter G. oxydans’ bioleaching capabilities. Similarly, disruption of the nucleotide-repair system could impair the ability of G. oxydans to respond to damage generated by oxidative stress resulting from initial biolixiviant production50,51 and thus alter further biolixiviant production.

Gene disruptions of interest demonstrate significant changes in bioleaching

The 12 gene disruption mutants that were highlighted by the Mon1 and Mon2 high-throughput assays (Table 2) were validated through bench-scale experiments where bioleaching was measured by Inductively Coupled Plasma—Mass Spectrometry (ICP-MS) (Fig. 3, Supplementary Data 4A). When compared to the proxy wild-type strain (pWT) (See “Materials and Methods”), 9 of the 12 disruption mutants generated significant changes in bioleaching (p < 0.05). Eight disruption mutants produced greater (with increases of between 56 and 111%) Nd-bioleaching than pWT, while only one mutant (δGO_1096) produced a reduction (−97%) in bioleaching (Fig. 3). When compared to the wild-type strain (WT), all 12 disruption mutants tested produced significantly different extractions (p < 0.01) (Supplementary Fig. 1), showing increases from 75 to 231%, and a 95% reduction by δGO_1096. The changes in extraction efficiency relative to WT after cell density normalization (Table S1) closely align with the results presented in Fig. 3. This complementary data supports our findings and validates our use of pWT as a reference strain.

Fig. 3: Identified G. oxydans gene disruptions raised REE-bioleaching by up to 111% or lowered it by 97%.
figure 3

Direct measurement of bioleaching extraction was performed on disruption mutants of interest following bioleaching experiments with synthetic monazite, using ICP-MS analysis. REE extractions were analyzed with pairwise comparisons among disruption mutants (n = 3) and pWT (n = 3). Levels of extraction significantly different from pWT are labeled with asterisks (*p  <  0.05; **p  <  0.01; ***p  <  0.001, ****p  <  0.0001) or “ns” if p > 0.05, denoting statistical significance after Bonferroni-corrected t-tests (N = 12). Blue squares indicate the mean extraction for each mutant, the center line denotes the median, boxes show the upper and lower quartiles, and whiskers extend to the range of data points within 1.5 times the interquartile range. Strains δGO_1162, δGO_1113, and δGO_1968 did not show statistically significant changes in extraction compared to pWT. Only one disruption mutant, δGO_1096, decreased bioleaching, demonstrating a 97% reduction in extraction. Current best-performing engineered strains, G. oxydans ΔpstS (single knockout) and G. oxydans P112:mgdh, ΔpstS (up-regulation and knockout), were included to contrast the extractions of the tested disruption mutants. Significant changes in extraction efficiency compared against the WT strain can be found in Supplementary Fig. 1.

Although the increases in Nd-bioleaching achieved by 9 of the newly identified gene disruptions are substantial, they are notably smaller than those produced by our current two best-performing strains—G. oxydans ΔpstS and G. oxydans P112:mgdh, ΔpstS15—which demonstrated 322% and 1186% increases over pWT (Fig. 3), and 564% and 1922% over WT, respectively (Supplementary Fig. 1).

Overexpression of gene GO_1096 increases bioleaching

We successfully engineered a markerless deletion strain, ΔGO_1096, and an overexpression strain, P112:GO_1096 (Supplementary Fig. 2), for further assessment of GO_1096’s effects on bioleaching performance. The overexpression strain, P112:GO_1096, demonstrated a significant increase in extraction efficiency compared to WT, with a 64% improvement (Fig. 5, Supplementary Data 4B), underscoring the positive regulatory role of GO_1096 in bioleaching. We performed relative gene expression analysis of P112:GO_1096, which revealed a 7.6-fold increase in GO_1096 expression (Supplementary Fig. 2B, C); however, this substantial upregulation did not directly translate into a proportional increase in extraction efficiency. Interestingly, while the disruption of GO_1096 led to a pronounced reduction in bioleaching (Fig. 3), ΔGO_1096 did not exhibit the same effects on extraction efficiency (Fig. 5).

Many gene disruptions of interest do not affect biolixiviant pH

The pH of the biolixiviants produced by the 12 gene disruption mutants selected were measured (Fig. 4). Unsurprisingly, given our focus on identifying mutants absent in an acid-production screen, yet still noteworthy, 9 of the 12 mutants did not exhibit significant changes to biolixiviant pH when compared to a proxy wild-type strain of G. oxydans. Only the δGO_102, δGO_1096, and δGO_1109 mutants produced statistically significant changes to biolixiviant pH (Fig. 4). δGO_102 and δGO_1109 produced very modest reductions in biolixiviant pH of 0.06 and 0.07 units respectively, which are only barely above the precision-threshold of a pH meter. δGO_1096 produced a large increase in biolixiviant pH of 0.55 units.

Fig. 4: Only three of the identified G. oxydans gene disruptions produce statistically significant changes in biolixiviant pH.
figure 4

REE extraction and biolixiviant pH were plotted for each strain selected for direct bioleaching measurement. Strain replicates that significantly changed the biolixiviant pH compared to proxy wild-type are indicated as follows, pWT “”, δGO_102 “□”, δGO_1109”, and δGO_1096 “Δ”. Other gene disruptions with data points falling within the model are represented as “•”, including data points for δGO_19, δGO_1113, δGO_1162, δGO_1598, δGO_1816, δGO_1968, δGO_1991, δGO_2555, and δGO_2946.

Discussion

The generation of the QC collection facilitates the implementation of cost-effective and elaborate screening methods52,53. The curated collection encompasses disruption mutants for 2733 genes, one single representative for almost every non-essential gene in G. oxydans’ genome16. The reduction in redundancy is crucial to improve the utility of a knockout collection, improving its value as a practical tool for the exploration of gene function53,54. This streamlined collection was essential for conducting a technically challenging genome-scale screen for REE-bioleaching, leading to the identification of previously uncharacterized aspects of this mechanism. To our knowledge, this study represents the first direct whole genome screening for bioleaching of any mined metal.

In total, we identified 89 gene disruptions involved in Nd-bioleaching from synthetic monazite. Notably, 21 genes implicated in acidification, previously reported16, were among the hits found through this screen, corroborating their direct involvement in REE-bioleaching. This outcome verifies the reliability of our As-III screen for Nd-bioleaching but also underscores some challenges related to reduced sensitivity. However, we also identified an additional 68 gene disruptions that were tested in our earlier assays but were not implicated in bioleaching. Out of 12 disruption mutants selected for further analysis, 9 did not produce any statistically significant change in pH (Fig. 4). This indicates that the new screen is more comprehensive than pH indicator screens for acid production. Furthermore, these results suggest that additional components might be present during bioleaching by G. oxydans13,30, potentially increasing solubilization.

Notable and significant changes in bioleaching were validated by ICP-MS through bench-scale experiments (Figs. 3 and 4). Out of the 12 mutants selected for verification, only one (δGO_1096) reduced extraction, suggesting that this gene normally facilitates bioleaching. In contrast, 11 of the mutants appear to regulate bioleaching, and their disruption removes the inhibitory controls on it.

Three results (δGO_1096, δGO_1991, and δGO_1968) imply that changes in stress response pathways can both improve or impair the bioleaching mechanism. This notion has been discussed by Sousa et al. after observing increased resistance to acetic acid at specific levels of oxidative stress in Saccharomyces cerevisiae55.

Disruption of the gene encoding Formamidopyrimidine-DNA Glycosylase Fpg (GO_1968; belonging to both the metal ion binding and nucleotide excision repair gene ontologies terms) increased bioleaching. Fpg is a glycosylase involved in the repair of DNA damage caused by oxidative stress56, and its disruption might lead to a heightened cellular response. For instance, it has been shown that certain levels of mistranslation can improve Escherichia coli’s tolerance to oxidative stress51. The metabolic response to accumulated DNA damage could involve transcription errors and shifts in secondary metabolite production. Such shifts might influence the types or quantities of solubilizing compounds produced by G. oxydans. Moreover, the gene coding for the Efflux ABC Transporter for Glutathione CycD (GO_1991) plays a role in redox balance within the periplasm during oxidative stress conditions57, and its disruption also increases bioleaching.

Notably, only a single gene disruption led to a decrease in bioleaching efficiency (δGO_1096). GO_1096 codes for the GTP-binding protein TypA which has been linked to stress response58. This suggests that its disruption might hamper G. oxydans’ ability to adapt to the stress generated during bioleaching, thereby diminishing efficiency. δGO_1096 was the only mutant that increased biolixiviant pH (out of only 3 that changed the pH at all). This result suggests TypA may facilitate acid-mediated bioleaching as well as non-acid bioleaching. We suspect that this gene was not found in our earlier acid-production screen16 because we swapped the GO_1096 disruption mutant used in our original curated collection16 for a new choice that was more likely to disrupt gene function. Interestingly, δGO_1096’s increase in biolixiviant pH is modest compared to those produced by mutants identified in our previous screen that produced similar reductions in REE-extraction. For example, disruption of the tldD gene, necessary for the synthesis of the PQQ molecule (a cofactor for the membrane-bound glucose dehydrogenase), produced a 92% reduction in REE-extraction (admittedly from a different substrate) but raised biolixiviant pH by ≈ 0.7 units16.

Furthermore, we observed that the disruption of three genes involved in gene expression increased bioleaching: GO_1162 encoding the phosphate regulon regulatory protein PhoB, GO_1109 encoding a LysR substrate binding domain, and GO_1816 encoding the transcriptional regulator YafY.

PhoB is involved in the regulation of inorganic phosphate (Pi) homeostasis59; it can positively or negatively regulate phosphate uptake in the presence of excess or limited Pi60. The disruption of GO_1162 only showed significant changes in bioleaching when compared to the wild-type strain (Fig. 3, Supplementary Fig. 1), indicating that this effect may not be fully caused by a loss of function. However, it is possible that disrupting this transcriptional activator can cause cells to lose the ability to modulate the phosphate regulon61, potentially leading to continuous phosphate-scavenging. Such dysregulation could explain the slight increase in bioleaching, though further investigation would be necessary to confirm this.

The LysR substrate binding domain binds specific ligands for signal recognition, and as a result, triggers the activity of transcriptional regulators62. Its disruption may influence gene repression or activation, altering the metabolic flux in a manner favorable for bioleaching. The regulatory protein YafY presents HTH (helix-turn-helix) and WYL (tryptophan, W; tyrosine, Y; and leucine, L) domains, which are known for their role in gene expression in response to DNA damage63. Disruption of the hypothetical protein GO_102 produced the second-largest increase in Nd-bioleaching (Fig. 3). blastp analysis64 revealed 100% sequence alignment between GO_102 and an HTH-domain containing protein, suggesting it might also be involved in gene expression. These disruptions may lead to significant changes in gene expression levels, which could explain the increase in extraction.

Both δGO_102 and δGO_1109 disruption mutants decreased biolixiviant pH but were not identified by our earlier acid production screen. The pH changes caused by these mutants are modest compared with prominent mutants from our earlier acid production screen (which decreased pH by 0.1 to 0.2 units)16. We suspect that these pH changes were too small to be detected by our earlier screen. This leads us to presume that these mutants may be simultaneously altering acid and non-acid mechanisms of bioleaching.

We speculate that disruption of the hypothetical protein GO_2946 might enhance bioleaching by increasing the permeability of the outer membrane of G. oxydans. blastp analysis64 showed homology of GO_2946 to a glycosyltransferase family 2 (GT2) protein, whose function involves the transfer of sugar moieties among intermediates during the biosynthesis of polysaccharides65. In a previous study, a GT2 protein was found to repress oxidative fermentation in Gluconacetobacter intermedius; and when disrupted, final yields of gluconic acid and acetic acid increased compared to the parental strain65. Changes in the polysaccharide composition of the cell surface might increase the permeability of the membrane, facilitating the diffusion of substrate, therefore enhancing transport of the biolixiviant out of the cell65.

The disruption of a TonB-dependent receptor gene (GO_19) increased bioleaching. This type of system is often associated with the uptake of iron bound to siderophores66. Under iron starvation conditions, bacterial cells tend to upregulate secretory mechanisms in an attempt to manage iron deficiency67,68. Disruption of this gene might alter iron sensing and transport, prompting the cell to increase biolixiviant production.

Disruption of genes involved in nucleotide metabolism increased bioleaching (GO_1598, GO_1113, and GO_2555). The largest increase in bioleaching was produced by δGO_1598. GO_1598 codes for the orotate phosphoribosyltransferase enzyme PyrE which catalyzes the production of orotidine monophosphate (OMP), using phosphoribosyl diphosphate (PRPP) and orotate69. One way that the disruption of GO_1598 could increase bioleaching is by increasing availability of orotate, which is known to chelate metals69, and in particular lanthanides70. GO_1113 codes for XdhC, an accessory protein involved in purine metabolism that binds to Molybdenum Cofactor (MoCo) for maturation of xanthine dehydrogenase71. We speculate that its disruption could create an accumulation of MoCo, and accelerate other enzymatic reactions used in the production of the biolixiviant. GO_2555 codes a hypothetical protein that demonstrated partial homology64 (E-value = 1 × 10−125) to the L-dopachrome tautomerase yellow-f2 protein (EC 5.3.2.3) involved in tyrosine metabolism, which interacts with quinone and flavonoids to produce melanin72,73,74. The enhanced bioleaching observed from this disruption may be attributed to the accumulation of intermediates, which can be utilized in other metabolic pathways associated with bioleaching.

Our results propose that the choice of bioleaching substrate plays a crucial role in determining the extent to which genetic modifications lead to measurable differences in bioleaching performance. This effect likely arises because of the biochemical activity of G. oxydans in the presence of different minerals, altering the metabolic pathways at play during direct bioleaching. For example, in our previous work on retorted phosphor powder, the largest single-target genetically induced improvement—an 18% increase—was achieved by disrupting the pstC gene16. Similarly, an engineered knockout/overexpression strain of G. oxydans (with a pstS deletion and mgdh overexpression via the P112 promoter) improved bioleaching from the mineral allanite by up to 73%15. However, in this study using monazite, single gene disruptions improved bioleaching by as much as 111%, while the same engineered knockout/overexpression strain resulted in a striking 1186% increase over proxy wild-type. These differences suggest that the interplay between microbial activity and metal solubilization is influenced by subtle substrate-associated factors that have yet to be fully elucidated.

The construction of an engineered strain overexpressing GO_1096 confirmed its potential as a promising target for enhancing REE extraction (Fig. 5), underscoring its involvement in the bioleaching mechanism. Although overexpression of GO_1096 resulted in a 7.6-fold increase in gene expression, the corresponding improvements in REE extraction were not strictly proportional, suggesting that GO_1096 likely operates within a broader genetic framework, potentially requiring the coordinated expression of additional genes to achieve optimal bioleaching effects. Moreover, the stark difference in performance between δGO_1096 and ΔGO_1096 strains indicates that the complete removal of GO_1096 does not drastically impair the bioleaching process, hinting at possible compensatory or redundant pathways that mitigate its loss, a phenomenon previously reported in other studies75,76. In addition, the partial disruption in δGO_1096 may inadvertently affect the nearby regulatory elements or neighboring genes, leading to a more pronounced impact on bioleaching. Further investigation would be necessary to fully elucidate the precise role of GO_1096 during bioleaching. Nevertheless, the observed benefits from its overexpression offer a promising outlook, indicating that targets identified in this study hold potential for meaningful improvements in bioleaching.

Fig. 5: Genome-scale screening successfully identifies genetic engineering target to improve bioleaching.
figure 5

Mutant strains consisting of a clean deletion and an overexpression of the gene GO_1096 were generated to test a selected hit as a target for genetic engineering. Extraction efficiency was determined for generated mutants following bioleaching experiments with synthetic monazite and compared to the wild-type (WT) strain. p-value is displayed for both sets of comparisons denoting statistical significance after unpaired t-tests performed against WT. Blue squares indicate the mean extraction for each strain, the center line denotes the median, boxes show the upper and lower quartiles, and whiskers extend to the range of data points within 1.5 times the interquartile range.

It remains unclear whether modification of other genes discussed in this article could lead to transformative advancements in bioleaching by G. oxydans. It is worth noting that when we first observed the effects generated by transposon disruption mutants, they were very modest compared to the improvements achieved by subsequently engineering the same target genes. For example, disruption of pstS only increased bioleaching from retorted phosphor powder by 4.3%16, whereas clean deletion of pstS improved bioleaching from allanite by 30%15. Remarkably, combination of this knockout with upregulation of mgdh increased bioleaching from allanite by 73%15. This suggests—though further testing is necessary—that multiple modifications inspired by the results presented here could have substantial impacts on bioleaching performance from certain substrates.

Conclusion

Our work pivots away from the well-characterized acidolysis mechanism to examine contributions to bioleaching via alternative routes. Our findings introduce additional genes that influence bioleaching yet remained undetected in acidification screens (Figs. 2 and 3). Most genes that we identified do not significantly affect biolixiviant pH (Fig. 4).

These results, along with earlier observations by Reed et al.21, strongly suggest that the bioleaching mechanism of G. oxydans is not purely acid-based. Our data suggests that disrupting genes involved in nucleotide synthesis, transcriptional regulation, DNA repair, nutrient uptake, and oxidative stress response could significantly impact bioleaching. But there is no clear evidence in this list that suggests a single gene coding for a redox molecule (e.g., a flavin) or chelator (e.g., a siderophore) amplifies the effectiveness of bioleaching.

The gene disruptions we have identified appear to be involved in regulatory functions. This suggests that if there are genes involved in complexolysis or redoxolysis within the G. oxydans genome, they could be multiply redundant, and that knocking out a single gene contributes only marginally to changes in bioleaching. Alternatively, these genes may possess some degree of essentiality, rendering their identification challenging. This highlights the need for broader and higher-sensitivity screening approaches that can capture a more diverse array of bioleaching-related genetic activities, potentially leading to enhanced methods for harnessing and optimizing the bioleaching mechanism.

Further research remains necessary to fully elucidate the complexities of G. oxydans’ biolixiviant and its interaction with REEs. The integration of advanced analytical techniques may be required to precisely characterize the metabolic impact of the strains examined in this study and to identify the key components involved in REE-bioleaching.

Materials and methods

Curation of the quality-controlled whole-genome knockout collection

The QC collection was built by removing redundancy from an earlier condensed collection (CC) of G. oxydans transposon mutants and supplementing with additional mutants from a saturating-coverage progenitor collection (PC) from the same work16. The progenitor collection of transposon mutants was cataloged using the Knockout Sudoku probabilistic inference combinatorial pooling method52,77.

The best-choice disruption mutant for each gene in the QC collection was selected by considering the location of the transposon disruption in the gene, and the reliability of the location assignment of the mutant. We followed the generally accepted consensus that the best location for a transposon to disrupt gene function is between 50 bp from the start of the coding region and the middle of the gene (a fractional distance of ≤ 0.5 into the gene). Furthermore, we preferred to select mutants whose location was directly inferred by the Knockout Sudoku algorithm (also known as unambiguous location inference52,77), rather than relying upon probabilistic location inference (also known as ambiguous location inference52,77). If given the choice between a mutant with non-ideal locations and unambiguous location inference, and a mutant in the same gene with ideal location and ambiguous location inference, we chose the former. Mutant genomic location statistics are shown in Table 1. A catalog of the QC collection is included in Supplementary Data 1.

Mutants were transferred from the earlier collections by fluid transfer with a colony picking robot (CP7200, Norgren Systems, Ronceverte, West Virginia, USA) into fresh 96-well polypropylene storage plates containing yeast peptone mannitol (YPM) media (5 g L−1 yeast extract (C7341, Hardy Diagnostics, Santa Maria, California, USA); 3 g L−1 peptone (211677, BD, Franklin Lakes, New Jersey, USA); 25 g L−1 mannitol (BDH9248, VWR Chemicals, Radnor, Pennsylvania, USA)) and 100 μg mL−1 kanamycin. Plates were incubated at 30 °C and 800 rpm in a Multitron shaker (Infors Inc., Laurel, Maryland, USA) for 3 nights to achieve saturation. Following incubation, stocks were prepared by adding glycerol (G5516, Sigma-Aldrich, St. Louis, Missouri, USA) to 20% (v/v) to each well for long-term storage at −80 °C.

Genome-scale REE-bioleaching screen

REE concentrations were measured using the REE-Chelating Dye Arsenazo-III (As-III; 110107, Sigma-Aldrich, St. Louis, Missouri, USA). The As-III solution was prepared following the method described by Hogendoorn et al.40.

The high-throughput screen consisted of a preliminary survey carried out in duplicate followed by a validation screen performed in triplicate. It spanned several weeks and was partitioned into batches of 10–15 plates. Plates containing disruption mutants to be analyzed were replicated from the QC collection using a 96-pin cryo-replicator press (CR1100, EnzyScreen, Heemstede, Netherlands) into 96-well polypropylene round-bottom plates filled with 150 µL YPM supplemented with 100 µg mL−1 kanamycin.

Replicated plates were incubated for two nights at 30 °C, 80% humidity, and 800 rpm in a Multitron shaker. Then, cultures were back-diluted either in duplicate or triplicate, by inoculating 5 µL of bacterial culture in 120 µL of the same medium using the PlateMaster pipetting system (F110761, Gilson Inc., Middleton, Wisconsin, USA). After two nights of incubation, the optical density was measured using a plate reader (18531, Biotek, Charlotte, Vermont, USA), and the biolixiviant production was set up by measuring the remaining volume in each well and adding glucose (0188, VWR Chemicals, Radnor, Pennsylvania, USA) to 20% w/v. Plates were incubated for 24 h at 30 °C, 80% humidity, and 800 rpm. For all the aforementioned steps, 96-well plates were covered with a breathable rayon film (60941-086, VWR Chemicals, Radnor, Pennsylvania, USA).

For bioleaching, 150 µL of biolixiviant was transferred from each well into plates containing synthetic monazite powder. Plates were sealed with an adhesive polypropylene sealing Film (PCR-TS, Axygen Scientific, Union City, CA) and incubated for 24 h at 30 °C and 800 rpm. Final plates containing the leachate were centrifuged at 3214 × g for 12 min to obtain the supernatant to be used for analysis.

For the preliminary survey, bioleaching was performed in presence of synthetic monazite mineral Mon1 (Supplementary Data 3A) at 0.33% pulp density to facilitate an initial identification of disruption strains with differential bioleaching capabilities. For the screen with Mon2 (Supplementary Data 3C), the pulp density was 1%. After centrifugation, a 1:10 dilution of the supernatant from each well was transferred to polystyrene flat-bottom 96-well plates filled with 100 µL of 60 µM As-III Solution. For validation with Mon2, 100 µL of the supernatant was used instead. Analysis plates were shaken for 1 min, then the absorbance at 650 nm was measured with the plate reader.

REE-bioleaching screen analysis

Several factors were considered to confirm the quality of the data collected, namely cross-contamination of wells, systematic differences among wells caused by their position in the plate, and additional errors generated during manipulation.

To address these challenges, the raw data collected through our preliminary high-throughput survey with Mon1 was analyzed with a correcting software78 to detect and remove systematic errors. This software allowed us to normalize our data based on Z-score transformation within plates79. To avoid false positives caused by mutants with large growth defects, wells with growth OD600 < 0.2 were excluded to avoid the potential offsets during the normalization process.

An initial hit identification was carried out by comparing the normalized absorbance values of the disruption mutants to the mean of the plate in which they were analyzed. Using a conservative threshold, disruption mutants with absorbance values deviating by more than 1.5 standard deviations from the plate mean were selected for further screening.

For our screen with Mon2, wells containing the selected disruption mutants were used to inoculate new 96-well plates filled with medium supplemented with antibiotic. To ensure reliable data collection, we arrayed the mutants in a checkerboard pattern, included each mutant in triplicate, avoided perimeter wells to reduce medium evaporation, and included a proxy wild-type strain within each plate.

Raw absorbance data was used for statistical analysis, which included pairwise comparisons conducted utilizing the emmeans package in R while applying a false discovery rate (FDR) correction for p-value adjustment for multiple comparisons80. Disruption mutants that demonstrated significant differences in REE-extraction were considered promising.

Gene ontology enrichment analysis

A gene ontology (GO) enrichment analysis was carried out as described by Schmitz et al.16. Enrichment was performed with the BioConductor topGO package45,81, employing the TopGO Fisher test with the default weight algorithm, applying a significance threshold of 0.05.

Direct REE-bioleaching measurements

Selected disruption mutants that included six significant hits identified through the REE-bioleaching screen and other six additional disruption mutants were used for further validation through bench-scale bioleaching experiments82 with Mon2. Additional disruption mutants were picked based on the features disrupted that represented a trait of interest, these included specific enzymes or disruptions with interesting behavior throughout the screen.

A total of 12 disruption mutant strains were plated on YPM and kanamycin agar (05039, VWR Chemicals, Radnor, PA) for colony isolation. Three biological replicates for each disruption mutant were used to inoculate 3 mL cultures, as well as for wild-type controls in addition to no-bacteria control. After incubation for two nights at 30 °C and 200 rpm in a shaking incubator (29314, Infors Inc., Laurel, MD), cultures were normalized to OD590 0.01 in 50 mL flasks with a final volume of 5 mL and grown for another two nights. Following incubation, the cultures were diluted 1:1 with glucose 40% (w/v) to reach a final concentration of 20%. Flasks were incubated at 30 °C and 200 rpm for 48 h to allow biolixiviant production.

Bioleaching experiments were set up to digest Mon2 at 1% pulp density, for 24 h at 30 °C and 200 rpm. Once obtained, the leachate was filtered through a 0.45 µm AcroPrep Advance 96-well filter plate (8029, Pall Corporation, Show Low, AZ, USA) by centrifugation at 1500 × g for 5 min.

Each sample underwent a 20-fold dilution using 2% trace metal grade nitric acid (JT9368, J.T. Baker, Radnor, Pennsylvania, USA). Neodymium concentration in each sample was measured using an Agilent 7800 ICP-MS. Calibration and verification utilized a REE mix standard (67349, Sigma-Aldrich, St. Louis, Missouri, USA) and a rhodium internal standard (04736, Sigma-Aldrich, m/Z  =  103). Rigorous quality checks included intermittent assessment of standards, blanks, and replicate samples83. Data interpretation was carried out using MassHunter software, version 4.5.

A proxy wild-type strain (pWT) was selected for comparison to account for changes in the wild-type background introduced by the presence of the antibiotic cassette in the genome. The selected pWT contains a transposon insertion in an intergenic region (at base pair 1,262,471; between genes GO_2147 and GO_2148) and does not demonstrate any notable phenotypic changes. This strain can be located in plate 130, well A2 in the progenitor collection.

For each disruption mutant, bioleaching measurements (n  =  3) were evaluated against those of the wild-type and proxy wild-type strains (n  =  3). This was done performing pairwise comparisons with Bonferroni correction using the emmeans package in R. The total REE extracted was deemed significantly different if p  <  0.05/N (N = 12).

Genetic engineering

The gene GO_1096 was selected as the target for genetic engineering experiments to validate screen results. Engineered strains generated in this study included the clean deletion of GO_1096 and the overexpression of GO_1096 via the P112 promoter. Genetic modifications were introduced using the codA/codB-based counterselection system84 for allelic exchange. The suicide vector pKOS6b was linearized with XbaI (FD0684, Thermo Fisher Scientific, Waltham, MA) and used as the backbone for cloning.

To generate a deletion of GO_1096, a thousand base pair regions immediately upstream and downstream of the target locus were amplified and assembled into the linearized pKOS6b using the HiFi DNA Assembly Kit (E5520, New England Biolabs, Ipswich, MA). For transcriptional upregulation of GO_1096, the promoter P112, flanked by a thousand base pairs upstream and downstream of the GO_1096 start site, was cloned into pKOS6b.

Plasmids were transformed into G. oxydans by electroporation, following the protocol described by Mostafa et al.85. Briefly, G. oxydans cells were harvested at an optical density of 0.8 and washed three times with 1 mM HEPES buffer. The final pellet was resuspended in 250 µL of HEPES and glycerol to 12%, flash-frozen in liquid nitrogen, and stored at −80 °C. For transformation, thawed cells were electroporated with 250 ng of plasmid DNA at 2 kV for 3 ms, using the Bio-Rad MicroPulser (1652100, Bio-Rad, Hercules, CA). Cells were then recovered in 800 µL of EP medium (80 g L−1 mannitol, 15 g L−1 yeast extract, 2.5 g L−1 MgSO4·7H2O, 1.5 g L−1 CaCl2, and 0.5 g L−1 glycerol) and incubated overnight. Following recovery, cells were plated on EP agar supplemented with 100 µg mL−1 kanamycin to select for integrants. Colonies were subsequently patched onto YPM agar containing 10 µg mL−1 5-fluorocytosine (5-FC) (F0321, VWR Chemicals, Radnor, PA) for counterselection. Colonies emerging on 5-FC plates were screened using colony PCR. Successful mutants were confirmed by sequencing to verify the intended modification. Strains and plasmids used in this work are listed in Table S2, and oligonucleotides in Table S3.

Gene expression analysis

To assess the effectiveness of the promoter in the overexpressed strain, changes in expression levels were determined by real-time quantitative polymerase chain reaction (RT-qPCR) with QuanStudio7 Pro (Thermo Fisher Scientific, Waltham, MA). Extraction of microbial RNA was achieved using the RNeasy mini kit (74524, Qiagen, Hilden, Germany), following the manufacturer’s protocols for enzymatic lysis and proteinase K digestion of bacteria, and purification of total RNA from bacterial lysate. cDNA synthesis and amplification were conducted using the Luna Universal One-Step RT-qPCR kit (E3005, New England Biolabs, Ipswich, MA). The gap gene was used as the internal reference (housekeeping gene) for normalization. All experiments were performed in triplicate with biological replicates.

Limitations of the study

During high-throughput screening, growth hindrance due to strain disruption poses considerable challenges. For example, despite the pstS gene disruption being flagged as a key player in REE-bioleaching during preliminary testing, its pronounced effect on microbial growth required its exclusion from final analyses because it did not meet the established growth threshold for inclusion. Future studies could address these growth-related issues by incorporating an upregulation collection into the screening process. This approach would not only enhance the insights gained from knockout collections but also allow for the potential identification of essential genes as significant contributors to bioleaching.

Other limitations involve the choice of the colorimetric reporter As-III. This dye was selected based on extensive evidence of its high binding affinity to REE; however, while the formation constant of this chelator is relatively high86,87, there is potential for underreporting of hits containing stronger chelators that could interfere with dye-REE complex formation. As a result, the associated gene disruptions may not have been identified due to competition in REE binding during screening. Additional screening strategies would be necessary to account for such competitive interactions.

Furthermore, confirmation of gene involvement of hits identified in this study will require further experiments, including gene deletion and complementation, to fully evaluate gene function and interaction.

While our study adds to the growing body of work on microbial metal recovery, the application of engineered microbes in bioleaching is still emerging. Successes have been noted with engineered strains in controlled settings such as stirred-tank reactors. Nonetheless, advancing this technology will require careful consideration of containment strategies and regulatory adjustments to appropriately mitigate potential risks.

Statistics and reproducibility

In our genome-scale screen, statistical robustness was ensured through multiple replicates and a rigorous validation approach. Our screen included a preliminary survey carried out in duplicate, followed by a validation screen performed in triplicate. This multistep approach allowed for the establishment of robustness in our findings. To control for potential biases, we considered factors such as cross-contamination of wells, position-related systematic noise, and errors during manipulation.

The data from our preliminary survey with Mon1 was corrected using specialized software to remove systematic errors78, and Z-score transformation was applied within plates for normalization. Wells that showed growth OD600 below 0.2 were excluded to avoid potential offsets during this normalization process. Disruption mutants exhibiting changes of 1.5 standard deviations from the plate mean were selected for further screening.

In the subsequent validation screen with Mon2, disruption mutants were subjected to a more stringent validation process. This included a balanced distribution of mutants across the plates, triplicate data collection, and within-plate comparison strains. Statistical significance for the REE-extraction capabilities of each mutant was determined using pairwise comparisons with the emmeans package in R and applying a false discovery rate (FDR) correction for p-value adjustment in multiple comparisons.

Bench-scale bioleaching experiments further confirmed the reproducibility of our findings, with each mutant strain’s performance benchmarked against that of the wild-type and the proxy wild-type strains. Neodymium extraction levels measured by ICP-MS were compared pairwise, with significance established at p < 0.05 after Bonferroni correction (N = 12). These stringent statistical analyses ensured that the results presented were both robust and reproducible, providing a reliable foundation for future explorations into the genetic determinants of REE-bioleaching.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.