Fig. 1: Working principle of semantic regularization based on a pre-trained LLM.
From: Semantic regularization of electromagnetic inverse problems

a Illustration of solving Eq. (1) with semantic regularization. The gradient blue graph represents the solution space in the semantic manifold \({{{{{\mathscr{S}}}}}}\). Measurements determine the white isolines which represent the data misfit, i.e., the first term on the right-hand side in Eq. (1). The green graph represents the semantic regularizer which is determined mainly by the semantic prior \({{{{{{\boldsymbol{\alpha }}}}}}}_{0}\), i.e., the second term on the right-hand side in Eq. (1). The optimal solution α is found at the intersection of the data misfit isoline and the semantic regularizer. b “Encoder-decoder” neural network architecture for reconstructing the unknown x with semantic regularization. The encoder maps the measurements y to a pair \((\Delta {{{{{\boldsymbol{\alpha }}}}}},{{{{{{\boldsymbol{\alpha }}}}}}}_{0})\), where \({{{{{{\boldsymbol{\alpha }}}}}}}_{0}\) is the semantic prior and \({{{{{\boldsymbol{\alpha }}}}}}={{{{{{\boldsymbol{\alpha }}}}}}}_{0}+\Delta {{{{{\boldsymbol{\alpha }}}}}}\) is the semantically embedded reconstructed unknown, as illustrated in (a). The decoder maps a pair \((\Delta {{{{{\boldsymbol{\alpha }}}}}},{{{{{{\boldsymbol{\alpha }}}}}}}_{0})\) to the reconstructed unknown in the original space. To train the encoder-decoder network, a multi-step procedure as outlined in Supplementary Note 1 is used. During the first step, the loss function defined in Eq. (2) that is composed of the three terms highlighted in red is minimized. A subsequent GAN-based training step refines the encoder to ensure that it outputs reasonable semantic priors. Once trained, the encoder directly outputs a recommended semantic embedding for a given measurement, which, importantly, can be manually changed into \({{{{{{\boldsymbol{\alpha }}}}}}}_{0}^{{\prime} }\) as required in different contexts explored in this work, before the decoder maps \((\Delta {{{{{\boldsymbol{\alpha }}}}}},{{{{{{\boldsymbol{\alpha }}}}}}}_{0}^{{\prime} })\) to a reconstructed unknown in the original space.