Figure 6 | Scientific Reports

Figure 6

From: Toward a stable and low-resource PLM-based medical diagnostic system via prompt tuning and MoE structure

Figure 6

Frameworks of MLM-based pre-training, DLM-based pre-training, fine-tuning and prompt-tuning. (a) The original input and the word “feel”, “make” and “on” are masked randomly. (b) The process of MLM-based pre-training, the masked words are replaced with “[MASK]”, and the transformer regenerate them. (c) The process of DLM-based pre-training, the masked words are replaced by other words created by a generator. After the transformer, all words are judged by a discrimination header, named DLM-Head. In this example, the word “felt” and “eat” is judged to have be replaced and the other words are not replaced (original). (d) An example of fine-tuning, where the word embedding corresponding to the token “[CLS]” is used to discriminate which category this sentence belongs to. In the example of prompt tuning in (e), the prompt designed according to the categories is spliced after the original input, and then the discriminator is used to discriminate the tokens corresponding to the categories are replaced or not. Morever, the processes in (c,e) share the discrimination header (DLM Head).

Back to article page