Table 6 Experimental results of GPT-3.5 and GPT-4 with input prompt.

From: Toward a stable and low-resource PLM-based medical diagnostic system via prompt tuning and MoE structure

Task

Features

GPT-3.5

GPT-4

AD-D

All F.

60.85

71.55

Selected F.

64.75

\({\textbf {76.75}}\)

Easy F.

60.05

68.10

Biological F.

37.80

48.85

AD-P

All F.

62.50

66.25

Selected F.

63.75

65.00

Easy F.

65.00

\({\textbf {72.50}}\)

Biological F.

65.00

67.50

ICU

58.39

58.39

  1. As reference, the category with the highest number in Alzheimer’s disease diagnosis task (AD-D) accounts for 44.75%, the category with the highest number in Alzheimer’s disease progression prediction task (AD-P) accounts for 65.00%, and the category with the highest number in ICU death prediction task (ICU) accounts for 58.39%.
  2. The optim is marked with bold.