Fig. 4

The process of caption generation and report writing. We take the image and default prompt as input to the MLLM, and the output is the corresponding caption of that image. In real practice, the MLLM-generated captions can be revised and utilized by the clinicians while writing the bronchoscopy examination report.