Fig. 7: The overview of our proposed datasets and benchmarks. | Nature Communications

Fig. 7: The overview of our proposed datasets and benchmarks.

From: Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data

Fig. 7

a Overview of Medical Multimodal Dataset (MedMD). Our collected data covers the majority of radiologic modalities and anatomical regions of the human body, such as the brain, head and neck, thorax, spine, abdomen, upper limb, lower limb, and pelvis, etc. The dataset mixes two types of datasets, i.e., interleaved datasets and visual instruction datasets. \({{{\mathcal{T}}}}\) refers to the text of interleaved data, \({{{\mathcal{I}}}}\) refers to the instruction input text, and \({{{\mathcal{R}}}}\) refers to the response text. b The data statistics of RadMD and RadBench. The left image shows the distribution of different modalities of RadMD, and the center image shows the distribution of 2D and 3D sample pairs of RadMD. The right image shows the distribution of the anatomy of the samples in the RadBench.

Back to article page