Fig. 4 | Scientific Reports

Fig. 4

From: Fine-tuned large language models with structured prompts enable efficient construction of lung cancer knowledge graphs

Fig. 4The alternative text for this image may have been generated using AI.

The three-stage framework for constructing the Lung Cancer Knowledge Graph (LCKG) using an LLM. (a) Information extraction: The fine-tuned KGLM, enhanced with prompt engineering, extracts knowledge triplets from unstructured web data. The prompt provides an instruction and an example of the desired (head, relation, tail) output format. (b) Knowledge fusion: The extracted triplets are integrated with semi-structured clinical data and structured public graph data. This fusion involves rule-matching, entity alignment, and a final quality assessment before manual cleaning of edge cases. (c) Storage and visualization: The unified and cleaned triplets are stored in a Neo4j graph database, which supports querying (via Cypher) and visualization tools.

Back to article page