Fig. 3: ChatClimate Data Pipeline: from creating external memory, receiving questions to accurate answers from IPCC AR6.
From: ChatClimate: Grounding conversational AI in climate science

The black arrows show the sequence of tasks in the ChatClimate pipeline. Langchain is the Python library we used for splitting text into smaller chunks. Tiktoken is OpenAI’s tokenizer. ‘text-embedding-ada-002’ is the embedding mode from OpenAI. GPT-4 is the large language model.