Table 6 Prompt for LLM-driven binarization of COVID-19 symptoms referenced in audio transcripts
From: Generative AI and unstructured audio data for precision public health
The task is to analyze a health summary provided as input and return a Python dictionary with the following symptoms as keys: 1. Runny nose 2. Fever 3. Loss of appetite 4. Loss of smell 5. Sore throat 6. Nausea 7. Headache 8. Diarrhea 9. Non-productive cough 10. Productive cough 11. Muscle aches 12. Fatigue 13. Shortness of breath 14. Joint pain 15. Chest pain 16. Loss of taste 17. Abdominal pain For each symptom: Set the value to ‘1‘ if the symptom is explicitly mentioned as being present at any point in the summary, even if the summary later mentions that the speaker is no longer experiencing this symptom (e.g., “The speaker described a sore throat” indicates a sore throat). Set the value to ‘1‘ if there are indirect references to the symptom (including any synonyms) and it is implied as being present (e.g., “The speaker described a scratchy throat” or “The speaker described a hoarse voice” indicates a sore throat). Set the value to ‘0‘ if the symptom is explicitly negated or stated as absent (e.g., “The speaker denied having a sore throat” or “The speaker was worried about a sore throat but did not have one”). Set the value to ‘0‘ if neither direct nor indirect references to the symptom are present in the summary. Be comprehensive in interpreting both direct and indirect references to symptoms, as well as absences of symptoms. Return only the dictionary in Python format. |