Extended Data 2: Design of the natural language processing information extraction model.
From: Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence

Segmented sentences from the raw text of the EHR were embedded using word2vec. The LSTM model then generated the structured records in a query–answer format. This schematic illustrates the process using the free-text ‘lesion in the upper left lobe of patient’s lung’ as an example.