Extended Data Fig. 9: Supervised clustering model predicting the gluten-specific T cell profile.

a, Diagram illustrating the workflow for model training and prediction. PBMC samples from donors with UCeD are split into two parts as indicated. One part (right) is not tet enriched and is later used for estimation of gluten-specific T cell profile cell prevalence within the sample. The tet-enriched part (left) is used to train a random forest classification model using repeated K-fold cross-validation on the phenotype of the tet-positive cells. b, Scatter plot of the mean decrease in the Gini score for each predictor provides information on how important the predictor variables are to the final model.