Table 1 Performance of various models when trained to predict either trained tasting panel descriptors or RateBeer review scores

Model	Trained panel R²	Trained panel rank	RateBeer R²	RateBeer rank
AdaBoost	0.21	2.88	0.61	5.67
Artificial Neural Network	0.14	5.60	0.46	4.00
Extra Trees	0.22	3.02	0.61	4.67
Gradient boosting	0.21	3.42	0.69	1.50
Lasso regression	0.05	4.94	0.64	4.33
Linear regression	−4.13	7.88	−11.02	8.00
Partial Least Squares Regression	−0.25	7.56	0.57	5.33
Random Forest	0.22	2.86	0.62	3.50
Support Vector Regression	0.18	6.50	0.59	6.50
XGBoost	0.22	4.12	0.62	2.00

The performance metric is the coefficient of determination (R²) for predictions on the test dataset, obtained from multi-output models (Methods). The average rank is the mean after ranking the individual-attribute models per descriptor, with the lowest value indicating the best model. The highest scores per evaluation metric are indicated in bold. Note that some models result in negative R-squared values, implying the average of the outcome variable would have worked as a better predictor than the models’ predictions. Values for all descriptors can be found in Supplementary Table S3 (tasting panel) and Supplementary Table S4 (RateBeer).

Quick links

Search