Figure 3 | Scientific Reports

Figure 3

From: Environment and culture shape both the colour lexicon and the genetics of colour perception

Figure 3

Predicting the existence of a dedicated word for blue (Panel A, left) and frequency of abnormal red-green color perception (Panel B, right); only the 7 best predictors are presented here (see Methods). 1st row: Using Bayesian mixed-effect models (BRMS), the best predictors’ (according to Bayes Factor, WAIC, LOO and K-Fold methods) slope estimates. 2nd row: specificity-based predictor importance from SVMs. 3rd row: accuracy-based predictor importance from random forests (RF), measuring the amount by which the accuracy decreases when one variable is removed from the model; higher values represent more important predictors. 4th row: Gini-index-based predictor importance from random forests (RF); this measures by how much the Gini impurity decreases when a variable is chosen to split a node (note, only relative values matter, and there is a bias towards using numeric variables to split nodes). 5th row: unconditional predictor importance from conditional random forests (CF); this is similar to the accuracy-based importance from random forests. 6th row: the performance of the four methods (BRMS, SVM, RF and CF) in terms of accuracy (left; as this is a binary classification problem) and \(R^2\) (right; as this is a regression problem). Variable names have been abbreviated for legibility: popSize is population size, humid_m is median humidity, dist2lak is the distance to the closest lake, lat is latitude, genD4 is the 4th dimension of the multidimensional scaling of the between-populations genetic distances, climPC1 is the 1st principal component resulting from the Principal Component Analysis (PCA) of the climate variables, macroar is the macroarea, long is the longitude, and dist2wat is the distance to the closest body of water (ocean/sea, lake or river). Plots generated using R59.

Back to article page