Fig. 4: Estate and vintage decoding from compounds. | Communications Chemistry

Fig. 4: Estate and vintage decoding from compounds.

From: Predicting Bordeaux red wine origins and vintages from raw gas chromatograms

Fig. 4

a, b Dimensionality reduction of 32 compounds via tSNE and UMAP. The estate clusters are not as well marked as with the raw chromatograms, and the right bank wines (A, B, C) are not clearly separated from the left bank ones (D, E, F, G), particularly in the tSNE case (Fig. 1). c Performance histograms for decoding estate identity from subsets of compounds, using two classifiers: Linear discriminant analysis (LDA) and logistic regression (LR), as in Fig. 2. The horizontal black line in each histogram indicates mean performance across data splits. Chance performance (14%) is shown by the dashed line. The best average decoding accuracy was 91% correct for LR and m_concat (all three types of compounds concatenated together, 32 dimensions), though comparable results were observed for the oak and ester compounds. Performance was markedly weaker for the off-Flavor (offFla) compounds (41% for LDA). Overall performance is worse than with the raw chromatograms (Fig. 2). d Same as in (c) but for decoding vintages. Decoding performance was smaller overall than for estate identity, however performance for LDA and LR was significantly above chance (p < 0.001, 8% for chance) for all but offFla. Concatenating the three compound types into m_concat leads to slightly stronger performance.

Back to article page