Fig. 5: A discriminant analysis on all groups together found the best split of the data based on preselected groups and a machine-learning approach identified patients and controls.

A Representation of the discriminant analysis between groups, in 2D and 3D. By entering into the analysis all the genes from all groups treated with DMSO or tBHQ, 3 canonical components were able to fully discriminate LR controls, HR controls, LR patients and HR patients. B The list of genes which composed the canonical 1, 2 and 3. The canonical 1 discriminates between patient and controls, while the canonical 3 discriminates between the GAG-gclc HR and LR. C SVM algorithm optimized the difference between patients and controls using the 76 genes but without considering the GAG-gclc polymorphism LR/HR genotype, using the 76 genes and the genotype, using the 20 most discriminant genes and the genotype, and finally using the 30 most discriminant genes and the genotype. Accuracy, specificity and sensitivity (with the number of misclassified patients and controls) to discriminate between patients and controls are indicated in the table and in the graph of the ROC curve for the different analysis.