Table 3 Mean (standard deviation) of AUC for the cancer vs. non-cancer classification problem across the four hold-one-site-out folds with \(F{S}_{sd}^{\theta ,\kappa }\) and \(F{S}_{d}^{\theta ,\kappa }\) \(\theta \in \{SFS,WLCX,mRMR,ROC\}\), \(\kappa \in \{LDA,QDA,SVM,RF\}\).

For each classifier model, the top 5 most stable and discriminating or most discriminating features were employed for constructing \(F{S}_{sd}\) and \(F{S}_{d}\) respectively. For every feature selection-classification pair four models were trained and validated, one model for every possible combination of three of the four sites. The three chosen sites were combined and used for training and the held out site was used for validation. The improvement between \(F{S}_{sd}\) and \(F{S}_{d}\) is shown. A positive improvement indicates that \(F{S}_{sd}\) outperformed \(F{S}_{d}\). Note that for this particular problem, the prediction AUC for all models were very high, nearly perfect in most cases.

Quick links

Search