Extended Data Fig. 3: Estimation of the GAE code length and accuracy of age prediction. | Nature Aging

Extended Data Fig. 3: Estimation of the GAE code length and accuracy of age prediction.

From: An inflammatory aging clock (iAge) based on deep learning tracks multimorbidity, immunosenescence, frailty and cardiovascular aging

Extended Data Fig. 3: Estimation of the GAE code length and accuracy of age prediction.

We used 5-fold cross-validation to identify the best code length, among lengths from 1 to 10. We selected the length of code k, whose performance was not statistically significantly worse than that of longer codes (paired t-test p-value > 0.05). Within each fold we performed nested 3-fold cross-validation to select hyper-parameters (depth, weight decay and guidance-ratio). In our experiment, the best code length is 5 (a) as adding one more code (6) does not significantly improve the total loss (p = 0.18). After obtaining the best code length as 5, we used the 5-fold-cross-validation to select the best hyper-parameter setting (depth = 2, guidance-ratio = 0.2, L2 = 0.001) on all GAE with code length 5. Finally, we trained the GAE on the whole dataset with the selected best hyper-parameter setting and obtained the predictive function as the inflammatory clock predictor. GAE was compared to other machine learning methods such as autoencoder, neural networks, PCA, and RAW in (b). For the neural network, 2 fully connected layers with 5 nodes in each layer and tanh activation function were used. For PCA and RAW, we used elastic net to predict age. The GAE method outperforms linear methods for protein data reconstruction and prediction of chronological age (b). In (c), we found that the predictive performance of gradient boosting decision tree (GBDT) has similar performance as PCA. We conclude that GAE is superior to traditional machine learning methods.

Source data

Back to article page