Fig. 4: Immune age estimation using external molecular expression data and reinforcement learning.
From: Reading the immune clock: a machine learning model predicts mouse immune age from cellular patterns

a Evaluate the consistency between the model-predicted age and actual age values through R2 in the test dataset. b Stratified K-fold cross-validation R2 score for evaluating the model’s predictive performance. The five bars represent each fold in 5-fold cross-validation. The dataset was randomly divided into five subsets, and the model was iteratively trained on four and validated on one. R² scores represent performance on the validation fold. The red dashed line shows the mean R² score. c Comparison of actual and predicted immune ages for each validation sample. These test samples were not included in model training or validation. d–f The external data set refers to a synthetic dataset composed of randomly generated expression profiles for 103 molecules extracted from six immune cell populations. This dataset was used to evaluate model generalizability under unseen conditions. d Balanced synthetic data: Mean and variability of molecular expression. e Balanced predicted age distribution for the generated external dataset. f Target Age vs. Predicted Age: Visualization matching. Evaluating the explanatory power of the built model on external data sets. The explanatory power (93.43%) corresponds to the coefficient of determination (R²), indicating that ~93% of the variance in target age is explained by the model predictions. R2 = 1 − ∑(ytrue−ypred)2/∑(ytrue−ŷtrue)2, where ytrue is the target age, ŷtrue is the mean target age, and ypred is the predicted age. g ML algorithm training and tool development scheme based on molecular expression levels. A total of 103 immune molecular features derived from six immune cell clusters were integrated. These features were used in a SVR to model the relationship between molecular expression and chronological age. For external validation, synthetic datasets were input into the trained model to predict the nearest immune age values. This diagram summarizes the overall process from data integration to model evaluation. Data are shown as mean ±S.D.