Extended Data Fig. 1: Model architecture in detail and training strategies.

(a) The feature compression subnetwork consists of an input layer of 2,048 neurons, a bottleneck of 512 neurons, and an output layer of 2,048 neurons. (b) The MLP regression subnetwork consists of an input layer of 512 neurons, a hidden layer of 512 neurons, and an output layer with the number of neurons reflecting the number of genes. (c) In the ensemble learning strategy (bagging), five models were trained independently with five internal training-validation splits; these five model predictions were averaged to make the final prediction. (d) In the model selection strategy, the ‘best’ model with the highest performance on the validation set was chosen to make predictions on the test set. Of note, DeepPT uses ensemble learning.