Fig. 1: Illustration of the model design and training strategy of TransPeakNet.

A The model takes a molecular structure and derives its atomic representations from a GNN. The solvent information is encoded into a latent representation via the Solvent encoder. The representation of each atom is concatenated with the solvent representation, which is then used to predict the cross shifts of carbon and proton. B Model pertaining on the annotated 1D NMR dataset using MTT. C The pre-trained model is refined through an unsupervised process using the unlabeled HSQC dataset. The final output of the model has both the HSQC cross-peaks and atom alignment.