Fig. 6: The structure of the CrystalTransformer model.

a The main part of CrystalTransformer model. InputA and InputX denote atom (chemistry) and structure (coordinates) information respectively. After passing through the information extraction layers, the inputs are transformed into the A matrix and X matrix. These two matrices are then concatenated and processed through the Transformer layers which include multi-head self-attention, feedforward layers, and other components, to produce the output target. b Chemical information extraction layer. InputA is first passed through an embedding layer, followed by a linear transformation. c Coordinates information extraction layer. InputX undergoes data augmentation followed by a linear transformation.