Extended Data Fig. 1: Quantum circuit tensor encoding.

(a) Schematic representation of the gate embeddings for a single and multi qubit gate. (b) Quantum circuit encoding and decoding pipeline. For encoding (green arrows), an input quantum circuit (top left) is first tokenized based on the proposed vocabulary. Then, the token matrix is transformed into a continuous tensor based on the chosen embeddings vi (bottom right). In order to decode a continuous tensor into a circuit (blue arrows), we first use the cosine similarity between input embeddings and the ones assigned to existing tokens to generate a tokenized matrix, which is then transformed back into a circuit by means of the vocabulary. The transformation between circuits and tokens depends on such vocabulary, and can be changed at will to cope with the desired computing framework or platform. Further details are given in text.